IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING DRIMUX:Dynamic Rumor Influence Minimization with User Experience in Social Networks Biao Wang,Ge Chen,Luoyi Fu,Li Song,and Xinbing Wang,Senior Member,IEEE Abstract-With the soaring development of large scale online social networks,online information sharing is becoming ubiquitous everyday.Various information is propagating through online social networks including both the positive and negative.In this paper,we focus on the negative information problems such as the online rumors.Rumor blocking is a serious problem in large-scale social networks.Malicious rumors could cause chaos in society and hence need to be blocked as soon as possible after being detected.In this paper,we propose a model of dynamic rumor influence minimization with user experience(DRIMUX).Our goal is to minimize the influence of the rumor(i.e.,the number of users that have accepted and sent the rumor)by blocking a certain subset of nodes.A dynamic Ising propagation model considering both the global popularity and individual attraction of the rumor is presented based on realistic scenario.In addition,different from existing problems of influence minimization,we take into account the constraint of user experience utility.Specifically,each node is assigned a tolerance time threshold.If the blocking time of each user exceeds that threshold,the utility of the network will decrease.Under this constraint,we then formulate the problem as a network inference problem with survival theory,and propose solutions based on maximum likelihood principle.Experiments are implemented based on large-scale real world networks and validate the effectiveness of our method. Index Terms-Social network,rumor blocking,survival theory. 1 INTRODUCTION Wggeiowmh2mine0Tac companies then took measures to block related accounts delete relevant posts and fanpages on their social network and Chinese Sina Weibo,etc.,hundreds of millions of people platforms to prevent the ISIS from spreading malicious are able to become friends [2]and share all kinds of infor- information.Additionally,Facebook et al.also have issued mation with each other.Online social network analysis has relevant security policies and standards to claim the au- also attracted growing interest among researchers [3],[4], thority to block accounts of users when they are against [5],[6].On one hand,these online social platforms provide rules or at risk [13].Undoubtedly,malicious rumors should great convenience to the diffusion of positive information be stopped as soon as possible once detected so that their such as new ideas,innovations,and hot topics [7],[8].On negative influence can be minimized. the other hand,however,they may become a channel for the Most of the previous works studied the problem of spreading of malicious rumors or misinformation [9],[10], maximizing the influence of positive information through [11].For example,some people may post on social networks social networks [14],[15],[16].Fast approximation methods a rumor about an upcoming earthquake,which will cause were also proposed to influence maximization problem chaos among the crowd and hence may hinder the normal [17],[18].In contrast,the negative influence minimization public order.In this case,it is necessary to detect the rumor problem has gained much less attention,but still there have source and delete related messages,which may be enough been consistent efforts on designing effective strategies for to prevent the rumor from further spreading.However,in blocking malicious rumors and minimizing the negative certain extreme circumstances such as terrorist online attack, influence.Budak et al.[9]introduced the notion of a "good" it might be necessary to disable or block related Social campaign in a social network to counteract the negative Network(SN)accounts to avoid serious negative influences. influence of a "bad"one by convincing users to adopt For instance,in 2016,the families of three out of the forty the "good"one.Kimura et al.[19]studied the problem nine victims from the Orlando nightclub shooting incident of minimizing the propagation of malicious rumors by filed a lawsuit against Twitter,Facebook and Google for blocking a limited number of links in a social network.They providing "material support"to the terrorism organization provided two different definitions of contamination degree of the Islamic State of Iraq and Syria (ISIS)[12].These and proposed corresponding optimization algorithms. Fan et al.[20]investigated the least cost rumor blocking problem in social networks.They introduced the concept of .B.Wang,Ge Chen,Luoyi Fu,Li Song and Xinbing Wang are with the 'protectors"and try to select a minimal number of them to Department of Electronic Engineering,Shanghai Jiao Tong University, 800 Dongchuan Road,Shanghai,China. limit the bad influence of rumors by triggering a protection E-mail:{sweetmvp24,chenge,yiluofu,song _li,xwang8y@sjtu.edu.cn. cascade against the rumor cascade.However,there are a few limitations in those works.First,they consider the The early version of this paper appeared in the Proceedings of AAAl 2016[1]. rumor popularity as constant during the whole propagation
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 DRIMUX: Dynamic Rumor Influence Minimization with User Experience in Social Networks Biao Wang, Ge Chen, Luoyi Fu, Li Song, and Xinbing Wang, Senior Member, IEEE Abstract—With the soaring development of large scale online social networks, online information sharing is becoming ubiquitous everyday. Various information is propagating through online social networks including both the positive and negative. In this paper, we focus on the negative information problems such as the online rumors. Rumor blocking is a serious problem in large-scale social networks. Malicious rumors could cause chaos in society and hence need to be blocked as soon as possible after being detected. In this paper, we propose a model of dynamic rumor influence minimization with user experience (DRIMUX). Our goal is to minimize the influence of the rumor (i.e., the number of users that have accepted and sent the rumor) by blocking a certain subset of nodes. A dynamic Ising propagation model considering both the global popularity and individual attraction of the rumor is presented based on realistic scenario. In addition, different from existing problems of influence minimization, we take into account the constraint of user experience utility. Specifically, each node is assigned a tolerance time threshold. If the blocking time of each user exceeds that threshold, the utility of the network will decrease. Under this constraint, we then formulate the problem as a network inference problem with survival theory, and propose solutions based on maximum likelihood principle. Experiments are implemented based on large-scale real world networks and validate the effectiveness of our method. Index Terms—Social network, rumor blocking, survival theory. ✦ 1 INTRODUCTION WITH the soaring development and rising popularity of large-scale social networks such as Twitter, Facebook, and Chinese Sina Weibo, etc., hundreds of millions of people are able to become friends [2] and share all kinds of information with each other. Online social network analysis has also attracted growing interest among researchers [3], [4], [5], [6]. On one hand, these online social platforms provide great convenience to the diffusion of positive information such as new ideas, innovations, and hot topics [7], [8]. On the other hand, however, they may become a channel for the spreading of malicious rumors or misinformation [9], [10], [11]. For example, some people may post on social networks a rumor about an upcoming earthquake, which will cause chaos among the crowd and hence may hinder the normal public order. In this case, it is necessary to detect the rumor source and delete related messages, which may be enough to prevent the rumor from further spreading. However, in certain extreme circumstances such as terrorist online attack, it might be necessary to disable or block related Social Network (SN) accounts to avoid serious negative influences. For instance, in 2016, the families of three out of the forty nine victims from the Orlando nightclub shooting incident filed a lawsuit against Twitter, Facebook and Google for providing “material support” to the terrorism organization of the Islamic State of Iraq and Syria (ISIS) [12]. These • B. Wang, Ge Chen, Luoyi Fu, Li Song and Xinbing Wang are with the Department of Electronic Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China. E-mail: {sweetmvp24, chenge, yiluofu, song li, xwang8}@sjtu.edu.cn. The early version of this paper appeared in the Proceedings of AAAI 2016 [1]. companies then took measures to block related accounts, delete relevant posts and fanpages on their social network platforms to prevent the ISIS from spreading malicious information. Additionally, Facebook et al. also have issued relevant security policies and standards to claim the authority to block accounts of users when they are against rules or at risk [13]. Undoubtedly, malicious rumors should be stopped as soon as possible once detected so that their negative influence can be minimized. Most of the previous works studied the problem of maximizing the influence of positive information through social networks [14], [15], [16]. Fast approximation methods were also proposed to influence maximization problem [17], [18]. In contrast, the negative influence minimization problem has gained much less attention, but still there have been consistent efforts on designing effective strategies for blocking malicious rumors and minimizing the negative influence. Budak et al. [9] introduced the notion of a “good” campaign in a social network to counteract the negative influence of a “bad” one by convincing users to adopt the “good” one. Kimura et al. [19] studied the problem of minimizing the propagation of malicious rumors by blocking a limited number of links in a social network. They provided two different definitions of contamination degree and proposed corresponding optimization algorithms. Fan et al. [20] investigated the least cost rumor blocking problem in social networks. They introduced the concept of “protectors” and try to select a minimal number of them to limit the bad influence of rumors by triggering a protection cascade against the rumor cascade. However, there are a few limitations in those works. First, they consider the rumor popularity as constant during the whole propagation
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2 process,which is not close to the realistic scenarios.Second, in the design of the rumor blocking strategies,either blocking nodes or links,they fail to take into account the issue of user experience in real world social networks.We have to avoid blocking the accounts of users for such a long time that they may lodge complaints or even quit the social network.Therefore,it is necessary to consider the impact of blocking time on both individual node and the entire network. In this paper,we investigate the problem of dynamic rumor influence minimization with user experience.First, based on existing works on information diffusion in social networks [21],[22],[23],[24],we incorporate the rumor popularity dynamics in the diffusion model.We analyze Fig.1.The random graph denotation of online social networks.Nodes existing investigations on topic propagation dynamics [25] with different colors illustrate the different communities they belong to. and bursty topic patterns [26].Then we choose Chi-squared The size of the nodes indicate their degrees or "influence"in the social distribution to approximate the global rumor popularity. network.Normally,the more influential a node is,the more contributions Inspired by the novel energy model proposed by Han et it will make to rumor propagation. al.[27],we then analyze the individual tendency towards the rumor and present the probability of successful rumor formulation in Section 4,the solutions in Section 5,and the propagation between a pair of nodes.Finally,inspired by experiments in Section 6.Finally,we conclude the paper in the concept of Ising model [28],we derive the cooperative Section 7. succeeding probability of rumor propagation that integrates the global rumor popularity with individual tendency.After that,we introduce the concept of user experience utility 2 PRELIMINARIES function and analyze the impact of blocking time of nodes to 2.1 Social Network Definition the rumor propagation process.We then adopt the survival A social network,in mathematical context,can be formu- theory to explain the likelihood of nodes getting activated, and propose both greedy and dynamic algorithms based on lated as a directed graph G=(V,E)consisting of a set of nodes V representing the users,and a set of directed edges maximum likelihood principle. E denoting the relationship between users(e.g.following or The contributions of our work are as follows: being followed).Figure 1 shows the random graph illustra- We propose a rumor propagation model taking into tion of a social network.LetV=N denote the number of account the following three elements:First,the glob- nodes,and(u,v)EE denote the directed edge from node al popularity of the rumor over the entire social u to node v(u,v∈V),and auv∈{0,l}denote the edge network,i.e.,the general topic dynamics.Second, coefficient,where ouv =1 represents the existence of edge the attraction dynamics of the rumor to a potential (u,v),and auv=0,otherwise.We use puv to denote the spreader,i.e.,the individual tendency to forward the probability of u sending the rumor to v and v accepting it, rumor to its neighbors.Third,the acceptance proba- i.e.,the success probability of u activating v.Let D(u)denote bility of the rumor recipients.In our model,inspired the in-degree of node u.From Figure 1,we can see nodes in by the Ising model,we combine all three factors larger size have higher degree than those in smaller size. together to propose a cooperative rumor propagation The degree of a node is also an indication of "influence"in probability. a social network since higher degree denotes more connec- In our rumor blocking strategies,we consider the tions to other nodes,thus it implies more opportunities to influence of blocking time to user experience in real share information (both positive and negative)with other world social networks.Thus we propose a blocking nodes time constraint into the traditional rumor influence minimization objective function.In that case,our 2.2 Rumor Diffusion Model method optimizes the rumor blocking strategy with- Rumor diffusion mechanism is similar with that of epidemic out sacrificing the online user experience. We use survival theory to analyze the likelihood of propagation [29].During the propagation of rumors,each node could have one of the following three states:Suscep- nodes becoming activated or infected by the rumor before a time threshold which is determined by the tible (S),Infected (I)and Recovered (R),which is known user experience constraint.Then we propose both as the SIR model [30],[31].The state of being susceptible greedy and dynamic blocking algorithms using the represents the node has the potential to accept and spread the rumor at any time;Infected indicates the node has maximum likelihood principle. already accepted and spread the rumor;Recovered denotes The rest of the paper is organized as follows.In Section the state of the node identifying the rumor and denying 2,we introduce the preliminaries of social network and it.In this paper,we consider the rumor propagation as a information diffusion models.Next we give an overview of progressive process,i.e.,once a node is infected,it will stay the related work in Section 3.Then we propose the problem infected and not recover,which is the SI model
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2 process, which is not close to the realistic scenarios. Second, in the design of the rumor blocking strategies, either blocking nodes or links, they fail to take into account the issue of user experience in real world social networks. We have to avoid blocking the accounts of users for such a long time that they may lodge complaints or even quit the social network. Therefore, it is necessary to consider the impact of blocking time on both individual node and the entire network. In this paper, we investigate the problem of dynamic rumor influence minimization with user experience. First, based on existing works on information diffusion in social networks [21], [22], [23], [24], we incorporate the rumor popularity dynamics in the diffusion model. We analyze existing investigations on topic propagation dynamics [25] and bursty topic patterns [26]. Then we choose Chi-squared distribution to approximate the global rumor popularity. Inspired by the novel energy model proposed by Han et al. [27], we then analyze the individual tendency towards the rumor and present the probability of successful rumor propagation between a pair of nodes. Finally, inspired by the concept of Ising model [28], we derive the cooperative succeeding probability of rumor propagation that integrates the global rumor popularity with individual tendency. After that, we introduce the concept of user experience utility function and analyze the impact of blocking time of nodes to the rumor propagation process. We then adopt the survival theory to explain the likelihood of nodes getting activated, and propose both greedy and dynamic algorithms based on maximum likelihood principle. The contributions of our work are as follows: • We propose a rumor propagation model taking into account the following three elements: First,the global popularity of the rumor over the entire social network, i.e., the general topic dynamics. Second, the attraction dynamics of the rumor to a potential spreader, i.e., the individual tendency to forward the rumor to its neighbors. Third, the acceptance probability of the rumor recipients. In our model, inspired by the Ising model, we combine all three factors together to propose a cooperative rumor propagation probability. • In our rumor blocking strategies, we consider the influence of blocking time to user experience in real world social networks. Thus we propose a blocking time constraint into the traditional rumor influence minimization objective function. In that case, our method optimizes the rumor blocking strategy without sacrificing the online user experience. • We use survival theory to analyze the likelihood of nodes becoming activated or infected by the rumor before a time threshold which is determined by the user experience constraint. Then we propose both greedy and dynamic blocking algorithms using the maximum likelihood principle. The rest of the paper is organized as follows. In Section 2, we introduce the preliminaries of social network and information diffusion models. Next we give an overview of the related work in Section 3. Then we propose the problem Fig. 1. The random graph denotation of online social networks. Nodes with different colors illustrate the different communities they belong to. The size of the nodes indicate their degrees or “influence” in the social network. Normally, the more influential a node is, the more contributions it will make to rumor propagation. formulation in Section 4, the solutions in Section 5, and the experiments in Section 6. Finally, we conclude the paper in Section 7. 2 PRELIMINARIES 2.1 Social Network Definition A social network, in mathematical context, can be formulated as a directed graph G = (V, E) consisting of a set of nodes V representing the users, and a set of directed edges E denoting the relationship between users (e.g. following or being followed). Figure 1 shows the random graph illustration of a social network. Let |V | = N denote the number of nodes, and (u, v) ∈ E denote the directed edge from node u to node v (u, v ∈ V ), and αuv ∈ {0, 1} denote the edge coefficient, where αuv = 1 represents the existence of edge (u, v), and αuv = 0, otherwise. We use puv to denote the probability of u sending the rumor to v and v accepting it, i.e., the success probability of u activating v. Let D(u) denote the in-degree of node u. From Figure 1, we can see nodes in larger size have higher degree than those in smaller size. The degree of a node is also an indication of “influence” in a social network since higher degree denotes more connections to other nodes, thus it implies more opportunities to share information (both positive and negative) with other nodes. 2.2 Rumor Diffusion Model Rumor diffusion mechanism is similar with that of epidemic propagation [29]. During the propagation of rumors, each node could have one of the following three states: Susceptible (S), Infected (I) and Recovered (R), which is known as the SIR model [30], [31]. The state of being susceptible represents the node has the potential to accept and spread the rumor at any time; Infected indicates the node has already accepted and spread the rumor; Recovered denotes the state of the node identifying the rumor and denying it. In this paper, we consider the rumor propagation as a progressive process, i.e., once a node is infected, it will stay infected and not recover, which is the SI model
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 3 1400000 Samples of Rumor Evolution Pattern ■Normal 1200000 100000( ■Blocked A0000 0000 Natural Rumor Propagation Propagation Under Blockage 40000 200000 Ap Oct Apr Oct Fig.2.Typical cases of the topic evolution pattern curves extracted from Sina Weibo from Jan.2011 to Jan.2013.Different colors represent different hot topics and their evolution tracks respectively. Fig.3.Illustration of the process of natural rumor propagation and Diffusion models describe the process of information the process with rumor blockage,i.e.,blocking a certain amount of propagating through the network.Two classic diffusion "important"nodes in the propagation path. models are the Linear Threshold (LT)[32]and the Inde- pendent Cascade (IC)model [33],[34].In LT model,an million-scale tweets and blog posts.Crane et al.[35]demon- inactive node u becomes activated if the ratio of its activat- strated the existence of Poisson distribution and Power- ed parent nodes surpasses a certain pre-defined threshold 0<<1.In this paper,since we mainly focus on pairwise law relaxation in controlling the topic evolution over time. Figure 2 shows several typical topic evolution curves based probabilistic rumor propagation among nodes including on the data extracted from Sina Weibo.From the figure, individual tendency of a node to a rumor as well as the in each topic evolution curve,we can see the three phases global popularity probability of a rumor,it is more suitable mentioned above.It is also discernable that every topic has to adopt the IC model in our work. a certain lifespan of its own,which is similar to the case of The IC model has been widely adopted in information d- iffusion problems.The whole propagation process proceeds rumor propagation process.Thus,it is reasonable to utilize in discrete time steps to,t1,t2,....Initially,the cascade is a function to simulate the process of rumor propagation. According to the topic evolution characteristics,in our work, triggered by a set of activated nodes,i.e.,the seed nodes at we use the Chi-squared distribution [36],[37],[38]to simu- to.In our rumor diffusion context,we assume the rumor late the rumor propagation dynamics. is started by one source node s in the network,and the other nodes are inactive.We use su(t)E[0,1}to denote the 3.2 Energy Model state of node u at time step t,where su(t)=1 represents u is activated and s(t)=0,otherwise.For every following Rumor propagation can be considered as a type of social time step t>1,each node u which was activated at time contagion process [39]with several special characteristics. step (t-1)has a single opportunity to activate any of its Firstly,people's interest of a rumor tends to decrease with currently inactive neighbors v with a success probability time,which indicates the probability of a node willing to pv.In our context,it means in each time step,any node forward the rumor.That process is similar to the simulated that has accepted the rumor in previous time step has a annealing process [40].Han et al.[27]proposed a novel en- chance to let their inactive neighbors accept the rumor.For ergy model to describe the rumor propagation process.They simplicity,we assume that once a node is activated by the introduce the heat energy calculation formula AE=cmAT rumor,it will stay activated until the end of the diffusion in Physics to analogize the rumor impact.The rumor's process. influence on individual node is formulated as the amount of accumulated heat energy.Based on the model proposed by Han et al.[27],we define the expression of individual 3 RELATED WORK tendency with respect to the success activation probability 3.1 Topic Dynamics between a pair of nodes.In addition,even though an ac- Researchers have studied the temporal dynamics of we- tivated node does transmit the rumor to its neighbors,the b topics based on real-world statistics.Yang et al.[25] probability of these neighbors accepting the rumor is still to analyzed how the number of tweets related to a specific be determined.In that case,we can define the acceptance theme(i.e.,the popularity of a topic)changes with time, probability of the rumor recipient.By combining the rumor and revealed that a topic evolution generally consists of sending probability at the transmitting end with the rumor three phases,i.e.,a rising phase from the start,a peak acceptance probability at the receiving end,we can obtain period and then a fading phase.Fluctuations in each phase the ultimate rumor propagation probability. may result in different temporal characteristics.Yang et al. [25]proposed K-Spectral Centroids clustering algorithm 3.3 Ising Model for classifying online content according to their temporal The Ising model [41]is a widely applicable model in the patterns and finally extract six representative patterns from research of Physics theory.It is a simple theoretical de-
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 3 Jan 2011 Apr Jul Oct Jan 2012 Apr Jul Oct Jan 2013 Date 0 200000 400000 600000 800000 1000000 1200000 1400000 Amount Samples of Rumor Evolution Pattern Fig. 2. Typical cases of the topic evolution pattern curves extracted from Sina Weibo from Jan. 2011 to Jan. 2013. Different colors represent different hot topics and their evolution tracks respectively. Diffusion models describe the process of information propagating through the network. Two classic diffusion models are the Linear Threshold (LT) [32] and the Independent Cascade (IC) model [33], [34]. In LT model, an inactive node u becomes activated if the ratio of its activated parent nodes surpasses a certain pre-defined threshold 0 < θ < 1. In this paper, since we mainly focus on pairwise probabilistic rumor propagation among nodes including individual tendency of a node to a rumor as well as the global popularity probability of a rumor, it is more suitable to adopt the IC model in our work. The IC model has been widely adopted in information diffusion problems. The whole propagation process proceeds in discrete time steps t0, t1, t2, . . . . Initially, the cascade is triggered by a set of activated nodes, i.e., the seed nodes at t0. In our rumor diffusion context, we assume the rumor is started by one source node s in the network, and the other nodes are inactive. We use su(t) ∈ {0, 1} to denote the state of node u at time step t, where su(t) = 1 represents u is activated and su(t) = 0, otherwise. For every following time step t ≥ 1, each node u which was activated at time step (t − 1) has a single opportunity to activate any of its currently inactive neighbors v with a success probability puv. In our context, it means in each time step, any node that has accepted the rumor in previous time step has a chance to let their inactive neighbors accept the rumor. For simplicity, we assume that once a node is activated by the rumor, it will stay activated until the end of the diffusion process. 3 RELATED WORK 3.1 Topic Dynamics Researchers have studied the temporal dynamics of web topics based on real-world statistics. Yang et al. [25] analyzed how the number of tweets related to a specific theme (i.e., the popularity of a topic) changes with time, and revealed that a topic evolution generally consists of three phases, i.e., a rising phase from the start, a peak period and then a fading phase. Fluctuations in each phase may result in different temporal characteristics. Yang et al. [25] proposed K-Spectral Centroids clustering algorithm for classifying online content according to their temporal patterns and finally extract six representative patterns from Normal Infected Blocked Natural Rumor Propagation Propagation Under Blockage Fig. 3. Illustration of the process of natural rumor propagation and the process with rumor blockage, i.e., blocking a certain amount of “important” nodes in the propagation path. million-scale tweets and blog posts. Crane et al. [35] demonstrated the existence of Poisson distribution and Powerlaw relaxation in controlling the topic evolution over time. Figure 2 shows several typical topic evolution curves based on the data extracted from Sina Weibo. From the figure, in each topic evolution curve, we can see the three phases mentioned above. It is also discernable that every topic has a certain lifespan of its own, which is similar to the case of rumor propagation process. Thus, it is reasonable to utilize a function to simulate the process of rumor propagation. According to the topic evolution characteristics, in our work, we use the Chi-squared distribution [36], [37], [38] to simulate the rumor propagation dynamics. 3.2 Energy Model Rumor propagation can be considered as a type of social contagion process [39] with several special characteristics. Firstly, people’s interest of a rumor tends to decrease with time, which indicates the probability of a node willing to forward the rumor. That process is similar to the simulated annealing process [40]. Han et al. [27] proposed a novel energy model to describe the rumor propagation process. They introduce the heat energy calculation formula ∆E = cm∆T in Physics to analogize the rumor impact. The rumor’s influence on individual node is formulated as the amount of accumulated heat energy. Based on the model proposed by Han et al. [27], we define the expression of individual tendency with respect to the success activation probability between a pair of nodes. In addition, even though an activated node does transmit the rumor to its neighbors, the probability of these neighbors accepting the rumor is still to be determined. In that case, we can define the acceptance probability of the rumor recipient. By combining the rumor sending probability at the transmitting end with the rumor acceptance probability at the receiving end, we can obtain the ultimate rumor propagation probability. 3.3 Ising Model The Ising model [41] is a widely applicable model in the research of Physics theory. It is a simple theoretical de-
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING scription of the concept of ferromagnetism in Physics [42]. 4 PROBLEM FORMULATION Specifically,it describes the phenomenon that when an array of atomic spins align in the way that the magnetic moments 4.1 Dynamic Rumor Propagation with Ising Model associated to them will all point in the same direction. Kempe et al.[14]considered the success probability puv Then it will create a macroscopic magnetic moment.Gen- as a system parameter and is fixed at the very beginning erally speaking,the Ising model contains two parts-the of the cascade.However,based on the topic dynamics we microscopic and macroscopic parts.The microscopic part discussed in a previous section,at different time steps of represents the local or individual behavior which is the the propagation process,a topic can vary dramatically in alignment of each of the atomic spins.Correspondingly,its popularity.Besides,the rumor attraction [27]for each the macroscopic part stands for the global or collective individual node u E V is also a realistic factor we should behavior which is the exterior magnetic moment.Based on take into account.That means the success of rumor prop- its intrinsic attributes,the Ising model can be generalized to agation between neighbors includes two aspects:first,the other similar scenarios.In our work,we utilize it to model activated node u has to be so attracted by the rumor that the rumor propagation process in social networks. it will choose to send the rumor to its neighbors;second, one of u's inactive neighbors v decides to accept the rumor Only after those two steps,we can claim that v is activated. In other words,the success of rumor propagation depends 3.4 User Experience both on the global popularity and the individual tendency of the rumor topic,which can be regarded as a generalized User experience is an important factor for various services feature of the Ising model. including social networks [43].Existing rumor blocking strategies block either nodes (users)or links (connections Now we investigate the two steps of a successful rumor propagation.In the first step,at any time stamp ti,u is between users)in social networks to prevent the rumor from further propagation.However,none has analyzed the one of the activated nodes in time stamp tj-1.Based on the work in [27],we give the modified version of the probability impact of blocking nodes.Generally speaking,the longer the user is blocked,the less satisfactory the user feels about of node u sending the rumor to one of its inactive neighbors vas the social network.Therefore,if the blocked time surpasses Po a certain threshold,it is possible that the user may quit pd(与)=1g(10+t) (1) the social network or at least lodge a complaint to the administrator.Bhatti et al.[44]analyzed the user-perceived where po is the initial sending probability at time stamp 0. quality in web server design and found that users'tolerance On the receiving end,the probability of node v accepting for latency decreases over the duration of interaction with the rumor transmitted by its parent node u is also given as a site.A utility function was presented to measure the piuce =1/Dv, (2) customer satisfaction.Inspired by that,in our work,we apply a modified utility function to measure user experience where D is the in-degree of node v.That denotation can in rumor blocking. be interpreted as that if a node has very high degree,it will receive more information than those with lower degrees.In that case,due to the large number of pieces of information 3.5 Rumor Influence Minimization it receives,every single piece of information will have much lower possibility to be read,recognized and forwarded by Rumor influence minimization addresses the problem of the node.For example,if a user has only one friend on minimizing the propagation effect of undesirable rumors a social network,he or she would probably read all the in social networks.Figure 3 demonstrates the mechanism messages forwarded by his or her friend,and thus if the user of rumor blocking in social networks.It shows both the finds a rumor interesting,he or she will probably believe normal rumor propagation process without any blockage in and forward it.In contrast,if a user has hundreds or even the social network and the process blocking a set of nodes thousands of friends,he or she could easily miss most of the on the path of rumor propagation.The rumor influence information. minimization problem is converse to the classic influence Thus,based on the above analysis,we then give the maximization problem [14]. probability of successful rumor propagation from u to v as It has been investigated in different influence diffusion mod- els in social networks.Fan et al.[20]studied the least cost rumor blocking problem in social networks,and introduced aat,)=pd,哈=D。g10+t 1 Po (3) the notion of "protectors"to limit the bad influence of ru- which can be defined as the individual tendency between mors by initiating a protector cascade to propagate against different pair of nodes in the network. the rumor cascade.Greedy algorithm is proposed for both Now we discuss the global topic popularity of the ru- opportunistic and deterministic cascade models.However, mor.As mentioned in related work,the rumor popularity Kimura et al.[19]proposed the strategy of blocking links generally includes three phases and approximately subject instead of nodes in social networks so as to minimize the to the chi square distribution,which is given by propagation of malicious rumors.Different contamination minimization problems are defined based on different defi- 21-)tk-1e-号 nitions of contamination degree of a network. Pglb(t;k)= T() (4)
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 4 scription of the concept of ferromagnetism in Physics [42]. Specifically, it describes the phenomenon that when an array of atomic spins align in the way that the magnetic moments associated to them will all point in the same direction. Then it will create a macroscopic magnetic moment. Generally speaking, the Ising model contains two parts – the microscopic and macroscopic parts. The microscopic part represents the local or individual behavior which is the alignment of each of the atomic spins. Correspondingly, the macroscopic part stands for the global or collective behavior which is the exterior magnetic moment. Based on its intrinsic attributes, the Ising model can be generalized to other similar scenarios. In our work, we utilize it to model the rumor propagation process in social networks. 3.4 User Experience User experience is an important factor for various services including social networks [43]. Existing rumor blocking strategies block either nodes (users) or links (connections between users) in social networks to prevent the rumor from further propagation. However, none has analyzed the impact of blocking nodes. Generally speaking, the longer the user is blocked, the less satisfactory the user feels about the social network. Therefore, if the blocked time surpasses a certain threshold, it is possible that the user may quit the social network or at least lodge a complaint to the administrator. Bhatti et al. [44] analyzed the user-perceived quality in web server design and found that users’ tolerance for latency decreases over the duration of interaction with a site. A utility function was presented to measure the customer satisfaction. Inspired by that, in our work, we apply a modified utility function to measure user experience in rumor blocking. 3.5 Rumor Influence Minimization Rumor influence minimization addresses the problem of minimizing the propagation effect of undesirable rumors in social networks. Figure 3 demonstrates the mechanism of rumor blocking in social networks. It shows both the normal rumor propagation process without any blockage in the social network and the process blocking a set of nodes on the path of rumor propagation. The rumor influence minimization problem is converse to the classic influence maximization problem [14]. It has been investigated in different influence diffusion models in social networks. Fan et al. [20] studied the least cost rumor blocking problem in social networks, and introduced the notion of “protectors” to limit the bad influence of rumors by initiating a protector cascade to propagate against the rumor cascade. Greedy algorithm is proposed for both opportunistic and deterministic cascade models. However, Kimura et al. [19] proposed the strategy of blocking links instead of nodes in social networks so as to minimize the propagation of malicious rumors. Different contamination minimization problems are defined based on different defi- nitions of contamination degree of a network. 4 PROBLEM FORMULATION 4.1 Dynamic Rumor Propagation with Ising Model Kempe et al. [14] considered the success probability puv as a system parameter and is fixed at the very beginning of the cascade. However, based on the topic dynamics we discussed in a previous section, at different time steps of the propagation process, a topic can vary dramatically in its popularity. Besides, the rumor attraction [27] for each individual node u ∈ V is also a realistic factor we should take into account. That means the success of rumor propagation between neighbors includes two aspects: first, the activated node u has to be so attracted by the rumor that it will choose to send the rumor to its neighbors; second, one of u’s inactive neighbors v decides to accept the rumor. Only after those two steps, we can claim that v is activated. In other words, the success of rumor propagation depends both on the global popularity and the individual tendency of the rumor topic, which can be regarded as a generalized feature of the Ising model. Now we investigate the two steps of a successful rumor propagation. In the first step, at any time stamp tj , u is one of the activated nodes in time stamp tj−1. Based on the work in [27], we give the modified version of the probability of node u sending the rumor to one of its inactive neighbors v as p send u (tj ) = p0 lg(10 + tj ) , (1) where p0 is the initial sending probability at time stamp 0. On the receiving end, the probability of node v accepting the rumor transmitted by its parent node u is also given as p acc v = 1/Dv, (2) where Dv is the in-degree of node v. That denotation can be interpreted as that if a node has very high degree, it will receive more information than those with lower degrees. In that case, due to the large number of pieces of information it receives, every single piece of information will have much lower possibility to be read, recognized and forwarded by the node. For example, if a user has only one friend on a social network, he or she would probably read all the messages forwarded by his or her friend, and thus if the user finds a rumor interesting, he or she will probably believe and forward it. In contrast, if a user has hundreds or even thousands of friends, he or she could easily miss most of the information. Thus, based on the above analysis, we then give the probability of successful rumor propagation from u to v as pind(tj ) = p send u · p acc v = 1 Dv p0 lg(10 + tj ) , (3) which can be defined as the individual tendency between different pair of nodes in the network. Now we discuss the global topic popularity of the rumor. As mentioned in related work, the rumor popularity generally includes three phases and approximately subject to the chi square distribution, which is given by pglb(t; k) = 2 (1− k 2 ) t k−1 e − t 2 2 Γ( k 2 ) , (4)
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 5 where >0 represents the degree of freedom,I()is the propagation process. Gamma function.It explains a common social phenomenon Heterogeneous Networks.In heterogeneous social net- that when a rumor spreads for a while,it may create a works,different nodes have distinct properties.For exam- "rumor atmosphere"that could affect the judgements or ple,in real world social networks like Twitter or Weibo, decisions of users on online social networks. different users have different levels according to the time According to the Ising model [28],the"phase transition" they have spent on it,the number of followers they have of a spin involves both short-range interaction with its or the number of messages they have posted.Typically, nearest neighbors and long-term system evolution,and is we can simply divide all users into VIPs and ordinary a combined result.Inspired by that,we propose the co- users considering their levels.Intuitively,the social network operative propagation probability integrating Palb(t;k)with operators would try to avoid blocking the VIP users as Pind(t)in the form of a logistic function as possible as they can because of the impact they have on 1 the enormous followers and hence on the entire network. Pu(d=1+exp-B'psbt内+Pm可 On the other hand,from the perspective of the VIPs,since they usually have frequent interactions with their followers, 1 there is a high possibility that they can not tolerate to be 1+exp- B,--竖+六品可 blocked for a long time.As a result,the VIPs usually have a T() relatively low tolerance threshold of the blocked time. (5) Let C(u)denote the level of node u,and T(u)denote the where B1,B2 E(0,1)are the balance coefficients which tolerance threshold of node u when it is blocked.Then we satisfy B1+B2=1. propose the following expression: Based on this cooperative propagation probability,the 1 probability of node v getting activated at time stamp t;can T(u=c(西 (8) be given by The mechanism can be explained as follows.When the level Pr[s(G)=1刂=1-Π[1-su(G-)pun(tl, (6) of a user u,C(u),is approximate to zero,i.e.,the user has UEP just joined the social network,its tolerance threshold is close where Pr]represents probability,and P represents the to infinity,because even if it is blocked forever,it can just parent nodes of v. apply for a new account without any loss.On the contrary, if u is a VIP user,the higher its level is,the less blocked time it can endure.When C(u)is large enough,its tolerance will 4.2 User Experience Utility be asymptotic to zero. In the formulated optimization problem,one important There are several metrics to define the level or sig- constraint condition is the user experience utility function.nificance of a node in a social network,such as degree, Therefore,before giving the concrete algorithm,first,we eigenvector or betweenness.Also there is a widely adopt- elaborate on the user experience utility function. ed PageRank algorithm [45]which can be regarded as a It is a common sense that user experience is a critical variant of eigenvector centrality.For simplicity,we here use element in the success of modern business.A large variety the degree of a node u,D(u),to represent its level,i.e., of communication services involve user experience,such as C(u)=D(u),where is a constant coefficient.Thus,the web searching,telephone connecting,et.al.For customers user experience utility in a heterogeneous network can be using those services,the latency plays a tremendous role written as in their satisfaction extent,i.e.,the user experience utility. N Specifically,in our proposed problem,the user experience 11 >(1-SD(u)T6(u) (9) utility lies in the blocked time t of the selected node u. N In our work,we try to find the proper user experience utility function to characterize the impact of nodes being Given the user experience utility,now we analyze the blocked to the entire social network.Specifically,we analyze strategy of blocking nodes of different levels and its influ- both the homogeneous and heterogeneous scenarios. ence to the entire network.Our goal now is to select a certain Homogeneous Networks.For homogeneous social net- number of nodes and then block them so that the final active works,the simple case,we assume that all the nodes have nodes within a duration can be minimized.Nevertheless,it the same blocked time threshold Tih.In other words,we is a challenge to select the proper nodes to block without assume that all the users in the same social network have triggering the complaint of users.Let's assume that every same tolerance of the time being blocked regardless of the time a node is blocked longer than its tolerance threshold, time they have been in the social network.In this scenario, it will trigger a complaint to the operator.Besides,we also based on the user experience definition on web server in assume that a fraction of its neighbors will also trigger a [44],we define the user experience utility function as certain amount of complaints.For example,some users may attract many followers by frequently sharing interesting 1 information,so if they are blocked for a long time,and the Uhom N Th-To(u) N (7 followers fail to receive the information,it is likely that some u=l of them may lodge a complaint.In the following part,we where the Tih represents the tolerance time threshold,T(u)will use survival theory to analyze the process mentioned is used to record the blocked time of node u in the whole above
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 5 where k > 0 represents the degree of freedom, Γ(·) is the Gamma function. It explains a common social phenomenon that when a rumor spreads for a while, it may create a “rumor atmosphere” that could affect the judgements or decisions of users on online social networks. According to the Ising model [28], the “phase transition” of a spin involves both short-range interaction with its nearest neighbors and long-term system evolution, and is a combined result. Inspired by that, we propose the cooperative propagation probability integrating pglb(t; k) with pind(t) in the form of a logistic function as puv(t) = 1 1 + exp − β1 · pglb(t; k) + β2 · pind(t) = 1 1 + exp − β1 2 (1− k 2 ) t k−1e − t2 2 Γ( k 2 ) + β2 1 Dv p0 lg(10+t) , (5) where β1, β2 ∈ (0, 1) are the balance coefficients which satisfy β1 + β2 = 1. Based on this cooperative propagation probability, the probability of node v getting activated at time stamp tj can be given by P r[sv(tj ) = 1] = 1 − Y u∈Pv [1 − su(tj−1)puv(tj )], (6) where P r[·] represents probability, and Pv represents the parent nodes of v. 4.2 User Experience Utility In the formulated optimization problem, one important constraint condition is the user experience utility function. Therefore, before giving the concrete algorithm, first, we elaborate on the user experience utility function. It is a common sense that user experience is a critical element in the success of modern business. A large variety of communication services involve user experience, such as web searching, telephone connecting, et.al. For customers using those services, the latency plays a tremendous role in their satisfaction extent, i.e., the user experience utility. Specifically, in our proposed problem, the user experience utility lies in the blocked time t b u of the selected node u. In our work, we try to find the proper user experience utility function to characterize the impact of nodes being blocked to the entire social network. Specifically, we analyze both the homogeneous and heterogeneous scenarios. Homogeneous Networks. For homogeneous social networks, the simple case, we assume that all the nodes have the same blocked time threshold Tth. In other words, we assume that all the users in the same social network have same tolerance of the time being blocked regardless of the time they have been in the social network. In this scenario, based on the user experience definition on web server in [44], we define the user experience utility function as Uhom = 1 N X N u=1 Tth − Tb(u) Tth , (7) where the Tth represents the tolerance time threshold, Tb(u) is used to record the blocked time of node u in the whole propagation process. Heterogeneous Networks. In heterogeneous social networks, different nodes have distinct properties. For example, in real world social networks like Twitter or Weibo, different users have different levels according to the time they have spent on it, the number of followers they have or the number of messages they have posted. Typically, we can simply divide all users into VIPs and ordinary users considering their levels. Intuitively, the social network operators would try to avoid blocking the VIP users as possible as they can because of the impact they have on the enormous followers and hence on the entire network. On the other hand, from the perspective of the VIPs, since they usually have frequent interactions with their followers, there is a high possibility that they can not tolerate to be blocked for a long time. As a result, the VIPs usually have a relatively low tolerance threshold of the blocked time. Let L(u) denote the level of node u, and T (u) denote the tolerance threshold of node u when it is blocked. Then we propose the following expression: T (u) = 1 L(u) . (8) The mechanism can be explained as follows. When the level of a user u, L(u), is approximate to zero, i.e., the user has just joined the social network, its tolerance threshold is close to infinity, because even if it is blocked forever, it can just apply for a new account without any loss. On the contrary, if u is a VIP user, the higher its level is, the less blocked time it can endure. When L(u) is large enough, its tolerance will be asymptotic to zero. There are several metrics to define the level or significance of a node in a social network, such as degree, eigenvector or betweenness. Also there is a widely adopted PageRank algorithm [45] which can be regarded as a variant of eigenvector centrality. For simplicity, we here use the degree of a node u, D(u), to represent its level, i.e., L(u) = ζD(u), where ζ is a constant coefficient. Thus, the user experience utility in a heterogeneous network can be written as Uhet = 1 N X N u=1 (1 − ζD(u)Tb(u)). (9) Given the user experience utility, now we analyze the strategy of blocking nodes of different levels and its influence to the entire network. Our goal now is to select a certain number of nodes and then block them so that the final active nodes within a duration can be minimized. Nevertheless, it is a challenge to select the proper nodes to block without triggering the complaint of users. Let’s assume that every time a node is blocked longer than its tolerance threshold, it will trigger a complaint to the operator. Besides, we also assume that a fraction of its neighbors will also trigger a certain amount of complaints. For example, some users may attract many followers by frequently sharing interesting information, so if they are blocked for a long time, and the followers fail to receive the information, it is likely that some of them may lodge a complaint. In the following part, we will use survival theory to analyze the process mentioned above.