Impact of Social Relation and Group Size in Multicast Ad Hoc Networks Yi Qin,Riheng Jia,Jinbei Zhang,Weijie Wu,Xinbing Wang Abstract-This paper investigates the multicast capacity of The second characteristic of social networks is the social static wireless social networks.We adopt the two-layer network relation which has been studied for more than thirty years model,which includes the social layer and the networking layer. [9].[16]-[19].In 1979.the authors in [16]focused on about In the social layer,the social group size of each source node is modeled as power-law distribution.Moreover,the rank-based 500 persons in a network.Further research about 130 million model is utilized to describe the relation between source and persons of Myspace was demonstrated in [17].In their works, destinations in the networking layer.Based on the two-layer the social network was modeled as a graph to describe the network model,the probability density function (PDF)of the geographical and social relations among users.Moreover,the destination positions is analyzed and verified by numerical sim- social relation in these papers reflected how users select friends ulation,which is different from the traditional ad hoc networks According to the PDF,the bound of the network capacity is in the network,and it was usually determined by the factors derived,and we propose a Euclidean minimum spanning tree such as friendship,common interest or alliance,which was based transmission scheme,which is proved to achieve the order different from the traditional ad hoc networks.From the above of capacity bound for most cases.Finally,the capacity of social studies of the experiments about how people selected friends, networks is compared with the traditional multicast ad hoc some feasible social relation models were proposed such as networks,which indicates that the capacity scaling performs better in social networks than traditional ones.To our best distance-based model and rank-based model [19].The scaling knowledge,this is the first work of analyzing the impact on laws of these two models were analyzed in [9]and [18], the capacity of social relation and group size in multicast ad hoc respectively.In [9],the probability that node j was a friend of networks for the rank-based model. node i was proportional to d(i,j)-,where d(i,j)represented the distance between i and j.Instead,in [18],the authors I.INTRODUCTION illustrated that the probability was proportional to Rank(j), where Ranki(j)was the rank of j with respect to i. Network scaling law was firstly studied by Gupta and The two social characteristics above were both observed Kumar [1],followed by many works about the network through statistics from online social networks.However,they capacity and throughput [2]-[4].However,their models fell can also be introduced into wireless networks because that short in well characterizing real social networks,and they they are independent from the wireless nature of the networks. had to consider simplified models instead.For example,in Moreover,it is worthwhile to study wireless social networks most of these works,the destinations associated with each due to the two facts:1)The study of wireless social networks is user were drawn according to a uniform distribution,which is promising for the reason that the social wireless networks,such unrealistic.To give another example,it was also an unrealistic as MSN wireless client,facebook wireless client.have gained a assumption that the number of friends was known apriori large popularity nowadays.Furthermore,in cellular networks, in traditional multicast ad hoc networks [2]-[4].Therefore, as the number of wireless terminals increases,one base- many researchers turned to study social characteristics such station needs to service more and more terminals.However, as the way people selected friends (destinations)[5][6]and the number of simultaneously serviced terminals is restricted the number of these friends [7][8].It was only recently that due to the limited up-link resource.This will reduce the studying social phenomena had become a hot topic in ad per-terminal throughput when there are too many terminals. hoc networks [9],peer to peer networks [10]and some other Therefore,wireless multi-hop transmission can be adopted to networks [11]-[14]. improve the per-terminal throughput of social networks.2)The To study the social phenomenon.two characteristics of multi-hop user relaying technology has become an important social networks are considered in our paper,which are the topic and been well studied.It can be demonstrated from [201. social group size and the social relation.Firstly,we introduce which investigated coordination among users through wireless the group size which describes the number of friends for each channel (such as multi-hop),that the user relaying technology node.In [7][8][15],the authors analyzed the probability that helped to improve the capacity of wireless system. an arbitrary node had g friends based on the data of Cyworld, Additionally,the nodes in the network need to transmit their MySpace and orkutwith,each with more than 10 million users. packets to all of their friends by multicasting in this paper, The results showed that the probability satisfied the power-law which is also a common requirement.For example,a user of distribution.which generally matched the fact. facebook may want to send a message to all of his friends. The capacity of traditional multicast ad hoc networks was The authors are with the Department of Electronic Engineering Shanghai Jiao Tong University,China intensively studied in [2]-[4].However,the traditional results Email:[qinyi_33,jiariheng,abelchina,weijiewu,xwang8)@sjtu.edu.cn cannot be directly applied to social networks,and therefore
1 Impact of Social Relation and Group Size in Multicast Ad Hoc Networks Yi Qin, Riheng Jia, Jinbei Zhang, Weijie Wu, Xinbing Wang Abstract—This paper investigates the multicast capacity of static wireless social networks. We adopt the two-layer network model, which includes the social layer and the networking layer. In the social layer, the social group size of each source node is modeled as power-law distribution. Moreover, the rank-based model is utilized to describe the relation between source and destinations in the networking layer. Based on the two-layer network model, the probability density function (PDF) of the destination positions is analyzed and verified by numerical simulation, which is different from the traditional ad hoc networks. According to the PDF, the bound of the network capacity is derived, and we propose a Euclidean minimum spanning tree based transmission scheme, which is proved to achieve the order of capacity bound for most cases. Finally, the capacity of social networks is compared with the traditional multicast ad hoc networks, which indicates that the capacity scaling performs better in social networks than traditional ones. To our best knowledge, this is the first work of analyzing the impact on the capacity of social relation and group size in multicast ad hoc networks for the rank-based model. I. INTRODUCTION Network scaling law was firstly studied by Gupta and Kumar [1], followed by many works about the network capacity and throughput [2]- [4]. However, their models fell short in well characterizing real social networks, and they had to consider simplified models instead. For example, in most of these works, the destinations associated with each user were drawn according to a uniform distribution, which is unrealistic. To give another example, it was also an unrealistic assumption that the number of friends was known apriori in traditional multicast ad hoc networks [2]- [4]. Therefore, many researchers turned to study social characteristics such as the way people selected friends (destinations) [5] [6] and the number of these friends [7] [8]. It was only recently that studying social phenomena had become a hot topic in ad hoc networks [9], peer to peer networks [10] and some other networks [11]- [14]. To study the social phenomenon, two characteristics of social networks are considered in our paper, which are the social group size and the social relation. Firstly, we introduce the group size which describes the number of friends for each node. In [7] [8] [15], the authors analyzed the probability that an arbitrary node had q friends based on the data of Cyworld, MySpace and orkutwith, each with more than 10 million users. The results showed that the probability satisfied the power-law distribution, which generally matched the fact. The authors are with the Department of Electronic Engineering Shanghai Jiao Tong University, China Email: {qinyi 33, jiariheng, abelchina, weijiewu, xwang8}@sjtu.edu.cn The second characteristic of social networks is the social relation which has been studied for more than thirty years [9], [16]- [19]. In 1979, the authors in [16] focused on about 500 persons in a network. Further research about 130 million persons of Myspace was demonstrated in [17]. In their works, the social network was modeled as a graph to describe the geographical and social relations among users. Moreover, the social relation in these papers reflected how users select friends in the network, and it was usually determined by the factors such as friendship, common interest or alliance, which was different from the traditional ad hoc networks. From the above studies of the experiments about how people selected friends, some feasible social relation models were proposed such as distance-based model and rank-based model [19]. The scaling laws of these two models were analyzed in [9] and [18], respectively. In [9], the probability that node j was a friend of node i was proportional to d(i, j) −α, where d(i, j) represented the distance between i and j. Instead, in [18], the authors illustrated that the probability was proportional to Rank−α i (j), where Ranki(j) was the rank of j with respect to i. The two social characteristics above were both observed through statistics from online social networks. However, they can also be introduced into wireless networks because that they are independent from the wireless nature of the networks. Moreover, it is worthwhile to study wireless social networks due to the two facts: 1) The study of wireless social networks is promising for the reason that the social wireless networks, such as MSN wireless client, facebook wireless client, have gained a large popularity nowadays. Furthermore, in cellular networks, as the number of wireless terminals increases, one basestation needs to service more and more terminals. However, the number of simultaneously serviced terminals is restricted due to the limited up-link resource. This will reduce the per-terminal throughput when there are too many terminals. Therefore, wireless multi-hop transmission can be adopted to improve the per-terminal throughput of social networks. 2) The multi-hop user relaying technology has become an important topic and been well studied. It can be demonstrated from [20], which investigated coordination among users through wireless channel (such as multi-hop), that the user relaying technology helped to improve the capacity of wireless system. Additionally, the nodes in the network need to transmit their packets to all of their friends by multicasting in this paper, which is also a common requirement. For example, a user of facebook may want to send a message to all of his friends. The capacity of traditional multicast ad hoc networks was intensively studied in [2]- [4]. However, the traditional results cannot be directly applied to social networks, and therefore
3 we present this paper to improve and generalize the theory II.NETWORK MODEL AND DEFINITIONS of ad hoc network capacity.In particular,the main differ- In this section,the two-layer model of static social networks ence between the social networks and the traditional ad hoc is proposed based on group size and social relation,which networks is the Probability Distribution Function (PDF)of are the main differences between social network model and destination positions.Therefore,a challenging question arises traditional ad hoc network model.Moreover,we also list some when considering the two social characteristics: network performance metrics and notations for our following How do the two characteristics of static social networks analysis. jointly impact the capacity of network,and what is the capacity achieving scheme? A.The Social Network Model To answer this question,we study the two-layer static social network model,which includes social layer and networking In this paper,we adopt the two-layer network model,which layer.In the social layer,the social group size of each source includes node was modeled as power-law distribution model.Moreover, 1)Social layer:This layer captures the social relation the rank-based model was utilized to describe the relation among individuals,which is not related with the network between source and destinations in the networking layer.For topology. a node with g friends,the PDF of its friends'positions 2)Networking layer:This layer reflects the network topol- are derived in our paper.The simulations of the PDF are ogy based on the node positions. also illustrated.which verify the theoretical results.We then In the social layer,each node i selects some friends as its analyze the bound of network capacity and propose a capacity destinations,which form a social group.Moreover,the size of achieving scheme based on the Euclidean Minimum Spanning the social group,denoted as gi.is proved to satisfy the power- Tree. law degree distribution in [9].In particular,the experiments The main contributions of this paper are summarized as of three online social networking services,each with more follows: than 10 million users,indicate that P[qi=g}follows the The relation between rank and geographical position of an distribution in (1) arbitrary node is demonstrated for rank-based static social 1 network model.If there are n nodes in the network which Plqi=q)=Gng5 (1) is of size 1 x 1,the result shows that =e (V n with probability 1 when n goes to infinity,where z is where B≥0 is a constant and Gn=∑a 1 g=1 g8 the distance between source and destination.Moreover. In the networking layer,we consider the network which is we also derive the PDF of the destination positions for a a unit square,and n static nodes (users)are uniformly and randomly distributed in it.The well-known protocol model as given source,which is verified by the simulations. We analyze the capacity bound of the social network in [1]is employed as the interference model in our network. model based on the PDF of destination positions and When node i wants to transmit a packet to node j.the Euclidean Minimum Spanning Tree.Moreover,a corre- transmission is considered to be successful if the following sponding transmission scheme is proposed to achieve the inequality is satisfied capacity bound in most cases. lIXi-Xl≤r(n), (2) Finally,we compare the capacity of social networks with the traditional ad hoc networks,and the results where Xi represents node i's location,and r(n)is the max- indicate that the capacity scaling performs better in social imum transmission range of each node,which is a function networks than traditional ad hoc networks. of n.here is the Euclidean distance.Moreover,any other The rest of this paper is organized as follows.In Section node k who wants to transmit packet at the same time must II,we introduce the network model and definitions.The PDF satisfy the inequality as of destination positions and the capacity bound of the social IXk-X‖≥(1+△)r(n) (3) network model are derived in Section III,and the theoretical results of the PDF is verified by numerical simulations in Sec- where A>0 is a constant factor depending on the acceptable tion IV.In Section V,a capacity achieving scheme is proposed Signal to Interference Noise Ratio (SINR)of the network to demonstrate that the capacity is achievable in most cases.In Furthermore,the bandwidth of the network is finite and Section VI,we compare the capacity of social networks with constant.In this model,the transmission range r(n)is assumed traditional ad hoc networks.Finally,we conclude in Section asr(n)[1].which guarantees the connectivity of VIl. the network.In addition,each node transmits packet to its destinations through multi-hop. IWe use standard asymptotic notations in our paper.Consider t wo nonnegative function f()and g():(1)f(n)=o(g(n)) It is important to study the node mapping from the social means limnoo f(n)/g(n)=0.(2)f(n)= O(g(n))mean- layer to the networking layer.We adopt the rank-based model s limnoo f(n)/g(n)<oo.(3)f(n) w(g(m))mean in [18]and [19].In this model,for a given node i,when it s limn-→ef(n)/g(n)。=o.(4)f(n)=(g(n)means selects one friend from all the nodes,we denote that node j is limnoo f(n)/g(n)>0.(5)f(n)=e(g(n))means f(n)=O(g(n)) and g(n)=O(f(n)).(6)f(n)=e(g(n))means that there exists two con- selected with probability PBased on social experiments stants a and b satisfy f(n)=O(g(n)loga n)and f(n)=(g(n)log n) [19],the friends of a user are more likely to be close to him
2 we present this paper to improve and generalize the theory of ad hoc network capacity. In particular, the main difference between the social networks and the traditional ad hoc networks is the Probability Distribution Function (PDF) of destination positions. Therefore, a challenging question arises when considering the two social characteristics: • How do the two characteristics of static social networks jointly impact the capacity of network, and what is the capacity achieving scheme? To answer this question, we study the two-layer static social network model, which includes social layer and networking layer. In the social layer, the social group size of each source node was modeled as power-law distribution model. Moreover, the rank-based model was utilized to describe the relation between source and destinations in the networking layer. For a node with q friends, the PDF of its friends’ positions are derived in our paper. The simulations of the PDF are also illustrated, which verify the theoretical results. We then analyze the bound of network capacity and propose a capacity achieving scheme based on the Euclidean Minimum Spanning Tree. The main contributions of this paper are summarized as follows: • The relation between rank and geographical position of an arbitrary node is demonstrated for rank-based static social network model. If there are n nodes in the network which is of size 1 × 1, the result shows that x = Θ ✏➮Rank n ✑ 1 with probability 1 when n goes to infinity, where x is the distance between source and destination. Moreover, we also derive the PDF of the destination positions for a given source, which is verified by the simulations. • We analyze the capacity bound of the social network model based on the PDF of destination positions and Euclidean Minimum Spanning Tree. Moreover, a corresponding transmission scheme is proposed to achieve the capacity bound in most cases. • Finally, we compare the capacity of social networks with the traditional ad hoc networks, and the results indicate that the capacity scaling performs better in social networks than traditional ad hoc networks. The rest of this paper is organized as follows. In Section II, we introduce the network model and definitions. The PDF of destination positions and the capacity bound of the social network model are derived in Section III, and the theoretical results of the PDF is verified by numerical simulations in Section IV. In Section V, a capacity achieving scheme is proposed to demonstrate that the capacity is achievable in most cases. In Section VI, we compare the capacity of social networks with traditional ad hoc networks. Finally, we conclude in Section VII. 1We use standard asymptotic notations in our paper. Consider two nonnegative function f(·) and g(·): (1) f(n) = o(g(n)) means limn→∞ f(n)/g(n) = 0. (2) f(n) = O(g(n)) means limn→∞ f(n)/g(n) < ∞. (3) f(n) = ω(g(n)) means limn→∞ f(n)/g(n) = ∞. (4) f(n) = Ω(g(n)) means limn→∞ f(n)/g(n) > 0. (5) f(n) = Θ(g(n)) means f(n) = O(g(n)) and g(n) = O(f(n)). (6) f(n) = Θ( ˜ g(n)) means that there exists two constants a and b satisfy f(n) = O(g(n) loga n) and f(n) = Ω(g(n) logb n) II. NETWORK MODEL AND DEFINITIONS In this section, the two-layer model of static social networks is proposed based on group size and social relation, which are the main differences between social network model and traditional ad hoc network model. Moreover, we also list some network performance metrics and notations for our following analysis. A. The Social Network Model In this paper, we adopt the two-layer network model, which includes 1) Social layer: This layer captures the social relation among individuals, which is not related with the network topology. 2) Networking layer: This layer reflects the network topology based on the node positions. In the social layer, each node i selects some friends as its destinations, which form a social group. Moreover, the size of the social group, denoted as qi , is proved to satisfy the powerlaw degree distribution in [9]. In particular, the experiments of three online social networking services, each with more than 10 million users, indicate that P{qi = q} follows the distribution in (1) P{qi = q} = 1 Gnq β , (1) where β ≥ 0 is a constant and Gn = Pn q=1 1 q β . In the networking layer, we consider the network which is a unit square, and n static nodes (users) are uniformly and randomly distributed in it. The well-known protocol model as in [1] is employed as the interference model in our network. When node i wants to transmit a packet to node j, the transmission is considered to be successful if the following inequality is satisfied kXi − Xjk ≤ r(n), (2) where Xi represents node i’s location, and r(n) is the maximum transmission range of each node, which is a function of n. k · k here is the Euclidean distance. Moreover, any other node k who wants to transmit packet at the same time must satisfy the inequality as kXk − Xjk ≥ (1 + ∆)r(n), (3) where ∆ > 0 is a constant factor depending on the acceptable Signal to Interference Noise Ratio (SINR) of the network. Furthermore, the bandwidth of the network is finite and constant. In this model, the transmission range r(n) is assumed as r(n) = √ log n √ n [1], which guarantees the connectivity of the network. In addition, each node transmits packet to its destinations through multi-hop. It is important to study the node mapping from the social layer to the networking layer. We adopt the rank-based model in [18] and [19]. In this model, for a given node i, when it selects one friend from all the nodes, we denote that node j is selected with probability Pi→j . Based on social experiments [19], the friends of a user are more likely to be close to him
Denoting P as the set consisting all the nodes and d(i,j)= TABLE I:Important Definitions of Symbols and Notations Xi-Xill as the distance between i and j,the rank of j with Symbol Definition respect to i is defined as N Number of users Ranki(j)=k EP:d(i,k)<d(i,j), (4) r(n) The maximum transmission range qi Number of destinations (friends)of user i where|represented the number of elements in the set a Parameter of rank-based model According to [18]and [19].P satisfies B Parameter of power-law distribution of 1 group size P-→Rank?⑦ (5) Ranki(j) The rank of j with respect to i where a>0 is a constant.When a =0,the network is the fn(x,θ) The probability density function of destina- traditional ad hoc network.(5)shows the fact that a node is tion positions more likely to select a friend that nearby.Furthermore,this EMST The total length of the EMST model focuses on relative location instead of geographic lo- C(n) The capacity of the network cation.We defineHTherefore,by normalizing. Na(n) The minimum total number of hops for a (5)can be represented as multicast tree with g+1 nodes The capacity ratio of social networks and 1 P=H.Rank (j) (6 Ra traditional ad hoc networks when each node has q friends In [19],the authors verify the (6)by online data.It should be noticed that we consider the case that one node selects multiple destinations instead of one.Therefore,the distribution of the destinations will be analyzed based on both social group size Thus,in order to minimize the length of transmission path,the and (6). Euclidean Minimum Spanning Tree (EMST)is investigated. The total length for this EMST is proved in [21]to satisfy the B.Network Performance Metrics and Notations following lemma. We give the definitions of throughput and capacity in this Lemma 1 Let f(x)denote the PDF of the related nodes in subsection.Moreover,some notations are also demonstrated. Definition of throughput:For a given scheme,we define the the network,where x is the position vector.Then,for large number of nodes n and the network dimension d 1,if f(x) throughput as the maximum achievable transmission rate.In t time slots,any i's destination (denoted as k)receives Mi(k,t) is independent from n,the total length for the EMST satisfies packets and the number of i's friends is qi.The long term per-node throughput of this multicast session is defined by 照EMST可=d网f学k, (9) (n)as(n)=liminf(t).Then the average with probability 1,where c(d)is constant. throughput of this scheme is defined by T(n) The proof of this lemma is demonstrated in [21],and d=2 in our paper.However,when f(x)is related with n,Lemma T(n)=liminf二>λ(n) (7)1 does not hold.As a result,we denote the PDF as fr(x)in this paper and give the following lemma which indicates the Definition of capacity:In this social network,C(n)is said bound of the total length of the EMST for infinite n. to be the asymptotic per-node social multicast capacity with order e(To(n)),where To(n)satisfies (8). Lemma 2 If fn(x)in Lemma I is related with infinite n and the following two conditions are satisfied, T(n) (Tw:io问 =0∞}=0, (8) 。Condition I:There exists a function gn(x)= T(n) {Tm):imo简 >0}≠0, ∑e:ewTile:satisfying for any feasible T(n). lgn(x)-fn(x)ldx→0, (10) Furthermore,we list some important parameters in Table I that will be used in later analysis,proofs and discussions. when n goes to infinity,where is the range of total network,Ei(i=l,2,·)is the partition of业separating III.CAPACITY ANALYSIS FOR SOCIAL NETWORKS 亚into many nor-overlapping parts(i.e,ξi∩ξj=0,i≠ In this section,firstly,the relation between rank and ge- jand Ei=亚,lξ:is an indicative function and ographical position of an arbitrary node is analyzed.Based egn(x)dx=Je fn(x)dx.Moreover,each part Ei is a on such relation,we derive the PDF of the destinations,and d-dimensional hypercube,and the number of its adjacent finally obtain the bound of capacity. parts is limited by a constant The links between the nodes belonging to one multicast Condition 2:lim ngn(x)dx=(1)with probabil- n+0 session are considered,which can organize a spanning tree. ity 1
3 Denoting P as the set consisting all the nodes and d(i, j) = kXi − Xjk as the distance between i and j, the rank of j with respect to i is defined as Ranki(j) = |{k ∈ P : d(i, k) < d(i, j)}|, (4) where | · | represented the number of elements in the set. According to [18] and [19], Pi→j satisfies Pi→j ∝ 1 Rankα i (j) , (5) where α ≥ 0 is a constant. When α = 0, the network is the traditional ad hoc network. (5) shows the fact that a node is more likely to select a friend that nearby. Furthermore, this model focuses on relative location instead of geographic location. We define Hn = Pn i=1 1 iα . Therefore, by normalizing, (5) can be represented as Pi→j = 1 HnRankα i (j) . (6) In [19], the authors verify the (6) by online data. It should be noticed that we consider the case that one node selects multiple destinations instead of one. Therefore, the distribution of the destinations will be analyzed based on both social group size and (6). B. Network Performance Metrics and Notations We give the definitions of throughput and capacity in this subsection. Moreover, some notations are also demonstrated. Definition of throughput: For a given scheme, we define the throughput as the maximum achievable transmission rate. In t time slots, any i’s destination (denoted as k) receives Mi(k, t) packets and the number of i’s friends is qi . The long term per-node throughput of this multicast session is defined by λi(n) as λi(n) = lim inf t→∞ 1 tqi Pqi k=1 Mi(k, t). Then the average throughput of this scheme is defined by T(n) T(n) = lim inf n→∞ 1 n ❳n i=1 λi(n). (7) Definition of capacity: In this social network, C(n) is said to be the asymptotic per-node social multicast capacity with order Θ(T0(n)), where T0(n) satisfies (8). {T(n) : lim n→∞ T(n) T0(n) = ∞} = ∅, {T(n) : lim n→∞ T(n) T0(n) > 0} 6= ∅, (8) for any feasible T(n). Furthermore, we list some important parameters in Table I that will be used in later analysis, proofs and discussions. III. CAPACITY ANALYSIS FOR SOCIAL NETWORKS In this section, firstly, the relation between rank and geographical position of an arbitrary node is analyzed. Based on such relation, we derive the PDF of the destinations, and finally obtain the bound of capacity. The links between the nodes belonging to one multicast session are considered, which can organize a spanning tree. TABLE I: Important Definitions of Symbols and Notations Symbol Definition n Number of users r(n) The maximum transmission range qi Number of destinations (friends) of user i α Parameter of rank-based model β Parameter of power-law distribution of group size Ranki(j) The rank of j with respect to i fn(x, θ) The probability density function of destination positions kEMSTk The total length of the EMST C(n) The capacity of the network Nq(n) The minimum total number of hops for a multicast tree with q + 1 nodes Rq The capacity ratio of social networks and traditional ad hoc networks when each node has q friends Thus, in order to minimize the length of transmission path, the Euclidean Minimum Spanning Tree (EMST) is investigated. The total length for this EMST is proved in [21] to satisfy the following lemma. Lemma 1 Let f(x) denote the PDF of the related nodes in the network, where x is the position vector. Then, for large number of nodes n and the network dimension d > 1, if f(x) is independent from n, the total length for the EMST satisfies limn→∞ n − d−1 d kEMSTk = c(d) ❩ Rd f(x) d−1 d dx, (9) with probability 1, where c(d) is constant. The proof of this lemma is demonstrated in [21], and d = 2 in our paper. However, when f(x) is related with n, Lemma 1 does not hold. As a result, we denote the PDF as fn(x) in this paper and give the following lemma which indicates the bound of the total length of the EMST for infinite n. Lemma 2 If fn(x) in Lemma 1 is related with infinite n and the following two conditions are satisfied, • Condition 1: P There exists a function gn(x) = ξi∈Ψ γi1ξi satisfying ❩ Ψ |gn(x) − fn(x)|dx → 0, (10) when n goes to infinity, where Ψ is the range of total network, ξi(i = 1, 2, · · ·) is the partition of Ψ separating Ψ into many non-overlapping parts (i.e., ξi∩ξj = ∅, ∀i 6= j and ❙ ξi = Ψ), 1ξi is an indicative function and ❘ ξi gn(x)dx = ❘ ξi fn(x)dx. Moreover, each part ξi is a d-dimensional hypercube, and the number of its adjacent parts is limited by a constant ζ. • Condition 2: limn→∞ n ❘ ξi gn(x)dx = Ω(1) with probability 1
One than has that with probability I lim n牛EsT=cd0, (11)Lemma 4 Node i is the source node.If the rank of node j toi n→心∫nan(x)字dk is Ranki(j)=X,the distance between node i and j satisfies where c(d)is constant. x=e(√)with probability,1.when n→o. Proof:The proof can be found in the Appendix. ◆ Proof:The probability that the distance between node i Moreover,according to the sum of p-series,the Gn in (1) and Hn in (6)can be represented as and j does not follow =e is 日(n1-a)0≤a<1, 1- 日(logn) a=1, (12) (=(V周)<(sV+(V圆 i=l Θ(1) a>1, (19 and where 0<a (2e) 日(m1-8)0≤B<1, andb>2e.TheP(z≤Vg) 1 satisfies 日(logn) 3=1, (13) 9=1 Θ(1) B>1. To calculate the PDF of user i's gi friends,the relation between rank and geographical position is important.We will aX =∑P(There areknodes within元n away from node i) show this relation in Lemma 4 which is supported by Lemma k=X 3.It should be noticed that the impact of boundary effect on n! scaling laws can be ignored in the proofs of this paper,which = can also be found in other works of scaling laws [2]. (20) Lemma 3 We denote two constants a and b where 0<a <1 amdb>1.For any X∈{1,2,…,n,there must be Denotingg())(1)-based on Lemma 3,there must be g(x)<ci(ael-a)x.Furthermore, ”()产-兴) <a(ae1-a)',(14 the ratio of g(k)and g(+1)can be represented as 9() (兴)(1-兴)"- k+1 where e is the base of the natural logarithm and c is constant, when n is large enough.Moreover,if bx n,there must be gk+五= +0-k-可(g+11-器)n-k1> ax (21) (贷)产(-)<e- (15) Thus,for any k>X,g()satisfies where c2 is constant and n is large enough. g内<2k-)<…<a k! =9(X). (22) Proof:Firstly,the left-hand side of(14)satisfies Therefore.P in (19 can be upper-bounded as (e)产-)” n! g()”元 axxx (n-ax)n-x (e≤V)立a k! ()x()vm-下nx k=X nn-X ()V收+ 2 (ax-x g(X) () (23) In-axn-x n-X 三(笑)广(白)V (16) where c3>1 is constant,and (a)holds due to the fact that sgX(ae-<2csg】 n! (17) (a)holds due to(17).ForP)we denote "( Moreover,for any c>0,there must be 1 (1+c)<e. (s)(1-祭)-and P(≥√)<2ceg*(内) Hence,g(x)satisfies can be proved in the similar way as above. +二) (1-a)X Hence,the probability that the distance between node i and g(X)<2csa 2ca(ael-a)x. j does not satisfy can be represented as (18) Therefore,if we define c1=2c3,equation (14)is proved. 1-(e=(VA) <2csci(ael-")x+2csca(bel-b)x. Furthermore,equation(15)can be proved in the similar way. (24)
4 One than has that with probability 1 limn→∞ n −d−1 d kEMSTk ❘ Rd fn(x) d−1 d dx = c(d), (11) where c(d) is constant. Proof: The proof can be found in the Appendix. Moreover, according to the sum of p-series, the Gn in (1) and Hn in (6) can be represented as Hn = ❳n i=1 1 i α = ✽ ❁ ✿ Θ n 1−α ✁ 0 ≤ α < 1, Θ (log n) α = 1, Θ(1) α > 1, (12) and Gn = ❳n q=1 1 q β = ✽ ❁ ✿ Θ n 1−β ✁ 0 ≤ β < 1, Θ (log n) β = 1, Θ(1) β > 1. (13) To calculate the PDF of user i’s qi friends, the relation between rank and geographical position is important. We will show this relation in Lemma 4 which is supported by Lemma 3. It should be noticed that the impact of boundary effect on scaling laws can be ignored in the proofs of this paper, which can also be found in other works of scaling laws [2]. Lemma 3 We denote two constants a and b where 0 < a < 1 and b > 1. For any X ∈ {1, 2, · · · , n}, there must be n! X!(n − X)! ✏ aX n ✑X ✏ 1 − aX n ✑n−X < c1(ae 1−a ) X, (14) where e is the base of the natural logarithm and c1 is constant, when n is large enough. Moreover, if bX ≤ n, there must be n! X!(n − X)! ✏ bX n ✑X ✏ 1 − bX n ✑n−X < c2(be1−b ) X, (15) where c2 is constant and n is large enough. Proof: Firstly, the left-hand side of (14) satisfies g(X) = n! X!(n − X)! ✏ aX n ✑X ✏ 1 − aX n ✑n−X (a) < c3 n e ✁n √ n X e ✁X √ X n−X e ✁n−X √ n − X a XX X nX (n − aX) n−X nn−X =c3a X ✏ n − aX n − X ✑n−X ➱ 1 X + 1 n − X <2c3a X ✏ n − aX n − X ✑n−X , (16) where c3 > 1 is constant, and (a) holds due to the fact that lim n→∞ ⑨n e ❾n √ 2πn n! = 1. (17) Moreover, for any c > 0, there must be 1 < (1 + c) 1 c < e. Hence, g(X) satisfies g(X) < 2c3a X ❶⑩ 1 + (1 − a)X n − X ❿ n−X (1−a)X ➀(1−a)X < 2c3(ae 1−a ) X. (18) Therefore, if we define c1 = 2c3, equation (14) is proved. Furthermore, equation (15) can be proved in the similar way. Lemma 4 Node i is the source node. If the rank of node j to i is Ranki(j) = X, the distance between node i and j satisfies x = Θ ✏➮X n ✑ with probability 1, when n → ∞. Proof: The probability that the distance between node i and j does not follow x = Θ ✏➮X n ✑ is 1 − P ✒ x = Θ ✒➱ X n ✓✓ < P ✒ x ≤ ➱ aX πn ✓ + P ✒ x ≥ ➱ bX πn ✓ , (19) where 0 < a < (2e) −1 and b > 2e. The P ✏ x ≤ ➮aX n ✑ satisfies P ✒ x ≤ ➱ aX πn ✓ = ❳n k=X P(There are k nodes within ➱ aX πn away from node i) = ❳n k=X n! k!(n − k)!( aX n ) k (1 − aX n ) n−k . (20) Denoting g(k) = n! k!(n−k)!( aX n ) k (1 − aX n ) n−k , based on Lemma 3, there must be g(X) < c1(ae1−a ) X. Furthermore, the ratio of g(k) and g(k + 1) can be represented as g(k) g(k + 1) = n! k!(n−k)!( aX n ) k (1 − aX n ) n−k n! (k+1)!(n−k−1)!( aX n ) k+1(1 − aX n ) n−k−1 > k + 1 aX . (21) Thus, for any k > X, g(k) satisfies g(k) < aX k g(k − 1) < · · · < (aX) k−XX! k! g(X). (22) Therefore, P ✏ x ≤ ➮aX n ✑ in (19) can be upper-bounded as P ✒ x ≤ ➱ aX πn ✓ < ❳n k=X (aX) k−XX! k! g(X) (a) < c3 ❳n k=X (aX) k−X X e ✁X √ X k e ✁k √ k g(X) = c3 ❳n k=X ✏ aXe k ✑k ✏ 1 ae✑X ➱ X k g(X) < c3g(X) ❳n k=X (ae) k−X < 2c3g(X), (23) (a) holds due to (17). For P ✏ x ≥ ➮bX πn ✑ , we denote g ∗ (k) = n! k!(n−k)!( bX n ) k (1 − bX n ) n−k and P ✏ x ≥ ➮bX πn ✑ < 2c3g ∗ (k) can be proved in the similar way as above. Hence, the probability that the distance between node i and j does not satisfy x = Θ ✏➮X n ✑ can be represented as 1 − P ✒ x = Θ ✒➱ X n ✓✓ < 2c3c1(ae 1−a ) X + 2c3c2(be1−b ) X. (24)
It should be noticed that ci(ael-a)x and c2(bel-b)x are Denoting C(X,qi,n)as the monotonic functions of a and b with lower-bound 0. respectively.For any given constant c>0,there must exist a 0,4 and b satisfying cac1(ael-a)X+c3c2(be1-6)x<c.Therefore, C(X,4,m)=44- (3 based on the definition of o(.),the following equation can be and thus PfRanki(j)=X}can be transformed as obtained. 1 PRanki(j)=x)=+C(X.qin)X (33) 1-(e=a(V图)》 o1) (25) Then,we prove that the order of C(X,qi,n)is independent which means that thewith probability 1 when from X when C(X,qi,n)X=w(qi)(i.e.,for any constant n goes to infinity. ca there must be C(X,qi,n)X>cqi).We arbitrarily select two integers Xa and X.Based on (32)and C(X,gi,n)X> To derive the fn(x)in Lemma 2,we first show the distri- c4qi we can obtain bution of rank of i's friends when it has gi friends in the next Lemma.Afterwards,we derive the PDF in Theorem 1 based 0X1X2,91十 0X1X2a-1>c4 OX1X3.9-1X1X29-2 X9 Xe X9X8 on the relation between rank and geographical position. ,xg+三之c4 0x,4=+9-2 X X X9X9 Lemma 5 If i is a source node with qi destinations,and j is (34) one of them.The probability that Ranki(j)=X satisfies If c.can be lower-bounded as y P{Rank:()=X}=日 qi+nl-axa (26) 0x1X2,44> C-1)ax-1)ox (35) 2X9 2X9 when0≤a<l,and According to the Newton's inequality in [22]that for any P{Rank:()=X}=日 i+XCi(qi,n) (27) constant cs>can be lower-bounded as when a =1 and Ci(qi,n)is given in (45),and 0x4-1>50E40a2 (36 0X1X2,4-1 1 P{Rank()=X}=日 gi+qi-axa (28) Therefore,the relation between and can be derived as when a>1. 0X1X2,94-1> C50X1X2,4:0X1X2,9-2 Proof:We sort the nodes based on the rank respects X1X2,94-1 to i as v,·,vn-l,where Rank(vx)=X.The set of = C50x1X9X1X2,9-2 X9 X1X2,94-1 (37 destinations of i is denoted as Qi,where Qi=gi.Therefore, X the probability that Ranki(j)=X is cs(c4-1)0x1X,g-2 P{Rank:()=X}=P{i=UXUX∈Q}P{vx∈Q}, X (29) Thus,we calculate the bound of C(X1,qi,n)as follows which is shown in [9].In particular,PjxQ)= and Pfvx EQi}can be represented as C(X19:,n)=9 3,+e X x-1+巴 1 H Xo 94:0X1X24 (38) P{vx∈Q}= 1≤1<<ig-1≤n,h≠Xh=1 (ad-可+10xx9- 1 1 1≤ii<…<ig:≤nh=1 C(X1,9,n)< (s-+)g:0x1da4 (30) 0X1X2,4t-1 We define ox.q-1- Therefore,it is obvious that 1i1<…<g4-1≤n,h≠Xh C(X1,9,n)=9 40xX1X29L) (39) 0X1X2,4-1 l≤i1<…<ig,≤nh=1 Similarly, C(X2,qi,n)=曰 90X1X2:44 =日(C(X1,q,n), (40) 1≤i1<… 0X1X2,44-1 which means that the order of C(X,gi,n)is independent from X when C(X,qi,n)=w(gi).Hence,we denote (31) C(X,qi,n)as cxC(qi,n)where cx may be related to X,qi
5 It should be noticed that c1(ae1−a ) X and c2(be1−b ) X are the monotonic functions of a and b with lower-bound 0, respectively. For any given constant c > 0, there must exist a and b satisfying c3c1(ae1−a ) X +c3c2(be1−b ) X < c. Therefore, based on the definition of o(·), the following equation can be obtained. 1 − P ❶ x = Θ ❶r X n ➀➀ = o(1), (25) which means that the x = Θ ✏➮X n ✑ with probability 1 when n goes to infinity. To derive the fn(x) in Lemma 2, we first show the distribution of rank of i’s friends when it has qi friends in the next Lemma. Afterwards, we derive the PDF in Theorem 1 based on the relation between rank and geographical position. Lemma 5 If i is a source node with qi destinations, and j is one of them. The probability that Ranki(j) = X satisfies P{Ranki(j) = X} = Θ ⑩ 1 qi + n1−αXα ❿ , (26) when 0 ≤ α < 1, and P{Ranki(j) = X} = Θ ⑩ 1 qi + XC1(qi , n) ❿ , (27) when α = 1 and C1(qi , n) is given in (45), and P{Ranki(j) = X} = Θ ✒ 1 qi + q 1−α i Xα ✓ , (28) when α > 1. Proof: We sort the nodes based on the rank respects to i as v1, · · · , vn−1, where Ranki(vX) = X. The set of destinations of i is denoted as Qi , where |Qi | = qi . Therefore, the probability that Ranki(j) = X is P{Ranki(j) = X} = P{j = vX|vX ∈ Qi}P{vX ∈ Qi}, (29) which is shown in [9]. In particular, P{j = vX|vX ∈ Qi} = 1 qi and P{vX ∈ Qi} can be represented as P{vX ∈ Qi} = 1 HnXα P 1≤i1<···<iqi−1≤n,ih6=X qi◗−1 h=1 1 Hni α h P 1≤i1<···<iqi≤n ◗qi h=1 1 Hni α h . (30) We define σX,qi−1 = P 1≤i1<···<iqi−1≤n,ih6=X qi◗−1 h=1 1 i α h and hence ❳ 1≤i1<···<iqi≤n ❨qi h=1 1 i α h = ❳ 1≤i1<···<iqi≤n,∃h:ih=X ❨qi h=1 1 i α h + ❳ 1≤i1<···<iqi≤n,ih6=X ❨qi h=1 1 i α h = 1 Xα σX,qi−1 + σX,qi . (31) Denoting Cσ(X, qi , n) as Cσ(X, qi , n) = qi σX,qi σX,qi−1 , (32) and thus P{Ranki(j) = X} can be transformed as P{Ranki(j) = X} = 1 qi + Cσ(X, qi, n)Xα . (33) Then, we prove that the order of C(X, qi , n) is independent from X when C(X, qi , n)Xα = ω(qi) (i.e., for any constant c4 there must be C(X, qi , n)Xα > c4qi). We arbitrarily select two integers Xa and Xb. Based on (32) and C(X, qi , n)Xα > c4qi we can obtain σX1X2,qi + σX1X2,qi−1 Xα 2 > c4 ⑩σX1X2,qi−1 Xα 1 + σX1X2,qi−2 Xα 2 Xα 1 ❿ , σX1X2,qi + σX1X2,qi−1 Xα 1 > c4 ⑩σX1X2,qi−1 Xα 2 + σX1X2,qi−2 Xα 2 Xα 1 ❿ . (34) If c4 > 1, σX1X2,qi can be lower-bounded as σX1X2,qi > (c4 − 1)σX1X2,qi−1 2Xα 2 + (c4 − 1)σX1X2,qi−1 2Xα 1 . (35) According to the Newton’s inequality in [22] that for any constant c5 > 0, σX1X2,qi−1 can be lower-bounded as σX1X2,qi−1 > c5σX1X2,qi σX1X2,qi−2 σX1X2,qi−1 . (36) Therefore, the relation between σX1X2,qi−1 and σX1X2,qi−2 Xα 2 can be derived as σX1X2,qi−1 > c5σX1X2,qi σX1X2,qi−2 σX1X2,qi−1 = c5 Xα 2 σX1X2,qi σX1X2,qi−2 σX1X2,qi−1 Xα 2 > c5(c4 − 1)σX1X2,qi−2 Xα 2 . (37) Thus, we calculate the bound of C(X1, qi , n) as follows C(X1, qi, n) = qi σX1X2,qi + σX2X2,qi−1 Xα 2 σX1X2,qi−1 + σX1X2,qi−2 Xα 2 > qiσX1X2,qi ⑨ 1 c5(c4−1) + 1❾ σX1X2,qi−1 , C(X1, qi, n) < ⑨ 1 c5(c4−1) + 1❾ qiσX1X2,qi σX1X2,qi−1 . (38) Therefore, it is obvious that C(X1, qi, n) = Θ( qiσX1X2,qi σX1X2,qi−1 ). (39) Similarly, C(X2, qi, n) = Θ ✒ qiσX1X2,qi σX1X2,qi−1 ✓ = Θ(C(X1, qi, n)), (40) which means that the order of C(X, qi , n) is independent from X when C(X, qi , n)Xα = ω(qi). Hence, we denote C(X, qi , n) as cXC(qi , n) where cX may be related to X, qi