IEEE TRANSACTIONS ON INFORMATION THEORY 1 On the Similarity between von Neumann Graph Entropy and Structural Information:Interpretation, Computation,and Applications Xuecheng Liu,Luoyi Fu,Xinbing Wang,and Chenghu Zhou Abstract-The von Neumann graph entropy is a measure downstream tasks such as entropy-driven network design,graph of graph complexity based on the Laplacian spectrum.It has comparison,and community obfuscation. recently found applications in various learning tasks driven by the networked data.However,it is computationally demanding Index Terms-Spectral graph theory,Laplacian spectrum, and hard to interpret using simple structural patterns.Due to spectral polarization,community obfuscation. the close relation between the Laplacian spectrum and the degree sequence,we conjecture that the structural information,defined as the Shannon entropy of the normalized degree sequence,might I.INTRODUCTION be a good approximation of the von Neumann graph entropy that VIDENCE has rapidly grown in the past few years that is both scalable and interpretable. In this work,we thereby study the difference between the graphs are ubiquitous in our daily life;online social structural information and the von Neumann graph entropy networks,metabolic networks,transportation networks,and named as entropy gap.Based on the knowledge that the degree collaboration networks are just a few examples that could be sequence is majorized by the Laplacian spectrum,we for the represented precisely by graphs.One important issue in graph first time prove that the entropy gap is between 0 and log,e in analysis is to measure the complexity of these graphs [2], any undirected unweighted graphs.Consequently we certify that [3]which refers to the level of organization of the structural the structural information is a good approximation of the von Neumann graph entropy that achieves provable accuracy,scala- features such as the scaling behavior of degree distribution, bility,and interpretability simultaneously.This approximation is community structure,core-periphery structure,etc.In order further applied to two entropy-related tasks:network design and to capture the inherent structural complexity of graphs,many graph similarity measure,where a novel graph similarity measure entropy based graph measures [3]-[8]are proposed,each of and the corresponding fast algorithms are proposed.Meanwhile, which is a specific form of the Shannon entropy for different we show empirically and theoretically that maximizing the von Neumann graph entropy can effectively hide the community types of distributions extracted from the graphs. structure,and then propose an alternative metric called spectral As one of the aforementioned entropy based graph com- polarization to guide the community obfuscation. plexity measures,the von Neumann graph entropy,defined Our experimental results on graphs of various scales and types as the Shannon entropy of the spectrum of the trace rescaled show that the very small entropy gap readily applies to a wide Laplacian matrix of a graph (see Definition 1),is of special range of simple/weighted graphs.As an approximation of the von Neumann graph entropy,the structural information is the only interests to scholars and practitioners [1],[9]-[15].This spec- one that achieves both high efficiency and high accuracy among trum based entropy measure distinguishes between different the prominent methods.It is at least two orders of magnitude graph structures.For instance,it is maximal for complete faster than SLaQ [1]with comparable accuracy.Our structural graphs,minimal for graphs with only a single edge,and information based methods also exhibit superior performance in takes on intermediate values for ring graphs.Historically,the entropy measure originates from quantum information theory Manuscript received February 18,2021:revised November 9,2021:ac- and is used to describe the mixedness of a quantum system cepted January 2.2022.Date of publication TBD:date of current ver. which is represented as a density matrix.It is Braunstein et sion TBD.This work was supported by National Key R&D Program of China (No.2018YFB2100302).NSF China (No.42050105,62020106005 al.that first use the von Neumann entropy to measure the 62061146002.61960206002.61822206.61832013.61829201),2021 Tencent complexity of graphs by viewing the scaled Laplacian matrix AI Lab RhinoBird Focused Research Program (No.JR202132),and the as a density matrix [8]. Program of Shanghai Academic/Technology Research Leader under Grant No.18XD1401800.(Corresponding author:Xinbing Wang.) Built upon the Laplacian spectrum,the von Neumann graph Xuecheng Liu and Xinbing Wang are with the Department of Elec- entropy is a natural choice to capture the graph complexity tronic Engineering,Shanghai Jiao Tong University,Shanghai,200240 China since the Laplacian spectrum is well-known to contain rich in- (email:liuxuecheng@sjtu.edu.cn,xwang8@sjtu.edu.cn). Luoyi Fu is with the Department of Computer Science and En- formation about the multi-scale structure of graphs [16],[17]. gineering,Shanghai Jiao Tong University,Shanghai,200240 China As a result,it has recently found applications in downstream (email:yiluofu@sjtu.edu.cn). tasks of complex network analysis and pattern recognition. Chenghu Zhou is with the Institute of Geographical Science and Natural Resources Research,Chinese Academy of Sciences,Beijing.100101 China For example,the von Neumann graph entropy facilitates the (email:zhouch@lreis.ac.cn). measure of graph similarity via Jensen-Shannon divergence Communicated by R.Talmon,Associate Editor for Signal Processing. which could be used to compress multilayer networks [13]and Copyright (c)2017 IEEE.Personal use of this material is permitted. However,permission to use this material for any other purposes must be detect anomalies in graph streams [9].As another example, obtained from the IEEE by sending a request to pubs-permissions@ieee.org. the von Neumann graph entropy could be used to design edge
IEEE TRANSACTIONS ON INFORMATION THEORY 1 On the Similarity between von Neumann Graph Entropy and Structural Information: Interpretation, Computation, and Applications Xuecheng Liu , Luoyi Fu , Xinbing Wang , and Chenghu Zhou Abstract—The von Neumann graph entropy is a measure of graph complexity based on the Laplacian spectrum. It has recently found applications in various learning tasks driven by the networked data. However, it is computationally demanding and hard to interpret using simple structural patterns. Due to the close relation between the Laplacian spectrum and the degree sequence, we conjecture that the structural information, defined as the Shannon entropy of the normalized degree sequence, might be a good approximation of the von Neumann graph entropy that is both scalable and interpretable. In this work, we thereby study the difference between the structural information and the von Neumann graph entropy named as entropy gap. Based on the knowledge that the degree sequence is majorized by the Laplacian spectrum, we for the first time prove that the entropy gap is between 0 and log2 e in any undirected unweighted graphs. Consequently we certify that the structural information is a good approximation of the von Neumann graph entropy that achieves provable accuracy, scalability, and interpretability simultaneously. This approximation is further applied to two entropy-related tasks: network design and graph similarity measure, where a novel graph similarity measure and the corresponding fast algorithms are proposed. Meanwhile, we show empirically and theoretically that maximizing the von Neumann graph entropy can effectively hide the community structure, and then propose an alternative metric called spectral polarization to guide the community obfuscation. Our experimental results on graphs of various scales and types show that the very small entropy gap readily applies to a wide range of simple/weighted graphs. As an approximation of the von Neumann graph entropy, the structural information is the only one that achieves both high efficiency and high accuracy among the prominent methods. It is at least two orders of magnitude faster than SLaQ [1] with comparable accuracy. Our structural information based methods also exhibit superior performance in Manuscript received February 18, 2021; revised November 9, 2021; accepted January 2, 2022. Date of publication TBD; date of current version TBD. This work was supported by National Key R&D Program of China (No. 2018YFB2100302), NSF China (No. 42050105, 62020106005, 62061146002, 61960206002, 61822206, 61832013, 61829201), 2021 Tencent AI Lab RhinoBird Focused Research Program (No. JR202132), and the Program of Shanghai Academic/Technology Research Leader under Grant No. 18XD1401800. (Corresponding author: Xinbing Wang.) Xuecheng Liu and Xinbing Wang are with the Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China (email:liuxuecheng@sjtu.edu.cn, xwang8@sjtu.edu.cn). Luoyi Fu is with the Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240 China (email:yiluofu@sjtu.edu.cn). Chenghu Zhou is with the Institute of Geographical Science and Natural Resources Research, Chinese Academy of Sciences, Beijing, 100101 China (email:zhouch@lreis.ac.cn). Communicated by R. Talmon, Associate Editor for Signal Processing. Copyright (c) 2017 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. downstream tasks such as entropy-driven network design, graph comparison, and community obfuscation. Index Terms—Spectral graph theory, Laplacian spectrum, spectral polarization, community obfuscation. I. INTRODUCTION E VIDENCE has rapidly grown in the past few years that graphs are ubiquitous in our daily life; online social networks, metabolic networks, transportation networks, and collaboration networks are just a few examples that could be represented precisely by graphs. One important issue in graph analysis is to measure the complexity of these graphs [2], [3] which refers to the level of organization of the structural features such as the scaling behavior of degree distribution, community structure, core-periphery structure, etc. In order to capture the inherent structural complexity of graphs, many entropy based graph measures [3]–[8] are proposed, each of which is a specific form of the Shannon entropy for different types of distributions extracted from the graphs. As one of the aforementioned entropy based graph complexity measures, the von Neumann graph entropy, defined as the Shannon entropy of the spectrum of the trace rescaled Laplacian matrix of a graph (see Definition 1), is of special interests to scholars and practitioners [1], [9]–[15]. This spectrum based entropy measure distinguishes between different graph structures. For instance, it is maximal for complete graphs, minimal for graphs with only a single edge, and takes on intermediate values for ring graphs. Historically, the entropy measure originates from quantum information theory and is used to describe the mixedness of a quantum system which is represented as a density matrix. It is Braunstein et al. that first use the von Neumann entropy to measure the complexity of graphs by viewing the scaled Laplacian matrix as a density matrix [8]. Built upon the Laplacian spectrum, the von Neumann graph entropy is a natural choice to capture the graph complexity since the Laplacian spectrum is well-known to contain rich information about the multi-scale structure of graphs [16], [17]. As a result, it has recently found applications in downstream tasks of complex network analysis and pattern recognition. For example, the von Neumann graph entropy facilitates the measure of graph similarity via Jensen-Shannon divergence, which could be used to compress multilayer networks [13] and detect anomalies in graph streams [9]. As another example, the von Neumann graph entropy could be used to design edge
IEEE TRANSACTIONS ON INFORMATION THEORY 2 centrality measure [15],vertex centrality measure [18].and To address the second question,we conduct to our knowl- entropy-driven networks [19]. edge a first study of the difference between the structural information and the von Neumann graph entropy,which we A.Motivations name as entropy gap. However,despite the popularity received in applications,the main obstacle encountered in practice is the computational B.Contributions inefficiency of the exact von Neumann graph entropy.Indeed, as the spectrum based entropy measure,the von Neumann To study the entropy gap,we are based on a fundamental graph entropy suffers from computational inefficiency since relationship between the Laplacian spectrum A and the degree the computational complexity of the graph spectrum is cubic in sequence d in undirected graphs:d is majorized by A.In the number of nodes.Meanwhile,the existing approximation other words,there is a doubly stochastic matrix P such that approaches [1].[9],[10]such as the quadratic approximation, PA=d.Leveraging the majorization and the classic Jensen's fail to capture the presence of non-trivial structural patterns inequality,we prove that the entropy gap is strictly larger than that seem to be encapsulated in the spectrum based entropy 0 in arbitrary undirected graphs.By exploiting the Jensen's measure.Therefore,there is a strong desire to find a good gap [21]which is an inverse version of the classic Jensen's approximation that achieves accuracy,scalability,and inter- inequality,we further prove that the entropy gap is no more pretability simultaneously. than for any undirected graphwhere A is the Instead of starting from scratch,we are inspired by the well- weight matrix,6 is the minimum degree,and vol(G)is the known knowledge that there is a close relationship between volume of the graph.The upper bound on the entropy gap the combinatorial characteristics of a graph and the algebraic turns out to be log2e for any unweighted graph.And both properties of its associated matrices [20].To illustrate this,we the constant lower and upper bounds on the entropy gap can plot the Laplacian spectrum and the degree sequence together be further sharpened using more advanced knowledge about in a same figure for four representative real-world graphs the Lapalcian spectrum and the degree sequence,such as the and four synthetic graphs.As shown in Fig.1,the sorted Grone-Merris majorization [22]. spectrum sequence and the sorted degree sequence almost In a nutshell,our paper makes the following contributions: coincide with each other.The similar phenomenon can also Theory and interpretability:Inspired by the close relation be observed in larger scale free graphs,which indicates that it between the Laplacian spectrum and the degree sequence, is possible to reduce the approximation of the von Neumann we for the first time bridge the gap between the von graph entropy to the time-efficient computation of simple node Neumann graph entropy and the structural information by degree statistics.Therefore,we ask without hesitation the first proving that the entropy gap is between 0 and log2e in research question, any unweighted graph.To the best of our knowledge,the RQ1:Does there exist some non-polynomial function constant bounds on the approximation error in unweighted such that∑-1p(d,/∑-1d)is close to the von Neumann graphs are sharper than that of any existing approaches graph entropy?Here di is the degree of the node i in a graph with provable accuracy,such as FINGER [9].Therefore, of order n. the answers to both RQ1 and RQ2 are YES!As shown We emphasize the non-polynomial property of the func- in Table I,the relative approximation error is around 1% tion since most of previous works that are based on the for small graphs,which is practically good.Besides,the polynomial approximations fail to fulfill the interpretabil- structural information provides a simple geometric interpre- ity.The challenges from both the scalability and the in- tation of the von Neumann graph entropy as a measure of terpretability are translated directly into two requirements degree heterogeneity.Thus,the structural information is a on the function to be determined.First,the explicit ex- good approximation of the von Neumann graph entropy that pression of must exist and be kept simple to ensure the achieves provable accuracy,scalability,and interpretability simultaneously. interpretability of the sum over degree statistics.Second, the function o should be graph-agnostic to meet the scal- Applications and efficient algorithms:Using the structural ability requirement,that is,o should be independent from information as a proxy of the von Neumann graph entropy the graph to be analyzed.One natural choice yielded by with bounded error(the entropy gap).we develop fast algo- the entropic nature of the graph complexity measure for the rithms for two entropy based applications:network design non-polynomial function o is ()=-zlog2z.The sum and graph similarity measure.Since the network design -∑-1(d,/∑=1d)log2(d,/∑-1d)has been named problem aims to maximize the von Neumann graph entropy, we combine the greedy method and a pruning strategy to as one-dimensional structural information by Li et al.[3]in accelerate the searching process.For the graph similarity a connected graph since it has an entropic form and captures measure,we propose a new distance measure based on the the information of a classic random walker in a graph.We structural information and the Jensen-Shannon divergence. extend this notion to arbitrary undirected graphs.Following We further show that the proposed distance measure is a the question RQ1,we raise the second research question, pseudometric and devise a fast incremental algorithm to RQ2:Is the structural information an accurate proxy of the compute the similarity between adjacent graphs in a graph von Neumann graph entropy? stream
IEEE TRANSACTIONS ON INFORMATION THEORY 2 centrality measure [15], vertex centrality measure [18], and entropy-driven networks [19]. A. Motivations However, despite the popularity received in applications, the main obstacle encountered in practice is the computational inefficiency of the exact von Neumann graph entropy. Indeed, as the spectrum based entropy measure, the von Neumann graph entropy suffers from computational inefficiency since the computational complexity of the graph spectrum is cubic in the number of nodes. Meanwhile, the existing approximation approaches [1], [9], [10] such as the quadratic approximation, fail to capture the presence of non-trivial structural patterns that seem to be encapsulated in the spectrum based entropy measure. Therefore, there is a strong desire to find a good approximation that achieves accuracy, scalability, and interpretability simultaneously. Instead of starting from scratch, we are inspired by the wellknown knowledge that there is a close relationship between the combinatorial characteristics of a graph and the algebraic properties of its associated matrices [20]. To illustrate this, we plot the Laplacian spectrum and the degree sequence together in a same figure for four representative real-world graphs and four synthetic graphs. As shown in Fig. 1, the sorted spectrum sequence and the sorted degree sequence almost coincide with each other. The similar phenomenon can also be observed in larger scale free graphs, which indicates that it is possible to reduce the approximation of the von Neumann graph entropy to the time-efficient computation of simple node degree statistics. Therefore, we ask without hesitation the first research question, RQ1: Does there exist some non-polynomial function φ such that Pn i=1 φ di/ Pn j=1 dj is close to the von Neumann graph entropy? Here di is the degree of the node i in a graph of order n. We emphasize the non-polynomial property of the function φ since most of previous works that are based on the polynomial approximations fail to fulfill the interpretability. The challenges from both the scalability and the interpretability are translated directly into two requirements on the function φ to be determined. First, the explicit expression of φ must exist and be kept simple to ensure the interpretability of the sum over degree statistics. Second, the function φ should be graph-agnostic to meet the scalability requirement, that is, φ should be independent from the graph to be analyzed. One natural choice yielded by the entropic nature of the graph complexity measure for the non-polynomial function φ is φ(x) = −x log2 x. The sum − Pn i=1 di/ Pn j=1 dj log2 di/ Pn j=1 dj has been named as one-dimensional structural information by Li et al. [3] in a connected graph since it has an entropic form and captures the information of a classic random walker in a graph. We extend this notion to arbitrary undirected graphs. Following the question RQ1, we raise the second research question, RQ2: Is the structural information an accurate proxy of the von Neumann graph entropy? To address the second question, we conduct to our knowledge a first study of the difference between the structural information and the von Neumann graph entropy, which we name as entropy gap. B. Contributions To study the entropy gap, we are based on a fundamental relationship between the Laplacian spectrum λ and the degree sequence d in undirected graphs: d is majorized by λ. In other words, there is a doubly stochastic matrix P such that Pλ = d. Leveraging the majorization and the classic Jensen’s inequality, we prove that the entropy gap is strictly larger than 0 in arbitrary undirected graphs. By exploiting the Jensen’s gap [21] which is an inverse version of the classic Jensen’s inequality, we further prove that the entropy gap is no more than log2 e·tr(A 2 ) δ·vol(G) for any undirected graph G, where A is the weight matrix, δ is the minimum degree, and vol(G) is the volume of the graph. The upper bound on the entropy gap turns out to be log2 e for any unweighted graph. And both the constant lower and upper bounds on the entropy gap can be further sharpened using more advanced knowledge about the Lapalcian spectrum and the degree sequence, such as the Grone-Merris majorization [22]. In a nutshell, our paper makes the following contributions: • Theory and interpretability: Inspired by the close relation between the Laplacian spectrum and the degree sequence, we for the first time bridge the gap between the von Neumann graph entropy and the structural information by proving that the entropy gap is between 0 and log2 e in any unweighted graph. To the best of our knowledge, the constant bounds on the approximation error in unweighted graphs are sharper than that of any existing approaches with provable accuracy, such as FINGER [9]. Therefore, the answers to both RQ1 and RQ2 are YES! As shown in Table I, the relative approximation error is around 1% for small graphs, which is practically good. Besides, the structural information provides a simple geometric interpretation of the von Neumann graph entropy as a measure of degree heterogeneity. Thus, the structural information is a good approximation of the von Neumann graph entropy that achieves provable accuracy, scalability, and interpretability simultaneously. • Applications and efficient algorithms: Using the structural information as a proxy of the von Neumann graph entropy with bounded error (the entropy gap), we develop fast algorithms for two entropy based applications: network design and graph similarity measure. Since the network design problem aims to maximize the von Neumann graph entropy, we combine the greedy method and a pruning strategy to accelerate the searching process. For the graph similarity measure, we propose a new distance measure based on the structural information and the Jensen-Shannon divergence. We further show that the proposed distance measure is a pseudometric and devise a fast incremental algorithm to compute the similarity between adjacent graphs in a graph stream
IEEE TRANSACTIONS ON INFORMATION THEORY 6 spectrum spectrum A spectrum 20 A spectrum o degree 10 o degree 10 bb 50 o degree o degree 10 20 30 20 40 500 200 400 index index index index (a)Zachary's karate club (b)Dolphins (c)Email (d)Celegans A spectrum 500 A spectrum o degree 2 o degree △spectrum △spectrum o degree o degree 200 400 200 400 200 400 200 400 index index index index (e)ER graph of order 500 (f)BA graph of order 500 (g)Complete graph of order 500 (h)Ring graph of order 500 Fig.1:The close relation between Laplacian spectra and degree sequence in four representative real-world graphs(a-d)and four common synthetic graphs (e-h).Both the Laplacian spectra and degree sequence are sorted in non-increasing order.The x-axis represents the index of the sorted sequences,and the y-axis represents the value of Laplacian spectrum and degree. TABLE I:Structural information and von Neumann graph entropy of the graphs in Fig.1. Measurements Zachary Dolphins Email Celegans ER BA Complete Ring structural information 1 4.7044 5.7005 9.5665 7.9257 8.9497 8.5739 8.9659 8.9658 von Neumann graph entropy 4.5504 5.5489 9.5029 7.8631 8.9302 8.4935 8.9629 8.5231 entropy gap△H 0.1540 0.1516 0.0636 0.0626 0.0195 0.0804 0.0030 0.4427 relative error △H 3.38% 2.73% 0.67% 0.80% 0.22% 0.95% 0.03% 5.19% Connection with community structure:While the two- between the Laplacian spectrum and the degree sequence dimensional structural information [3]encodes the com- (Fig.1): munity structure,we find empirically that both the von More straightforward results to illustrate the tightness of Neumann graph entropy and the one-dimensional structural the approximation (Table D); information are uninformative of the community structure. A fine-grained analysis on the lower bound of the entropy However,they are effective in adversarial attacks on com- gap to show that the entropy gap is actually strictly larger munity detection,since maximizing the von Neumann graph than 0(Theorem 1); entropy will make the Laplacian spectrum uninformative A theoretical analysis on the entropy gap of several of the community structure.Using the similar idea,we classes of graphs,including complete graph,complete propose an alternative metric called spectral polarization bipartite graph,path graph,and ring graph(Table III and which is both effective and efficient in hiding the community Appendix B); structure. An analysis and discussion over the connection between Extensive experiments and evaluations:We use 3 random graph entropy and community structure (Section VI and graph models,9 real-world static graphs,and 2 real-world Appendix A-D). temporal graphs to evaluate the properties of the entropy Roadmap:The remainder of this paper is organized as gap and the proposed algorithms.The results show that the follows.We review three related issues in Section II.In entropy gap is small in a wide range of simple/weighted Section III we introduce the definitions of the von Neumann graphs.And it is insensitive to the change of graph size. graph entropy,structural information,and the notion of entropy Compared with the prominent methods for approximating gap.Section IV shows the close relationship between von the von Neumann graph entropy,the structural information Neumann graph entropy and structural information by bound- is superior in both accuracy and computational speed.It is ing the entropy gap.Section V presents efficient algorithms at least 2 orders of magnitude faster than the accurate SLaQ for two graph entropy based applications.In Section VI we [1]algorithm with comparable accuracy.Our proposed discuss the connection between von Neumann graph entropy algorithms based on structural information also exhibit and community structure.Section VII provides experimental superb performance in downstream tasks such as entropy- results.Section VIII offers some conclusions and directions driven network design,graph comparison,and community for future research. obfuscation. An earlier version of this work appeared in our WWW 2021 II.RELATED WORK conference paper [23].In addition to revising the conference We review three issues related to the von Neumann graph version,this TIT submission includes the following new entropy:computation,interpretation,and connection with the materials: community structure.The first two issues arise from the broad More experimental results to illustrate the relationship applications [13]-[15],[19],[24]-[27]of the von Neumann
IEEE TRANSACTIONS ON INFORMATION THEORY 3 0 10 20 30 index 0 10 spectrum degree (a) Zachary’s karate club 0 20 40 60 index 0 10 spectrum degree (b) Dolphins 0 500 1000 index 0 50 spectrum degree (c) Email 0 200 400 index 0 200 spectrum degree (d) Celegans 0 200 400 index 0 50 spectrum degree (e) ER graph of order 500 0 200 400 index 0 50 spectrum degree (f) BA graph of order 500 0 200 400 index 0 500 spectrum degree (g) Complete graph of order 500 0 200 400 index 0.0 2.5 spectrum degree (h) Ring graph of order 500 Fig. 1: The close relation between Laplacian spectra and degree sequence in four representative real-world graphs (a-d) and four common synthetic graphs (e-h). Both the Laplacian spectra and degree sequence are sorted in non-increasing order. The x-axis represents the index of the sorted sequences, and the y-axis represents the value of Laplacian spectrum and degree. TABLE I: Structural information and von Neumann graph entropy of the graphs in Fig. 1. Measurements Zachary Dolphins Email Celegans ER BA Complete Ring structural information H1 4.7044 5.7005 9.5665 7.9257 8.9497 8.5739 8.9659 8.9658 von Neumann graph entropy Hvn 4.5504 5.5489 9.5029 7.8631 8.9302 8.4935 8.9629 8.5231 entropy gap ∆H 0.1540 0.1516 0.0636 0.0626 0.0195 0.0804 0.0030 0.4427 relative error ∆H Hvn 3.38% 2.73% 0.67% 0.80% 0.22% 0.95% 0.03% 5.19% • Connection with community structure: While the twodimensional structural information [3] encodes the community structure, we find empirically that both the von Neumann graph entropy and the one-dimensional structural information are uninformative of the community structure. However, they are effective in adversarial attacks on community detection, since maximizing the von Neumann graph entropy will make the Laplacian spectrum uninformative of the community structure. Using the similar idea, we propose an alternative metric called spectral polarization which is both effective and efficient in hiding the community structure. • Extensive experiments and evaluations: We use 3 random graph models, 9 real-world static graphs, and 2 real-world temporal graphs to evaluate the properties of the entropy gap and the proposed algorithms. The results show that the entropy gap is small in a wide range of simple/weighted graphs. And it is insensitive to the change of graph size. Compared with the prominent methods for approximating the von Neumann graph entropy, the structural information is superior in both accuracy and computational speed. It is at least 2 orders of magnitude faster than the accurate SLaQ [1] algorithm with comparable accuracy. Our proposed algorithms based on structural information also exhibit superb performance in downstream tasks such as entropydriven network design, graph comparison, and community obfuscation. An earlier version of this work appeared in our WWW 2021 conference paper [23]. In addition to revising the conference version, this TIT submission includes the following new materials: • More experimental results to illustrate the relationship between the Laplacian spectrum and the degree sequence (Fig. 1); • More straightforward results to illustrate the tightness of the approximation (Table I); • A fine-grained analysis on the lower bound of the entropy gap to show that the entropy gap is actually strictly larger than 0 (Theorem 1); • A theoretical analysis on the entropy gap of several classes of graphs, including complete graph, complete bipartite graph, path graph, and ring graph (Table III and Appendix B); • An analysis and discussion over the connection between graph entropy and community structure (Section VI and Appendix A-D). Roadmap: The remainder of this paper is organized as follows. We review three related issues in Section II. In Section III we introduce the definitions of the von Neumann graph entropy, structural information, and the notion of entropy gap. Section IV shows the close relationship between von Neumann graph entropy and structural information by bounding the entropy gap. Section V presents efficient algorithms for two graph entropy based applications. In Section VI we discuss the connection between von Neumann graph entropy and community structure. Section VII provides experimental results. Section VIII offers some conclusions and directions for future research. II. RELATED WORK We review three issues related to the von Neumann graph entropy: computation, interpretation, and connection with the community structure. The first two issues arise from the broad applications [13]–[15], [19], [24]–[27] of the von Neumann
IEEE TRANSACTIONS ON INFORMATION THEORY TABLE II:Comparison of methods for approximating the von B.Spectral Descriptor of Graphs and Its Structural Counter- Neumann graph entropy in terms of fulfilled()and missing part (x)properties. Researchers in spectral graph theory have always been inter- 9] 1 [10]Structural Information (Ours) ested in establishing a connection between the combinatorial Provable accuracy characteristics of a graph and the algebraic properties of its Scalability associated matrices.For example,the algebraic connectivity Interpretability (also known as Fiedler eigenvalue),defined as the second smallest eigenvalue of a graph Laplacian matrix,has been used to measure the robustness [16]and synchronizability [32]of graph entropy,whereas the last issue comes from spectral graphs.The magnitude of the algebraic connectivity has also clustering [28]and two-dimensional structural information been found to reflect how well connected the overall graph based clustering [3]. is [17].As another example,the Fiedler vector,defined as the eigenvector corresponding to the Fiedler eigenvalue of a graph Laplacian matrix,has been found to be a good sign of A.Approximate Computation of the von Neumann Graph the bi-partition structure of a graph [33].However,there are Entropy some other spectral descriptors that have found applications In an effort to overcome the computational inefficiency of in graph analytics,but require more structural interpretations, the von Neumann graph entropy,past works have resorted such as the heat kernel trace [34],[35]and von Neumann graph entropy. to various numerical approximations.Chen et al.[9]first Simmons et al.[36]suggest to interpret the von Neumann compute a quadratic approximation of the entropy via Taylor graph entropy as the centralization of graphs,which is very expansion,then derive two finer approximations with accuracy guarantee by spectrum-based and degree-based rescaling.re- similar to our interpretation using the structural information. spectively.Before Chen's work,the Taylor expansion is widely They derive both upper and lower bounds on the von Neumann adopted to give computationally efficient approximations [29], graph entropy in terms of graph centralization under some but there is no theoretical guarantee on the approximation hard assumptions on the range of the von Neumann graph accuracy.Following Chen's work,Choi et al.[10]propose entropy.Therefore,their results cannot be directly converted several more complicated quadratic approximations based on to accuracy guaranteed approximations of the von Neumann advanced polynomial approximation methods the superiority graph entropy for arbitrary simple graphs.By constrast,our work shows that the structural information is an accurate, of which is verified through experiments. scalable,and interpretable proxy of the von Neumann graph Besides,there is a trend to approximate spectral sums using stochastic trace estimation based approximations [30], entropy for arbitrary simple graphs.Besides,the techniques used in our proof are also quite different from [36]. the merit of which is the provable error-bounded estimation of the spectral sums.For example,Kontopoulou et al.[11 propose three randomized algorithms based on Taylor series, C.Spectrum,Detectability,and Significance of Community Chebyshev polynomials,and random projection matrices to Structure approximate the von Neumann entropy of density matrices.As Community structure is one of the most recognized char- another example,based on the stochastic Lanczos quadrature acteristics of large scale real-world graphs in which similar technique [31].Tsitsulin et al.[1]propose an efficient and nodes tend to cluster together.Thus it has found applications effective approximation technique called SLaQ to estimate in classification,recommendation.and link prediction.etc. the von Neumann entropy and other spectral descriptors for Started from the Fiedler vector,spectral algorithms have been web-scale graphs.However,the approximation error bound of widely studied for detecting the community structure in a SLaQ for the von Neumann graph entropy is not provided. graph [37].[38]because they are simple and theoretically The disadvantages of such stochastic approximations are also warranted.Cheeger's inequality hc v2u2 bounds obvious;their computational efficiency depends on the number the conductance hc of a graph G using the second smallest of random vectors used in stochastic trace estimation,and eigenvalue of the normalized Laplacian matrix.This is later they are not suitable for applications like anomaly detection generalized to multiway spectral partitioning[39]yielding the in graph streams and entropy-driven network design. higher-order Cheeger inequalities学≤pc(k)≤O(k2), The comparison of methods for approximating the von for each k,where uk is the k-th smallest eigenvalue of Neumann graph entropy is presented in Table II.One of the the normalized Lapalcian matrix and pc(k)is the k-way common drawbacks of the aforementioned methods is the expansion constant.Since both hc and Pc(k)measure the lack of interpretability,that is,none of these methods provide significance of community structure,the graph spectrum is enough evidence to interpret this spectrum based entropy closely related to the community structure. measure in terms of structural patterns.By contrast,as a The coupling between graph spectrum and community good proxy of the von Neumann graph entropy,the structural structure has been empirically validated in [37]where New- information offers us the intuition that the spectrum based man found that if the second smallest eigenvalue A2 of the entropy measure is closely related to the degree heterogeneity Laplacian matrix is well separated from the eigenvalues above of graphs. it,the spectral clustering based on Lapalcian matrix often
IEEE TRANSACTIONS ON INFORMATION THEORY 4 TABLE II: Comparison of methods for approximating the von Neumann graph entropy in terms of fulfilled (✓) and missing (✗) properties. [9] [1] [10] Structural Information (Ours) Provable accuracy ✓ ✗ ✗ ✓ Scalability ✓ ✓ ✗ ✓ Interpretability ✗ ✗ ✗ ✓ graph entropy, whereas the last issue comes from spectral clustering [28] and two-dimensional structural information based clustering [3]. A. Approximate Computation of the von Neumann Graph Entropy In an effort to overcome the computational inefficiency of the von Neumann graph entropy, past works have resorted to various numerical approximations. Chen et al. [9] first compute a quadratic approximation of the entropy via Taylor expansion, then derive two finer approximations with accuracy guarantee by spectrum-based and degree-based rescaling, respectively. Before Chen’s work, the Taylor expansion is widely adopted to give computationally efficient approximations [29], but there is no theoretical guarantee on the approximation accuracy. Following Chen’s work, Choi et al. [10] propose several more complicated quadratic approximations based on advanced polynomial approximation methods the superiority of which is verified through experiments. Besides, there is a trend to approximate spectral sums using stochastic trace estimation based approximations [30], the merit of which is the provable error-bounded estimation of the spectral sums. For example, Kontopoulou et al. [11] propose three randomized algorithms based on Taylor series, Chebyshev polynomials, and random projection matrices to approximate the von Neumann entropy of density matrices. As another example, based on the stochastic Lanczos quadrature technique [31], Tsitsulin et al. [1] propose an efficient and effective approximation technique called SLaQ to estimate the von Neumann entropy and other spectral descriptors for web-scale graphs. However, the approximation error bound of SLaQ for the von Neumann graph entropy is not provided. The disadvantages of such stochastic approximations are also obvious; their computational efficiency depends on the number of random vectors used in stochastic trace estimation, and they are not suitable for applications like anomaly detection in graph streams and entropy-driven network design. The comparison of methods for approximating the von Neumann graph entropy is presented in Table II. One of the common drawbacks of the aforementioned methods is the lack of interpretability, that is, none of these methods provide enough evidence to interpret this spectrum based entropy measure in terms of structural patterns. By contrast, as a good proxy of the von Neumann graph entropy, the structural information offers us the intuition that the spectrum based entropy measure is closely related to the degree heterogeneity of graphs. B. Spectral Descriptor of Graphs and Its Structural Counterpart Researchers in spectral graph theory have always been interested in establishing a connection between the combinatorial characteristics of a graph and the algebraic properties of its associated matrices. For example, the algebraic connectivity (also known as Fiedler eigenvalue), defined as the second smallest eigenvalue of a graph Laplacian matrix, has been used to measure the robustness [16] and synchronizability [32] of graphs. The magnitude of the algebraic connectivity has also been found to reflect how well connected the overall graph is [17]. As another example, the Fiedler vector, defined as the eigenvector corresponding to the Fiedler eigenvalue of a graph Laplacian matrix, has been found to be a good sign of the bi-partition structure of a graph [33]. However, there are some other spectral descriptors that have found applications in graph analytics, but require more structural interpretations, such as the heat kernel trace [34], [35] and von Neumann graph entropy. Simmons et al. [36] suggest to interpret the von Neumann graph entropy as the centralization of graphs, which is very similar to our interpretation using the structural information. They derive both upper and lower bounds on the von Neumann graph entropy in terms of graph centralization under some hard assumptions on the range of the von Neumann graph entropy. Therefore, their results cannot be directly converted to accuracy guaranteed approximations of the von Neumann graph entropy for arbitrary simple graphs. By constrast, our work shows that the structural information is an accurate, scalable, and interpretable proxy of the von Neumann graph entropy for arbitrary simple graphs. Besides, the techniques used in our proof are also quite different from [36]. C. Spectrum, Detectability, and Significance of Community Structure Community structure is one of the most recognized characteristics of large scale real-world graphs in which similar nodes tend to cluster together. Thus it has found applications in classification, recommendation, and link prediction, etc. Started from the Fiedler vector, spectral algorithms have been widely studied for detecting the community structure in a graph [37], [38] because they are simple and theoretically warranted. Cheeger’s inequality µ2 2 ≤ hG ≤ √ 2µ2 bounds the conductance hG of a graph G using the second smallest eigenvalue of the normalized Laplacian matrix. This is later generalized to multiway spectral partitioning [39] yielding the higher-order Cheeger inequalities µk 2 ≤ ρG(k) ≤ O(k 2 ) √µk for each k, where µk is the k-th smallest eigenvalue of the normalized Lapalcian matrix and ρG(k) is the k-way expansion constant. Since both hG and ρG(k) measure the significance of community structure, the graph spectrum is closely related to the community structure. The coupling between graph spectrum and community structure has been empirically validated in [37] where Newman found that if the second smallest eigenvalue λ2 of the Laplacian matrix is well separated from the eigenvalues above it, the spectral clustering based on Lapalcian matrix often
IEEE TRANSACTIONS ON INFORMATION THEORY 5 does very well.However,community detection by spectral Definition 3(Entropy gap):The entropy gap of an undi- algorithms in sparse graphs often fails,because the spectrum rected graph G=(V,E,A)is defined as AH(G)=H1(G)- contains no clear evidence of community structure.This is 孔vm(G) exemplified under the sparse stochastic block model with two The von Neumann graph entropy and the structural informa- clusters of equal size [38],[40],where the second smallest tion are well-defined for all the undirected graphs except for eigenvalue of the adjacency matrix gets lost in the bulk of the graphs with empty edge set,in which vol(G)=0.When uninformative eigenvalues.Our experiments complement the E=,we take it for granted that H1(G)=Hvn(G)=0. correlation between graph spectrum and community structure by showing that the spikes in a sequence of spectral gaps are IV.APPROXIMATION ERROR ANALYSIS good indicators of the community structure. In this section we bound the entropy gap in the undirected The significance and detectability of community structure graphs of order n.Since the nodes with degree 0 have no has found its application in an emerging area called com- contribution to both the structural information and the von munity obfuscation [41]-[47],where the graph structure is Neumann graph entropy,without loss of generality we assume minimally perturbed to protect its community structure from that di>0 for any node iV. being detected.None of these practical algorithms exploit the correlation between graph spectrum and community structure except for the structural entropy proposed by Liu et al.[45]. A.Bounds on the Approximation Error Our work bridges the one-dimensional structural entropy in We first provide bounds on the additive approximation error [45]with the spectral entropy,elaborates both empirically and in Theorem 1,Corollary 1,and Corollary 2,then analyze the theoretically that maximizing the spectral entropy is effective multiplicative approximation error in Theorem 2 in community obfuscation,and thus provides a theoretical Theorem I (Bounds on the absolute approximation error): foundation for the success of the structural entropy [45]. For any undirected graph G=(V,E,A),the inequality III.PRELIMINARIES 0<△H(G)≤1o82e.t(42) (1) 6 vol(G) In this paper,we study the undirected graph G=(V,E,A) holds,where 6=min{dildi>0}is the minimum positive with positive edge weights,where V=[n]{1,...,n} degree. is the node set.E is the edge set and A is the Before proving Theorem 1,we introduce two techniques: symmetric weight matrix with positive entry Aij denoting the the majorization and the Jensen's gap.The former one is weight of an edge (i,j)EE.If the node pair (i,j)E, a preorder of the vector of reals,while the latter is an then Aij=0.If the graph G is unweighted,the weight inverse version of the Jensen's inequality,whose definitions matrix A E(0,1]nxn is called the adjacency matrix of G. are presented as follows. The degree of the node i V in the graph G is defined Definition 4 (Majorization [48]):For a vector x E Rd,we as di= A.The Laplacian matrix of the graph G denote by xERd the vector with the same components,but is defined as L D-A where D=diag(d1,...,dn)is sorted in descending order.Given x,y Rd,we say that x the degree matrix.Let (A:)1 be the sorted eigenvalues of majorizesy(written as x>y)if and only if∑-1≥ L such that0=d≤2≤…≤入n,which is called Laplacian specmm.w demnevoas the ∑=1 for any∈[网and xT1=yr1. Lemma I (Jensen's gap [211):Let X be a one-dimensional volume of graph G,then vol(G)=tr(L))=∑=1入iwhere random variable with the mean u and the support Let ( tr()is the trace operator.For the convenience of delineation, be a twice differentiable function on and define the function we define a special function f(z)xlog2x on the support h()P-,then Ev(X)】-中(EX])≤ [0,oo)where f(0)limzLo f(x)=0 by convention. suprth()}.var(X).Additionally,if (r)is convex,then In the following,we present the formal definitions of the h(x)is monotonically increasing in x,and if (is concave, von Neumann graph entropy,the structural information,and then h(x)is monotonically decreasing in z. the entropy gap.Slightly different from the one-dimensional Lemma 2:The function f()=xlog2x is convex,its first structural information proposed by Li et al.[3],our definition order derivative f()=log2x+log2e is concave. of structural information does not require the graph G to be Proof:The second order derivative f"(z)=(log2 e)/x> connected. 0,thus f(x)=log2x is convex. Definition I(von Neumann graph entropy):The von Neu- We can see that the majorization characterizes the degree of mann graph entropy of an undirected graph G=(V,E,A) concentration between the two vectors x and y.Specifically. is defined asH(G=-∑=f,/ol(G)》,where0= x>y means that the entries of y are more concentrated on its 入1≤2≤·≤入are the eigenvalues of the Laplacian mean yT1/1T1 than the entires of x.An equivalent definition matrix L=D-A of the graph G,and vol(G)=>=1Ai is of the majorization [48]using linear algebra says that x the volume of G. y if and only if there exists a doubly stochastic matrix P Definition 2 (Structural information):The structural infor- such that Px =y.As a famous example of the majorization, mation of an undirected graph G=(V,E,A)is defined as the Schur-Horn theorem [48]says that the diagonal elements H(G)=->1f(di/vol(G)),where di is the degree of of a positive semidefinite Hermitian matrix are majorized by node i in G and vol(G)=d is the volume of G. its eigenvalues.Since xTLx=(i)Aj()20
IEEE TRANSACTIONS ON INFORMATION THEORY 5 does very well. However, community detection by spectral algorithms in sparse graphs often fails, because the spectrum contains no clear evidence of community structure. This is exemplified under the sparse stochastic block model with two clusters of equal size [38], [40], where the second smallest eigenvalue of the adjacency matrix gets lost in the bulk of uninformative eigenvalues. Our experiments complement the correlation between graph spectrum and community structure by showing that the spikes in a sequence of spectral gaps are good indicators of the community structure. The significance and detectability of community structure has found its application in an emerging area called community obfuscation [41]–[47], where the graph structure is minimally perturbed to protect its community structure from being detected. None of these practical algorithms exploit the correlation between graph spectrum and community structure except for the structural entropy proposed by Liu et al. [45]. Our work bridges the one-dimensional structural entropy in [45] with the spectral entropy, elaborates both empirically and theoretically that maximizing the spectral entropy is effective in community obfuscation, and thus provides a theoretical foundation for the success of the structural entropy [45]. III. PRELIMINARIES In this paper, we study the undirected graph G = (V, E, A) with positive edge weights, where V = [n] , {1, . . . , n} is the node set, E is the edge set, and A ∈ R n×n + is the symmetric weight matrix with positive entry Aij denoting the weight of an edge (i, j) ∈ E. If the node pair (i, j) ∈/ E, then Aij = 0. If the graph G is unweighted, the weight matrix A ∈ {0, 1} n×n is called the adjacency matrix of G. The degree of the node i ∈ V in the graph G is defined as di = Pn j=1 Aij . The Laplacian matrix of the graph G is defined as L , D − A where D = diag(d1, . . . , dn) is the degree matrix. Let {λi} n i=1 be the sorted eigenvalues of L such that 0 = λ1 ≤ λ2 ≤ · · · ≤ λn, which is called Laplacian spectrum. We define vol(G) = Pn i=1 di as the volume of graph G, then vol(G) = tr(L) = Pn i=1 λi where tr(·) is the trace operator. For the convenience of delineation, we define a special function f(x) , x log2 x on the support [0, ∞) where f(0) , limx↓0 f(x) = 0 by convention. In the following, we present the formal definitions of the von Neumann graph entropy, the structural information, and the entropy gap. Slightly different from the one-dimensional structural information proposed by Li et al. [3], our definition of structural information does not require the graph G to be connected. Definition 1 (von Neumann graph entropy): The von Neumann graph entropy of an undirected graph G = (V, E, A) is defined as Hvn(G) = − Pn i=1 f(λi/vol(G)), where 0 = λ1 ≤ λ2 ≤ · · · ≤ λn are the eigenvalues of the Laplacian matrix L = D − A of the graph G, and vol(G) = Pn i=1 λi is the volume of G. Definition 2 (Structural information): The structural information of an undirected graph G = (V, E, A) is defined as H1(G) = − Pn i=1 f(di/vol(G)), where di is the degree of node i in G and vol(G) = Pn i=1 di is the volume of G. Definition 3 (Entropy gap): The entropy gap of an undirected graph G = (V, E, A) is defined as ∆H(G) = H1(G)− Hvn(G). The von Neumann graph entropy and the structural information are well-defined for all the undirected graphs except for the graphs with empty edge set, in which vol(G) = 0. When E = ∅, we take it for granted that H1(G) = Hvn(G) = 0. IV. APPROXIMATION ERROR ANALYSIS In this section we bound the entropy gap in the undirected graphs of order n. Since the nodes with degree 0 have no contribution to both the structural information and the von Neumann graph entropy, without loss of generality we assume that di > 0 for any node i ∈ V . A. Bounds on the Approximation Error We first provide bounds on the additive approximation error in Theorem 1, Corollary 1, and Corollary 2, then analyze the multiplicative approximation error in Theorem 2. Theorem 1 (Bounds on the absolute approximation error): For any undirected graph G = (V, E, A), the inequality 0 < ∆H(G) ≤ log2 e δ · tr(A2 ) vol(G) (1) holds, where δ = min{di |di > 0} is the minimum positive degree. Before proving Theorem 1, we introduce two techniques: the majorization and the Jensen’s gap. The former one is a preorder of the vector of reals, while the latter is an inverse version of the Jensen’s inequality, whose definitions are presented as follows. Definition 4 (Majorization [48]): For a vector x ∈ R d , we denote by x ↓ ∈ R d the vector with the same components, but sorted in descending order. Given x, y ∈ R d , we say that x majorizes y (written as x y) if and only if Pk i=1 x ↓ i ≥ Pk i=1 y ↓ i for any k ∈ [d] and x |1 = y |1. Lemma 1 (Jensen’s gap [21]): Let X be a one-dimensional random variable with the mean µ and the support Ω. Let ψ(x) be a twice differentiable function on Ω and define the function h(x) , ψ(x)−ψ(µ) (x−µ) 2 − ψ 0 (µ) x−µ , then E[ψ(X)] − ψ(E[X]) ≤ supx∈Ω{h(x)}· var(X). Additionally, if ψ 0 (x) is convex, then h(x) is monotonically increasing in x, and if ψ 0 (x) is concave, then h(x) is monotonically decreasing in x. Lemma 2: The function f(x) = x log2 x is convex, its first order derivative f 0 (x) = log2 x + log2 e is concave. Proof: The second order derivative f 00(x) = (log2 e)/x > 0, thus f(x) = x log2 x is convex. We can see that the majorization characterizes the degree of concentration between the two vectors x and y. Specifically, x y means that the entries of y are more concentrated on its mean y |1/1 |1 than the entires of x. An equivalent definition of the majorization [48] using linear algebra says that x y if and only if there exists a doubly stochastic matrix P such that Px = y. As a famous example of the majorization, the Schur-Horn theorem [48] says that the diagonal elements of a positive semidefinite Hermitian matrix are majorized by its eigenvalues. Since x TLx = P (i,j)∈E Aij (xi − xj ) 2 ≥ 0