Protein Complex Identification Given: a protein-protein interaction network a set of proteins that regulate a gene that a biologist wishes to study What other proteins should she study? those contained in a compact dense subgraph containing the given proteins
Protein Complex Identification • Given: a protein-protein interaction network • A set of proteins that regulate a gene that a biologist wishes to study. • What other proteins should she study? those contained in a compact dense subgraph containing the given proteins. 16
Challenges Complexity of underlying community models Responsiveness requirements of query processing Dynamic network structures Massive volume of big graphs
Challenges • Complexity of underlying community models • Responsiveness requirements of query processing • Dynamic network structures • Massive volume of big graphs 17
Related Work Community Detection( Finding all communities in the entire network) non-overlapping community detection /Girvan and Newman, PNAS02J overlapping community detection /Ahn et al, Nature 10 Community Search( Finding communities containing given query nodes) Different community models are proposed for various types of networks and query processing techniques Structural Networks --- Densely-connected community search Attributed Graphs ---> Attributed community search Ego-networks---> Social circle discovery Location-based Social Networks---> Querying geo-social groups
Related Work • Community Detection (Finding all communities in the entire network) – non-overlapping community detection [Girvan and Newman, PNAS’02] – overlapping community detection [Ahn et al, Nature’10] • Community Search (Finding communities containing given query nodes) Different community models are proposed for various types of networks and query processing techniques. – Structural Networks ---> Densely-connected community search – Attributed Graphs ---> Attributed community search – Ego-networks ---> Social circle discovery – Location-based Social Networks ---> Querying geo-social groups 18
Part 1: Densely-connected Community Search In the simplest way, a graph represents a structure of interactions within a group of vertices Task: finding densely-connected communities containing query nodes Quasi-clique model [Cui et al. SIGMOD'131 Query-biased densest subgraph model [Wu et al. PVLDB'151 K-core model [Sozio& Gionis KDD'10, Cuiet al. SIGMOD'14, Liet al PVLDB'15, Narbieriet al. DMKD15/ K-truss model (Huang et al. SIGMOD' 14, Huang et al. PVLDB/
Part 1: Densely-connected Community Search • In the simplest way, a graph represents a structure of interactions within a group of vertices. • Task: finding densely-connected communities containing query nodes. – Quasi-clique model [Cui et al. SIGMOD’13] – Query-biased densest subgraph model [Wu et al. PVLDB’15] – K-core model [Sozio & Gionis KDD’10, Cui et al. SIGMOD’14, Li et al. PVLDB’15, Narbieri et al. DMKD’15] – K-truss model [Huang et al. SIGMOD’14, Huang et al. PVLDB’16] 19
[Cuiet al, SIGMOD'13 Quasi-Clique based model a-adjacency-y-quasi-k-clique community model Y-quasi-k-clique: a k-node graph with at least l yk(k-1)/2]edges a-adjacency-y-quasi-k-clique: overlap a vertices, where ask-1 k-clique: a complete graph of k nodes with k(k-1)/2 edges Y-qubsilgaeques a-adjacency-y-quasi-k-cliques (=Q84=4) (a=2,y=08,k=4)
[Cui et al., SIGMOD’13] k-cliques (k=4) γ-quasi-k-cliques (γ=0.8, k=4) α-adjacency-γ-quasi-k-cliques (α=2, γ=0.8, k=4) k-clique: a complete graph of k nodes with k(k−1)/2 edges. 20 Quasi-Clique based Model