Problem statement and complexity analysis > FDS: finding dense subgraphs find a connected temporal subgraph with the greatest cohesive density 235 N2).3.5.12351 7.77.72 density(H=44 density(G)=36 P Bogdanov, M. Mongiov, A K Singh. Mining heavy subgraphs in time-evolving networks. In /CDM, 2011. 6
Problem statement and complexity analysis P. Bogdanov, M. Mongiov, A. K. Singh. Mining heavy subgraphs in time-evolving networks. In ICDM, 2011. 6 ➢ FDS: finding dense subgraphs • find a connected temporal subgraph with the greatest cohesive density cdensity(H)=44 cdensity(G)=36
Problem statement and complexity analysis > FDS: finding dense subgraphs find a connected temporal subgraph with the greatest cohesive density Problem hardness[bogdanov et al. 111 the FDs problem is NP-complete, even for a temporal network with a single snapshot and with +1 or-1 edge weights only Our approximation hardness result the cohesive density of the optimal dense temporal subgraph is NP-hard to approximate within any constant factor. proof sketch: building an approximation factor preserving reduction from the net worth maximization problem, which is NP- hard to approximate within any constant factor P Bogdanov, M. Mongiov, A K Singh. Mining heavy subgraphs in time-evolving networks. In /CDM, 2011. 7
Problem statement and complexity analysis ➢ Problem hardness[Bogdanov et al. 11] P. Bogdanov, M. Mongiov, A. K. Singh. Mining heavy subgraphs in time-evolving networks. In ICDM, 2011. 7 ➢ Our approximation hardness result • the FDS problem is NP-complete, even for a temporal network with a single snapshot and with +1 or -1 edge weights only. • the cohesive density of the optimal dense temporal subgraph is NP-hard to approximate within any constant factor. ✓ proof sketch: building an approximation factor preserving reduction from the net worth maximization problem, which is NPhard to approximate within any constant factor. ➢ FDS: finding dense subgraphs • find a connected temporal subgraph with the greatest cohesive density
Challenges find a dense temporal determine a time interval [,j] subgraph find a dense subgraph given [,j Filter-and-Verification[ Bogdanov et al. 11 consider all time intervals [, j] and find dense subgraphs by fixing[i, j] each time filter [ j] if its upper bound of cohesive density is worse than the best cohesive density achieved prune 99%of a total of T*(T+1)2 time intervals T 447 1,414 14,142 T*T+1/2 10 10 10 108 unpruned 10 10 104 106 Filter-and-Verification is insuficient for large temporal graphs! A new and better algorithm design philosophy is needed P BogdanoV, M. Mongiov, A K Singh. Mining heavy subgraphs in time-evolving networks. In /CDM, 2011. 8
Challenges ➢ Filter-and-Verification[Bogdanov et al. 11] • consider all time intervals [i, j] and find dense subgraphs by fixing [i, j] each time • filter [i, j] if its upper bound of cohesive density is worse than the best cohesive density achieved • prune 99% of a total of T*(T+1)/2 time intervals P. Bogdanov, M. Mongiov, A. K. Singh. Mining heavy subgraphs in time-evolving networks. In ICDM, 2011. 8 T 141 447 1,414 ··· 14,142 T*(T+1)/2 104 105 106 ··· 108 # unpruned 102 103 104 ··· 106 Filter-and-Verification is insufficient for large temporal graphs! A new and better algorithm design philosophy is needed. determine a time interval [i, j] find a dense subgraph given [i, j] find a dense temporal subgraph
Outline The fds problem: analyses and challenges a data-driven approach Experimental stud Summary
Outline ➢ The FDS problem: analyses and challenges ➢ A data-driven approach ➢ Experimental study ➢ Summary 9
Main ideas Big graph friendly Employ hidden data statistics to explore k time intervals k is typically a small constant independent of T, e.g., 10 T 141 447 1,414 14,142 T*(T+12 10 106 108 unpruned 10 105 our approach k Our data-driven approach fIDEs step 1: identify k time intervals involved with dense subgraphs employing hidden data statistics and drawing characteristics of targeted time intervals step 2: compute dense subgraphs given time intervals v building the connections with the NWM problem, and exploiting effective and efficient optimization techniques 1000x faster while remain comparable quality of dense subgraphs 10
Main ideas ➢ Employ hidden data statistics to explore k time intervals • k is typically a small constant independent of T, e.g., 10 10 T 141 447 1,414 ··· 14,142 T*(T+1)/2 104 105 106 ··· 108 # unpruned 102 103 104 ··· 106 our approach k k k … k ➢ Our data-driven approach FIDES • step 1: identify k time intervals involved with dense subgraphs ✓ employing hidden data statistics and drawing characteristics of targeted time intervals • step 2: compute dense subgraphs given time intervals ✓ building the connections with the NWM problem, and exploiting effective and efficient optimization techniques Big graph friendly 1000x faster while remain comparable quality of dense subgraphs