Maximum Defective Clique Computation: Improved Time Complexities and Practical Performance (2403.07561v1)
Abstract: The concept of $k$-defective clique, a relaxation of clique by allowing up-to $k$ missing edges, has been receiving increasing interests recently. Although the problem of finding the maximum $k$-defective clique is NP-hard, several practical algorithms have been recently proposed in the literature, with kDC being the state of the art. kDC not only runs the fastest in practice, but also achieves the best time complexity. Specifically, it runs in $O*(\gamma_kn)$ time when ignoring polynomial factors; here, $\gamma_k$ is a constant that is smaller than two and only depends on $k$, and $n$ is the number of vertices in the input graph $G$. In this paper, we propose the kDC-Two algorithm to improve the time complexity as well as practical performance. kDC-Two runs in $O*( (\alpha\Delta){k+2} \gamma_{k-1}\alpha)$ time when the maximum $k$-defective clique size $\omega_k(G)$ is at least $k+2$, and in $O*(\gamma_{k-1}n)$ time otherwise, where $\alpha$ and $\Delta$ are the degeneracy and maximum degree of $G$, respectively. In addition, with slight modification, kDC-Two also runs in $O*( (\alpha\Delta){k+2} (k+1){\alpha+k+1-\omega_k(G)})$ time by using the degeneracy gap $\alpha+k+1-\omega_k(G)$ parameterization; this is better than $O*( (\alpha\Delta){k+2}\gamma_{k-1}\alpha)$ when $\omega_k(G)$ is close to the degeneracy-based upper bound $\alpha+k+1$. Finally, to further improve the practical performance, we propose a new degree-sequence-based reduction rule that can be efficiently applied, and theoretically demonstrate its effectiveness compared with those proposed in the literature. Extensive empirical studies on three benchmark graph collections show that our algorithm outperforms the existing fastest algorithm by several orders of magnitude.
- Massive Quasi-Clique Detection. In Proc. of LATIN’02 (Lecture Notes in Computer Science, Vol. 2286). Springer, 598–612.
- A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems 55 (2016), 278–288.
- Clique Relaxations in Social Network Analysis: The Maximum k-Plex Problem. Operations Research 59, 1 (2011), 133–142.
- Punam Bedi and Chhavi Sharma. 2016. Community detection in social networks. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 6, 3 (2016), 115–135.
- An exact algorithm for the maximum k-club problem in an undirected graph. Eur. J. Oper. Res. 138, 1 (2002), 21–28.
- Randy Carraghan and Panos M. Pardalos. 1990. An Exact Algorithm for the Maximum Clique Problem. Oper. Res. Lett. 9, 6 (Nov. 1990), 375–382.
- Lijun Chang. 2019. Efficient Maximum Clique Computation over Large Sparse Graphs. In Proc. of KDD’19. 529–538.
- Lijun Chang. 2023. Efficient Maximum K-Defective Clique Computation with Improved Time Complexity. Proc. ACM Manag. Data 1, 3, Article 209 (nov 2023), 26 pages. https://doi.org/10.1145/3617313
- Lijun Chang and Lu Qin. 2018. Cohesive Subgraph Computation over Large Sparse Graphs. Springer Series in the Data Sciences.
- Computing maximum k-defective cliques in massive graphs. Comput. Oper. Res. 127 (2021), 105131.
- Finding maximal cliques in massive networks. ACM Trans. Database Syst. 36, 4 (2011), 21:1–21:34.
- Maximal Defective Clique Enumeration. Proc. ACM Manag. Data 1, 1 (2023), 77:1–77:26. https://doi.org/10.1145/3588931
- Listing All Maximal Cliques in Large Sparse Real-World Graphs. ACM Journal of Experimental Algorithmics 18 (2013).
- The Dense k-Subgraph Problem. Algorithmica 29, 3 (2001), 410–421. https://doi.org/10.1007/S004530010050
- Fedor V. Fomin and Dieter Kratsch. 2010. Exact Exponential Algorithms. Springer.
- An Exact Algorithm with New Upper Bounds for the Maximum k-Defective Clique Problem in Massive Sparse Graphs. In Proc. of AAAI’22. 10174–10183.
- A Branch-and-Price Framework for Decomposing Graphs into Relaxed Cliques. INFORMS J. Comput. 33, 3 (2021), 1070–1090.
- Maximum weight relaxed cliques and Russian Doll Search revisited. Discret. Appl. Math. 234 (2018), 131–138.
- Johan Håstad. 1996. Clique is Hard to Approximate Within n1-epsilon1-epsilon{}^{\mbox{1-epsilon}}start_FLOATSUPERSCRIPT 1-epsilon end_FLOATSUPERSCRIPT. In Proc. of FOCS’96. 627–636.
- Shweta Jain and C. Seshadhri. 2020a. The Power of Pivoting for Exact Clique Counting. In Proc. WSDM’20. ACM, 268–276.
- Shweta Jain and C. Seshadhri. 2020b. Provably and Efficiently Approximating Near-cliques using the Turán Shadow: PEANUTS. In Proc. of WWW’20. ACM / IW3C2, 1966–1976.
- Tang Jian. 1986. An O(20.304𝑛0.304𝑛{}^{\mbox{0.304\emph{n}}}start_FLOATSUPERSCRIPT 0.304 italic_n end_FLOATSUPERSCRIPT) Algorithm for Solving Maximum Independent Set Problem. IEEE Trans. Computers 35, 9 (1986), 847–851.
- Richard M. Karp. 1972. Reducibility Among Combinatorial Problems. In Proc. of CCC’72. 85–103.
- A Survey of Algorithms for Dense Subgraph Discovery. In Managing and Mining Graph Data. Advances in Database Systems, Vol. 40. Springer, 303–336.
- Combining MaxSAT Reasoning and Incremental Upper Bound for the Maximum Clique Problem. In Proc. of ICTAI’13.
- On minimization of the number of branches in branch-and-bound algorithms for the maximum clique problem. Computers & OR 84 (2017), 1–15.
- Ordering Heuristics for k-clique Listing. Proc. VLDB Endow. 13, 11 (2020), 2536–2548.
- David W. Matula and Leland L. Beck. 1983. Smallest-Last Ordering and clustering and Graph Coloring Algorithms. J. ACM 30, 3 (1983), 417–427.
- Foundations of Machine Learning. MIT Press.
- Panos M. Pardalos and Jue Xue. 1994. The maximum clique problem. J. global Optimization 4, 3 (1994), 301–328.
- Fast Algorithms for the Maximum Clique Problem on Massive Graphs with Applications to Overlapping Community Detection. Internet Mathematics 11, 4-5 (2015), 421–448.
- On clique relaxation models in network analysis. Eur. J. Oper. Res. 226, 1 (2013), 9–18.
- J. M. Robson. 1986. Algorithms for Maximum Independent Sets. J. Algorithms 7, 3 (1986), 425–440.
- J. M. Robson. 2001. Finding a maximum independent set in time O(2n/4)𝑂superscript2𝑛4O(2^{n/4})italic_O ( 2 start_POSTSUPERSCRIPT italic_n / 4 end_POSTSUPERSCRIPT ). https://www.labri.fr/perso/robson/mis/techrep.html.
- Parallel Maximum Clique Algorithms with Applications to Network Analysis. SIAM J. Scientific Computing 37, 5 (2015).
- H. Sachs. 1963. Regular Graphs with Given Girth and Restricted Circuits. Journal of the London Mathematical Society s1-38, 1 (1963), 423–429.
- A new exact maximum clique algorithm for large and massive sparse graphs. Computers & Operations Research 66 (2016), 81–94.
- An Airspace Planning Model for Selecting Flight-plans Under Workload, Safety, and Equity Considerations. Transp. Sci. 36, 4 (2002), 378–397.
- Continuous cubic formulations for cluster detection problems in networks. Math. Program. 196, 1 (2022), 279–307.
- Characterizing protein interactions employing a genome-wide siRNA cellular phenotyping screen. PLoS computational biology 10, 9 (2014), e1003814.
- Robert Endre Tarjan and Anthony E. Trojanowski. 1977. Finding a Maximum Independent Set. SIAM J. Comput. 6, 3 (1977), 537–546.
- Etsuji Tomita. 2017. Efficient Algorithms for Finding Maximum and Maximal Cliques and Their Applications. In Proc. of WALCOM’17. 3–15.
- A simple and faster branch-and-bound algorithm for finding a maximum clique. In Proc. of WALCOM’10. 191–203.
- Algorithms for detecting optimal hereditary structures in graphs, with application to clique relaxations. Comput. Optim. Appl. 56, 1 (2013), 113–130.
- Scalable maximum clique computation using mapreduce. In Proc. of ICDE’13. 74–85.
- Mihalis Yannakakis. 1978. Node- and Edge-Deletion NP-Complete Problems. In Proc. of STOC’78. ACM, 253–264.
- Predicting interactions in protein networks by completing defective cliques. Bioinform. 22, 7 (2006), 823–829.