fbpx
Wikipedia

Planted clique

In computational complexity theory, a planted clique or hidden clique in an undirected graph is a clique formed from another graph by selecting a subset of vertices and adding edges between each pair of vertices in the subset. The planted clique problem is the algorithmic problem of distinguishing random graphs from graphs that have a planted clique. This is a variation of the clique problem; it may be solved in quasi-polynomial time but is conjectured not to be solvable in polynomial time for intermediate values of the clique size. The conjecture that no polynomial time solution exists is called the planted clique conjecture; it has been used as a computational hardness assumption.

Definition edit

A clique in a graph is a subset of vertices, all of which are adjacent to each other. A planted clique is a clique created from another graph by adding edges between all pairs of a selected subset of vertices.

The planted clique problem can be formalized as a decision problem over a random distribution on graphs, parameterized by two numbers, n (the number of vertices), and k (the size of the clique). These parameters may be used to generate a graph, by the following random process:[1]

  1. Create an Erdős–Rényi random graph on n vertices by choosing independently for each pair of vertices whether to include an edge connecting that pair, with probability 1/2 for each pair.
  2. Decide whether or not to add a clique to the graph, with probability 1/2; if not, return the graph formed in step 1.
  3. Choose randomly a subset of k of the n vertices and add an edge (if one is not already present) between each pair of the selected vertices.

The problem is then to determine algorithmically whether one of the graphs resulting from this process contains a clique of at least k vertices.

Upper and lower bounds edit

There exists a function   such that asymptotically almost surely, the size of the largest clique in an n-vertex random graph is either   or  ,[2] and there exists some constant   such that the expected number of cliques of size   converges to infinity. Consequently, one should expect that the planting a clique of size   cannot be detected with high probability.

By the central limit theorem, the vertex degrees of the random graph would be distributed close to a standard normal distribution with mean   and standard deviation  . Consequently, when   is on the order of   it would create a detectable change in the shape of the distribution. Namely, if you plot out the vertex degree distribution, it would look like a deformed bell curve. Therefore, the most interesting range of values for the parameter k is between these two values,[1]

 

Algorithms edit

Large cliques edit

For sufficiently large values of the parameter k, the planted clique problem can be solved (with high probability) in polynomial time.[1]

Kučera (1995) observes that, when   then almost surely all vertices of the planted clique have higher degree than all vertices outside the clique, making the clique very easy to find. He describes a modification to the random process for generating planted clique instances, that makes the vertex degrees more uniform even for large values of k, but shows that despite this modification the planted clique may still be found quickly.[3]

Alon, Krivelevich & Sudakov (1998) prove for   a planted clique can be found with high probability by the following method:

  1. Compute the eigenvector of the adjacency matrix corresponding to its second highest eigenvalue.
  2. Select the k vertices whose coordinates in this eigenvector have the largest absolute values.
  3. Return the set of vertices that are adjacent to at least 3/4 of the selected vertices.

They show how to modify this technique so that it continues to work whenever k is at least proportional to some multiple of the square root of the number of vertices.[4] Large planted cliques can also be found using semidefinite programming.[5] A combinatorial technique based on randomly sampling vertices can achieve the same bound on k and runs in linear time.[6]

Quasipolynomial time edit

It is also possible to solve the planted clique problem, regardless of the choice of k, in quasi-polynomial time.[7] Because the largest clique in a random graph typically has size near 2 log2 n,[8] a planted clique of size k (if it exists) can be found with high probability by the following method:

  1. Loop through all sets S of   vertices.
  2. For each choice of S, test whether S is a clique. If it is, and  , return S. Otherwise, find the set T of vertices that are adjacent to all vertices in S. If  , return T.

The running time of this algorithm is quasipolynomial, because there are quasipolynomially many choices of S to loop over. This method is guaranteed to try a set S that is a subset of the planted clique; with high probability, the set T will consist only of other members of the planted clique.

As a hardness assumption edit

The planted clique conjecture is the conjecture that there is no polynomial time algorithm that takes as input graphs produced by the planted clique process and distinguishes the ones with planted cliques from the ones that don't have planted cliques with probability significantly better than random chance.[9]

Hazan & Krauthgamer (2011) used the assumption that finding planted cliques is hard as a computational hardness assumption to prove that, if so, it is also hard to approximate the best Nash equilibrium in a two-player game.[7] The planted clique conjecture has also been used as a hardness assumption to show the difficulty of property testing k-independence of random distributions,[10] finding clusters in social networks,[11] and machine learning.[12]

References edit

  1. ^ a b c Arora, Sanjeev; Barak, Boaz (2009), Computational Complexity: A Modern Approach, Cambridge University Press, pp. 362–363, ISBN 9780521424264.
  2. ^ Bollobas, B.; Erdös, P. (November 1976). "Cliques in random graphs". Mathematical Proceedings of the Cambridge Philosophical Society. 80 (3): 419–427. doi:10.1017/S0305004100053056. ISSN 1469-8064. S2CID 16619643.
  3. ^ Kučera, Luděk (1995), "Expected complexity of graph partitioning problems", Discrete Applied Mathematics, 57 (2–3): 193–212, doi:10.1016/0166-218X(94)00103-K, hdl:11858/00-001M-0000-0014-B73F-2, MR 1327775.
  4. ^ Alon, Noga; Krivelevich, Michael; Sudakov, Benny (1998), "Finding a large hidden clique in a random graph", Random Structures & Algorithms, 13 (3–4): 457–466, CiteSeerX 10.1.1.24.6419, doi:10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.3.CO;2-K, MR 1662795
  5. ^ Feige, U.; Krauthgamer, R. (2000), "Finding and certifying a large hidden clique in a semirandom graph", Random Structures and Algorithms, 16 (2): 195–208, doi:10.1002/(SICI)1098-2418(200003)16:2<195::AID-RSA5>3.0.CO;2-A.
  6. ^ Dekel, Yael; Gurel-Gurevich, Ori; Peres, Yuval (2014), "Finding hidden cliques in linear time with high probability", Combinatorics, Probability and Computing, 23 (1): 29–49, arXiv:1010.2997, doi:10.1017/S096354831300045X, MR 3197965, S2CID 14356678.
  7. ^ a b Hazan, Elad; Krauthgamer, Robert (2011), "How hard is it to approximate the best Nash equilibrium?", SIAM Journal on Computing, 40 (1): 79–91, CiteSeerX 10.1.1.511.4422, doi:10.1137/090766991, MR 2765712.
  8. ^ Grimmett, G. R.; McDiarmid, C. J. H. (1975), "On colouring random graphs", Mathematical Proceedings of the Cambridge Philosophical Society, 77 (2): 313–324, Bibcode:1975MPCPS..77..313G, doi:10.1017/S0305004100051124, MR 0369129, S2CID 3421302.
  9. ^ Braverman, Mark; Ko, Young Kun; Rubinstein, Aviad; Weinstein, Omri (2015), ETH hardness for densest-k-subgraph with perfect completeness, arXiv:1504.08352, Bibcode:2015arXiv150408352B.
  10. ^ Alon, Noga; Andoni, Alexandr; Kaufman, Tali; Matulef, Kevin; Rubinfeld, Ronitt; Xie, Ning (2007), "Testing k-wise and almost k-wise independence", STOC'07—Proceedings of the 39th Annual ACM Symposium on Theory of Computing, New York: ACM, pp. 496–505, doi:10.1145/1250790.1250863, ISBN 9781595936318, MR 2402475, S2CID 5050980.
  11. ^ Balcan, Maria-Florina; Borgs, Christian; Braverman, Mark; Chayes, Jennifer; Teng, Shang-Hua (2013), "Finding Endogenously Formed Communities", Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '13), SIAM, pp. 767–783, ISBN 978-1-611972-51-1.
  12. ^ Berthet, Quentin; Rigollet, Philippe (2013), "Complexity theoretic lower bounds for sparse principal component detection", Conference on Learning Theory, Journal of Machine Learning Research, 30: 1046–1066.

planted, clique, computational, complexity, theory, planted, clique, hidden, clique, undirected, graph, clique, formed, from, another, graph, selecting, subset, vertices, adding, edges, between, each, pair, vertices, subset, planted, clique, problem, algorithm. In computational complexity theory a planted clique or hidden clique in an undirected graph is a clique formed from another graph by selecting a subset of vertices and adding edges between each pair of vertices in the subset The planted clique problem is the algorithmic problem of distinguishing random graphs from graphs that have a planted clique This is a variation of the clique problem it may be solved in quasi polynomial time but is conjectured not to be solvable in polynomial time for intermediate values of the clique size The conjecture that no polynomial time solution exists is called the planted clique conjecture it has been used as a computational hardness assumption Contents 1 Definition 1 1 Upper and lower bounds 2 Algorithms 2 1 Large cliques 2 2 Quasipolynomial time 3 As a hardness assumption 4 ReferencesDefinition editA clique in a graph is a subset of vertices all of which are adjacent to each other A planted clique is a clique created from another graph by adding edges between all pairs of a selected subset of vertices The planted clique problem can be formalized as a decision problem over a random distribution on graphs parameterized by two numbers n the number of vertices and k the size of the clique These parameters may be used to generate a graph by the following random process 1 Create an Erdos Renyi random graph on n vertices by choosing independently for each pair of vertices whether to include an edge connecting that pair with probability 1 2 for each pair Decide whether or not to add a clique to the graph with probability 1 2 if not return the graph formed in step 1 Choose randomly a subset of k of the n vertices and add an edge if one is not already present between each pair of the selected vertices The problem is then to determine algorithmically whether one of the graphs resulting from this process contains a clique of at least k vertices Upper and lower bounds edit There exists a function f n 2 log 2 n displaystyle f n sim 2 log 2 n nbsp such that asymptotically almost surely the size of the largest clique in an n vertex random graph is either f n displaystyle f n nbsp or f n 1 displaystyle f n 1 nbsp 2 and there exists some constant c displaystyle c nbsp such that the expected number of cliques of size f n c displaystyle geq f n c nbsp converges to infinity Consequently one should expect that the planting a clique of size 2 log 2 n displaystyle sim 2 log 2 n nbsp cannot be detected with high probability By the central limit theorem the vertex degrees of the random graph would be distributed close to a standard normal distribution with mean n 2 displaystyle frac n 2 nbsp and standard deviation n 2 displaystyle frac sqrt n 2 nbsp Consequently when k displaystyle k nbsp is on the order of n displaystyle sqrt n nbsp it would create a detectable change in the shape of the distribution Namely if you plot out the vertex degree distribution it would look like a deformed bell curve Therefore the most interesting range of values for the parameter k is between these two values 1 2 log 2 n k n displaystyle 2 log 2 n ll k ll sqrt n nbsp Algorithms editLarge cliques edit For sufficiently large values of the parameter k the planted clique problem can be solved with high probability in polynomial time 1 Kucera 1995 observes that when k W n log n displaystyle k Omega sqrt n log n nbsp then almost surely all vertices of the planted clique have higher degree than all vertices outside the clique making the clique very easy to find He describes a modification to the random process for generating planted clique instances that makes the vertex degrees more uniform even for large values of k but shows that despite this modification the planted clique may still be found quickly 3 Alon Krivelevich amp Sudakov 1998 prove for k gt 10 n displaystyle k gt 10 sqrt n nbsp a planted clique can be found with high probability by the following method Compute the eigenvector of the adjacency matrix corresponding to its second highest eigenvalue Select the k vertices whose coordinates in this eigenvector have the largest absolute values Return the set of vertices that are adjacent to at least 3 4 of the selected vertices They show how to modify this technique so that it continues to work whenever k is at least proportional to some multiple of the square root of the number of vertices 4 Large planted cliques can also be found using semidefinite programming 5 A combinatorial technique based on randomly sampling vertices can achieve the same bound on k and runs in linear time 6 Quasipolynomial time edit It is also possible to solve the planted clique problem regardless of the choice of k in quasi polynomial time 7 Because the largest clique in a random graph typically has size near 2 log2 n 8 a planted clique of size k if it exists can be found with high probability by the following method Loop through all sets S of min k 3 log 2 n displaystyle min k 3 log 2 n nbsp vertices For each choice of S test whether S is a clique If it is and S k displaystyle S k nbsp return S Otherwise find the set T of vertices that are adjacent to all vertices in S If T k displaystyle T geq k nbsp return T The running time of this algorithm is quasipolynomial because there are quasipolynomially many choices of S to loop over This method is guaranteed to try a set S that is a subset of the planted clique with high probability the set T will consist only of other members of the planted clique As a hardness assumption editThe planted clique conjecture is the conjecture that there is no polynomial time algorithm that takes as input graphs produced by the planted clique process and distinguishes the ones with planted cliques from the ones that don t have planted cliques with probability significantly better than random chance 9 Hazan amp Krauthgamer 2011 used the assumption that finding planted cliques is hard as a computational hardness assumption to prove that if so it is also hard to approximate the best Nash equilibrium in a two player game 7 The planted clique conjecture has also been used as a hardness assumption to show the difficulty of property testing k independence of random distributions 10 finding clusters in social networks 11 and machine learning 12 References edit a b c Arora Sanjeev Barak Boaz 2009 Computational Complexity A Modern Approach Cambridge University Press pp 362 363 ISBN 9780521424264 Bollobas B Erdos P November 1976 Cliques in random graphs Mathematical Proceedings of the Cambridge Philosophical Society 80 3 419 427 doi 10 1017 S0305004100053056 ISSN 1469 8064 S2CID 16619643 Kucera Ludek 1995 Expected complexity of graph partitioning problems Discrete Applied Mathematics 57 2 3 193 212 doi 10 1016 0166 218X 94 00103 K hdl 11858 00 001M 0000 0014 B73F 2 MR 1327775 Alon Noga Krivelevich Michael Sudakov Benny 1998 Finding a large hidden clique in a random graph Random Structures amp Algorithms 13 3 4 457 466 CiteSeerX 10 1 1 24 6419 doi 10 1002 SICI 1098 2418 199810 12 13 3 4 lt 457 AID RSA14 gt 3 3 CO 2 K MR 1662795 Feige U Krauthgamer R 2000 Finding and certifying a large hidden clique in a semirandom graph Random Structures and Algorithms 16 2 195 208 doi 10 1002 SICI 1098 2418 200003 16 2 lt 195 AID RSA5 gt 3 0 CO 2 A Dekel Yael Gurel Gurevich Ori Peres Yuval 2014 Finding hidden cliques in linear time with high probability Combinatorics Probability and Computing 23 1 29 49 arXiv 1010 2997 doi 10 1017 S096354831300045X MR 3197965 S2CID 14356678 a b Hazan Elad Krauthgamer Robert 2011 How hard is it to approximate the best Nash equilibrium SIAM Journal on Computing 40 1 79 91 CiteSeerX 10 1 1 511 4422 doi 10 1137 090766991 MR 2765712 Grimmett G R McDiarmid C J H 1975 On colouring random graphs Mathematical Proceedings of the Cambridge Philosophical Society 77 2 313 324 Bibcode 1975MPCPS 77 313G doi 10 1017 S0305004100051124 MR 0369129 S2CID 3421302 Braverman Mark Ko Young Kun Rubinstein Aviad Weinstein Omri 2015 ETH hardness for densest k subgraph with perfect completeness arXiv 1504 08352 Bibcode 2015arXiv150408352B Alon Noga Andoni Alexandr Kaufman Tali Matulef Kevin Rubinfeld Ronitt Xie Ning 2007 Testing k wise and almost k wise independence STOC 07 Proceedings of the 39th Annual ACM Symposium on Theory of Computing New York ACM pp 496 505 doi 10 1145 1250790 1250863 ISBN 9781595936318 MR 2402475 S2CID 5050980 Balcan Maria Florina Borgs Christian Braverman Mark Chayes Jennifer Teng Shang Hua 2013 Finding Endogenously Formed Communities Proceedings of the Twenty Fourth Annual ACM SIAM Symposium on Discrete Algorithms SODA 13 SIAM pp 767 783 ISBN 978 1 611972 51 1 Berthet Quentin Rigollet Philippe 2013 Complexity theoretic lower bounds for sparse principal component detection Conference on Learning Theory Journal of Machine Learning Research 30 1046 1066 Retrieved from https en wikipedia org w index php title Planted clique amp oldid 1131429254, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.