fbpx
Wikipedia

Analysis of Boolean functions

In mathematics and theoretical computer science, analysis of Boolean functions is the study of real-valued functions on or (such functions are sometimes known as pseudo-Boolean functions) from a spectral perspective.[1] The functions studied are often, but not always, Boolean-valued, making them Boolean functions. The area has found many applications in combinatorics, social choice theory, random graphs, and theoretical computer science, especially in hardness of approximation, property testing, and PAC learning.

Basic concepts edit

We will mostly consider functions defined on the domain  . Sometimes it is more convenient to work with the domain   instead. If   is defined on  , then the corresponding function defined on   is

 

Similarly, for us a Boolean function is a  -valued function, though often it is more convenient to consider  -valued functions instead.

Fourier expansion edit

Every real-valued function   has a unique expansion as a multilinear polynomial:

 

(Note that even if the function is 0-1 valued this is not a sum mod 2, but just an ordinary sum of real numbers.)

This is the Hadamard transform of the function  , which is the Fourier transform in the group  . The coefficients   are known as Fourier coefficients, and the entire sum is known as the Fourier expansion of  . The functions   are known as Fourier characters, and they form an orthonormal basis for the space of all functions over  , with respect to the inner product  .

The Fourier coefficients can be calculated using an inner product:

 

In particular, this shows that  , where the expected value is taken with respect to the uniform distribution over  . Parseval's identity states that

 

If we skip  , then we get the variance of  :

 

Fourier degree and Fourier levels edit

The degree of a function   is the maximum   such that   for some set   of size  . In other words, the degree of   is its degree as a multilinear polynomial.

It is convenient to decompose the Fourier expansion into levels: the Fourier coefficient   is on level  .

The degree   part of   is

 

It is obtained from   by zeroing out all Fourier coefficients not on level  .

We similarly define  .

Influence edit

The  'th influence of a function   can be defined in two equivalent ways:

 

If   is Boolean then   is the probability that flipping the  'th coordinate flips the value of the function:

 

If   then   doesn't depend on the  'th coordinate.

The total influence of   is the sum of all of its influences:

 

The total influence of a Boolean function is also the average sensitivity of the function. The sensitivity of a Boolean function   at a given point is the number of coordinates   such that if we flip the  'th coordinate, the value of the function changes. The average value of this quantity is exactly the total influence.

The total influence can also be defined using the discrete Laplacian of the Hamming graph, suitably normalized:  .

A generalized form of influence is the  -stable influence, defined by:

 

The corresponding total influences is

 

One can prove that a function   has at most “constantly” many “stably-influential” coordinates:  

Noise stability edit

Given  , we say that two random vectors   are  -correlated if the marginal distributions of   are uniform, and  . Concretely, we can generate a pair of  -correlated random variables by first choosing   uniformly at random, and then choosing   according to one of the following two equivalent rules, applied independently to each coordinate:

 

We denote this distribution by  .

The noise stability of a function   at   can be defined in two equivalent ways:

 

For  , the noise sensitivity of   at   is

 

If   is Boolean, then this is the probability that the value of   changes if we flip each coordinate with probability  , independently.

Noise operator edit

The noise operator   is an operator taking a function   and returning another function   given by

 

When  , the noise operator can also be defined using a continuous-time Markov chain in which each bit is flipped independently with rate 1. The operator   corresponds to running this Markov chain for   steps starting at  , and taking the average value of   at the final state. This Markov chain is generated by the Laplacian of the Hamming graph, and this relates total influence to the noise operator.

Noise stability can be defined in terms of the noise operator:  .

Hypercontractivity edit

For  , the  -norm of a function   is defined by

 

We also define  

The hypercontractivity theorem states that for any   and  ,

 

Hypercontractivity is closely related to the logarithmic Sobolev inequalities of functional analysis.[2]

A similar result for   is known as reverse hypercontractivity.[3]

p-Biased analysis edit

In many situations the input to the function is not uniformly distributed over  , but instead has a bias toward   or  . In these situations it is customary to consider functions over the domain  . For  , the p-biased measure   is given by

 

This measure can be generated by choosing each coordinate independently to be 1 with probability   and 0 with probability  .

The classical Fourier characters are no longer orthogonal with respect to this measure. Instead, we use the following characters:

 

The p-biased Fourier expansion of   is the expansion of   as a linear combination of p-biased characters:

 

We can extend the definitions of influence and the noise operator to the p-biased setting by using their spectral definitions.

Influence edit

The  's influence is given by

 

The total influence is the sum of the individual influences:

 

Noise operator edit

A pair of  -correlated random variables can be obtained by choosing   independently and  , where   is given by

 

The noise operator is then given by

 

Using this we can define the noise stability and the noise sensitivity, as before.

Russo–Margulis formula edit

The Russo–Margulis formula (also called the Margulis–Russo formula[1]) states that for monotone Boolean functions  ,

 

Both the influence and the probabilities are taken with respect to  , and on the right-hand side we have the average sensitivity of  . If we think of   as a property, then the formula states that as   varies, the derivative of the probability that   occurs at   equals the average sensitivity at  .

The Russo–Margulis formula is key for proving sharp threshold theorems such as Friedgut's.

Gaussian space edit

One of the deepest results in the area, the invariance principle, connects the distribution of functions on the Boolean cube   to their distribution on Gaussian space, which is the space   endowed with the standard  -dimensional Gaussian measure.

Many of the basic concepts of Fourier analysis on the Boolean cube have counterparts in Gaussian space:

  • The counterpart of the Fourier expansion in Gaussian space is the Hermite expansion, which is an expansion to an infinite sum (converging in  ) of multivariate Hermite polynomials.
  • The counterpart of total influence or average sensitivity for the indicator function of a set is Gaussian surface area, which is the Minkowski content of the boundary of the set.
  • The counterpart of the noise operator is the Ornstein–Uhlenbeck operator (related to the Mehler transform), given by  , or alternatively by  , where   is a pair of  -correlated standard Gaussians.
  • Hypercontractivity holds (with appropriate parameters) in Gaussian space as well.

Gaussian space is more symmetric than the Boolean cube (for example, it is rotation invariant), and supports continuous arguments which may be harder to get through in the discrete setting of the Boolean cube. The invariance principle links the two settings, and allows deducing results on the Boolean cube from results on Gaussian space.

Basic results edit

Friedgut–Kalai–Naor theorem edit

If   has degree at most 1, then   is either constant, equal to a coordinate, or equal to the negation of a coordinate. In particular,   is a dictatorship: a function depending on at most one coordinate.

The Friedgut–Kalai–Naor theorem,[4] also known as the FKN theorem, states that if   almost has degree 1 then it is close to a dictatorship. Quantitatively, if   and  , then   is  -close to a dictatorship, that is,   for some Boolean dictatorship  , or equivalently,   for some Boolean dictatorship  .

Similarly, a Boolean function of degree at most   depends on at most   coordinates, making it a junta (a function depending on a constant number of coordinates), where   is an absolute constant equal to at least 1.5, and at most 4.41, as shown by Wellens.[5] The Kindler–Safra theorem[6] generalizes the Friedgut–Kalai–Naor theorem to this setting. It states that if   satisfies   then   is  -close to a Boolean function of degree at most  .

Kahn–Kalai–Linial theorem edit

The Poincaré inequality for the Boolean cube (which follows from formulas appearing above) states that for a function  ,

 

This implies that  .

The Kahn–Kalai–Linial theorem,[7] also known as the KKL theorem, states that if   is Boolean then  .

The bound given by the Kahn–Kalai–Linial theorem is tight, and is achieved by the Tribes function of Ben-Or and Linial:[8]

 

The Kahn–Kalai–Linial theorem was one of the first results in the area, and was the one introducing hypercontractivity into the context of Boolean functions.

Friedgut's junta theorem edit

If   is an  -junta (a function depending on at most   coordinates) then   according to the Poincaré inequality.

Friedgut's theorem[9] is a converse to this result. It states that for any  , the function   is  -close to a Boolean junta depending on   coordinates.

Combined with the Russo–Margulis lemma, Friedgut's junta theorem implies that for every  , every monotone function is close to a junta with respect to   for some  .

Invariance principle edit

The invariance principle[10] generalizes the Berry–Esseen theorem to non-linear functions.

The Berry–Esseen theorem states (among else) that if   and no   is too large compared to the rest, then the distribution of   over   is close to a normal distribution with the same mean and variance.

The invariance principle (in a special case) informally states that if   is a multilinear polynomial of bounded degree over   and all influences of   are small, then the distribution of   under the uniform measure over   is close to its distribution in Gaussian space.

More formally, let   be a univariate Lipschitz function, let  , let  , and let  . Suppose that  . Then

 

By choosing appropriate  , this implies that the distributions of   under both measures are close in CDF distance, which is given by  .

The invariance principle was the key ingredient in the original proof of the Majority is Stablest theorem.

Some applications edit

Linearity testing edit

A Boolean function   is linear if it satisfies  , where  . It is not hard to show that the Boolean linear functions are exactly the characters  .

In property testing we want to test whether a given function is linear. It is natural to try the following test: choose   uniformly at random, and check that  . If   is linear then it always passes the test. Blum, Luby and Rubinfeld[11] showed that if the test passes with probability   then   is  -close to a Fourier character. Their proof was combinatorial.

Bellare et al.[12] gave an extremely simple Fourier-analytic proof, that also shows that if the test succeeds with probability  , then   is correlated with a Fourier character. Their proof relies on the following formula for the success probability of the test:

 

Arrow's theorem edit

Arrow's impossibility theorem states that for three and more candidates, the only unanimous voting rule for which there is always a Condorcet winner is a dictatorship.

The usual proof of Arrow's theorem is combinatorial. Kalai[13] gave an alternative proof of this result in the case of three candidates using Fourier analysis. If   is the rule that assigns a winner among two candidates given their relative orders in the votes, then the probability that there is a Condorcet winner given a uniformly random vote is  , from which the theorem easily follows.

The FKN theorem implies that if   is a rule for which there is almost always a Condorcet winner, then   is close to a dictatorship.

Sharp thresholds edit

A classical result in the theory of random graphs states that the probability that a   random graph is connected tends to   if  . This is an example of a sharp threshold: the width of the "threshold window", which is  , is asymptotically smaller than the threshold itself, which is roughly  . In contrast, the probability that a   graph contains a triangle tends to   when  . Here both the threshold window and the threshold itself are  , and so this is a coarse threshold.

Friedgut's sharp threshold theorem[14] states, roughly speaking, that a monotone graph property (a graph property is a property which doesn't depend on the names of the vertices) has a sharp threshold unless it is correlated with the appearance of small subgraphs. This theorem has been widely applied to analyze random graphs and percolation.

On a related note, the KKL theorem implies that the width of threshold window is always at most  .[15]

Majority is stablest edit

Let   denote the majority function on   coordinates. Sheppard's formula gives the asymptotic noise stability of majority:

 

This is related to the probability that if we choose   uniformly at random and form   by flipping each bit of   with probability  , then the majority stays the same:

 .

There are Boolean functions with larger noise stability. For example, a dictatorship   has noise stability  .

The Majority is Stablest theorem states, informally, then the only functions having noise stability larger than majority have influential coordinates. Formally, for every   there exists   such that if   has expectation zero and  , then  .

The first proof of this theorem used the invariance principle in conjunction with an isoperimetric theorem of Borell in Gaussian space; since then more direct proofs were devised.[16][17]

Majority is Stablest implies that the Goemans–Williamson approximation algorithm for MAX-CUT is optimal, assuming the unique games conjecture. This implication, due to Khot et al.,[18] was the impetus behind proving the theorem.

References edit

  1. ^ a b O'Donnell, Ryan (2014). Analysis of Boolean functions. Cambridge University Press. arXiv:2105.10386. ISBN 978-1-107-03832-5.
  2. ^ P. Diaconis; L. Saloff-Coste (August 1996). "Logarithmic Sobolev inequalities for finite Markov chains". Annals of Applied Probability. 6 (3): 695–750. doi:10.1214/AOAP/1034968224. ISSN 1050-5164. MR 1410112. Zbl 0867.60043. Wikidata Q62111462.
  3. ^ Mossel, Elchanan; Oleszkiewicz, Krzysztof; Sen, Arnab (2013). "On reverse hypercontractivity". Geometric and Functional Analysis. 23 (3): 1062–1097. arXiv:1108.1210. doi:10.1007/s00039-013-0229-4. S2CID 15933352.
  4. ^ Friedgut, Ehud; Kalai, Gil; Naor, Assaf (2002). "Boolean functions whose Fourier transform is concentrated on the first two levels". Advances in Applied Mathematics. 29 (3): 427–437. doi:10.1016/S0196-8858(02)00024-6.
  5. ^ Wellens, Jake (2020). "Relationships between the number of inputs and other complexity measures of Boolean functions". arXiv:2005.00566. doi:10.19086/da.57741 (inactive 31 January 2024). {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: DOI inactive as of January 2024 (link)
  6. ^ Kindler, Guy (2002). "Chapter 16" (PDF). Property testing, PCP, and juntas (Thesis). Tel Aviv University.
  7. ^ Kahn, Jeff; Kalai, Gil; Linial, Nati (1988). "The influence of variables on Boolean functions.". Proc. 29th Symp. on Foundations of Computer Science. SFCS'88. White Plains: IEEE. pp. 68–80. doi:10.1109/SFCS.1988.21923.
  8. ^ Ben-Or, Michael; Linial, Nathan (1985). "Collective coin flipping, robust voting schemes and minima of Banzhaf values". Proc. 26th Symp. on Foundations of Computer Science. SFCS'85. Portland, Oregon: IEEE. pp. 408–416. doi:10.1109/SFCS.1985.15.
  9. ^ Friedgut, Ehud (1998). "Boolean functions with low average sensitivity depend on few coordinates". Combinatorica. 18 (1): 474–483. CiteSeerX 10.1.1.7.5597. doi:10.1007/PL00009809. S2CID 15534278.
  10. ^ Mossel, Elchanan; O'Donnell, Ryan; Oleszkiewicz, Krzysztof (2010). "Noise stability of functions with low influences: Invariance and optimality". Annals of Mathematics. 171 (1): 295–341. arXiv:math/0503503. doi:10.4007/annals.2010.171.295.
  11. ^ Blum, Manuel; Luby, Michael; Rubinfeld, Ronitt (1993). "Self-testing/correcting with applications to numerical problems". J. Comput. Syst. Sci. 47 (3): 549–595. doi:10.1016/0022-0000(93)90044-W.
  12. ^ Bellare, Mihir; Coppersmith, Don; Håstad, Johan; Kiwi, Marcos; Sudan, Madhu (1995). "Linearity testing in characteristic two". Proc. 36th Symp. on Foundations of Computer Science. FOCS'95.
  13. ^ Kalai, Gil (2002). "A Fourier-theoretic perspective on the Condorcet paradox and Arrow's theorem" (PDF). Advances in Applied Mathematics. 29 (3): 412–426. doi:10.1016/S0196-8858(02)00023-4.
  14. ^ Friedgut, Ehud (1999). "Sharp thresholds of graph properties and the k-SAT problem". Journal of the American Mathematical Society. 12 (4): 1017–1054. doi:10.1090/S0894-0347-99-00305-7.
  15. ^ Friedgut, Ehud; Kalai, Gil (1996). "Every monotone graph property has a sharp threshold". Proceedings of the American Mathematical Society. 124 (10): 2993–3002. doi:10.1090/S0002-9939-96-03732-X.
  16. ^ De, Anindya; Mossel, Elchanan; Neeman, Joe (2016), "Majority is Stablest: Discrete and SoS" (PDF), Theory of Computing, 12 (4): 1–50, CiteSeerX 10.1.1.757.3048, doi:10.4086/toc.2016.v012a004
  17. ^ Eldan, Ronen; Mikulincer, Dan; Raghavendra, Prasad (June 2023). "Noise stability on the Boolean hypercube via a renormalized Brownian motion". STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of Computing. STOC. Orlando, Florida: ACM. pp. 661–671. arXiv:2208.06508. doi:10.1145/3564246.3585118.
  18. ^ Khot, Subhash; Kindler, Guy; Mossel, Elchanan; O'Donnell, Ryan (2007), "Optimal inapproximability results for MAX-CUT and other two-variable CSPs?" (PDF), SIAM Journal on Computing, 37 (1): 319–357, CiteSeerX 10.1.1.130.8886, doi:10.1137/S0097539705447372, S2CID 2090495

analysis, boolean, functions, mathematics, theoretical, computer, science, analysis, boolean, functions, study, real, valued, functions, displaystyle, displaystyle, such, functions, sometimes, known, pseudo, boolean, functions, from, spectral, perspective, fun. In mathematics and theoretical computer science analysis of Boolean functions is the study of real valued functions on 0 1 n displaystyle 0 1 n or 1 1 n displaystyle 1 1 n such functions are sometimes known as pseudo Boolean functions from a spectral perspective 1 The functions studied are often but not always Boolean valued making them Boolean functions The area has found many applications in combinatorics social choice theory random graphs and theoretical computer science especially in hardness of approximation property testing and PAC learning Contents 1 Basic concepts 1 1 Fourier expansion 1 2 Fourier degree and Fourier levels 1 3 Influence 1 4 Noise stability 1 5 Noise operator 1 6 Hypercontractivity 1 7 p Biased analysis 1 7 1 Influence 1 7 2 Noise operator 1 7 3 Russo Margulis formula 1 8 Gaussian space 2 Basic results 2 1 Friedgut Kalai Naor theorem 2 2 Kahn Kalai Linial theorem 2 3 Friedgut s junta theorem 2 4 Invariance principle 3 Some applications 3 1 Linearity testing 3 2 Arrow s theorem 3 3 Sharp thresholds 3 4 Majority is stablest 4 ReferencesBasic concepts editWe will mostly consider functions defined on the domain 1 1 n displaystyle 1 1 n nbsp Sometimes it is more convenient to work with the domain 0 1 n displaystyle 0 1 n nbsp instead If f displaystyle f nbsp is defined on 1 1 n displaystyle 1 1 n nbsp then the corresponding function defined on 0 1 n displaystyle 0 1 n nbsp is f 01 x 1 x n f 1 x 1 1 x n displaystyle f 01 x 1 ldots x n f 1 x 1 ldots 1 x n nbsp Similarly for us a Boolean function is a 1 1 displaystyle 1 1 nbsp valued function though often it is more convenient to consider 0 1 displaystyle 0 1 nbsp valued functions instead Fourier expansion edit Every real valued function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp has a unique expansion as a multilinear polynomial f x S n f S x S x x S x i S x i displaystyle f x sum S subseteq n hat f S chi S x quad chi S x prod i in S x i nbsp Note that even if the function is 0 1 valued this is not a sum mod 2 but just an ordinary sum of real numbers This is the Hadamard transform of the function f displaystyle f nbsp which is the Fourier transform in the group Z 2 n displaystyle mathbb Z 2 n nbsp The coefficients f S displaystyle hat f S nbsp are known as Fourier coefficients and the entire sum is known as the Fourier expansion of f displaystyle f nbsp The functions x S displaystyle chi S nbsp are known as Fourier characters and they form an orthonormal basis for the space of all functions over 1 1 n displaystyle 1 1 n nbsp with respect to the inner product f g 2 n x 1 1 n f x g x displaystyle langle f g rangle 2 n sum x in 1 1 n f x g x nbsp The Fourier coefficients can be calculated using an inner product f S f x S displaystyle hat f S langle f chi S rangle nbsp In particular this shows that f E f displaystyle hat f emptyset operatorname E f nbsp where the expected value is taken with respect to the uniform distribution over 1 1 n displaystyle 1 1 n nbsp Parseval s identity states that f 2 E f 2 S f S 2 displaystyle f 2 operatorname E f 2 sum S hat f S 2 nbsp If we skip S displaystyle S emptyset nbsp then we get the variance of f displaystyle f nbsp Var f S f S 2 displaystyle operatorname Var f sum S neq emptyset hat f S 2 nbsp Fourier degree and Fourier levels edit The degree of a function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp is the maximum d displaystyle d nbsp such that f S 0 displaystyle hat f S neq 0 nbsp for some set S displaystyle S nbsp of size d displaystyle d nbsp In other words the degree of f displaystyle f nbsp is its degree as a multilinear polynomial It is convenient to decompose the Fourier expansion into levels the Fourier coefficient f S displaystyle hat f S nbsp is on level S displaystyle S nbsp The degree d displaystyle d nbsp part of f displaystyle f nbsp is f d S d f S x S displaystyle f d sum S d hat f S chi S nbsp It is obtained from f displaystyle f nbsp by zeroing out all Fourier coefficients not on level d displaystyle d nbsp We similarly define f gt d f lt d f d f d displaystyle f gt d f lt d f geq d f leq d nbsp Influence edit The i displaystyle i nbsp th influence of a function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp can be defined in two equivalent ways Inf i f E f f i 2 2 S i f S 2 f i x 1 x n f x 1 x i 1 x i x i 1 x n displaystyle begin aligned amp operatorname Inf i f operatorname E left left frac f f oplus i 2 right 2 right sum S ni i hat f S 2 5pt amp f oplus i x 1 ldots x n f x 1 ldots x i 1 x i x i 1 ldots x n end aligned nbsp If f displaystyle f nbsp is Boolean then Inf i f displaystyle operatorname Inf i f nbsp is the probability that flipping the i displaystyle i nbsp th coordinate flips the value of the function Inf i f Pr f x f i x displaystyle operatorname Inf i f Pr f x neq f oplus i x nbsp If Inf i f 0 displaystyle operatorname Inf i f 0 nbsp then f displaystyle f nbsp doesn t depend on the i displaystyle i nbsp th coordinate The total influence of f displaystyle f nbsp is the sum of all of its influences Inf f i 1 n Inf i f S S f S 2 displaystyle operatorname Inf f sum i 1 n operatorname Inf i f sum S S hat f S 2 nbsp The total influence of a Boolean function is also the average sensitivity of the function The sensitivity of a Boolean function f displaystyle f nbsp at a given point is the number of coordinates i displaystyle i nbsp such that if we flip the i displaystyle i nbsp th coordinate the value of the function changes The average value of this quantity is exactly the total influence The total influence can also be defined using the discrete Laplacian of the Hamming graph suitably normalized Inf f f L f displaystyle operatorname Inf f langle f Lf rangle nbsp A generalized form of influence is the r displaystyle rho nbsp stable influence defined by Inf i r f Stab r D i f S i r S 1 f S 2 displaystyle operatorname Inf i rho f operatorname Stab rho operatorname D i f sum S ni i rho S 1 hat f S 2 nbsp The corresponding total influences is I r f d d r Stab r f S S r S 1 f S 2 displaystyle operatorname I rho f frac d d rho operatorname Stab rho f sum S S rho S 1 hat f S 2 nbsp One can prove that a function f 1 1 n 1 1 displaystyle f 1 1 n to 1 1 nbsp has at most constantly many stably influential coordinates i n Inf i 1 d f ϵ 1 d ϵ displaystyle i in n operatorname Inf i 1 delta f geq epsilon leq frac 1 delta epsilon nbsp Noise stability edit Given 1 r 1 displaystyle 1 leq rho leq 1 nbsp we say that two random vectors x y 1 1 n displaystyle x y in 1 1 n nbsp are r displaystyle rho nbsp correlated if the marginal distributions of x y displaystyle x y nbsp are uniform and E x i y i r displaystyle operatorname E x i y i rho nbsp Concretely we can generate a pair of r displaystyle rho nbsp correlated random variables by first choosing x z 1 1 n displaystyle x z in 1 1 n nbsp uniformly at random and then choosing y displaystyle y nbsp according to one of the following two equivalent rules applied independently to each coordinate y i x i w p r z i w p 1 r or y i x i w p 1 r 2 x i w p 1 r 2 displaystyle y i begin cases x i amp text w p rho z i amp text w p 1 rho end cases quad text or quad y i begin cases x i amp text w p frac 1 rho 2 x i amp text w p frac 1 rho 2 end cases nbsp We denote this distribution by y N r x displaystyle y sim N rho x nbsp The noise stability of a function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp at r displaystyle rho nbsp can be defined in two equivalent ways Stab r f E x y N r x f x f y S n r S f S 2 displaystyle operatorname Stab rho f operatorname E x y sim N rho x f x f y sum S subseteq n rho S hat f S 2 nbsp For 0 d 1 displaystyle 0 leq delta leq 1 nbsp the noise sensitivity of f displaystyle f nbsp at d displaystyle delta nbsp is NS d f 1 2 1 2 Stab 1 2 d f displaystyle operatorname NS delta f frac 1 2 frac 1 2 operatorname Stab 1 2 delta f nbsp If f displaystyle f nbsp is Boolean then this is the probability that the value of f displaystyle f nbsp changes if we flip each coordinate with probability d displaystyle delta nbsp independently Noise operator edit The noise operator T r displaystyle T rho nbsp is an operator taking a function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp and returning another function T r f 1 1 n R displaystyle T rho f colon 1 1 n to mathbb R nbsp given by T r f x E y N r x f y S n r S f S x S displaystyle T rho f x operatorname E y sim N rho x f y sum S subseteq n rho S hat f S chi S nbsp When r gt 0 displaystyle rho gt 0 nbsp the noise operator can also be defined using a continuous time Markov chain in which each bit is flipped independently with rate 1 The operator T r displaystyle T rho nbsp corresponds to running this Markov chain for 1 2 log 1 r displaystyle frac 1 2 log frac 1 rho nbsp steps starting at x displaystyle x nbsp and taking the average value of f displaystyle f nbsp at the final state This Markov chain is generated by the Laplacian of the Hamming graph and this relates total influence to the noise operator Noise stability can be defined in terms of the noise operator Stab r f f T r f displaystyle operatorname Stab rho f langle f T rho f rangle nbsp Hypercontractivity edit For 1 q lt displaystyle 1 leq q lt infty nbsp the L q displaystyle L q nbsp norm of a function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp is defined by f q E f q q displaystyle f q sqrt q operatorname E f q nbsp We also define f max x 1 1 n f x displaystyle f infty max x in 1 1 n f x nbsp The hypercontractivity theorem states that for any q gt 2 displaystyle q gt 2 nbsp and q 1 1 1 q displaystyle q 1 1 1 q nbsp T r f q f 2 and T r f 2 f q displaystyle T rho f q leq f 2 quad text and quad T rho f 2 leq f q nbsp Hypercontractivity is closely related to the logarithmic Sobolev inequalities of functional analysis 2 A similar result for q lt 2 displaystyle q lt 2 nbsp is known as reverse hypercontractivity 3 p Biased analysis edit In many situations the input to the function is not uniformly distributed over 1 1 n displaystyle 1 1 n nbsp but instead has a bias toward 1 displaystyle 1 nbsp or 1 displaystyle 1 nbsp In these situations it is customary to consider functions over the domain 0 1 n displaystyle 0 1 n nbsp For 0 lt p lt 1 displaystyle 0 lt p lt 1 nbsp the p biased measure m p displaystyle mu p nbsp is given by m p x p i x i 1 p i 1 x i displaystyle mu p x p sum i x i 1 p sum i 1 x i nbsp This measure can be generated by choosing each coordinate independently to be 1 with probability p displaystyle p nbsp and 0 with probability 1 p displaystyle 1 p nbsp The classical Fourier characters are no longer orthogonal with respect to this measure Instead we use the following characters w S x p 1 p i S x i 0 1 p p i S x i 1 displaystyle omega S x left sqrt frac p 1 p right i in S x i 0 left sqrt frac 1 p p right i in S x i 1 nbsp The p biased Fourier expansion of f displaystyle f nbsp is the expansion of f displaystyle f nbsp as a linear combination of p biased characters f S n f S w S displaystyle f sum S subseteq n hat f S omega S nbsp We can extend the definitions of influence and the noise operator to the p biased setting by using their spectral definitions Influence edit The i displaystyle i nbsp s influence is given by Inf i f S i f S 2 p 1 p E f f i 2 displaystyle operatorname Inf i f sum S ni i hat f S 2 p 1 p operatorname E f f oplus i 2 nbsp The total influence is the sum of the individual influences Inf f i 1 n Inf i f S S f S 2 displaystyle operatorname Inf f sum i 1 n operatorname Inf i f sum S S hat f S 2 nbsp Noise operator edit A pair of r displaystyle rho nbsp correlated random variables can be obtained by choosing x z m p displaystyle x z sim mu p nbsp independently and y N r x displaystyle y sim N rho x nbsp where N r displaystyle N rho nbsp is given by y i x i w p r z i w p 1 r displaystyle y i begin cases x i amp text w p rho z i amp text w p 1 rho end cases nbsp The noise operator is then given by T r f x S n r S f S w S x E y N r x f y displaystyle T rho f x sum S subseteq n rho S hat f S omega S x operatorname E y sim N rho x f y nbsp Using this we can define the noise stability and the noise sensitivity as before Russo Margulis formula edit The Russo Margulis formula also called the Margulis Russo formula 1 states that for monotone Boolean functions f 0 1 n 0 1 displaystyle f colon 0 1 n to 0 1 nbsp d d p E x m p f x Inf f p 1 p i 1 n Pr f f i displaystyle frac d dp operatorname E x sim mu p f x frac operatorname Inf f p 1 p sum i 1 n Pr f neq f oplus i nbsp Both the influence and the probabilities are taken with respect to m p displaystyle mu p nbsp and on the right hand side we have the average sensitivity of f displaystyle f nbsp If we think of f displaystyle f nbsp as a property then the formula states that as p displaystyle p nbsp varies the derivative of the probability that f displaystyle f nbsp occurs at p displaystyle p nbsp equals the average sensitivity at p displaystyle p nbsp The Russo Margulis formula is key for proving sharp threshold theorems such as Friedgut s Gaussian space edit One of the deepest results in the area the invariance principle connects the distribution of functions on the Boolean cube 1 1 n displaystyle 1 1 n nbsp to their distribution on Gaussian space which is the space R n displaystyle mathbb R n nbsp endowed with the standard n displaystyle n nbsp dimensional Gaussian measure Many of the basic concepts of Fourier analysis on the Boolean cube have counterparts in Gaussian space The counterpart of the Fourier expansion in Gaussian space is the Hermite expansion which is an expansion to an infinite sum converging in L 2 displaystyle L 2 nbsp of multivariate Hermite polynomials The counterpart of total influence or average sensitivity for the indicator function of a set is Gaussian surface area which is the Minkowski content of the boundary of the set The counterpart of the noise operator is the Ornstein Uhlenbeck operator related to the Mehler transform given by U r f x E z N 0 1 f r x 1 r 2 z displaystyle U rho f x operatorname E z sim N 0 1 f rho x sqrt 1 rho 2 z nbsp or alternatively by U r f x E f y displaystyle U rho f x operatorname E f y nbsp where x y displaystyle x y nbsp is a pair of r displaystyle rho nbsp correlated standard Gaussians Hypercontractivity holds with appropriate parameters in Gaussian space as well Gaussian space is more symmetric than the Boolean cube for example it is rotation invariant and supports continuous arguments which may be harder to get through in the discrete setting of the Boolean cube The invariance principle links the two settings and allows deducing results on the Boolean cube from results on Gaussian space Basic results editFriedgut Kalai Naor theorem edit If f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp has degree at most 1 then f displaystyle f nbsp is either constant equal to a coordinate or equal to the negation of a coordinate In particular f displaystyle f nbsp is a dictatorship a function depending on at most one coordinate The Friedgut Kalai Naor theorem 4 also known as the FKN theorem states that if f displaystyle f nbsp almost has degree 1 then it is close to a dictatorship Quantitatively if f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp and f gt 1 2 lt e displaystyle f gt 1 2 lt varepsilon nbsp then f displaystyle f nbsp is O e displaystyle O varepsilon nbsp close to a dictatorship that is f g 2 O e displaystyle f g 2 O varepsilon nbsp for some Boolean dictatorship g displaystyle g nbsp or equivalently Pr f g O e displaystyle Pr f neq g O varepsilon nbsp for some Boolean dictatorship g displaystyle g nbsp Similarly a Boolean function of degree at most d displaystyle d nbsp depends on at most C W 2 d displaystyle C W 2 d nbsp coordinates making it a junta a function depending on a constant number of coordinates where C W displaystyle C W nbsp is an absolute constant equal to at least 1 5 and at most 4 41 as shown by Wellens 5 The Kindler Safra theorem 6 generalizes the Friedgut Kalai Naor theorem to this setting It states that if f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp satisfies f gt d 2 lt e displaystyle f gt d 2 lt varepsilon nbsp then f displaystyle f nbsp is O e displaystyle O varepsilon nbsp close to a Boolean function of degree at most d displaystyle d nbsp Kahn Kalai Linial theorem edit The Poincare inequality for the Boolean cube which follows from formulas appearing above states that for a function f 1 1 n R displaystyle f colon 1 1 n to mathbb R nbsp Var f Inf f deg f Var f displaystyle operatorname Var f leq operatorname Inf f leq deg f cdot operatorname Var f nbsp This implies that max i Inf i f Var f n displaystyle max i operatorname Inf i f geq frac operatorname Var f n nbsp The Kahn Kalai Linial theorem 7 also known as the KKL theorem states that if f displaystyle f nbsp is Boolean then max i Inf i f W log n n displaystyle max i operatorname Inf i f Omega left frac log n n right nbsp The bound given by the Kahn Kalai Linial theorem is tight and is achieved by the Tribes function of Ben Or and Linial 8 x 1 1 x 1 w x 2 w 1 x 2 w w displaystyle x 1 1 land cdots land x 1 w lor cdots lor x 2 w 1 land cdots land x 2 w w nbsp The Kahn Kalai Linial theorem was one of the first results in the area and was the one introducing hypercontractivity into the context of Boolean functions Friedgut s junta theorem edit If f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp is an M displaystyle M nbsp junta a function depending on at most M displaystyle M nbsp coordinates then Inf f M displaystyle operatorname Inf f leq M nbsp according to the Poincare inequality Friedgut s theorem 9 is a converse to this result It states that for any e gt 0 displaystyle varepsilon gt 0 nbsp the function f displaystyle f nbsp is e displaystyle varepsilon nbsp close to a Boolean junta depending on exp Inf f e displaystyle exp operatorname Inf f varepsilon nbsp coordinates Combined with the Russo Margulis lemma Friedgut s junta theorem implies that for every p displaystyle p nbsp every monotone function is close to a junta with respect to m q displaystyle mu q nbsp for some q p displaystyle q approx p nbsp Invariance principle edit The invariance principle 10 generalizes the Berry Esseen theorem to non linear functions The Berry Esseen theorem states among else that if f i 1 n c i x i displaystyle f sum i 1 n c i x i nbsp and no c i displaystyle c i nbsp is too large compared to the rest then the distribution of f displaystyle f nbsp over 1 1 n displaystyle 1 1 n nbsp is close to a normal distribution with the same mean and variance The invariance principle in a special case informally states that if f displaystyle f nbsp is a multilinear polynomial of bounded degree over x 1 x n displaystyle x 1 ldots x n nbsp and all influences of f displaystyle f nbsp are small then the distribution of f displaystyle f nbsp under the uniform measure over 1 1 n displaystyle 1 1 n nbsp is close to its distribution in Gaussian space More formally let ps displaystyle psi nbsp be a univariate Lipschitz function let f S n f S x S displaystyle f sum S subseteq n hat f S chi S nbsp let k deg f displaystyle k deg f nbsp and let e max i S i f S 2 displaystyle varepsilon max i sum S ni i hat f S 2 nbsp Suppose that S f S 2 1 displaystyle sum S neq emptyset hat f S 2 leq 1 nbsp Then E x 1 1 n ps f x E g N 0 I ps f g O k 9 k e displaystyle left operatorname E x sim 1 1 n psi f x operatorname E g sim N 0 I psi f g right O k9 k varepsilon nbsp By choosing appropriate ps displaystyle psi nbsp this implies that the distributions of f displaystyle f nbsp under both measures are close in CDF distance which is given by sup t Pr f x lt t Pr f g lt t displaystyle sup t Pr f x lt t Pr f g lt t nbsp The invariance principle was the key ingredient in the original proof of the Majority is Stablest theorem Some applications editLinearity testing edit A Boolean function f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp is linear if it satisfies f x y f x f y displaystyle f xy f x f y nbsp where x y x 1 y 1 x n y n displaystyle xy x 1 y 1 ldots x n y n nbsp It is not hard to show that the Boolean linear functions are exactly the characters x S displaystyle chi S nbsp In property testing we want to test whether a given function is linear It is natural to try the following test choose x y 1 1 n displaystyle x y in 1 1 n nbsp uniformly at random and check that f x y f x f y displaystyle f xy f x f y nbsp If f displaystyle f nbsp is linear then it always passes the test Blum Luby and Rubinfeld 11 showed that if the test passes with probability 1 e displaystyle 1 varepsilon nbsp then f displaystyle f nbsp is O e displaystyle O varepsilon nbsp close to a Fourier character Their proof was combinatorial Bellare et al 12 gave an extremely simple Fourier analytic proof that also shows that if the test succeeds with probability 1 2 e displaystyle 1 2 varepsilon nbsp then f displaystyle f nbsp is correlated with a Fourier character Their proof relies on the following formula for the success probability of the test 1 2 1 2 S n f S 3 displaystyle frac 1 2 frac 1 2 sum S subseteq n hat f S 3 nbsp Arrow s theorem edit Arrow s impossibility theorem states that for three and more candidates the only unanimous voting rule for which there is always a Condorcet winner is a dictatorship The usual proof of Arrow s theorem is combinatorial Kalai 13 gave an alternative proof of this result in the case of three candidates using Fourier analysis If f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp is the rule that assigns a winner among two candidates given their relative orders in the votes then the probability that there is a Condorcet winner given a uniformly random vote is 3 4 3 4 Stab 1 3 f displaystyle frac 3 4 frac 3 4 operatorname Stab 1 3 f nbsp from which the theorem easily follows The FKN theorem implies that if f displaystyle f nbsp is a rule for which there is almost always a Condorcet winner then f displaystyle f nbsp is close to a dictatorship Sharp thresholds edit A classical result in the theory of random graphs states that the probability that a G n p displaystyle G n p nbsp random graph is connected tends to e e c displaystyle e e c nbsp if p log n c n displaystyle p sim frac log n c n nbsp This is an example of a sharp threshold the width of the threshold window which is O 1 n displaystyle O 1 n nbsp is asymptotically smaller than the threshold itself which is roughly log n n displaystyle frac log n n nbsp In contrast the probability that a G n p displaystyle G n p nbsp graph contains a triangle tends to e c 3 6 displaystyle e c 3 6 nbsp when p c n displaystyle p sim frac c n nbsp Here both the threshold window and the threshold itself are 8 1 n displaystyle Theta 1 n nbsp and so this is a coarse threshold Friedgut s sharp threshold theorem 14 states roughly speaking that a monotone graph property a graph property is a property which doesn t depend on the names of the vertices has a sharp threshold unless it is correlated with the appearance of small subgraphs This theorem has been widely applied to analyze random graphs and percolation On a related note the KKL theorem implies that the width of threshold window is always at most O 1 log n displaystyle O 1 log n nbsp 15 Majority is stablest edit Let Maj n 1 1 n 1 1 displaystyle operatorname Maj n colon 1 1 n to 1 1 nbsp denote the majority function on n displaystyle n nbsp coordinates Sheppard s formula gives the asymptotic noise stability of majority Stab r Maj n 1 2 p arccos r displaystyle operatorname Stab rho operatorname Maj n longrightarrow 1 frac 2 pi arccos rho nbsp This is related to the probability that if we choose x 1 1 n displaystyle x in 1 1 n nbsp uniformly at random and form y 1 1 n displaystyle y in 1 1 n nbsp by flipping each bit of x displaystyle x nbsp with probability 1 r 2 displaystyle frac 1 rho 2 nbsp then the majority stays the same Stab r Maj n 2 Pr Maj n x Maj n y 1 displaystyle operatorname Stab rho operatorname Maj n 2 Pr operatorname Maj n x operatorname Maj n y 1 nbsp There are Boolean functions with larger noise stability For example a dictatorship x i displaystyle x i nbsp has noise stability r displaystyle rho nbsp The Majority is Stablest theorem states informally then the only functions having noise stability larger than majority have influential coordinates Formally for every e gt 0 displaystyle varepsilon gt 0 nbsp there exists t gt 0 displaystyle tau gt 0 nbsp such that if f 1 1 n 1 1 displaystyle f colon 1 1 n to 1 1 nbsp has expectation zero and max i Inf i f t displaystyle max i operatorname Inf i f leq tau nbsp then Stab r f 1 2 p arccos r e displaystyle operatorname Stab rho f leq 1 frac 2 pi arccos rho varepsilon nbsp The first proof of this theorem used the invariance principle in conjunction with an isoperimetric theorem of Borell in Gaussian space since then more direct proofs were devised 16 17 Majority is Stablest implies that the Goemans Williamson approximation algorithm for MAX CUT is optimal assuming the unique games conjecture This implication due to Khot et al 18 was the impetus behind proving the theorem References edit a b O Donnell Ryan 2014 Analysis of Boolean functions Cambridge University Press arXiv 2105 10386 ISBN 978 1 107 03832 5 P Diaconis L Saloff Coste August 1996 Logarithmic Sobolev inequalities for finite Markov chains Annals of Applied Probability 6 3 695 750 doi 10 1214 AOAP 1034968224 ISSN 1050 5164 MR 1410112 Zbl 0867 60043 Wikidata Q62111462 Mossel Elchanan Oleszkiewicz Krzysztof Sen Arnab 2013 On reverse hypercontractivity Geometric and Functional Analysis 23 3 1062 1097 arXiv 1108 1210 doi 10 1007 s00039 013 0229 4 S2CID 15933352 Friedgut Ehud Kalai Gil Naor Assaf 2002 Boolean functions whose Fourier transform is concentrated on the first two levels Advances in Applied Mathematics 29 3 427 437 doi 10 1016 S0196 8858 02 00024 6 Wellens Jake 2020 Relationships between the number of inputs and other complexity measures of Boolean functions arXiv 2005 00566 doi 10 19086 da 57741 inactive 31 January 2024 a href Template Cite journal html title Template Cite journal cite journal a Cite journal requires journal help CS1 maint DOI inactive as of January 2024 link Kindler Guy 2002 Chapter 16 PDF Property testing PCP and juntas Thesis Tel Aviv University Kahn Jeff Kalai Gil Linial Nati 1988 The influence of variables on Boolean functions Proc 29th Symp on Foundations of Computer Science SFCS 88 White Plains IEEE pp 68 80 doi 10 1109 SFCS 1988 21923 Ben Or Michael Linial Nathan 1985 Collective coin flipping robust voting schemes and minima of Banzhaf values Proc 26th Symp on Foundations of Computer Science SFCS 85 Portland Oregon IEEE pp 408 416 doi 10 1109 SFCS 1985 15 Friedgut Ehud 1998 Boolean functions with low average sensitivity depend on few coordinates Combinatorica 18 1 474 483 CiteSeerX 10 1 1 7 5597 doi 10 1007 PL00009809 S2CID 15534278 Mossel Elchanan O Donnell Ryan Oleszkiewicz Krzysztof 2010 Noise stability of functions with low influences Invariance and optimality Annals of Mathematics 171 1 295 341 arXiv math 0503503 doi 10 4007 annals 2010 171 295 Blum Manuel Luby Michael Rubinfeld Ronitt 1993 Self testing correcting with applications to numerical problems J Comput Syst Sci 47 3 549 595 doi 10 1016 0022 0000 93 90044 W Bellare Mihir Coppersmith Don Hastad Johan Kiwi Marcos Sudan Madhu 1995 Linearity testing in characteristic two Proc 36th Symp on Foundations of Computer Science FOCS 95 Kalai Gil 2002 A Fourier theoretic perspective on the Condorcet paradox and Arrow s theorem PDF Advances in Applied Mathematics 29 3 412 426 doi 10 1016 S0196 8858 02 00023 4 Friedgut Ehud 1999 Sharp thresholds of graph properties and the k SAT problem Journal of the American Mathematical Society 12 4 1017 1054 doi 10 1090 S0894 0347 99 00305 7 Friedgut Ehud Kalai Gil 1996 Every monotone graph property has a sharp threshold Proceedings of the American Mathematical Society 124 10 2993 3002 doi 10 1090 S0002 9939 96 03732 X De Anindya Mossel Elchanan Neeman Joe 2016 Majority is Stablest Discrete and SoS PDF Theory of Computing 12 4 1 50 CiteSeerX 10 1 1 757 3048 doi 10 4086 toc 2016 v012a004 Eldan Ronen Mikulincer Dan Raghavendra Prasad June 2023 Noise stability on the Boolean hypercube via a renormalized Brownian motion STOC 2023 Proceedings of the 55th Annual ACM Symposium on Theory of Computing STOC Orlando Florida ACM pp 661 671 arXiv 2208 06508 doi 10 1145 3564246 3585118 Khot Subhash Kindler Guy Mossel Elchanan O Donnell Ryan 2007 Optimal inapproximability results for MAX CUT and other two variable CSPs PDF SIAM Journal on Computing 37 1 319 357 CiteSeerX 10 1 1 130 8886 doi 10 1137 S0097539705447372 S2CID 2090495 Retrieved from https en wikipedia org w index php title Analysis of Boolean functions amp oldid 1201752923, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.