fbpx
Wikipedia

Lewis's triviality result

In the mathematical theory of probability, David Lewis's triviality result is a theorem about the impossibility of systematically equating the conditional probability with the probability of a so-called conditional event, .

Conditional probability and conditional events edit

The statement "The probability that if  , then  , is 20%" means (put intuitively) that event   may be expected to occur in 20% of the outcomes where event   occurs. The standard formal expression of this is  , where the conditional probability   equals, by definition,  .

Beginning in the 1960s, several philosophical logicians—most notably Ernest Adams and Robert Stalnaker—floated the idea that one might also write  , where   is the conditional event "If  , then  ".[1] That is, given events   and  , one might suppose there is an event,  , such that   could be counted on to equal  , so long as  .

Part of the appeal of this move would be the possibility of embedding conditional expressions within more complex constructions. One could write, say,  , to express someone's high subjective degree of confidence ("75% sure") that either  , or else if  , then  . Compound constructions containing conditional expressions might also be useful in the programming of automated decision-making systems.[2]

 
Fig. 1 – A diagram of  ,  , and  . The   symbol is not assumed to represent any particular operation. Specifically, it is not assumed that   can be identified with  .

How might such a convention be combined with standard probability theory? The most direct extension of the standard theory would be to treat   as an event like any other, i.e., as a set of outcomes. Adding   to the familiar Venn- or Euler diagram of   and   would then result in something like Fig. 1, where   are probabilities allocated to the eight respective regions, such that  .

For   to equal   requires that  , i.e., that the probability inside the   region equal the   region's proportional share of the probability inside the   region. In general the equality will of course not be true, so that making it reliably true requires a new constraint on probability functions: in addition to satisfying Kolmogorov's probability axioms, they must also satisfy a new constraint, namely that   for any events   and   such that  .

Lewis's result edit

Lewis (1976) pointed out a seemingly fatal problem with the above proposal: assuming a nontrivial set of events, the new, restricted class of  -functions will not be closed under conditioning, the operation that turns probability function   into new function  , predicated on event  's occurrence. That is, if  , it will not in general be true that   as long as  . This implies that if rationality requires having a well-behaved probability function, then a fully rational person (or computing system) would become irrational simply in virtue of learning that arbitrary event   had occurred. Bas van Fraassen called this result "a veritable bombshell" (1976, p. 273).

Lewis's proof is as follows. Let a set of events be non-trivial if it contains two possible events,   and  , that are mutually exclusive but do not together exhaust all possibilities, so that  ,  ,  , and  . The existence of two such events implies the existence of the event  , as well, and, if conditional events are admitted, the event  . The proof derives a contradiction from the assumption that such a minimally non-trivial set of events exists.

  1. Consider the probability of   after conditioning, first on   and then instead on  .
    • Conditioning on   gives  . But also, by the new constraint on  -functions,    . Therefore,  .
    • Conditioning on   gives  . But also,    . (The mutual exclusivity of   and   ensures that  .) Therefore,  .
  2. Instantiate the identity   as    . By the results from Step 1, the left side reduces to  , while the right side, by the new constraint on  -functions, equals  . Therefore,  , which means that  , which contradicts the stipulation that  . This completes the proof.

Graphical version edit

 
Fig. 2 – A diagram of disjoint   and  , and  .

A graphical version of the proof starts with Fig. 2, where the   and   from Fig. 1 are now disjoint and   has been replaced by  .[3] By the assumption that   and   are possible,   and  . By the assumption that together   and   do not together exhaust all possibilities,  . And by the new constraint on probability functions,    , which means that

(1)  

Conditioning on an event involves zeroing out the probabilities outside the event's region and increasing the probabilities inside the region by a common scale factor. Here, conditioning on   will zero out   and   and scale up   and  , to   and  , respectively, and so

(2)   which simplifies to  

Conditioning instead on   will zero out   and   and scale up   and  , and so

(3)   which simplifies to  

From (2), it follows that  , and since   is the scaled-up value of  , it must also be that  . Similarly, from (3),  . But then (1) reduces to  , which implies that  , which contradicts the stipulation that  .

Later developments edit

In a follow-up article, Lewis (1986) noted that the triviality proof can proceed by conditioning not on   and   but instead, by turns, on each of a finite set of mutually exclusive and jointly exhaustive events   He also gave a variant of the proof that involved not total conditioning, in which the probability of either   or   is set to 1, but partial conditioning (i.e., Jeffrey conditioning), by which probability is incrementally shifted from   to  .

Separately, Hájek (1989) pointed out that even without conditioning, if the number of outcomes is large but finite, then in general  , being a ratio of two outputs of the  -function, will take on more values than any single output of the function can. So, for instance, if in Fig. 1   are all multiples of 0.01 (as would be the case if there were exactly 100 equiprobable outcomes), then   must be a multiple of 0.01, as well, but   need not be. That being the case,   cannot reliably be made to equal  .

Hájek (1994) also argued that the condition   caused acceptable  -functions to be implausibly sparse and isolated from one another. One way to put the point: standardly, any weighted average of two probability function is itself a probability function, so that between any two  -functions there will be a continuum of weighted-average  -functions along which one of the original  -functions gradually transforms into the other. But these continua disappear if the added   condition is imposed. Now an average of two acceptable  -functions will in general not be an acceptable  -function.

Possible rejoinders edit

Assuming that   holds for a minimally nontrivial set of events and for any  -function leads to a contradiction. Thus   can hold for any  -function only for trivial sets of events—that is the triviality result. However, the proof relies on background assumptions that may be challenged. It may be proposed, for instance, that the referent event of an expression like “ ” is not fixed for a given   and  , but instead changes as the probability function changes. Or it may be proposed that conditioning on   should follow a rule other than  .

But the most common response, among proponents of the   condition, has been to explore ways to model conditional events as something other than subsets of a universe set of outcomes. Even before Lewis published his result, Schay (1968) had modeled conditional events as ordered pairs of sets of outcomes. With that approach and others in the same spirit, conditional events and their associated combination and complementation operations do not constitute the usual algebra of sets of standard probability theory, but rather a more exotic type of structure, known as a conditional event algebra.

Notes edit

  1. ^ Hájek and Hall (1994) give a historical summary. The debate was actually framed as being about the probabilities of conditional sentences, rather than conditional events. However, this is merely a difference of idiom, so long as sentences are taken to express propositions and propositions are thought of as sets of possible worlds.
  2. ^ Reading "If  , then  " as "Not  , unless also  " makes compounding straightforward, since   becomes equivalent to the Boolean expression  . However, this has the unsatisfactory consequence that  ; then "If  , then  " is assigned high probability whenever   is highly unlikely, even if  's occurrence would make   highly unlikely. This is a version of what in logic is called a paradox of material implication.
  3. ^ A proof starting with overlapping   and  , as in Fig. 1, would use mutually exclusive events   and   in place of   and  .

References edit

  • Hájek, Alan (1989). "Probabilities of conditionals – Revisited". Journal of Philosophical Logic. 18 (4): 423–428. doi:10.1007/BF00262944. JSTOR 30226421. S2CID 31355969.
  • Hájek, Alan (1994). "Triviality on the cheap?". In Eells, Ellery; Skyrms, Brian (eds.). Probability and Conditionals. Cambridge UP. pp. 113–140. ISBN 978-0521039338.
  • Hájek, Alan; Hall, Ned (1994). "The hypothesis of the conditional construal of conditional probability". In Eells, Ellery; Skyrms, Brian (eds.). Probability and Conditionals. Cambridge UP. pp. 75–111. ISBN 978-0521039338.
  • Lewis, David (1976). "Probabilities of conditionals and conditional probabilities". Philosophical Review. 85 (3): 297–315. doi:10.2307/2184045. JSTOR 2184045.
  • Lewis, David (1986). "Probabilities of conditionals and conditional probabilities II". Philosophical Review. 95 (4): 581–589. doi:10.2307/2185051. JSTOR 2185051.
  • Schay, Geza (1968). "An algebra of conditional events". Journal of Mathematical Analysis and Applications. 24 (2): 334–344. doi:10.1016/0022-247X(68)90035-8.
  • van Fraassen, Bas C. (1976). "Probabilities of conditionals". In Harper, W.; Hooker, C. (eds.). Foundations and Philosophy of Epistemic Applications of Probability Theory. Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, Volume I. D. Reidel. pp. 261–308. ISBN 978-9027706171.

lewis, triviality, result, mathematical, theory, probability, david, theorem, about, impossibility, systematically, equating, conditional, probability, displaystyle, with, probability, called, conditional, event, displaystyle, rightarrow, contents, conditional. In the mathematical theory of probability David Lewis s triviality result is a theorem about the impossibility of systematically equating the conditional probability P B A displaystyle P B mid A with the probability of a so called conditional event A B displaystyle A rightarrow B Contents 1 Conditional probability and conditional events 2 Lewis s result 2 1 Graphical version 3 Later developments 4 Possible rejoinders 5 Notes 6 ReferencesConditional probability and conditional events editThe statement The probability that if A displaystyle A nbsp then B displaystyle B nbsp is 20 means put intuitively that event B displaystyle B nbsp may be expected to occur in 20 of the outcomes where event A displaystyle A nbsp occurs The standard formal expression of this is P B A 0 20 displaystyle P B mid A 0 20 nbsp where the conditional probability P B A displaystyle P B mid A nbsp equals by definition P A B P A displaystyle P A cap B P A nbsp Beginning in the 1960s several philosophical logicians most notably Ernest Adams and Robert Stalnaker floated the idea that one might also write P A B 0 20 displaystyle P A rightarrow B 0 20 nbsp where A B displaystyle A rightarrow B nbsp is the conditional event If A displaystyle A nbsp then B displaystyle B nbsp 1 That is given events A displaystyle A nbsp and B displaystyle B nbsp one might suppose there is an event A B displaystyle A rightarrow B nbsp such that P A B displaystyle P A rightarrow B nbsp could be counted on to equal P B A displaystyle P B mid A nbsp so long as P A gt 0 displaystyle P A gt 0 nbsp Part of the appeal of this move would be the possibility of embedding conditional expressions within more complex constructions One could write say P A B C 0 75 displaystyle P A cup B rightarrow C 0 75 nbsp to express someone s high subjective degree of confidence 75 sure that either A displaystyle A nbsp or else if B displaystyle B nbsp then C displaystyle C nbsp Compound constructions containing conditional expressions might also be useful in the programming of automated decision making systems 2 nbsp Fig 1 A diagram of A displaystyle A nbsp B displaystyle B nbsp and A B displaystyle A rightarrow B nbsp The displaystyle rightarrow nbsp symbol is not assumed to represent any particular operation Specifically it is not assumed that A B displaystyle A rightarrow B nbsp can be identified with A B displaystyle A cup B nbsp How might such a convention be combined with standard probability theory The most direct extension of the standard theory would be to treat A B displaystyle A rightarrow B nbsp as an event like any other i e as a set of outcomes Adding A B displaystyle A rightarrow B nbsp to the familiar Venn or Euler diagram of A displaystyle A nbsp and B displaystyle B nbsp would then result in something like Fig 1 where s t z displaystyle s t ldots z nbsp are probabilities allocated to the eight respective regions such that s t z 1 displaystyle s t cdots z 1 nbsp For P A B displaystyle P A rightarrow B nbsp to equal P B A displaystyle P B mid A nbsp requires that t v w y s t s t x y displaystyle t v w y s t s t x y nbsp i e that the probability inside the A B displaystyle A rightarrow B nbsp region equal the A B displaystyle A cap B nbsp region s proportional share of the probability inside the A displaystyle A nbsp region In general the equality will of course not be true so that making it reliably true requires a new constraint on probability functions in addition to satisfying Kolmogorov s probability axioms they must also satisfy a new constraint namely that P A B P B A displaystyle P A rightarrow B P B mid A nbsp for any events A displaystyle A nbsp and B displaystyle B nbsp such that P A gt 0 displaystyle P A gt 0 nbsp Lewis s result editLewis 1976 pointed out a seemingly fatal problem with the above proposal assuming a nontrivial set of events the new restricted class of P displaystyle P nbsp functions will not be closed under conditioning the operation that turns probability function P displaystyle P nbsp into new function P C P C displaystyle P C cdot P cdot mid C nbsp predicated on event C displaystyle C nbsp s occurrence That is if P A B P B A displaystyle P A rightarrow B P B mid A nbsp it will not in general be true that P C A B P C B A displaystyle P C A rightarrow B P C B mid A nbsp as long as P C gt 0 displaystyle P C gt 0 nbsp This implies that if rationality requires having a well behaved probability function then a fully rational person or computing system would become irrational simply in virtue of learning that arbitrary event C displaystyle C nbsp had occurred Bas van Fraassen called this result a veritable bombshell 1976 p 273 Lewis s proof is as follows Let a set of events be non trivial if it contains two possible events A displaystyle A nbsp and B displaystyle B nbsp that are mutually exclusive but do not together exhaust all possibilities so that P A gt 0 displaystyle P A gt 0 nbsp P B gt 0 displaystyle P B gt 0 nbsp P A B 0 displaystyle P A cap B 0 nbsp and P A B lt 1 displaystyle P A cup B lt 1 nbsp The existence of two such events implies the existence of the event A B displaystyle A cup B nbsp as well and if conditional events are admitted the event A B A displaystyle A cup B rightarrow A nbsp The proof derives a contradiction from the assumption that such a minimally non trivial set of events exists Consider the probability of A B A displaystyle A cup B rightarrow A nbsp after conditioning first on A displaystyle A nbsp and then instead on A displaystyle A nbsp Conditioning on A displaystyle A nbsp gives P A A B A P A B A A P A displaystyle P A A cup B rightarrow A P A cup B rightarrow A cap A P A nbsp But also by the new constraint on P displaystyle P nbsp functions P A A B A P A A B A P A A B displaystyle P A A cup B rightarrow A P A A cup B cap A P A A cup B nbsp P A B A A P A B A 1 1 1 displaystyle P A cup B cap A mid A P A cup B mid A 1 1 1 nbsp Therefore P A B A A P A displaystyle P A cup B rightarrow A cap A P A nbsp Conditioning on A displaystyle A nbsp gives P A A B A P A B A A P A displaystyle P A A cup B rightarrow A P A cup B rightarrow A cap A P A nbsp But also P A A B A P A A B A P A A B displaystyle P A A cup B rightarrow A P A A cup B cap A P A A cup B nbsp P A B A A P A B A 0 P A B A 0 displaystyle P A cup B cap A mid A P A cup B mid A 0 P A cup B mid A 0 nbsp The mutual exclusivity of B displaystyle B nbsp and A displaystyle A nbsp ensures that P A B A 0 displaystyle P A cup B mid A neq 0 nbsp Therefore P A B A A 0 displaystyle P A cup B rightarrow A cap A 0 nbsp Instantiate the identity P X Y P X Y P X displaystyle P X cap Y P X cap Y P X nbsp as P A B A A P A B A A displaystyle P A cup B rightarrow A cap A P A cup B rightarrow A cap A nbsp P A B A displaystyle P A cup B rightarrow A nbsp By the results from Step 1 the left side reduces to P A displaystyle P A nbsp while the right side by the new constraint on P displaystyle P nbsp functions equals P A B A P A B P A P A B displaystyle P A cup B cap A P A cup B P A P A cup B nbsp Therefore P A P A P A B displaystyle P A P A P A cup B nbsp which means that P A B 1 displaystyle P A cup B 1 nbsp which contradicts the stipulation that P A B lt 1 displaystyle P A cup B lt 1 nbsp This completes the proof Graphical version edit nbsp Fig 2 A diagram of disjoint A displaystyle A nbsp and B displaystyle B nbsp and A B A displaystyle A cup B rightarrow A nbsp A graphical version of the proof starts with Fig 2 where the A displaystyle A nbsp and B displaystyle B nbsp from Fig 1 are now disjoint and A B displaystyle A rightarrow B nbsp has been replaced by A B A displaystyle A cup B rightarrow A nbsp 3 By the assumption that A displaystyle A nbsp and B displaystyle B nbsp are possible x y gt 0 displaystyle x y gt 0 nbsp and u v gt 0 displaystyle u v gt 0 nbsp By the assumption that together A displaystyle A nbsp and B displaystyle B nbsp do not together exhaust all possibilities u v x y lt 1 displaystyle u v x y lt 1 nbsp And by the new constraint on probability functions P A B A P A A B displaystyle P A cup B rightarrow A P A mid A cup B nbsp P A A B P A B P A P A B displaystyle P A cap A cup B P A cup B P A P A cup B nbsp which means that 1 y v w x y x y u v displaystyle y v w frac x y x y u v nbsp Conditioning on an event involves zeroing out the probabilities outside the event s region and increasing the probabilities inside the region by a common scale factor Here conditioning on A displaystyle A nbsp will zero out u v displaystyle u v nbsp and w displaystyle w nbsp and scale up x displaystyle x nbsp and y displaystyle y nbsp to x A displaystyle x A nbsp and y A displaystyle y A nbsp respectively and so 2 y A 0 0 x A y A x A y A 0 0 displaystyle y A 0 0 frac x A y A x A y A 0 0 nbsp which simplifies to y A 1 displaystyle y A 1 nbsp Conditioning instead on A displaystyle A nbsp will zero out x displaystyle x nbsp and y displaystyle y nbsp and scale up u v displaystyle u v nbsp and w displaystyle w nbsp and so 3 0 v A w A 0 0 0 0 u A v A displaystyle 0 v A w A frac 0 0 0 0 u A v A nbsp which simplifies to v A w A 0 displaystyle v A w A 0 nbsp From 2 it follows that x A 0 displaystyle x A 0 nbsp and since x A displaystyle x A nbsp is the scaled up value of x displaystyle x nbsp it must also be that x 0 displaystyle x 0 nbsp Similarly from 3 v w 0 displaystyle v w 0 nbsp But then 1 reduces to y y y u displaystyle y y y u nbsp which implies that y u 1 displaystyle y u 1 nbsp which contradicts the stipulation that u v x y lt 1 displaystyle u v x y lt 1 nbsp Later developments editIn a follow up article Lewis 1986 noted that the triviality proof can proceed by conditioning not on A displaystyle A nbsp and A displaystyle A nbsp but instead by turns on each of a finite set of mutually exclusive and jointly exhaustive events A C D E displaystyle A C D E ldots nbsp He also gave a variant of the proof that involved not total conditioning in which the probability of either A displaystyle A nbsp or A displaystyle A nbsp is set to 1 but partial conditioning i e Jeffrey conditioning by which probability is incrementally shifted from A displaystyle A nbsp to A displaystyle A nbsp Separately Hajek 1989 pointed out that even without conditioning if the number of outcomes is large but finite then in general P B A P A B P A displaystyle P B mid A P A cap B P A nbsp being a ratio of two outputs of the P displaystyle P nbsp function will take on more values than any single output of the function can So for instance if in Fig 1 s t displaystyle s t ldots nbsp are all multiples of 0 01 as would be the case if there were exactly 100 equiprobable outcomes then P A B displaystyle P A rightarrow B nbsp must be a multiple of 0 01 as well but P A B P A displaystyle P A cap B P A nbsp need not be That being the case P A B displaystyle P A rightarrow B nbsp cannot reliably be made to equal P B A displaystyle P B mid A nbsp Hajek 1994 also argued that the condition P A B P B A displaystyle P A rightarrow B P B mid A nbsp caused acceptable P displaystyle P nbsp functions to be implausibly sparse and isolated from one another One way to put the point standardly any weighted average of two probability function is itself a probability function so that between any two P displaystyle P nbsp functions there will be a continuum of weighted average P displaystyle P nbsp functions along which one of the original P displaystyle P nbsp functions gradually transforms into the other But these continua disappear if the added P A B P B A displaystyle P A rightarrow B P B mid A nbsp condition is imposed Now an average of two acceptable P displaystyle P nbsp functions will in general not be an acceptable P displaystyle P nbsp function Possible rejoinders editAssuming that P A B P B A displaystyle P A rightarrow B P B mid A nbsp holds for a minimally nontrivial set of events and for any P displaystyle P nbsp function leads to a contradiction Thus P A B P B A displaystyle P A rightarrow B P B mid A nbsp can hold for any P displaystyle P nbsp function only for trivial sets of events that is the triviality result However the proof relies on background assumptions that may be challenged It may be proposed for instance that the referent event of an expression like A B displaystyle A rightarrow B nbsp is not fixed for a given A displaystyle A nbsp and B displaystyle B nbsp but instead changes as the probability function changes Or it may be proposed that conditioning on C displaystyle C nbsp should follow a rule other than P C P C displaystyle P C cdot P cdot mid C nbsp But the most common response among proponents of the P A B P B A displaystyle P A rightarrow B P B mid A nbsp condition has been to explore ways to model conditional events as something other than subsets of a universe set of outcomes Even before Lewis published his result Schay 1968 had modeled conditional events as ordered pairs of sets of outcomes With that approach and others in the same spirit conditional events and their associated combination and complementation operations do not constitute the usual algebra of sets of standard probability theory but rather a more exotic type of structure known as a conditional event algebra Notes edit Hajek and Hall 1994 give a historical summary The debate was actually framed as being about the probabilities of conditional sentences rather than conditional events However this is merely a difference of idiom so long as sentences are taken to express propositions and propositions are thought of as sets of possible worlds Reading If A displaystyle A nbsp then B displaystyle B nbsp as Not A displaystyle A nbsp unless also B displaystyle B nbsp makes compounding straightforward since A B displaystyle A rightarrow B nbsp becomes equivalent to the Boolean expression A B displaystyle A cup B nbsp However this has the unsatisfactory consequence that P A B P A displaystyle P A rightarrow B geq P A nbsp then If A displaystyle A nbsp then B displaystyle B nbsp is assigned high probability whenever A displaystyle A nbsp is highly unlikely even if A displaystyle A nbsp s occurrence would make B displaystyle B nbsp highly unlikely This is a version of what in logic is called a paradox of material implication A proof starting with overlapping A displaystyle A nbsp and B displaystyle B nbsp as in Fig 1 would use mutually exclusive events A B displaystyle A cap B nbsp and A B displaystyle A cap B nbsp in place of A displaystyle A nbsp and B displaystyle B nbsp References editHajek Alan 1989 Probabilities of conditionals Revisited Journal of Philosophical Logic 18 4 423 428 doi 10 1007 BF00262944 JSTOR 30226421 S2CID 31355969 Hajek Alan 1994 Triviality on the cheap In Eells Ellery Skyrms Brian eds Probability and Conditionals Cambridge UP pp 113 140 ISBN 978 0521039338 Hajek Alan Hall Ned 1994 The hypothesis of the conditional construal of conditional probability In Eells Ellery Skyrms Brian eds Probability and Conditionals Cambridge UP pp 75 111 ISBN 978 0521039338 Lewis David 1976 Probabilities of conditionals and conditional probabilities Philosophical Review 85 3 297 315 doi 10 2307 2184045 JSTOR 2184045 Lewis David 1986 Probabilities of conditionals and conditional probabilities II Philosophical Review 95 4 581 589 doi 10 2307 2185051 JSTOR 2185051 Schay Geza 1968 An algebra of conditional events Journal of Mathematical Analysis and Applications 24 2 334 344 doi 10 1016 0022 247X 68 90035 8 van Fraassen Bas C 1976 Probabilities of conditionals In Harper W Hooker C eds Foundations and Philosophy of Epistemic Applications of Probability Theory Foundations of Probability Theory Statistical Inference and Statistical Theories of Science Volume I D Reidel pp 261 308 ISBN 978 9027706171 Retrieved from https en wikipedia org w index php title Lewis 27s triviality result amp oldid 1186000597, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.