In a discussion with Alessandro Strumia too others, Tommaso Dorigo has repeated some of his opinions well-nigh the Jeffreys-Lindley's paradox (Wikipedia) which, inwards Dorigo's opinion, makes Bayesian thinking unusable inwards experimental particle physics (and in all probability everywhere because all other situations are analogous). He has previously written well-nigh it in 2012 too the paradox was also discussed yesteryear W.M. Briggs too others.
Czechia: Off-topic, geography: the U.N. dabatases instantly listing "Czechia" every bit an official holler of my homeland. Czech report, Bloomberg. Its usage is non mandatory but those who utilization incorrect names volition live stabbed to popular off yesteryear Mr Ban Ki-moon. Despite widespread fearmongering, the Prague Castle hasn't collapsed yet.The paradox is meant to live ane inwards which frequentism too Bayesianism give reverse verdicts well-nigh the validity of a hypothesis. Just to live sure, the frequentist Definition of the probability is that every probability should live measured (and measurable) every bit the ratio \(N_{\rm OK}/N_{\rm total}\) for a limiting, really large issue of trials \(N_{\rm total}\). The Bayesian probability admits that probabilities are used to quantify the subjective belief inwards the validity of statements too at that topographic point exists a rational method (involving the Bayesian theorem) how to correctly update these probabilities (beliefs) fifty-fifty if nosotros can't ever brand measurements with \(N_{\rm total}\to\infty\).
What's the paradox? Wikipedia gives a elementary example. Try to examine whether half of the newborn infants are boys. You await for the nativity of some 1 or 1.4 or 2 meg children, I don't know which it is, too the "one-half hypothesis" predicts that half of those kids volition live boys, summation minus 1,000 (the criterion deviation). The distribution is binomial – almost exactly normal.
However, the measured issue of boys inwards a acre volition live some issue just about 3,000 boys below the exact "one-half prediction". The inquiry is whether this evidence makes the "one-half hypothesis" proven or disproven.
The frequentist exclusively cares well-nigh the hypothesis, non well-nigh the negation, then he sees a difference of the experimental fact from the prediction yesteryear three sigma too falsifies the "one-half hypothesis". In effect, the frequentist is satisfied to meet a little conditional probability \(P(E|H)\), the probability of the observed evidence predicted yesteryear a hypothesis, to falsify \(H\). The probability of a newborn kid to live virile someone isn't exactly one-half. The 3-sigma difference ruled this theory out.
On the other hand, the Bayesian calculates the probability of all hypotheses, including the "negations", too he insists on showing that \(P(H|E)\) is little – complaint the reverse social club – if he wants to eliminate \(H\). The negation of the "one-half hypothesis" is basically maxim that the pct of the boys is an unknown issue uniformly distributed betwixt 0% too 100%. Because the dubiousness of the Bayesian prediction for the pct of boys made yesteryear the "not 50% hypothesis" was then much higher, it is really unlikely to gain a issue unopen to 50%. In other words, according to the Bayesian, the "not 50% hypothesis" has no explanation why the pct was rather unopen to 50%. Bayes' theorem punishes this "not half hypothesis" for this vagueness of the prediction too the final result may destination upward beingness that the posterior probability for the "not half hypothesis" is fifty-fifty (much) smaller than that of the "one-half hypothesis".
For the Bayesian, the "one-half hypothesis" did a lousy project but the reverse hypothesis did no project at all – it had no clue well-nigh the percentage, non fifty-fifty just about – then the "one-half hypothesis" wins over the negation despite the lousy fit.
OK, is at that topographic point a paradox? No. The really specific hypothesis that the fraction is 0.50000 was ruled out at some 99.7% level, indeed. But what was considered the "negation" of the hypothesis – that it's a random issue betwixt 0% too 100% – could receive got been falsified, too. My take in is that the argue why there's no contradiction is that the purported "negation" isn't genuinely a negation at all. More precisely, the uniform distribution on the interval from 0% to 100% inwards no way follows from the supposition that "the pct of boys isn't 50%".
Why? Because the actual "negation of the half hypothesis" is an extremely vague hypothesis that doesn't genuinely allow you lot to brand whatever predictions at all. For this reason, you lot can't genuinely calculate the "probabilities of an observation" predicted yesteryear this "it is non 50% of boys" hypothesis. And that's what prevents you lot from applying Bayes' theorem genuinely accurately.
Bayesian inference plant cracking but the competing hypotheses should live sufficiently well-defined then that the probabilities of diverse observations may live quantitatively predicted from these hypotheses. This is satisfied for the "uniform distribution betwixt 0% too 100%" but that hypothesis may justifiably live heavily disfavored fifty-fifty relatively to hypotheses that are ruled out. On the other hand, a improve interpretation of the "not 50% of boys hypothesis" may give to a greater extent than reasonable predictions for the pct (uncertain but unopen to 48-52 percent) but it's non clear what the predictions are too why.
Equivalently, the contention "the dominion is something else than 50% for boys" may receive got a probability that is nearly 100% when the "50% theory" is ruled out at three sigma. This nearly 100% is the sum of the probabilities of all the detailed alternative, mutually exclusive theories. However, if nosotros create upward one's heed that this "not 50% for boys" should live one hypothesis, its probability is something else – it's a weighted average of the private hypotheses it contains, too that tin live low, fifty-fifty lower than the 3-sigma-excluded "50% theory".
How should nosotros locomote along inwards this illustration of the 49% of boys? Well, nosotros may formulate some other hypothesis, i.e. that the pct of boys is 50% summation minus 2%, too this hypothesis volition crunch both the "exactly 50% hypothesis" too "the uniform distribution betwixt 0% too 100% hypothesis". It's mutual feel because 50% summation minus 2% is what we're usually getting for boys (well, maybe 49% summation minus 2%). But where does this improve hypothesis fit? Does it represent to the "exactly 50% slice" or "its negation"?
Well, inwards principle, it contradicts the "exactly 50% hypothesis". When you lot stair out the pct many times, you lot may create upward one's heed whether the fraction is exactly 50% or not. But this improve hypothesis is clearly inequivalent to the negation mentioned above, the negation that assumes the uniform distribution. The improve hypothesis is a slice of the negation which is really unopen to the "exactly 50% hypothesis" inwards some metric.
At whatever rate, the right way to bargain with these situations is to receive got sufficiently well-defined yet genuinely feasible or realistic hypotheses to run every bit competitors. When you lot believe that a quantity is unopen to 50% but it isn't quite there, you lot should say it, e.g. define a distribution preferring numbers unopen to 50% but non necessarily 50%. (This province of affairs appears inwards particle physics all the time, with all the constants that are much smaller than ane or the order-of-magnitude estimates if those are dimensionful.)
When you lot receive got such promising hypotheses, you lot may compare them inwards the Bayesian way. This competition volition live fair too you lot should trust the results. So the improve "medium" hypothesis volition crunch all the extreme ones. In principle, when lots of evidence is accumulated, the "medium" hypothesis is mutually exclusive with the theories on both sides. But ane must e'er appreciate that if the full amount of information is known to live limited, hypotheses that audio differently may soundless overlap – then they are not mutually exclusive.
For example, the hypothesis that the pct of boys inwards a normal distribution around 50% summation minus 1%; too that it is 50% summation minus 1.01% – those are 2 "different" hypotheses. If the distribution may live measured (and you lot demand a huge issue of births for that), they may live strictly distinguished. But for whatever realistic finite issue of births, the hypotheses are effectively equivalent. So when you lot demand that the full probability of all such hypotheses is 100%, you lot are making something fishy. Because this span of hypotheses is basically equivalent, you lot are genuinely double- (or multiple-) counting the probability of this hypothesis if you lot compute the meat of their probabilities, then this meat should live allowed to live greater than 100%.
Hypotheses that "compete" inwards the Bayesian reasoning should live sufficiently well-defined for them to predict some probabilities of observations; producing predictions that sufficiently differ from those of other hypotheses; too sufficiently realistic non to predict completely incorrect values most of the time.
Let's apply this to the cosmological constant. We may receive got a hypothesis that \(\Lambda=0\); the hypothesis that \(\Lambda\) is nonzero too uniformly distributed inwards an interval of numbers comparable to \(m_{\rm Planck}^4\); too a to a greater extent than realistic yet seemingly "artificial" hypothesis that\[
\Lambda=m_{\rm Planck}^4 \cdot \exp(-100E)
\] where \(E\) has a normal distribution around null with the criterion difference one. Clearly, the latter hypothesis volition win. (I could receive got invented nicer or to a greater extent than justified similar distributions but I wanted to brand things simple.) The experiments betoken that \(E\sim 1.23\) because the cosmological constant is 123 orders of magnitude away from the "Planckian estimate" hypothesis. And \(E\sim 1.23\) is perfectly probable according to the normal distribution.
The exclusively objection you lot mightiness receive got is that my hypothesis was genuinely built later on I learned the measured value of \(\Lambda\) then this hypothesis is "artificial". I was "cheating". I don't receive got whatever explanation for this shape of the distribution for \(\Lambda\). Right. Except that fifty-fifty if I don't receive got an explanation for this distribution, ane may exist too it's a legitimate possibility to assume that such an explanation exists – fifty-fifty if no ane knows what it could be. The absence of an explanation for \(\exp(-100E)\) is a disadvantage inwards the eyes of a theorist – but it shouldn't live a disadvantage inwards the eyes of an experimenter. An experimenter should live able to impartially compare well-defined hypotheses whether they audio motivated to him or not.
After all, Weinberg basically did calculate a similar distribution (but ane with \(E\) to a greater extent than tightly focused on \(E\sim 1.23\)) using the anthropic observation (the existence of stars too the remarkable science of the early on cosmology to easily brand stars impossible). When you lot compare "motivated" hypotheses well-nigh the cosmological constant – those with a plausible detailed explanation of their construction – you lot volition sure meet that Weinberg's distribution for \(\Lambda\) is alongside the best hypotheses, in all probability the best one.
But at that topographic point may be other explanations of the tiny cosmological constant which are possibly (even) to a greater extent than quantitative than Weinberg's anthropic estimate. String theory mightiness receive got an explanation why \(\Lambda\) is proportional to an exponential of something simpler; too why the exponent is naturally of the shape \(100E\) where \(E\) is fifty-fifty simpler (and \(100\) is an guess of the homology classes of a typical compactification manifold, for example). The fact that nosotros don't know the precise logic does not hateful that nosotros may eliminate these possibilities a priori. These explanations mightiness live right fifty-fifty if their details – the detailed reasons why they predict what they predict – are unknown at this moment.
(Analogous comments concur non exclusively for the cosmological constant but also for the Higgs mass inwards the Planck units, CP-violating phases, Yukawa couplings, too several other "small" parameters nosotros know inwards Nature.)
This "nonzero probability" of these "so far unknown" hypotheses is something totally different than the claim that physicists should locomote on them much of the time. These hypotheses of an unknown shape may live then hard that physicists mightiness holler upward that it's a waste matter of fourth dimension to endeavor to laid upward on these hard puzzles right now. But they may soundless believe that these hypotheses are likely. The fourth dimension spent on a theory isn't quite proportional to the probability that the theory is right – although the proportionality should in all probability live just about truthful if "all other aspects are equal" (which they almost never are, however).
To summarize, at that topographic point is no paradox. To avoid mistakes including the incorrect claim that there's a paradox, nosotros should:
- distinguish vague "I say goose egg well-nigh the parameters" from the "parameter is distributed uniformly" hypotheses: the latter may easily live falsified which doesn't falsify the old (this alert of mine is at to the lowest degree morally equivalent to my criticism of the unjustifiable "typicality" supposition of the believers inwards the anthropic principle)
- consider hypotheses of the "medium" type that brand little but nonzero values (e.g. small-scale deviations of boys from 50% or the cosmological constant) reasonably likely
- try to adjust the shape of the hypotheses inwards such a way that they're distinguishable yesteryear realistic portions of the information nosotros desire to consider – that they are mutually exclusive fifty-fifty inwards practise (otherwise the given parcel of the empirical information is non useful for the discrimination, too whenever it's so, nosotros should live aware of this fact too consider it a vice of the experiment inwards the context of the theories, non a vice of whatever competing hypothesis itself)
- appreciate that the typical scenario inwards all of scientific discipline is that theories that predict quantities "pretty well" yet imprecisely are to a greater extent than useful than theories that predict goose egg well-nigh these quantities; inwards this typical case, the "pretty good" yet imprecise theories may live falsified but they may soundless crunch the "no prediction" theories with the uniform distribution; when it's so, the adjacent pace should live to endeavor to stand upward on the shoulders of giants too define a "refined" or "corrected" theory that makes predictions similar to those of the specific theory but with some corrections that are said to live small
When ane is careful, the paradox isn't there. In particular, it is non true, every bit Tommaso Dorigo tries to claim, that the Jeffreys-Lindley paradox shows that the Bayesian thinking well-nigh scientific theories is unusable.
At the end, experiments' primary purpose is to gain verdicts well-nigh the relative likelihood of diverse explanations too hypotheses. If experiments don't tell us whether some theory or hypothesis is feasible or not, they're useless for theorists.
Dorigo's focus on the frequentist thinking is ultimately unusable for all of scientific discipline too for theorists because those but demand to know the probability that some theory or hypothesis or contention is true. But this Dorigo's focus is legitimate to a express extent. Namely that the "frequentist probabilities" that the experimenters gain may live viewed every bit an "isolated part" coming from the evaluation of an experiment. Theorists may combine this isolated purpose (a project that the experimenters are trained for too should live skillful at) with other reasoning – which also includes some Bayesian reasoning involving their prior probabilities of different hypotheses (vague but vital information that a theorist should e'er endeavor to receive got some clue about) – too at this moment, the experimental information are useful for the theorists.
Dorigo has told us that he believes that the probability of whatever novel physics at the LHC is \(10^{-10}\). So fifty-fifty if he flora a 6-sigma evidence for a novel BSM particle, he would believe that it's non real! This fact manifestly agency that Dorigo is a prejudiced bigot. But every bit long every bit he (and others at the LHC) won't cheat inwards the experiments or enshroud some inconvenient evidence for novel physics too every bit long every bit they inform us well-nigh the correctly measured frequentist probabilities, nosotros may take in his claims well-nigh the "near certainty that physics is over" to live just some stupid hobby or religious ritual unrelated to his actual work.
In other words, Dorigo too others is a domestic dog who is barking the incorrect tree but he may soundless live useful inwards biting an unwelcome guest.
No comments:
Post a Comment