TY - GEN

T1 - Self-calibrating probability forecasting

AU - Vovk, Vladimir

AU - Shafer, Glenn

AU - Nouretdinov, Ilia

PY - 2004

Y1 - 2004

N2 - In the problem of probability forecasting the learner's goal is to output, given a training set and a new object, a suitable probability measure on the possible values of the new object's label. An on-line algorithm for probability forecasting is said to be well-calibrated if the probabilities it outputs agree with the observed frequencies. We give a natural nonasymptotic formalization of the notion of well-calibratedness, which we then study under the assumption of randomness (the object/label pairs are independent and identically distributed). It turns out that, although no probability forecasting algorithm is automatically well-calibrated in our sense, there exists a wide class of algorithms for "multiprobability forecasting" (such algorithms are allowed to output a set, ideally very narrow, of probability measures) which satisfy this property; we call the algorithms in this class "Venn probability machines". Our experimental results demonstrate that a 1-Nearest Neighbor Venn probability machine performs reasonably well on a standard benchmark data set, and one of our theoretical results asserts that a simple Venn probability machine asymptotically approaches the true conditional probabilities regardless, and without knowledge, of the true probability measure generating the examples.

AB - In the problem of probability forecasting the learner's goal is to output, given a training set and a new object, a suitable probability measure on the possible values of the new object's label. An on-line algorithm for probability forecasting is said to be well-calibrated if the probabilities it outputs agree with the observed frequencies. We give a natural nonasymptotic formalization of the notion of well-calibratedness, which we then study under the assumption of randomness (the object/label pairs are independent and identically distributed). It turns out that, although no probability forecasting algorithm is automatically well-calibrated in our sense, there exists a wide class of algorithms for "multiprobability forecasting" (such algorithms are allowed to output a set, ideally very narrow, of probability measures) which satisfy this property; we call the algorithms in this class "Venn probability machines". Our experimental results demonstrate that a 1-Nearest Neighbor Venn probability machine performs reasonably well on a standard benchmark data set, and one of our theoretical results asserts that a simple Venn probability machine asymptotically approaches the true conditional probabilities regardless, and without knowledge, of the true probability measure generating the examples.

UR - http://www.scopus.com/inward/record.url?scp=26944459867&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=26944459867&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:26944459867

SN - 0262201526

SN - 9780262201520

T3 - Advances in Neural Information Processing Systems

BT - Advances in Neural Information Processing Systems 16 - Proceedings of the 2003 Conference, NIPS 2003

PB - Neural information processing systems foundation

T2 - 17th Annual Conference on Neural Information Processing Systems, NIPS 2003

Y2 - 8 December 2003 through 13 December 2003

ER -