Hosted by the Interactive AI Centre for Doctoral Training
Abstract: The Bayes’ rule is known as an information acquisition paradigm and tells how the likelihood function is changing a prior belief into a posterior belief. This presentation is mainly about the calibration of likelihood functions in the context of classification and decision making. We are therefore interested in classifiers which output likelihood functions rather than probability distributions.
In the binary classification case, the likelihood function can be written in the form of a Log-Likelihood-Ratio (LLR). This has been known as the weight-of-evidence in Bayesian updating and is considered a good candidate to inform which hypothesis or class the data is supporting and how strong. The Bayes’ rule can be written as a sum between the LLR and the log-ratio of prior probabilities decoupling therefore the contribution of the data and the initial personal belief.
During this presentation, calibration and evaluation strategies for LLRs will be briefly discussed. Then, a property of calibrated LLRs known as the idempotence property will be presented. We will see how this leads to a constraint on the distribution of calibrated LLRs and how this allows to design a new non-linear discriminant analysis. The latter learns a diffeomorphism between the feature space and a base space where the discriminant component forms calibrated LLRs. As a brief digression, the concept of perfect privacy—on which I worked as part of my PhD thesis—and how the discriminant analysis can be used for this purpose will be presented.
However, the appealing properties of the LLR can not be generalized straightforwardly to non-binary cases. We will see how treating discrete probability distributions and likelihood functions as compositional data helps in getting around this issue. The sample space of compositional data is a simplex on which a Euclidean vector space structure—known as the Aitchison geometry—can be defined. With the coordinate representation given by the Isometric-Log-Ratio (ILR) approach, the Bayes’ rule is the translation of the prior distribution by the likelihood function. Within this space, the likelihood function—in the form of a ILR transformation of the likelihood vector (ILRL)—can be considered as the multiclass and multidimensional extension of the LLR.
Then, we will see how the idempotence property and its constraint on the distribution of calibrated LLRs generalize to ILRLs for multiclass applications. This allows to naturally extend the above discriminant analysis to cases where more than two classes are involved. This approach is called Compositional Discriminant Analysis and maps the data into a space where the discriminant components form calibrated likelihood functions expressed by the ILRLs.
Finally, we will briefly discuss some potential future works on the treatment of likelihood functions and probability distributions as compositional data for machine learning applications.
Part of the Lunch and Learn seminar series, email iai-cdt@bristol.ac.uk if you would like to attend