Browse/search for people

Publication - Mr Kacper Sokol

    Counterfactual explanations of machine learning predictions

    Opportunities and challenges for AI safety

    Citation

    Sokol, K & Flach, P, 2019, ‘Counterfactual explanations of machine learning predictions: Opportunities and challenges for AI safety’. in: Proceedings of the AAAI Workshop on Artificial Intelligence Safety 2019: co-located with the Thirty-Third AAAI Conference on Artificial Intelligence 2019 (AAAI 2019) Honolulu, Hawaii, January 27, 2019. CEUR Workshop Proceedings

    Abstract

    One necessary condition for creating a safe AI system is making it transparent to uncover any unintended or harmful behaviour. Transparency can be achieved by explaining predictions of an AI system with counterfactual statements, which are becoming a de facto standard in explaining algorithmic decisions. The popularity of counterfactuals is mainly attributed to their compliance with the “right to explanation” introduced by the European Union’s General Data Protection Regulation and them being understandable by a lay audience as well as domain experts. In this paper we describe our experience and the lessons learnt from explaining decision tree models trained on UCI German Credit and FICO Explainable Machine Learning Challenge data sets with class-contrastive counterfactual statements. We review how counterfactual explanations can affect an artificial intelligence system and its safety by investigating their risks and benefits. We show example explanations, discuss their strengths and weaknesses, show how they can be used to debug the underlying model, inspect its fairness and unveil security and privacy challenges that they pose.

    Full details in the University publications repository