Counterfactual explanations for a tensor flow neural network using Microsoft DICE library: An Experimental Prototype & Design
- Subject:Bachelor's or Master's Thesis with the goal to implement a neural network using tensor flow, to implement counterfactual explanations and design a an online experiment to evaluate the explanation approach.
- Type:Bachelor / Master Thesis
- Add on:
From supporting healthcare intervention decisions to informing criminal justice, artificial intelligence (AI) is now increasingly entering the mainstream and supporting high-consequence human decisions (Wang et al., 2019). However, the effectiveness of these systems will be limited by their inability to explain their internal processing to human users in these critical contexts (Wang et al., 2019). The high complexity of the underlying machine learning (ML) models translates into a lack of interpretability of their decisions. This represents a major problem in many applications such as healthcare and finance, where a rationale for the model’s decision might be a requirement for accountability (Dosilovic et al., 2018). As a consequence of the need for users to understand the behavior of these models, new regulations, such as the European Union General Data Protection Regulation (GDPR), have been implemented to provide the “right to explanation” of all decisions made or supported by artificial intelligence algorithms (Cheng et al., 2019; Ribeiro et al., 2018). However, it is challenging for people who are not machine learning experts to understand the logic behind the algorithms. Due to this, recipients of the algorithm’s output have difficulty understanding how or why certain inputs lead to a particular decision (Cheng et al., 2019).
There has been a resurgence in the area of explainable artificial intelligence (XAI) because researchers and practitioners seek to provide more transparency to AI systems (Miller, 2019). However, there is no standard and generally accepted definition of XAI. The term XAI refers to the movements, initiatives, and efforts made in response to AI transparency (Adadi & Berrada, 2018). To enable end-users to understand, trust, and effectively interact in AI-based systems, researchers have produced many different algorithms, visualizations, interfaces, and toolkits. Recent research has shown that people do not explain the causes of an event, but instead, they explain the cause of an event relative to some other event that did not occur (Miller, 2019). Thus, theories of contrastive explanations and counterfactual causality have gained attention among researchers (Mittelstadt et al., 2019). Empirical evidence indicates that humans physiologically prefer contrastive explanations over non-contrastive explanations (Miller, 2019). As a result, a contrastive explanation will be perceived as more understandable to a user seeking an explanation.
Goal of Thesis
There are three goals for the thesis. First, to implement a machine learning classifier by building a neural network using tensor flow for an open-source dataset. Second, to implement counterfactual explanations using Microsoft’s DICE library (Mothilal et al., 2019) for the tensor flow classifier. The implementation is done through coding in python. Third, the design and pre-test of an online experiment where the counterfactual explanation prototype is compared to an existing, baseline explanation approach.
- Strong analytical skills
- Very good time management, organizational, and communication skills
- Interest in Machine Learning and Data Analysis
- Development skills
- Programming skills for python and knowledge of common machine learning libraries such as sklearn, TensorFlow.
- Good English skills (as the language of the thesis is English)
If you are interested in this topic and want to apply for this thesis, please contact Miguel Angel Meza Martinez with a short motivation statement, your CV, and a current transcript of records. Feel free to reach out beforehand if you have any questions.
Adadi, A., & Berrada, M. (2018). Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6, 52138–52160. https://doi.org/10.1109/ACCESS.2018.2870052
Cheng, H.-F., Wang, R., Zhang, Z., O’Connell, F., Gray, T., Harper, F. M., & Zhu, H. (2019). Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19, 1–12. https://doi.org/10.1145/3290605.3300789
Dosilovic, F. K., Brcic, M., & Hlupic, N. (2018). Explainable artificial intelligence: A survey. 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2018 - Proceedings, 210–215. https://doi.org/10.23919/MIPRO.2018.8400040
Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007
Mittelstadt, B., Russell, C., & Wachter, S. (2019). Explaining explanations in AI. FAT* 2019 - Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency, 279–288. https://doi.org/10.1145/3287560.3287574
Mothilal, R. K., Sharma, A., & Tan, C. (2019). Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations. FAT* 2020 - Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 607–617. https://doi.org/10.1145/3351095.3372850
Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-Precision Model-Agnostic Explanations. Thirty-Second AAAI Conference on Artificial Intelligence. www.aaai.org
Wang, D., Yang, Q., Abdul, A., & Lim, B. Y. (2019). Designing theory-driven user-centric explainable AI. Conference on Human Factors in Computing Systems - Proceedings, 1–15. https://doi.org/10.1145/3290605.3300831