Controllably Sparse Perturbations of Robust Classifiers for Explaining Predictions and Probing Learned Concepts

Roberts, Jay; Tsiligkaridis, Theodoros

dc.contributor.author	Roberts, Jay	en_US
dc.contributor.author	Tsiligkaridis, Theodoros	en_US
dc.contributor.editor	Archambault, Daniel and Nabney, Ian and Peltonen, Jaakko	en_US
dc.date.accessioned	2021-06-12T11:28:25Z
dc.date.available	2021-06-12T11:28:25Z
dc.date.issued	2021
dc.identifier.isbn	978-3-03868-146-5
dc.identifier.uri	https://doi.org/10.2312/mlvis.20211072
dc.identifier.uri	https://diglib.eg.org:443/handle/10.2312/mlvis20211072
dc.description.abstract	Explaining the predictions of a deep neural network (DNN) in image classification is an active area of research. Many methods focus on localizing pixels, or groups of pixels, which maximize a relevance metric for the prediction. Others aim at creating local "proxy" explainers which aim to account for an individual prediction of a model. We aim to explore "why" a model made a prediction by perturbing inputs to robust classifiers and interpreting the semantically meaningful results. For such an explanation to be useful for humans it is desirable for it to be sparse; however, generating sparse perturbations can computationally expensive and infeasible on high resolution data. Here we introduce controllably sparse explanations that can be efficiently generated on higher resolution data to provide improved counter-factual explanations. Further we use these controllably sparse explanations to probe what the robust classifier has learned. These explanations could provide insight for model developers as well as assist in detecting dataset bias.	en_US
dc.publisher	The Eurographics Association	en_US
dc.subject	Computing methodologies
dc.subject	Machine learning
dc.subject	Artificial intelligence
dc.title	Controllably Sparse Perturbations of Robust Classifiers for Explaining Predictions and Probing Learned Concepts	en_US
dc.description.seriesinformation	Machine Learning Methods in Visualisation for Big Data
dc.description.sectionheaders	Papers
dc.identifier.doi	10.2312/mlvis.20211072
dc.identifier.pages	1-5

Files in this item

Name:: 001-005.pdf
Size:: 658.3Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Machine Learning Methods in Visualisation for Big Data 2021
ISBN 978-3-03868-146-5

Show simple item record