This book presents a taxonomy framework and survey of methods relevant to explaining the decisions and analyzing the inner workings of Natural Language Processing (NLP) models. The book is intended to provide a snapshot of Explainable NLP, though the field continues to rapidly grow. The book is intended to be both readable by first-year M.Sc. students and interesting to an expert audience. The book opens by motivating a focus on providing a consistent taxonomy, pointing out inconsistencies and redundancies in previous taxonomies. It goes on to present (i) a taxonomy or framework for thinking about how approaches to explainable NLP relate to one another; (ii) brief surveys of each of the classes in the taxonomy, with a focus on methods that are relevant for NLP; and (iii) a discussion of the inherent limitations of some classes of methods, as well as how to best evaluate them. Finally, the book closes by providing a list of resources for further research on explainability.
Acknowledgments Introduction Two Common Distinctions Local and Global Explanations Intrinsic and Post-Hoc Explanations Shortcomings of Existing Taxonomies Guidotti et al. (2018) Adadi and Berrada (2019) Carvalho et al. (2019) Molnar (2019) Zhang et al. (2020) Danilevsky et al. (2020) Das et al. (2020) Atanasova et al. (2020) Kotonya and Toni (2020) The Method-Form Fallacy Inconsistent Classifications A Novel Taxonomy A Framework for Explainable NLP NLP Architectures Linear and Nonlinear Classification Recurrent Models Transformers Overview of Applications and Architectures Local and Global Explanations Backward Methods Forward Explaining by Intermediate Representations Forward Explaining by Continuous Outputs Forward Explaining by Discrete Outputs Local-Backward Explanations Vanilla Gradients Guided Back-Propagation Layer-Wise Relevance Propagation Deep Taylor Decomposition Integrated Gradients DeepLift Global-Backward Explanations Post-Hoc Unstructured Pruning Lottery Tickets Dynamic Sparse Training Binary Networks and Sparse Coding Local-Forward Explanations of Intermediate Representations Gates Attention Attention Roll-Out and Attention Flow Layer-Wise Attention Tracing Attention Decoding Global-Forward Explanations of Intermediate Representations Gate Pruning Attention Head Pruning Local-Forward Explanations of Continuous Output Word Association Norms Word Analogies Time Step Dynamics Global-Forward Explanations of Continuous Output Correlation of Representations Clustering Probing Classifiers Concept Activation Influential Examples Local-Forward Explanations of Discrete Output Challenge Datasets Local Uptraining Influential Examples Global-Forward Explanations of Discrete Output Uptraining Meta-Analysis Downstream Evaluation Evaluating Explanations Flavors of Explanations Heuristics Human Annotations Human Experiments Perspectives General Observations Beyond Taxonomy Moral Foundations of Explanations Resources Code Datasets and Benchmarks Bibliography Author's Biography
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.