Skip to content

This is a repository that contains practice module for the book 'Interpretability in Deep Learning'. This is recommended to AI practitioners and basically anyone who wants an overview of techniques to make their deep learning models more interpretable.

Notifications You must be signed in to change notification settings

bioailab/IDL-Archive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Interpretable DL Playground

This is a repository that contains a comprehensive resource for AI practitioners and enthusiasts to explore and practice interpretable deep learning techniques, featuring a curated collection of modules, a relevant book "Interpretability in Deep Learning" (Springer 2023), and intuitive explanations.

While we do recommend basic deep learning knowledge, we've also included fundamental concepts for the convenience of our readers. Our goal is to help you understand the most prevalent and common practices of explainable AI and provide intuitive explanations of the techniques without delving too much into the mathematics at the beginning of each chapter.

About the Repository

Are you tired of black-box deep learning models that are difficult to interpret and explain? Look no further! Our repository aims to contain a curated collection of practice modules (shall be expanded over time) for interpretable AI and trustworthy model development.

Whether you're an AI practitioner or simply interested in learning more about interpretable deep learning techniques, our repository is the perfect resource for you to start. But that's not all! We've also included resources for you to further your understanding of the topic. We attempt to expand the practice modules over time in complex topics to help you apply the techniques in a hands-on manner, so you can see the benefits of interpretable AI for yourself.

Research Papers - To Read

Paper Title Conference/Journal Author Description
Towards Robust Interpretability with Self-Explaining Neural Networks! NeurIPS 2018 D. Alvarez-Melis and T.S. Jaakkola Proposes self-explaining neural networks, which learn to provide interpretable explanations for their predictions by incorporating explicit explanatory factors into the model architecture.
Sanity Checks for Saliency Maps NeurIPS 2018 Adebayo et al. Highlights the limitations of saliency methods and proposes sanity checks to ensure that the explanations provided by these methods are meaningful and reliable.
On the (In)fidelity and Sensitivity of Explanations NeurIPS 2019 Yeh at al. Introduces two metrics, fidelity and sensitivity, to evaluate the quality of explanations provided by various interpretability methods, including saliency and CAMs.
Invariant Risk Minimization arXiv 2019 Arjovsky et al. Introduces Invariant Risk Minimization, a learning framework that encourages models to rely on features that are invariant across different environments, leading to more robust and interpretable predictions
Explanation by Progressive Exaggeration ICLR 2020 Singla et al. A new method which iteratively exaggerates the most important features in the input to generate more robust and interpretable explanations. GitHub
Relevance-CAM: Your Model Already Knows Where to Look! CVPR 2021 Lee et al. A method to generate class-discriminative visual explanations using pre-trained deep neural networks without additional training or modification.
Neural Prototype Trees for Interpretable Fine-Grained Image Recognition CVPR 2021 Nauta et al. A hierarchical approach for interpretable fine-grained image recognition that combines neural networks with decision trees.
Concept-Monitor: Understanding DNN training through individual neurons arXiv April 2023 Khan et al. A framework for demystifying black-box training processes using a unified embedding space and concept diversity metric, enabling interpretable visualization, improved training performance, and application to various training paradigms.
Recent and Interesting Archived Papers on Chat-GPT for Research: [1] Differentiate ChatGPT-generated and Human-written Medical Texts | Liao et al. (2023)
[2] In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT | Shen et al. (2023)
[3] Toxicity in CHATGPT: Analyzing Persona-assigned Language Models | Deshpande et al. (2023)
Intriguing Model Improvement Application Paper: [1] Attention-based Dropout Layer for Weakly Supervised Object Localization | Junsuk Choe, Hyunjung Shim (CVPR 2019) | GitHub |

Citation

If you wish to cite the book "Interpretability in Deep Learning", feel free to use this BibTeX reference:

@book{somani2023interpretability,
  title={Interpretability in Deep Learning},
  author={Somani, Ayush and Horsch, Alexander and Prasad, Dilip K},
  year={2023},
  publisher={Springer Nature}
}

Book Cover

Contributing

Feeling like extending the range of possibilities of interpretable methods to make AI more trustable? Or perhaps submitting a paper implementation? Any sort of contribution is greatly appreciated!

About

This is a repository that contains practice module for the book 'Interpretability in Deep Learning'. This is recommended to AI practitioners and basically anyone who wants an overview of techniques to make their deep learning models more interpretable.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published