Introducing Command R+: Our new, most powerful model in the Command R family.

Learn More

Cohere For AI - Guest Speaker: Walid Bousselham, PhD Student

other

Date: May 27, 2024

Time: 6:00 PM - 7:00 PM

Location: Online

Abstract: Vision Transformers (ViTs), with their ability to model long-range dependencies through self-attention mechanisms, have become a standard architecture in computer vision. However, the interpretability of these models remains a challenge. To address this, we propose LeGrad, an explainability method specifically designed for ViTs. LeGrad computes the gradient with respect to the attention maps of ViT layers, considering the gradient itself as the explainability signal. We aggregate the signal over all layers, combining the activations of the last as well as intermediate tokens to produce the merged explainability map. This makes LeGrad a conceptually simple and an easy-to-implement tool for enhancing the transparency of ViTs. We evaluate LeGrad in challenging segmentation, perturbation, and open-vocabulary settings, showcasing its versatility compared to other SotA explainability methods demonstrating its superior spatial fidelity and robustness to perturbations.

About the speaker: I'm a PhD student at Bonn University, advised by Prof. Hilde Kuehne. I'm also participating in MIT-IBM Watson Sight and Sound Project.My primary research area is deep learning for multimodal models. Particularly, I am interested in zero-shot adaptation of pretrained models for emerging behavior.Prior to this, I finished my Master of Engineering in Applied Mathematics at ENSTA Paris in France and my Master of Science in Statistics and applied Probabilities at the National University of Singapore (NUS).

Add event to calendar

Apple Google Office 365 Outlook Outlook.com Yahoo