Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models
Nishad Singhi*, Jae Myung Kim, Karsten Roth, Zeynep Akata
;
Abstract
"Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions as well as human interventions, in which expert users can modify misaligned concept choices to interpretably influence the decision of the model. However, existing approaches often require numerous human interventions per image to achieve strong performances, posing practical challenges in scenarios where obtaining human feedback is expensive. In this paper, we find that this is driven by an independent treatment of concepts during intervention, wherein a change of one concept does not influence the use of other ones. To address this issue, we introduce a trainable concept intervention realignment module, which leverages concept relations to realign concept assignments post-intervention. Across standard benchmarks, we find that concept realignment significantly improves intervention efficacy and reduces the number of interventions needed to reach a target classification performance or concept prediction accuracy. Moreover, it easily integrates into existing concept-based architectures without requiring changes to the models themselves. This reduced cost of human-model collaboration is crucial to enhance the feasibility of CBMs in resource-constrained environments. Our code is available at https://github.com/ExplainableML/"
Related Material
[pdf]
[supplementary material]
[DOI]