Content suppresses style: dimensionality collapse in contrastive learning

Abstract

Contrastive learning is a highly successful yet simple self-supervised learning technique that minimizes the representational distance of similar (positive) while maximizing it for dissimilar (negative) samples. Despite its success, our theoretical understanding of contrastive learning is still incomplete. Most importantly, it is unclear why the inferred representation faces a dimensionality collapse after SimCLR training and why downstream performance improves by removing the feature encoder’s last layers (projector). We show that collapse might be induced by an inductive bias of the InfoNCE loss for features that vary little within a positive pair (content) while suppressing more strongly-varying features (style). When at least one content variable is present, we prove that a low-rank projector reduces downstream task performance while simultaneously minimizing the InfoNCE objective. This result elucidates a potential reason why removing the projector could lead to better downstream performance. Subsequently, we propose a simple strategy leveraging adaptive temperature factors in the loss to equalize content and style latents, mitigating dimensionality collapse. Finally, we validate our theoretical findings on controlled synthetic data and natural images.

Patrik Reizinger
Patrik Reizinger
PhD candidate

I am a final-year Ph.D. student supervised by Wieland Brendel, Ferenc Huszár, Matthias Bethge, and Bernhard Schölkopf. I am part of the ELLIS and IMPRS-IS programs. I have also spent time at the Vector Institute and at the University of Cambridge.

Wieland Brendel
Wieland Brendel
Principal Investigator (PI)

Wieland Brendel received his Diploma in physics from the University of Regensburg (2010) and his Ph.D. in computational neuroscience from the École normale supérieure in Paris (2014). He joined the University of Tübingen as a postdoctoral researcher in the group of Matthias Bethge, became a Principal Investigator and Team Lead in the Tübingen AI Center (2018) and an Emmy Noether Group Leader for Robust Machine Learning (2020). In May 2022, Wieland joined the Max-Planck Institute for Intelligent Systems as an independent Group Leader and is now a Hector-endowed Fellow at the ELLIS Institute Tübingen (since September 2023). He received the 2023 German Pattern Recognition Award for his substantial contributions on robust, generalisable and interpretable machine vision. Aside of his research, Wieland co-founded a nationwide school competition (bw-ki.de) and a machine learning startup focused on visual quality control.