\[ \renewcommand{\d}{{\bf{d}}} \renewcommand{\b}{{\bf{b}}} \newcommand{\J}{{\bf{J}}} \newcommand{\A}{\bf{A}} \newcommand{\B}{\bf{B}} \newcommand{\RR}{\mathbf{R}} \newcommand{\h}{{\bf{h}}} \newcommand{\x}{{\bf{x}}} \newcommand{\bfa}{{\bf{a}}} \newcommand{\bfb}{{\bf{b}}} \newcommand{\bfc}{{\bf{c}}} \newcommand{\y}{{\bf{y}}} \newcommand{\z}{{\bf{z}}} \newcommand{\w}{{\bf{w}}} \newcommand{\f}{{\bf{f}}} \newcommand{\tf}{{\bf{\tilde f}}} \newcommand{\tx}{{\bf{\tilde x}}} \renewcommand{\d}{{\rm{d}}} \newcommand{\s}{{\bf{s}}} \newcommand{\g}{{\bf{g}}} \newcommand{\W}{{\bf{W}}} \newcommand{\vol}{{\operatorname{vol}}} \newcommand{\zz}{\mathbf{z}} \newcommand{\xx}{\mathbf{x}} \newcommand{\bdelta}{\bm{\delta}} \renewcommand{\H}{\mathbf{H}} \newcommand{\txx}{{\tilde{\mathbf{x}}}} \newcommand{\tzz}{{\tilde{\mathbf{z}}}} \newcommand{\tyy}{{\tilde{\mathbf{y}}}} \newcommand{\invf}{f^{-1}} \newcommand{\Sp}{\mathbb{S}} \]

Contrastive Learning Inverts the Data Generating Process

Roland S. Zimmermann*
University of Tübingen & IMPRS-IS
Yash Sharma*
University of Tübingen & IMPRS-IS
Steffen Schneider*
University of Tübingen & IMPRS-IS & EPFL
Matthias Bethge
University of Tübingen
Wieland Brendel
University of Tübingen

tl;dr: We show that a popular contrastive learning method can invert the data generating process and find the factors of variation underlying the data. Our findings may explain the empirical success of contrastive learning and pave the way towards more effective contrastive learning losses.


May '21 The paper was accepted at ICML 2021!
February '21 The pre-print is now available on arXiv: arxiv.org/abs/2102.08850
December '20 A shorter workshop version of the paper was accepted for poster presentation at the NeurIPS 2020 Workshop on Self-Supervised Learning - Theory and Practice.


Contrastive learning has recently seen tremendous success in self-supervised learning. So far, however, it is largely unclear why the learned representations generalize so effectively to a large variety of downstream tasks. We here prove that feedforward models trained with objectives belonging to the commonly used InfoNCE family learn to implicitly invert the underlying generative model of the observed data. While the proofs make certain statistical assumptions about the generative model, we observe empirically that our findings hold even if these assumptions are severely violated. Our theory highlights a fundamental connection between contrastive learning, generative modeling, and nonlinear independent component analysis, thereby furthering our understanding of the learned representations as well as providing a theoretical foundation to derive more effective contrastive losses.

Overview: We analyze the setup of contrastive learning, in which a feature encoder \(f\) is trained with the InfoNCE objective (Gutmann & Hyvarinen, 2012; van den Oord et al., 2018; Chen et al., 2020) using positive samples (green) and negative samples (orange). We assume the observations are generated by an (unknown) injective generative model \(g\) that maps unobservable latent variables from a hypersphere to observations in another manifold. Under these assumptions, the feature encoder \(f\) implictly learns to invert the ground-truth generative process \(g\) up to linear transformations, i.e., \(f = \mathbf{A} g^{-1}\) with an orthogonal matrix \(\mathbf{A}\), if \(f\) minimizes the InfoNCE objective.


  1. We establish a theoretical connection between the InfoNCE family of objectives, which is commonly used in self-supervised learning, and nonlinear ICA. We show that training with InfoNCE inverts the data generating process if certain statistical assumptions on the data generating process hold.
  2. We empirically verify our predictions when the assumed theoretical conditions are fulfilled. In addition, we show successful inversion of the data generating process even if theoretical assumptions are partially violated.
  3. We build on top of the CLEVR rendering pipeline (Johnson et al., 2017) to generate a more visually complex disentanglement benchmark, called 3DIdent, that contains hallmarks of natural environments (shadows, different lighting conditions, a 3D object, etc.). We demonstrate that a contrastive loss derived from our theoretical framework can identify the ground-truth factors of such complex, high-resolution images.


We start with the well-known formulation of a contrastive loss (often called InfoNCE), \[ L(f; \tau, M) := \underset{\substack{ (\x, \tx) \sim p_\mathsf{pos} \\ \{\xx^-_i\}_{i=1}^M \overset{\text{i.i.d.}}{\sim} p_\mathsf{data} }}{\mathbb{E}} \left[\, {- \log \frac{e^{f(\xx)^{\mathsf{T}} f(\tx) / \tau }}{e^{f(\xx)^{\mathsf{T}} f(\tx) / \tau } + \sum\limits_{i=1}^M e^{f(\xx^-_i)^{\mathsf{T}} f(\tx) / \tau }}}\,\right]. \nonumber \]

Our theoretical approach consists of three steps:

  • We demonstrate that the contrastive loss \(L\) can be interpreted as the cross-entropy between the (conditional) ground-truth and an inferred latent distribution.
  • Next, we show that encoders minimizing the contrastive loss maintain distance, i.e., two latent vectors with distance \(\alpha\) in the ground-truth generative model are mapped to points with the same distance \(\alpha\) in the inferred representation.
  • Finally, we use distance preservation to show that minimizers of the contrastive loss \(L\) invert the generative process up to certain invertible linear transformations.

We follow this approach both for the contrastive loss \(L\) defined above, and use our theory as a starting point to design new contrastive losses (e.g., for latents within a hypercube). We validate predictions regarding identifiability of the latent variables (up to a transformation) with extensive experiments.


We introduce 3Dident, a dataset with hallmarks of natural environments (shadows, different lighting conditions, 3D rotations, etc.). We publicly released the full dataset (including both, the train and test set) here. Reference code for evaluation has been made available at our repository.
3DIdent: Influence of the latent factors \(\z\) on the renderings \(\x\). Each column corresponds to a traversal in one of the ten latent dimensions while the other dimensions are kept fixed.

Acknowledgements & Funding

We thank Ivan Ustyuzhaninov, David Klindt, Lukas Schott and Luisa Eck for helpful discussions. We thank Bozidar Antic, Shubham Krishna and Jugoslav Stojcheski for ideas on the design of 3DIdent.

We thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting RSZ, YS and StS. StS acknowledges his membership in the European Laboratory for Learning and Intelligent Systems (ELLIS) PhD program. We acknowledge support from the German Federal Ministry of Education and Research (BMBF) through the Competence Center for Machine Learning (TUE.AI, FKZ 01IS18039A) and the Bernstein Computational Neuroscience Program Tübingen (FKZ: 01GQ1002). WB acknowledges support via his Emmy Noether Research Group funded by the German Science Foundation (DFG) under grant no. BR 6382/1-1 as well as support by Open Philantropy and the Good Ventures Foundation.


If you find our analysis helpful, please cite our paper:

  author = {
    Zimmermann, Roland S. and
    Sharma, Yash and
    Schneider, Steffen and
    Bethge, Matthias and
    Brendel, Wieland
  title = {
    Contrastive Learning Inverts
    the Data Generating Process
  journal = {CoRR},
  volume = {abs/2102.08850},
  year = {2021},
Webpage designed using Bootstrap 4.5.