Mode collapse

Last updated January 01, 2026

In machine learning, mode collapse is a failure mode observed in generative models, originally noted in Generative Adversarial Networks (GANs). It occurs when the model produces outputs that are less diverse than expected, effectively "collapsing" to generate only a few modes of the data distribution while ignoring others. This phenomenon undermines the goal of generative models to capture the full diversity of the training data.

Distinctions

Mode collapse is distinct from overfitting, also called memorization, where a model learns detailed patterns in the training data that do not generalize to the test data, although there are commonalities between both phenomena.

In terms of learning a probability distribution, mode collapse corresponds to the collapse of the entire distribution to one or a few points, which may or may not correspond to points with high likelihood in the target distribution.

Overfitting, on the other hand, corresponds to learning a distribution that is highly peaked around training data points. In a sense, it can be seen as a form of near-complete or complete mode collapse, where the modes are every, or most of the training dataset. However, this is usually due to the overparametrization of the model, and not the training procedure itself, as is the case for GANs.

Underfitting, however, does not share commonalities with mode collapse. In this case, the model is insufficiently parametrized or trained, and the learned distribution is far from the target distribution, usually too close to the distribution at initialization.

In GANs

Training-time mode collapse was originally noted and studied in GANs, where it arises primarily due to imbalances in the training dynamics between the generator and discriminator in GANs. In the original GAN paper, it was also called the "Helvetica scenario".^[1]^[2]

Common causes include:^[3]

If the discriminator learns too slowly, the generator may exploit weaknesses by producing a narrow set of outputs that consistently fool the discriminator.
Traditional GAN loss functions (e.g., Jensen-Shannon divergence) may be too lenient on generating same-looking outputs.
The adversarial training process can lead to oscillatory behavior, where the generator and discriminator fail to converge to a stable equilibrium, but instead engage in a rock-beats-paper-beats-scissors kind of cycling. The generator would generate just "rock" until the discriminator learns to classify that as generated, then the generator switch to generating just "scissors", and so on. The generator would always be mode-collapsed, though the precise mode in which it collapses to would change during training.

Several GAN-specific strategies were developed to mitigate mode collapse:

Two time-scale update rule.^[4]
Mini-batch discrimination^[5] allows the discriminator to evaluate entire batches of samples, encouraging diversity.
Unrolled GANs^[6] optimize the generator against future states of the discriminator.
Wasserstein GAN uses Earth Mover's distance to provide more stable gradients.^[7]
Use a big and balanced training dataset.^[8]
Regularization methods such as gradient penalty and spectral normalization.^[9]

Finetuning

The large language models are usually trained in two steps. In the first step ("pretraining"), the model is trained to simply generate text sampled from a large dataset. In the second step ("finetuning"), the model is trained to perform specific tasks by training it on a small dataset containing just the task-specific data. For example, to make a chatbot in this method, one first pretrains a large transformer model over a few trillion words of text scraped from the Internet, then finetunes it on a few million words of example chatlogs that the model should imitate.

Mode collapse may occur during finetuning, as the model learns to generate text that accomplishes the specific task, but loses ability to generate other forms of text. It may also be able to generate a smaller subset of texts that accomplish the specific task. It is hypothesized that there is a tradeoff between quality and diversity. Given a single pretrained model, one may finetune it to perform a specific task. More finetuning would result in higher average task performance, but less diverse outputs. Less finetuning would result in lower average performance, but more diverse outputs.^[10] A similar tradeoff has been observed in image generation models^[11] and GAN-based text generators.^[12]

Similarly, mode collapse may occur during RLHF, via reward hacking the reward model or other mechanisms.^[13]^[14]

References

↑ Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). "Generative Adversarial Nets". Advances in Neural Information Processing Systems. 27. Curran Associates, Inc.
↑ Kossale, Youssef; Airaj, Mohammed; Darouichi, Aziz (2022-10-06). "Mode Collapse in Generative Adversarial Networks: An Overview". 2022 8th International Conference on Optimization and Applications (ICOA). IEEE. pp. 1–6. doi:10.1109/ICOA55659.2022.9934291. ISBN 978-1-6654-7681-2.
↑ Lucic, Mario; Kurach, Karol; Michalski, Marcin; Gelly, Sylvain; Bousquet, Olivier (2018). "Are GANs Created Equal? A Large-Scale Study". Advances in Neural Information Processing Systems. 31. Curran Associates, Inc.
↑ Heusel, Martin; Ramsauer, Hubert; Unterthiner, Thomas; Nessler, Bernhard; Hochreiter, Sepp (2017). "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium". arXiv: 1706.08500 [cs.LG].
↑ Salimans, Tim; Goodfellow, Ian; Zaremba, Wojciech; Cheung, Vicki; Radford, Alec; Chen, Xi; Chen, Xi (2016). "Improved Techniques for Training GANs". Advances in Neural Information Processing Systems. 29. Curran Associates, Inc.
↑ Metz, Luke; Poole, Ben; Pfau, David; Sohl-Dickstein, Jascha (2016). "Unrolled Generative Adversarial Networks". arXiv: 1611.02163 [cs.LG].
↑ Gulrajani, Ishaan; Ahmed, Faruk; Arjovsky, Martin; Dumoulin, Vincent; Courville, Aaron C (2017). "Improved Training of Wasserstein GANs". Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.
↑ Brock, Andrew; Donahue, Jeff; Simonyan, Karen (2018). "Large Scale GAN Training for High Fidelity Natural Image Synthesis". arXiv: 1809.11096 [cs.LG].
↑ Miyato, Takeru; Kataoka, Toshiki; Koyama, Masanori; Yoshida, Yuichi (2018). "Spectral Normalization for Generative Adversarial Networks". arXiv: 1802.05957 [cs.LG].
↑ Zhang, Hugh; Duckworth, Daniel; Ippolito, Daphne; Neelakantan, Arvind (2020). "Trading off Diversity and Quality in Natural Language Generation". arXiv: 2004.10450 [cs.CL].
↑ Astolfi, Pietro; Careil, Marlene; Hall, Melissa; Mañas, Oscar; Muckley, Matthew; Verbeek, Jakob; Adriana Romero Soriano; Drozdzal, Michal (2024). "Consistency-diversity-realism Pareto fronts of conditional image generative models". arXiv: 2406.10429 [cs.CV].
↑ Caccia, Massimo; Caccia, Lucas; Fedus, William; Larochelle, Hugo; Pineau, Joelle; Charlin, Laurent (2018). "Language GANs Falling Short". arXiv: 1811.02549 [cs.CL].
↑ Wen, Jiaxin; Zhong, Ruiqi; Khan, Akbir; Perez, Ethan; Steinhardt, Jacob; Huang, Minlie; Bowman, Samuel R.; He, He; Feng, Shi (2024). "Language Models Learn to Mislead Humans via RLHF". arXiv: 2409.12822 [cs.CL].
↑ Casper, Stephen; et al. (2023). "Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback". arXiv: 2307.15217 [cs.AI].

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). "Generative Adversarial Nets". Advances in Neural Information Processing Systems. 27. Curran Associates, Inc.

[2] Kossale, Youssef; Airaj, Mohammed; Darouichi, Aziz (2022-10-06). "Mode Collapse in Generative Adversarial Networks: An Overview". 2022 8th International Conference on Optimization and Applications (ICOA). IEEE. pp. 1–6. doi:10.1109/ICOA55659.2022.9934291. ISBN 978-1-6654-7681-2.

[3] Lucic, Mario; Kurach, Karol; Michalski, Marcin; Gelly, Sylvain; Bousquet, Olivier (2018). "Are GANs Created Equal? A Large-Scale Study". Advances in Neural Information Processing Systems. 31. Curran Associates, Inc.

[4] Heusel, Martin; Ramsauer, Hubert; Unterthiner, Thomas; Nessler, Bernhard; Hochreiter, Sepp (2017). "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium". arXiv: 1706.08500 [cs.LG].

[5] Salimans, Tim; Goodfellow, Ian; Zaremba, Wojciech; Cheung, Vicki; Radford, Alec; Chen, Xi; Chen, Xi (2016). "Improved Techniques for Training GANs". Advances in Neural Information Processing Systems. 29. Curran Associates, Inc.

[6] Metz, Luke; Poole, Ben; Pfau, David; Sohl-Dickstein, Jascha (2016). "Unrolled Generative Adversarial Networks". arXiv: 1611.02163 [cs.LG].

[7] Gulrajani, Ishaan; Ahmed, Faruk; Arjovsky, Martin; Dumoulin, Vincent; Courville, Aaron C (2017). "Improved Training of Wasserstein GANs". Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.

[8] Brock, Andrew; Donahue, Jeff; Simonyan, Karen (2018). "Large Scale GAN Training for High Fidelity Natural Image Synthesis". arXiv: 1809.11096 [cs.LG].

[9] Miyato, Takeru; Kataoka, Toshiki; Koyama, Masanori; Yoshida, Yuichi (2018). "Spectral Normalization for Generative Adversarial Networks". arXiv: 1802.05957 [cs.LG].

[10] Zhang, Hugh; Duckworth, Daniel; Ippolito, Daphne; Neelakantan, Arvind (2020). "Trading off Diversity and Quality in Natural Language Generation". arXiv: 2004.10450 [cs.CL].

[11] Astolfi, Pietro; Careil, Marlene; Hall, Melissa; Mañas, Oscar; Muckley, Matthew; Verbeek, Jakob; Adriana Romero Soriano; Drozdzal, Michal (2024). "Consistency-diversity-realism Pareto fronts of conditional image generative models". arXiv: 2406.10429 [cs.CV].

[12] Caccia, Massimo; Caccia, Lucas; Fedus, William; Larochelle, Hugo; Pineau, Joelle; Charlin, Laurent (2018). "Language GANs Falling Short". arXiv: 1811.02549 [cs.CL].

[13] Wen, Jiaxin; Zhong, Ruiqi; Khan, Akbir; Perez, Ethan; Steinhardt, Jacob; Huang, Minlie; Bowman, Samuel R.; He, He; Feng, Shi (2024). "Language Models Learn to Mislead Humans via RLHF". arXiv: 2409.12822 [cs.CL].

[14] Casper, Stephen; et al. (2023). "Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback". arXiv: 2307.15217 [cs.AI].

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

Mode collapse

Contents

Distinctions

In GANs

Finetuning

See also

References