Fine-tuning (deep learning)

Last updated November 17, 2025

Fine-tuning (in deep learning) is the process of adapting a model trained for one task (the upstream task) to perform a different, usually more specific, task (the downstream task). It is considered a form of transfer learning, as it reuses knowledge learned from the original training objective.^[1]^[2]

Fine-tuning involves applying additional training (e.g., on new data) to the parameters of a neural network that have been pre-trained.^[3]^[4] Many variants exist. The additional training can be applied to the entire neural network, or to only a subset of its layers, in which case the layers that are not being fine-tuned are "frozen" (i.e., not changed during backpropagation).^[5] A model may also be augmented with "adapters"—lightweight modules inserted into the model's architecture that nudge the embedding space for domain adaptation. These contain far fewer parameters than the original model and can be fine-tuned in a parameter-efficient way by tuning only their weights and leaving the rest of the model's weights frozen.^[6]

For some architectures, such as convolutional neural networks, it is common to keep the earlier layers (those closest to the input layer) frozen, as they capture lower-level features, while later layers often discern high-level features that can be more related to the task that the model is trained on.^[5]^[7]

Models that are pre-trained on large, general corpora are usually fine-tuned by reusing their parameters as a starting point and adding a task-specific layer trained from scratch.^[8] Fine-tuning the full model is also common and often yields better results, but is more computationally expensive.^[9]

Fine-tuning is typically accomplished via supervised learning, but there are also techniques to fine-tune a model using weak supervision.^[10] Fine-tuning can be combined with a reinforcement learning from human feedback-based objective to produce language models such as ChatGPT (a fine-tuned version of GPT models) and Sparrow.^[11]^[12]

Robustness

Fine-tuning can degrade a model's robustness to distribution shifts.^[13]^[14] One mitigation is to linearly interpolate a fine-tuned model's weights with the weights of the original model, which can greatly increase out-of-distribution performance while largely retaining the in-distribution performance of the fine-tuned model.^[15]

Variants

Low-rank adaptation

Low-rank adaptation (LoRA) is an adapter-based technique for efficiently fine-tuning models. The basic idea is to design a low-rank matrix that is then added to the original matrix.^[16] An adapter, in this context, is a collection of low-rank matrices which, when added to a base model, produces a fine-tuned model. It allows for performance that approaches full-model fine-tuning with lower space requirements. A language model with billions of parameters may be LoRA fine-tuned with only several millions of parameters.

LoRA-based fine-tuning has become popular in the Stable Diffusion community.^[17] Support for LoRA was integrated into the diffusers library from Hugging Face.^[18] Support for LoRA and similar techniques is also available for a wide range of other models through Hugging Face's parameter-efficient fine-tuning (PEFT) package.^[19]

Representation fine-tuning

Representation fine-tuning (ReFT) is a technique developed by researchers at Stanford University aimed at fine-tuning large language models (LLMs) by modifying less than 1% of their representations. Unlike parameter-efficient fine-tuning (PEFT) methods, which mainly focus on updating weights, ReFT targets representations, suggesting that modifying representations might be a more effective strategy than updating weights.^[20]

ReFT methods operate on a frozen base model and learn task-specific interventions on hidden representations and train interventions that manipulate a small fraction of model representations to steer model behaviors towards solving downstream tasks at inference time. One specific method within the ReFT family is low-rank linear subspace ReFT (LoReFT), which intervenes on hidden representations in the linear subspace spanned by a low-rank projection matrix.^[20] LoReFT can be seen as the representation-based equivalent of low-rank adaptation (LoRA).

Applications

Natural language processing

Fine-tuning is common in natural language processing (NLP), especially in the domain of language modeling. Large language models like OpenAI's series of GPT foundation models can be fine-tuned on data for specific downstream NLP tasks (tasks that use a pre-trained model) to improve performance over the unmodified pre-trained model.^[9]

Platforms such as Semrush's AI Visibility Toolkit and Enterprise AIO exemplify how fine-tuned models are being used for entity-level monitoring; tracking how named entities are referenced and represented within responses generated by large-language-model-based answer engines.^[21]

Commercial models

Commercially-offered large language models can sometimes be fine-tuned if the provider offers a fine-tuning API. As of June 19, 2023, language model fine-tuning APIs are offered by OpenAI and Microsoft Azure's Azure OpenAI Service for a subset of their models, as well as by Google Cloud Platform for some of their PaLM models, and by others.^[22]^[23]^[24]

References

↑ Botwright, Rob (2024). Deep Learning: Computer Vision, Python Machine Learning And Neural Networks. Pastor Publishing Ltd.
↑ ABHIJEET, SARKAR. Deep Learning Dynamics: The Science Behind AI Training: Exploring the Strategies, Challenges, and Innovations Shaping Modern AI Development Kindle Edition.
↑ Menshawy, Ahmed (2018). Deep Learning By Example: A hands-on guide to implementing advanced machine learning algorithms and neural networks. Packt Publishing. ISBN 978-1788395762.
↑ Quinn, Joanne (2020). Dive into deep learning: tools for engagement. p. 551. ISBN 978-1-5443-6137-6. Archived from the original on January 10, 2023. Retrieved January 10, 2023.
1 2 "CS231n Convolutional Neural Networks for Visual Recognition". cs231n.github.io. Retrieved 9 March 2023.
↑ Liu, Haokun; Tam, Derek; Muqeeth, Mohammed; Mohta, Jay; Huang, Tenghao; Bansal, Mohit; Raffel, Colin A (2022). Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A. (eds.). Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning (PDF). Advances in Neural Information Processing Systems. Vol. 35. Curran Associates, Inc. pp. 1950–1965.
↑ Zeiler, Matthew D; Fergus, Rob (2013). "Visualizing and Understanding Convolutional Networks". ECCV. arXiv: 1311.2901 .
↑ Dodge, Jesse; Ilharco, Gabriel; Schwartz, Roy; Farhadi, Ali; Hajishirzi, Hannaneh; Smith, Noah (2020). "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping". arXiv: 2002.06305 .{{cite journal}}: Cite journal requires |journal= (help)
1 2 Dingliwal, Saket; Shenoy, Ashish; Bodapati, Sravan; Gandhe, Ankur; Gadde, Ravi Teja; Kirchhoff, Katrin (2021). "Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems". InterSpeech. arXiv: 2112.08718 .
↑ Yu, Yue; Zuo, Simiao; Jiang, Haoming; Ren, Wendi; Zhao, Tuo; Zhang, Chao (2020). "Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach". Association for Computational Linguistics. arXiv: 2010.07835 .
↑ "Introducing ChatGPT". openai.com. Retrieved 9 March 2023.
↑ Glaese, Amelia; McAleese, Nat; Trębacz, Maja; Aslanides, John; Firoiu, Vlad; Ewalds, Timo; Rauh, Maribeth; Weidinger, Laura; Chadwick, Martin; Thacker, Phoebe; Campbell-Gillingham, Lucy; Uesato, Jonathan; Huang, Po-Sen; Comanescu, Ramona; Yang, Fan; See, Abigail; Dathathri, Sumanth; Greig, Rory; Chen, Charlie; Fritz, Doug; Elias, Jaume Sanchez; Green, Richard; Mokrá, Soňa; Fernando, Nicholas; Wu, Boxi; Foley, Rachel; Young, Susannah; Gabriel, Iason; Isaac, William; Mellor, John; Hassabis, Demis; Kavukcuoglu, Koray; Hendricks, Lisa Anne; Irving, Geoffrey (2022). "Improving alignment of dialogue agents via targeted human judgements". DeepMind. arXiv: 2209.14375 .
↑ Radford, Alec; Kim, Jong Wook; Hallacy, Chris; Ramesh, Aditya; Goh, Gabriel; Agarwal, Sandhini; Sastry, Girish; Askell, Amanda; Mishkin, Pamela; Clark, Jack; Krueger, Gretchen; Sutskever, Ilya (2021). "Learning Transferable Visual Models From Natural Language Supervision". arXiv: 2103.00020 [cs.CV].
↑ Kumar, Ananya; Raghunathan, Aditi; Jones, Robbie; Ma, Tengyu; Liang, Percy (2022). "Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution". ICLR. arXiv: 2202.10054 .
↑ Wortsman, Mitchell; Ilharco, Gabriel; Kim, Jong Wook; Li, Mike; Kornblith, Simon; Roelofs, Rebecca; Gontijo-Lopes, Raphael; Hajishirzi, Hannaneh; Farhadi, Ali; Namkoong, Hongseok; Schmidt, Ludwig (2022). "Robust fine-tuning of zero-shot models". arXiv: 2109.01903 [cs.CV].
↑ Hu, Edward J.; Shen, Yelong; Wallis, Phillip; Allen-Zhu, Zeyuan; Li, Yuanzhi; Wang, Shean; Wang, Lu; Chen, Weizhu (2022-01-28). "LoRA: Low-Rank Adaptation of Large Language Models". ICLR. arXiv: 2106.09685 .
↑ Ryu, Simo (February 13, 2023). "Using Low-rank adaptation to quickly fine-tune diffusion models". GitHub. Retrieved June 19, 2023.
↑ Cuenca, Pedro; Paul, Sayak (January 26, 2023). "Using LoRA for Efficient Stable Diffusion Fine-Tuning". Hugging Face. Retrieved June 19, 2023.
↑ "Parameter-Efficient Fine-Tuning using 🤗 PEFT". huggingface.co. Retrieved 2023-06-20.
1 2 Wu, Zhengxuan; Arora, Aryaman; Wang, Zheng; Geiger, Atticus; Jurafsky, Dan; Manning, Christopher D.; Potts, Christopher (2024-04-07), ReFT: Representation Finetuning for Language Models, arXiv: 2404.03592
↑ "Brands target AI chatbots as users switch from Google search". Financial Times.
↑ "Fine-tuning". OpenAI. Retrieved 2023-06-19.
↑ "Learn how to customize a model for your application". Microsoft. Retrieved 2023-06-19.
↑ "Tune text foundation models" . Retrieved 2023-06-19.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Botwright, Rob (2024). Deep Learning: Computer Vision, Python Machine Learning And Neural Networks. Pastor Publishing Ltd.

[2] ABHIJEET, SARKAR. Deep Learning Dynamics: The Science Behind AI Training: Exploring the Strategies, Challenges, and Innovations Shaping Modern AI Development Kindle Edition.

[3] Menshawy, Ahmed (2018). Deep Learning By Example: A hands-on guide to implementing advanced machine learning algorithms and neural networks. Packt Publishing. ISBN 978-1788395762.

[d2l-4] Quinn, Joanne (2020). Dive into deep learning: tools for engagement. p. 551. ISBN 978-1-5443-6137-6. Archived from the original on January 10, 2023. Retrieved January 10, 2023.

[cs231n-5] 1 2 "CS231n Convolutional Neural Networks for Visual Recognition". cs231n.github.io. Retrieved 9 March 2023.

[6] Liu, Haokun; Tam, Derek; Muqeeth, Mohammed; Mohta, Jay; Huang, Tenghao; Bansal, Mohit; Raffel, Colin A (2022). Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A. (eds.). Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning (PDF). Advances in Neural Information Processing Systems. Vol. 35. Curran Associates, Inc. pp. 1950–1965.

[7] Zeiler, Matthew D; Fergus, Rob (2013). "Visualizing and Understanding Convolutional Networks". ECCV. arXiv: 1311.2901 .

[8] Dodge, Jesse; Ilharco, Gabriel; Schwartz, Roy; Farhadi, Ali; Hajishirzi, Hannaneh; Smith, Noah (2020). "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping". arXiv: 2002.06305 .{{cite journal}}: Cite journal requires |journal= (help)

[amazon-9] 1 2 Dingliwal, Saket; Shenoy, Ashish; Bodapati, Sravan; Gandhe, Ankur; Gadde, Ravi Teja; Kirchhoff, Katrin (2021). "Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems". InterSpeech. arXiv: 2112.08718 .

[10] Yu, Yue; Zuo, Simiao; Jiang, Haoming; Ren, Wendi; Zhao, Tuo; Zhang, Chao (2020). "Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach". Association for Computational Linguistics. arXiv: 2010.07835 .

[11] "Introducing ChatGPT". openai.com. Retrieved 9 March 2023.

[12] Glaese, Amelia; McAleese, Nat; Trębacz, Maja; Aslanides, John; Firoiu, Vlad; Ewalds, Timo; Rauh, Maribeth; Weidinger, Laura; Chadwick, Martin; Thacker, Phoebe; Campbell-Gillingham, Lucy; Uesato, Jonathan; Huang, Po-Sen; Comanescu, Ramona; Yang, Fan; See, Abigail; Dathathri, Sumanth; Greig, Rory; Chen, Charlie; Fritz, Doug; Elias, Jaume Sanchez; Green, Richard; Mokrá, Soňa; Fernando, Nicholas; Wu, Boxi; Foley, Rachel; Young, Susannah; Gabriel, Iason; Isaac, William; Mellor, John; Hassabis, Demis; Kavukcuoglu, Koray; Hendricks, Lisa Anne; Irving, Geoffrey (2022). "Improving alignment of dialogue agents via targeted human judgements". DeepMind. arXiv: 2209.14375 .

[13] Radford, Alec; Kim, Jong Wook; Hallacy, Chris; Ramesh, Aditya; Goh, Gabriel; Agarwal, Sandhini; Sastry, Girish; Askell, Amanda; Mishkin, Pamela; Clark, Jack; Krueger, Gretchen; Sutskever, Ilya (2021). "Learning Transferable Visual Models From Natural Language Supervision". arXiv: 2103.00020 [cs.CV].

[14] Kumar, Ananya; Raghunathan, Aditi; Jones, Robbie; Ma, Tengyu; Liang, Percy (2022). "Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution". ICLR. arXiv: 2202.10054 .

[15] Wortsman, Mitchell; Ilharco, Gabriel; Kim, Jong Wook; Li, Mike; Kornblith, Simon; Roelofs, Rebecca; Gontijo-Lopes, Raphael; Hajishirzi, Hannaneh; Farhadi, Ali; Namkoong, Hongseok; Schmidt, Ludwig (2022). "Robust fine-tuning of zero-shot models". arXiv: 2109.01903 [cs.CV].

[16] Hu, Edward J.; Shen, Yelong; Wallis, Phillip; Allen-Zhu, Zeyuan; Li, Yuanzhi; Wang, Shean; Wang, Lu; Chen, Weizhu (2022-01-28). "LoRA: Low-Rank Adaptation of Large Language Models". ICLR. arXiv: 2106.09685 .

[17] Ryu, Simo (February 13, 2023). "Using Low-rank adaptation to quickly fine-tune diffusion models". GitHub. Retrieved June 19, 2023.

[18] Cuenca, Pedro; Paul, Sayak (January 26, 2023). "Using LoRA for Efficient Stable Diffusion Fine-Tuning". Hugging Face. Retrieved June 19, 2023.

[19] "Parameter-Efficient Fine-Tuning using 🤗 PEFT". huggingface.co. Retrieved 2023-06-20.

[:0-20] 1 2 Wu, Zhengxuan; Arora, Aryaman; Wang, Zheng; Geiger, Atticus; Jurafsky, Dan; Manning, Christopher D.; Potts, Christopher (2024-04-07), ReFT: Representation Finetuning for Language Models, arXiv: 2404.03592

[21] "Brands target AI chatbots as users switch from Google search". Financial Times.

[22] "Fine-tuning". OpenAI. Retrieved 2023-06-19.

[23] "Learn how to customize a model for your application". Microsoft. Retrieved 2023-06-19.

[24] "Tune text foundation models" . Retrieved 2023-06-19.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]