Toloka

Toloka
Type of site	Platform
Founded	2014;11 years ago
Headquarters	Amsterdam, Netherlands
Owner	Nebius Group
Founder	Olga Megorskaya
Industry	Artificial intelligence ; Information technology
URL	www.toloka.ai

Last updated November 21, 2025

Toloka is a Netherlander multinational data services company based in Amsterdam, Netherlands. It delivers human-in-the-loop annotation and evaluation work that supports the development of generative AI and large language models.

History

Toloka was founded in 2014 by Olga Megorskaya as a crowdsourcing and microtasking platform for data markup that improved machine learning and search algorithms. As generative AI advanced, it evolved into a provider of specialized data annotation and model refinement for frontier AI developers.

In 2024, the company's Russian operations were sold to Russian investors.^[6]^[7]

Services

Generative AI

Agentic AI

Toloka develops data and evaluation systems that help train and assess AI agents. Its work includes RL-Gyms that let models learn through simulated tasks, along with human-in-the-loop methods that combine people and AI in shared workflows.

Machine learning

On Toloka, trainers are tasked with identifying the presence or absence of objects in content, as specified by algorithms.^[8]^[9] They also assess chatbot responses within given dialogues for relevance and engagement.^[10] Additionally, translation verification tasks involve evaluating the accuracy of translations from multiple annotators. For the fine-tuning of large language models (LLMs), experts are required to generate and provide context-based prompts that can be single-turn or multi-turn, serving various domains and purposes.

Natural language processing

In the natural language processing (NLP) domain, Toloka facilitates optical character recognition and classification, sentiment analysis, named-entity recognition, and search relevance evaluation. It also provides transcription and classification of audio data.^[8]

Annotators

Toloka mainly works with domain experts, such as physicists, scientists, lawyers, and software engineers, to develop specialized data for models targeting niche tasks.^[1] Contributors are recruited and trained through Mindrift, Toloka’s platform for sourcing skilled participants in AI projects.

The platform supports work on data annotation and model evaluation, particularly for large language models and other generative AI systems. Toloka also collaborates with freelancers, referred to as “Tolokers,” who annotate and create data for a wide range of applications, including labelling personally identifiable information, translating content, summarising text, and transcribing audio.^[1]

Research

In May 2019, Toloka's research team began publishing datasets for non-commercial and academic purposes to support the scientific community and attract researchers to Toloka. Such datasets are addressed to researchers in different directions like linguistics, computer vision, testing of result aggregation models, and chatbot training.^[11]

Toloka research has been showcased at a range of conferences, including the Conference on Neural Information Processing Systems (NeurIPS),^[11] the International Conference on Machine Learning (ICML)^[12] and the International Conference on Very Large Data Bases (VLDB).^[13]

Controversies

Enabling arrests of protesters via facial recognition software (March 2024)

In March 2024, Toloka's Russian division was criticized for helping develop the facial recognition software used by Russia to track and arrest protesters after the death of Alexei Navalny.^[14] The company's Russian operations were sold in July 2024.

References

1 2 3 Shrivastava, Rashi (July 24, 2024). "The Internet Isn't Big Enough To Train AI. One Fix? Fake Data". Forbes .
↑ Sacolick, Isaac (April 8, 2024). "How to test large language models". InfoWorld .
↑ "About Nebius". nebius.com. Retrieved 2025-10-22.
↑ "Amazon's Bezos leads new investment in AI data company Toloka". Reuters. Archived from the original on 2025-05-15. Retrieved 2025-10-22.
↑ "Amsterdam-based AI data firm Toloka raises €64M round led by Jeff Bezos - Silicon Canals". 2025-05-10. Retrieved 2025-10-22.
↑ Sawers, Paul (July 21, 2024). "From Yandex's ashes comes Nebius, a 'startup' with plans to be a European AI compute leader". TechCrunch .
↑ "Yandex founder to build AI business in Europe after Russia exit". Financial Times . July 16, 2024.
1 2 Woodie, Alex (April 27, 2021). "Toloka Expands Data Labeling Service". Datanami.
↑ Bussler, Frederik (December 7, 2021). "Data labeling will fuel the AI revolution". VentureBeat .
↑ Gandharv, Kumar (April 29, 2021). "Why Are Data Labelling Firms Eyeing Indian Market?". Analytics India Magazine.
1 2 "Toloka to present new dataset at prestigious Data-Centric AI workshop launched by Andrew Ng". FE News. November 18, 2021.
↑ "Toloka". icml.cc.
↑ "VLDB 2021 Challenge". crowdscience.ai.
↑ "Dutch Yandex subsidiary helping Russia with facial recognition software". NL Times. 27 March 2024.

External links

Official website

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[FakeData-1] 1 2 3 Shrivastava, Rashi (July 24, 2024). "The Internet Isn't Big Enough To Train AI. One Fix? Fake Data". Forbes .

[2] Sacolick, Isaac (April 8, 2024). "How to test large language models". InfoWorld .

[3] "About Nebius". nebius.com. Retrieved 2025-10-22.

[4] "Amazon's Bezos leads new investment in AI data company Toloka". Reuters. Archived from the original on 2025-05-15. Retrieved 2025-10-22.

[5] "Amsterdam-based AI data firm Toloka raises €64M round led by Jeff Bezos - Silicon Canals". 2025-05-10. Retrieved 2025-10-22.

[leader-6] Sawers, Paul (July 21, 2024). "From Yandex's ashes comes Nebius, a 'startup' with plans to be a European AI compute leader". TechCrunch .

[7] "Yandex founder to build AI business in Europe after Russia exit". Financial Times . July 16, 2024.

[Expands-8] 1 2 Woodie, Alex (April 27, 2021). "Toloka Expands Data Labeling Service". Datanami.

[9] Bussler, Frederik (December 7, 2021). "Data labeling will fuel the AI revolution". VentureBeat .

[10] Gandharv, Kumar (April 29, 2021). "Why Are Data Labelling Firms Eyeing Indian Market?". Analytics India Magazine.

[Ng-11] 1 2 "Toloka to present new dataset at prestigious Data-Centric AI workshop launched by Andrew Ng". FE News. November 18, 2021.

[12] "Toloka". icml.cc.

[13] "VLDB 2021 Challenge". crowdscience.ai.

[14] "Dutch Yandex subsidiary helping Russia with facial recognition software". NL Times. 27 March 2024.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]