![]() |
The SWiP project makes use of language, data and knowledge technologies to promote language equality among all of South Africa's official languages. The linguistic hegemonic status of English (and to a lesser extent Afrikaans) has resulted in English being the language of learning and teaching [1] which downplays an African epistemology, [2] thus local African languages are commonly under resourced. [3] The acronym "SWiP" describes the three main partners in a national collaboration between SADiLaR, the free encyclopedia Wikipedia and PanSALB who are working alongside local speech and language communities within Academica, to address language equality using digital technologies, especially Wikipedia. [4]
Under apartheid, certain languages were marginalised, including isiNdebele, Siswati, Xitsonga and Tshivenda. [5] To address the underrepresentation of South Africa's indigenous languages, three organisations are collaborating to build better low-resource languages corpora. These organisations are: [6]
Wikipedia is a common source of language data for natural language processing (NLP). [7] Low-resource languages have limited corpora of text (speech data, annotated text and other forms of linguistic data) for LLMs to draw on for NLP. The SWiP project has introduced a variety of alternative possibilities for the collection and compilation of corpora of suitable text for low-resource languages, and rolled this out on a national scale. These corpora can be used to create corpus-based dictionaries or semi-automatic translation. [8]
This collaborative project is also intended to promote, preserve, and digitise South Africa's indigenous languages and cultural knowledge by enhancing their presence on digital platforms such as Wikipedia. [9] By partnering with cultural and linguistic organisations, the project was designed to close the digital gap and ensure that local languages and cultural narratives are preserved and shared online. [6]
It is anticipated that the SWiP Project will: [9]
Phase 1 of the SWiP Project was launched on 20 September 2023 at UNISA with his Royal Majesty Enock Makhosoke II Mabhena, the King of amaNdebele, attending. [5] This event launched a number of events listed below and was successfully completed. Phase 2 of the project began in November 2024 and continues through 2025 at venues such as the Nelson Mandela University, University of Mpumalanga as well as University of Limpopo. [10]
An early success of the project was the integration of isiNdebele into Wikipedia. Initially represented by only 11 articles in the Wikipedia Incubator, the language saw rapid growth to over 140 articles within a year (as of 29 May 2025, there are 180 articles), [11] marking its transition to Wikipedia's main platform. [6]
The project has conducted extensive training sessions, engaging over 300 participants from various South African universities. Trainers introduced academics to Wikipedia and they learned article authorship skills (add content, citations, and photographs) and practiced translation using the Wikipedia translation tool.
These sessions led to the creation of hundreds of new articles, thousands of edits, and significant contributions of written content, references, and multimedia. The initiatives have fostered digital literacy and community engagement while significantly enhancing Wikipedia's indigenous language content. [9] [12]
Since its inception, the SWiP Project has:[ citation needed ]
Dashboard from SWip Phase 2 at the University of Mpumalanga
Dashboard from the SWip Phase 2 at the University of Limpopo .
The SWiP Resource Page is accessible to anyone interested in learning how to edit Wikipedia.
{{cite web}}
: CS1 maint: archived copy as title (link)