CEDICT

Last updated

The CEDICT project was started by Paul Denisowski in 1997 and is maintained by a team on mdbg.net under the name CC-CEDICT, with the aim to provide a complete Chinese to English dictionary with pronunciation in pinyin for the Chinese characters. [1] [2] [3] [4] [5]

Contents

Content

CEDICT is a text file; other programs (or simply Notepad or egrep or equivalent) are needed to search and display it. This project is used by several other Chinese-English projects. The Unihan Database uses CEDICT data for most of its information about character compounds, but this is auxiliary and is explicitly not a part of the main Unicode database. [6]

Features:

The basic format of a CEDICT entry is:

Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/ 漢字 汉字 [han4 zi4] /Chinese character/CL:個|个/

Example of a simple egrep search:

$ egrep -i 有勇無謀 cedict.txt 有勇無謀 有勇无谋 [you3 yong3 wu2 mou2] /bold but not very astute/

History

YearEvent
1991 EDICT Japanese dictionary project was started by Jim Breen.
1997CEDICT project started by Paul Denisowski, on the model of EDICT. Continued by Erik Peterson.
2007MDBG started a new project called CC-CEDICT which continues the CEDICT project with a new license: Creative Commons Attribution-Share Alike 3.0 License, allowing more projects to use it. [8] Additionally a work flow has been set up to streamline the process of submitting, reviewing and processing new entries.

CEDICT has shown the way to some other projects:

References

  1. Ken Lunde (1999). CJKV information processing. O'Reilly. ISBN   978-1-56592-224-2.
  2. Mottaz Jiang, Claire-Lise. "Hypertext Interfaces for Chinese Character and Word Dictionaries" (PDF).
  3. "Chinese Retrieval System Using Hangeul Pronunciation of Chinese Language - ProQuest". www.proquest.com. Retrieved 2025-05-16.
  4. Peng, Gang; Minett, James W.; Wang, William S.-Y. (2008-06-20). "The networks of syllables and characters in Chinese" . Journal of Quantitative Linguistics. 15 (3): 243–255. doi:10.1080/09296170802159488. ISSN   0929-6174.
  5. Applied Natural Language Processing Conference (6th : 2000 : Seattle, Wash ) (2000). Proceedings of the conferences and proceedings of the ANLP-NAACL 2000 student research workshop : 6th Applied Natural Language Processing Conference [and] 1st Meeting of the North American Chapter of the Association for Computational Linguistics : April 29 - May 4, Seattle, Washington, USA. Internet Archive. [New Brunswick, N.J.] : Association for Computational Linguistics ; San Francisco : Distributed by Morgan Kaufman Publishers. ISBN   978-1-55860-704-0.{{cite book}}: CS1 maint: numeric names: authors list (link)
  6. "Unihan Database Lookup". unicode.org.
  7. "CC-CEDICT download - MDBG Chinese Dictionary". www.mdbg.net.
  8. The original CEDICT license was for non-commercial use only, and did not allow entries to be added without permission.
  9. "HanDeDict @ Zydeo Wörterbuch Chinesisch-Deutsch". handedict.zydeo.net. Retrieved 18 July 2025.
  10. "Dictionnaire chinois français / 汉法词典 — Chine Informations". chine.in. Retrieved 16 July 2025.
  11. "CHDICT Chinese ⇔ Hungarian dictionary and corpus". chdict.zydeo.net. Retrieved 29 July 2025.
  12. "CC-Canto - A Cantonese dictionary for everyone". cantonese.org.
  13. "Cantonese CEDICT Project". 廣府話小研究Cantonese Resources. 2012-02-04. Retrieved 2025-07-17.
  14. "StarDict". Stardict.sourceforge.net. Retrieved 18 November 2011.