Text watermarking

Last updated

Text watermarking is a technique for embedding hidden information within textual content to verify its authenticity, origin, or ownership. [1] With the rise of generative AI systems using large language models (LLM), there has been significant development focused on watermarking AI-generated text. [2] Potential applications include detecting fake news and academic cheating, and excluding AI-generated material from LLM training data. [3] With LLMs the focus is on linguistic approaches that involve selecting words to form patterns within the text that can later be identified. [1] The results of the first reported large-scale public deployment, a trial using Google's Gemini chatbot, appeared in October 2024: users across 20 million responses found watermarked and unwatermarked text to be of equal quality. [3] Research on text watermarking began in 1997. [1]

See also

References

  1. 1 2 3 Kamaruddin, Nurul Shamimi; Kamsin, Amirrudin; Por, Lip Yee; Rahman, Hameedur (2018). "A Review of Text Watermarking: Theory, Methods, and Applications". IEEE Access . 6: 8011–8028. Bibcode:2018IEEEA...6.8011K. doi: 10.1109/ACCESS.2018.2796585 . ISSN   2169-3536.
  2. Liu, Aiwei; Pan, Leyi; Lu, Yijian; Li, Jingjing; Hu, Xuming; Zhang, Xi; Wen, Lijie; King, Irwin; Xiong, Hui; Yu, Philip (2024-09-03). "A Survey of Text Watermarking in the Era of Large Language Models". ACM Computing Surveys . 57 (2): 1–36. arXiv: 2312.07913 . doi:10.1145/3691626. ISSN   0360-0300.
  3. 1 2 Gibney, Elizabeth (Oct 23, 2024). "Google unveils invisible 'watermark' for AI-generated text" . Nature . 634 (8036): 1027–1028. Bibcode:2024Natur.634.1027G. doi:10.1038/d41586-024-03462-7. PMID   39443774 . Retrieved Oct 26, 2024.