This article needs additional citations for verification .(July 2017) |
Article spinning is a writing technique that generates text that deceptively appears to be original by systematically substituting words, phrases, or sentence structures in pre-existing works. The practice is commonly employed by websites seeking to improve search engine optimization (SEO) rankings and by individuals attempting to conceal plagiarism.
Content spinning is the process of replacing words, phrases, sentences, or even entire paragraphs with synonyms or alternative constructions to create multiple variations of a text—a practice also known as Rogeting. The process may be performed manually or through automated software. While basic methods often yield awkward and difficult-to-read results, more advanced techniques can produce seemingly original and readable articles upon superficial inspection. However, spin-generated texts are typically superficial and uninformative, often frustrating readers due to their lack of substantive content. [1]
The practice can fall under the category of spamdexing, a black hat SEO practice. Website authors who use article spinning avoid penalties in search engine results pages (SERPs) for duplicate content. Article spinning is also used in other applications, such as message personalization and chatbots.
Automatic rewriting can change the meaning of a sentence through the use of words with similar but subtly different meanings to the original. For example, the word "picture" could be replaced by the word "image" or "photo". Thousands of word-for-word combinations are stored in either a text file or database thesaurus. This ensures that a large percentage of words are different from the original article. [2]
Simpler automated techniques cannot recognize context or grammar. Poorly-done article spinning can create unidiomatic phrasing that no human writer would choose. Some spinning may substitute a synonym with the wrong part of speech when encountering a word that can be used as either a noun or a verb, use an obscure word that is only used within very specific contexts, or improperly substitute proper nouns. [3] [4] For example, while "great" can be a synonym for "good", "Great Britain" does not have the same meaning as "Good Britain".
One type of article spinning is "spintax". Spintax (or spin syntax) uses a marked-up version of text to indicate which parts of the text should be altered or rearranged. The different variants of one paragraph, one or several sentences, or groups of words or words are marked. Spintax can be complex, with many depth levels (nested spinning). It acts as a tree with large branches, then many smaller branches up to the leaves. To create readable articles out of spintax, a specific software application chooses any of the possible paths in the tree; this results in variations of the base article without significant alteration to its meaning.[ citation needed ]
By the mid-2010s, there were a number of websites which will automatically spin content for an author, often with the end goal of attracting viewers to a website in order to display advertisements to them. [5] Such early search engine optimization often relied on article spinning to create large volumes of superficially unique text. [6] In contrast, modern SEO and content strategies increasingly focus on AI-driven visibility and analytics. [7]
404 Media reported in June 2024 that the black hat SEO industry had adopted generative AI tools, making it easy to "[create] an entirely autonomous, ChatGPT-powered technology news site that steals other people’s original reporting for just $365.63." [8] Also, platforms such as Semrush have introduced tools such as "AI Visibility Toolkit" and "Enterprise AIO" to track brand presence in AI-generated responses, reflecting a shift from text manipulation to performance monitoring in AI-based discovery systems. [9] [10]
Because of the problems with automated spinning, website owners may pay writers or specific companies to perform higher quality spinning manually. Writers may also spin their own articles, allowing them to sell the same articles with slight variations to a number of clients or to use the article for multiple purposes, for example as content and also for article marketing.[ citation needed ]
In academia, article spinning is sometimes used by students as a way to plagiarise other people's work while evading detection from their teachers or automated checking devices such as Turnitin or its IThenticate system. [11] There are many websites offering text-spinning services to students. Unlike large language models, they are not designed to produce natural sounding writing; rather, they are designed to take a source text and preserve the meaning and structure, but swap out enough synonyms such that plagiarism remains undetected. [12]
Google representatives say that Google doesn't penalize websites that host duplicate content, but the advances in filtering techniques mean that duplicate content will rarely feature well in SERPs, which is a form of penalty. [13] In 2010 and 2011, changes to Google's search algorithm targeting content farms aim to penalize sites containing significant duplicate content. [14] In this context, article spinning might help, as it's not detected as duplicate content.
In November 2023, Seth Weintraub of 9to5Google reported that Google News was "riddled with AI copies of actual articles" stolen from sites like Electrek . LLM-generated clone articles monetized via Google AdSense were posted "[w]ithin minutes" of the original article's publication. [15]
In January 2024, 404 Media reported that journalists' articles were being overtaken by LLM-generated clones in Google News rankings. [16] [17] [18] Google responded that it did not have a policy against AI-generated articles because it focused on quality instead of method of production, [16] [19] and claimed only certain types of filtered searches were affected. [20]
Article spinning is a technique used to generate text that appears to be new by rephrasing or rewording existing content. It is often considered unethical, as it may involve paraphrasing copyrighted material to evade copyright restrictions, deceiving readers into consuming low-value content for the benefit of the creator, or both. [21]