Spelling suggestion

Last updated February 23, 2019

Spelling suggestion is a feature of many computer software applications used to suggest plausible replacements for words that are likely to have been misspelled.

Spelling suggestion features are commonly included in Internet search engines, word processors, spell checkers, medical transcription, automatic query reformulation, and frequency-log statistics reporting.

The Internet is the global system of interconnected computer networks that use the Internet protocol suite (TCP/IP) to link devices worldwide. It is a network of networks that consists of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies. The Internet carries a vast range of information resources and services, such as the inter-linked hypertext documents and applications of the World Wide Web (WWW), electronic mail, telephony, and file sharing.

A word processor (WP) is a computer program or device that provides for input, editing, formatting and output of text, often plus other features.

In software, a spell checker is a software feature that checks for misspellings in a text. Features are often in software, such as a word processor, email client, electronic dictionary, or search engine.

Algorithms

Algorithms for spelling suggestion may incorporate several information resources to determine candidates for misspelled words and the words' likely correct replacements, including the following:

Frequency of a word's occurrence in reliable data sources
Search query logs
List of frequently misspelled words
Domain specific data resources (e.g.: corpora of medical terms)

In linguistics, a corpus or text corpus is a large and structured set of texts. In corpus linguistics, they are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.

This computer-programming-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

Related Research Articles

Google Search, also referred to as Google Web Search or simply Google, is a web search engine developed by Google LLC. It is the most used search engine on the World Wide Web across all platforms, with 92.74% market share as of October 2018, handling more than 3.5 billion searches each day.

Spelling is a linguistic process of phonemic orthography with the necessary letters and diacritics present in a comprehensible order, usually with some degree of standardization; it is "the conventions which determine how the graphemes of a writing system are used to write a language". In other words it is interpretation of speech sound (phoneme) into writing (grapheme). Spelling is one of the elements of orthography, and highly standardized spelling is a prescriptive element.

Dogpile is a metasearch engine for information on the World Wide Web that fetches results from Google, Yahoo!, Yandex, Bing and results from other popular search engines, including those from audio and video content providers such as Yahoo!.

Autocomplete, or word completion, is a feature in which an application predicts the rest of a word a user is typing. In graphical user interfaces, users can typically press the tab key to accept a suggestion or the down arrow key to accept one of several.

Spotlight is a system-wide desktop search feature of Apple's macOS and iOS operating systems. Spotlight is a selection-based search system, which creates an index of all items and files on the system. It is designed to allow the user to quickly locate a wide variety of items on the computer, including documents, pictures, music, applications, and System Preferences. In addition, specific words in documents and in web pages in a web browser's history or bookmarks can be searched. It also allows the user to narrow down searches with creation dates, modification dates, sizes, types and other attributes. Spotlight also offers quick access to definitions from the built-in New Oxford American Dictionary and to calculator functionality. There are also command-line tools to perform functions such as Spotlight searches.

Typographical error mistake made in the typing process (such as a spelling mistake) of printed material

A typographical error, also called misprint, is a mistake made in the typing of printed material. Historically, this referred to mistakes in manual type-setting (typography). The term includes errors due to mechanical failure or slips of the hand or finger, but excludes errors of ignorance, such as spelling errors, or the flip-flopping of words such as "than" and "then". Before the arrival of printing, the "copyist's mistake" or "scribal error" was the equivalent for manuscripts. Most typos involve simple duplication, omission, transposition, or substitution of a small number of characters.

In information retrieval, tf–idf or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. It is often used as a weighting factor in searches of information retrieval, text mining, and user modeling. The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general. Tf–idf is one of the most popular term-weighting schemes today; 83% of text-based recommender systems in digital libraries use tf–idf.

In computer science, an inverted index is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of documents. The purpose of an inverted index is to allow fast full-text searches, at a cost of increased processing when a document is added to the database. The inverted file may be the database file itself, rather than its index. It is the most popular data structure used in document retrieval systems, used on a large scale for example in search engines. Additionally, several significant general-purpose mainframe-based database management systems have used inverted list architectures, including ADABAS, DATACOM/DB, and Model 204.

Google Trends is a website by Google that analyzes the popularity of top search queries in Google Search across various regions and languages. The website uses graphs to compare the search volume of different queries over time.

The HTTP referer is an optional HTTP header field that identifies the address of the webpage that linked to the resource being requested. By checking the referrer, the new webpage can see where the request originated.

A foreign language writing aid is a computer program or any other instrument that assists a non-native language user in writing decently in their target language. Assistive operations can be classified into two categories: on-the-fly prompts and post-writing checks. Assisted aspects of writing include: lexical, syntactic, lexical semantic and idiomatic expression transfer, etc. Different types of foreign language writing aids include automated proofreading applications, text corpora, dictionaries, translation aids and orthography aids.

Query expansion (QE) is the process of reformulating a given query to improve retrieval performance in information retrieval operations, particularly in the context of query understanding. In the context of search engines, query expansion involves evaluating a user's input and expanding the search query to match additional documents. Query expansion involves techniques such as:

A web search query is a query based on a specific search term that a user enters into a web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are often plain text or hypertext with optional search-directives. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

Muphry's law is an adage that states: "If you write anything criticizing editing or proofreading, there will be a fault of some kind in what you have written." The name is a deliberate misspelling of "Murphy's law".

Commonly misspelled English words are words that are often unintentionally misspelled in general writing. A selected list of common words is presented below, under Documented list of common misspellings. Although the word "common" is subjective depending on the situation, the focus is on general writing, rather than in a specific field. Official spellings also vary by country or region, with some rejecting the American or British variants as incorrect for the region.

Vector space model or term vector model is an algebraic model for representing text documents as vectors of identifiers, such as, for example, index terms. It is used in information filtering, information retrieval, indexing and relevancy rankings. Its first use was in the SMART Information Retrieval System.

Sketch Engine is a corpus manager and text analysis software developed by Lexical Computing Limited since 2003. Its purpose is to enable people studying language behaviour to search large text collections according to complex and linguistically motivated queries. Sketch Engine gained its name after one of the key features, word sketches: one-page, automatic, corpus-derived summaries of a word's grammatical and collocational behaviour. Currently, it supports and provides corpora in 90+ languages.

Query understanding is the process of inferring the intent of a search engine user by extracting semantic meaning from the searcher’s keywords. Query understanding methods generally take place before the search engine retrieves and ranks results. It is related to natural language processing but specifically focused on the understanding of search queries. Query understanding is at the heart of technologies like Amazon Alexa, Apple's Siri. Google Assistant, IBM's Watson, and Microsoft's Cortana.