Autocomplete

Last updated
Example of the partially typed search term baby st being autocompleted to various options Autocomplete.png
Example of the partially typed search term baby st being autocompleted to various options

Autocomplete, or word completion, is a feature in which an application predicts the rest of a word a user is typing. In Android and iOS [1] smartphones, this is called predictive text. In graphical user interfaces, users can typically press the tab key to accept a suggestion or the down arrow key to accept one of several.

Contents

Autocomplete speeds up human-computer interactions when it correctly predicts the word a user intends to enter after only a few characters have been typed into a text input field. It works best in domains with a limited number of possible words (such as in command line interpreters), when some words are much more common (such as when addressing an e-mail), or writing structured and predictable text (as in source code editors).

Many autocomplete algorithms learn new words after the user has written them a few times, and can suggest alternatives based on the learned habits of the individual user.

Definition

Original purpose

The original purpose of word prediction software was to help people with physical disabilities increase their typing speed, [2] as well as to help them decrease the number of keystrokes needed in order to complete a word or a sentence. [3] The need to increase speed is noted by the fact that people who use speech-generating devices generally produce speech at a rate that is less than 10% as fast as people who use oral speech. [4] But the function is also very useful for anybody who writes text, particularly people–such as medical doctors–who frequently use long, hard-to-spell terminology that may be technical or medical in nature.

Description

Autocomplete or word completion works so that when the writer writes the first letter or letters of a word, the program predicts one or more possible words as choices. If the intended word is included in the list, the writer can select it, for example, by using the number keys. If the word that the user wants is not predicted, the writer must enter the next letter of the word. At this time, the word choice(s) is altered so that the words provided begin with the same letters as those that have been selected. When the word that the user wants appears it is selected, and the word is inserted into the text. [5] [6] In another form of word prediction, words most likely to follow the just written one are predicted, based on recent word pairs used. [6] Word prediction uses language modeling, where within a set vocabulary the words are most likely to occur are calculated. [7] Along with language modeling, basic word prediction on AAC devices is often coupled with a frecency model, where words the AAC user has used recently and frequently are more likely to be predicted. [4] Word prediction software often also allows the user to enter their own words into the word prediction dictionaries either directly, or by "learning" words that have been written. [5] [6] Some search returns related to genitals or other vulgar terms are often omitted from autocompletion technologies, as are morbid terms [8] [9]

History

The autocomplete and predictive text technology was invented by Chinese scientists and linguists in the 1950s to solve the input inefficiency of the Chinese typewriter, [10] as the typing process involved finding and selecting thousands of logographic characters on a tray, [11] drastically slowing down the word processing speed. [12] [13]

In the 1950s, typists came to rearrange the character layout from the standard dictionary layout to groups of common words and phrases. [14] Chinese typewriter engineers innovated mechanisms to access common characters accessible at the fastest speed possible by word prediction, a technique used today in Chinese input methods for computers, and in text messaging in many languages. According to Stanford University historian Thomas Mullaney, the development of modern Chinese typewriters from the 1960s to 1970s influenced the development of modern computer word processors and affected the development of computers themselves. [15] [11] [14]

Types of autocomplete tools

There are standalone tools that add autocomplete functionality to existing applications. These programs monitor user keystrokes and suggest a list of words based on first typed letter(s). Examples are Typingaid and Letmetype. [16] [17] LetMeType, freeware, is no longer developed, the author has published the source code and allows anybody to continue development. Typingaid, also freeware, is actively developed. Intellicomplete, both a freeware and payware version, works only in certain programs which hook into the intellicomplete server program. [18] Many Autocomplete programs can also be used to create a Shorthand list. The original autocomplete software was Smartype, which dates back to the late 1980s and is still available today. It was initially developed for medical transcriptionists working in WordPerfect for MS/DOS, but it now functions for any application in any Windows or Web-based program.

Shorthand

Shorthand, also called Autoreplace, is a related feature that involves automatic replacement of a particular string with another one, usually one that is longer and harder to type, such as "myname" with "Lee John Nikolai François Al Rahman". This can also quietly fix simple typing errors, such as turning "teh" into "the". Several Autocomplete programs, standalone or integrated in text editors, based on word lists, also include a shorthand function for often used phrases.[ citation needed ]

Context completion

Context completion is a text editor feature, similar to word completion, which completes words (or entire phrases) based on the current context and context of other similar words within the same document, or within some training data set. The main advantage of context completion is the ability to predict anticipated words more precisely and even with no initial letters. The main disadvantage is the need of a training data set, which is typically larger for context completion than for simpler word completion. Most common use of context completion is seen in advanced programming language editors and IDEs, where training data set is inherently available and context completion makes more sense to the user than broad word completion would.[ citation needed ]

Line completion is a type of context completion, first introduced by Juraj Simlovic in TED Notepad, in July 2006. The context in line completion is the current line, while the current document poses as a training data set. When the user begins a line that starts with a frequently used phrase, the editor automatically completes it, up to the position where similar lines differ, or proposes a list of common continuations.[ citation needed ]

Action completion in applications are standalone tools that add autocomplete functionality to existing applications or all existing applications of an OS, based on the current context. The main advantage of Action completion is the ability to predict anticipated actions. The main disadvantage is the need of a data set. Most common use of Action completion is seen in advanced programming language editors and IDEs. But there are also action completion tools that work globally, in parallel, across all applications of the entire PC without (very) hindering the action completion of the respective applications.[ citation needed ]

Software integration

In web browsers

Autocomplete of the search box in Mozilla Firefox Autocomplete Mozilla Firefox 23 - Wikipedia de search.png
Autocomplete of the search box in Mozilla Firefox

In web browsers, autocomplete is done in the address bar (using items from the browser's history) and in text boxes on frequently used pages, such as a search engine's search box. Autocomplete for web addresses is particularly convenient because the full addresses are often long and difficult to type correctly. HTML5 has an autocomplete form attribute.[ citation needed ]

In e-mail programs

In e-mail programs autocomplete is typically used to fill in the e-mail addresses of the intended recipients. Generally, there are a small number of frequently used e-mail addresses, hence it is relatively easy to use autocomplete to select among them. Like web addresses, e-mail addresses are often long, hence typing them completely is inconvenient.[ citation needed ]

For instance, Microsoft Outlook Express will find addresses based on the name that is used in the address book. Google's Gmail will find addresses by any string that occurs in the address or stored name.[ citation needed ]

In search engines

In search engines, autocomplete user interface features provide users with suggested queries or results as they type their query in the search box. This is also commonly called autosuggest or incremental search . This type of search often relies on matching algorithms that forgive entry errors such as phonetic Soundex algorithms or the language independent Levenshtein algorithm. The challenge remains to search large indices or popular query lists in under a few milliseconds so that the user sees results pop up while typing.

Autocomplete can have an adverse effect on individuals and businesses when negative search terms are suggested when a search takes place. Autocomplete has now become a part of reputation management as companies linked to negative search terms such as scam, complaints and fraud seek to alter the results. Google in particular have listed some of the aspects that affect how their algorithm works, but this is an area that is open to manipulation. [19]

In source code editors

Code completion in Qt Creator 5.0: The programmer types some code, and when the software detects a recognizable string such as a variable identifier or class name it presents a menu to the programmer which contains the complete name of the identified variable or the methods applicable to the detected class, and the programmer makes a choice with her or his mouse or with the keyboard arrow keys. If the programmer continues typing without making a choice, then the menu disappears Qt Creator 5.0-Autocomplete.png
Code completion in Qt Creator 5.0: The programmer types some code, and when the software detects a recognizable string such as a variable identifier or class name it presents a menu to the programmer which contains the complete name of the identified variable or the methods applicable to the detected class, and the programmer makes a choice with her or his mouse or with the keyboard arrow keys. If the programmer continues typing without making a choice, then the menu disappears

Autocompletion of source code is also known as code completion. In a source code editor, autocomplete is greatly simplified by the regular structure of the programming language. There are usually only a limited number of words meaningful in the current context or namespace, such as names of variables and functions. An example of code completion is Microsoft's IntelliSense design. It involves showing a pop-up list of possible completions for the current input prefix to allow the user to choose the right one. This is particularly useful in object-oriented programming because often the programmer will not know exactly what members a particular class has. Therefore, autocomplete then serves as a form of convenient documentation as well as an input method.

Another beneficial feature of autocomplete for source code is that it encourages the programmer to use longer, more descriptive variable names, hence making the source code more readable. Typing large words which may contain camel case like numberOfWordsPerParagraph can be difficult, but autocomplete allows a programmer to complete typing the word using a fraction of the keystrokes.

In database query tools

Autocompletion in database query tools allows the user to autocomplete the table names in an SQL statement and column names of the tables referenced in the SQL statement. As text is typed into the editor, the context of the cursor within the SQL statement provides an indication of whether the user needs a table completion or a table column completion. The table completion provides a list of tables available in the database server the user is connected to. The column completion provides a list of columns for only tables referenced in the SQL statement. SQL Server Management Studio provides autocomplete in query tools.[ citation needed ]

In word processors

In many word processing programs, autocompletion decreases the amount of time spent typing repetitive words and phrases. The source material for autocompletion is either gathered from the rest of the current document or from a list of common words defined by the user. Currently Apache OpenOffice, Calligra Suite, KOffice, LibreOffice and Microsoft Office include support for this kind of autocompletion, as do advanced text editors such as Emacs and Vim.

In command-line interpreters

Command-line completion in PowerShell. Powershell Intellisense example for the Get-Process cmdlet.gif
Command-line completion in PowerShell.

In a command-line interpreter, such as Unix's sh or bash, or Windows's cmd.exe or PowerShell, or in similar command line interfaces, autocomplete of command names and file names may be accomplished by keeping track of all the possible names of things the user may access. Here autocomplete is usually done by pressing the Tab ↹ key after typing the first several letters of the word. For example, if the only file in the current directory that starts with x is xLongFileName, the user may prefer to type x and autocomplete to the complete name. If there were another file name or command starting with x in the same scope, the user would type more letters or press the Tab key repeatedly to select the appropriate text.

Efficiency

Research

Although research has shown that word prediction software does decrease the number of keystrokes needed and improves the written productivity of children with disabilities, [2] there are mixed results as to whether or not word prediction actually increases speed of output. [20] [21] It is thought that the reason why word prediction does not always increase the rate of text entry is because of the increased cognitive load and requirement to move eye gaze from the keyboard to the monitor. [2]

In order to reduce this cognitive load, parameters such as reducing the list to five likely words, and having a vertical layout of those words may be used. [2] The vertical layout is meant to keep head and eye movements to a minimum, and also gives additional visual cues because the word length becomes apparent. [22] Although many software developers believe that if the word prediction list follows the cursor, that this will reduce eye movements, [2] in a study of children with spina bifida by Tam, Reid, O'Keefe & Nauman (2002) it was shown that typing was more accurate, and that the children also preferred when the list appeared at the bottom edge of the screen, at the midline. Several studies have found that word prediction performance and satisfaction increases when the word list is closer to the keyboard, because of the decreased amount of eye-movements needed. [23]

Software with word prediction is produced by multiple manufacturers. The software can be bought as an add-on to common programs such as Microsoft Word (for example, WordQ+SpeakQ, Typing Assistant, [24] Co:Writer,[ citation needed ] Wivik,[ citation needed ] Ghotit Dyslexia),[ citation needed ] or as one of many features on an AAC device (PRC's Pathfinder,[ citation needed ] Dynavox Systems,[ citation needed ] Saltillo's ChatPC products[ citation needed ]). Some well known programs: Intellicomplete,[ citation needed ] which is available in both a freeware and a payware version, but works only with programs which are made to work with it. Letmetype[ citation needed ] and Typingaid[ citation needed ] are both freeware programs which work in any text editor.

An early version of autocompletion was described in 1967 by H. Christopher Longuet-Higgins in his Computer-Assisted Typewriter (CAT), [25] "such words as 'BEGIN' or 'PROCEDURE' or identifiers introduced by the programmer, would be automatically completed by the CAT after the programmer had typed only one or two symbols."

See also

Related Research Articles

<span class="mw-page-title-main">Software</span> Non-tangible executable component of a computer

Software is a collection of programs and data that tell a computer how to perform specific tasks. Software often includes associated software documentation. This is in contrast to hardware, from which the system is built and which actually performs the work.

An integrated development environment (IDE) is a software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, and a debugger. Some IDEs, such as IntelliJ IDEA, Eclipse and Lazarus contain the necessary compiler, interpreter or both; others, such as SharpDevelop and NetBeans, do not.

<span class="mw-page-title-main">Text editor</span> Computer software used to edit plain text documents

A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software. Text editors are provided with operating systems and software development packages, and can be used to change files such as configuration files, documentation files and programming language source code.

<span class="mw-page-title-main">Case sensitivity</span> Defines whether uppercase and lowercase letters are treated as distinct

In computers, case sensitivity defines whether uppercase and lowercase letters are treated as distinct (case-sensitive) or equivalent (case-insensitive). For instance, when users interested in learning about dogs search an e-book, "dog" and "Dog" are of the same significance to them. Thus, they request a case-insensitive search. But when they search an online encyclopedia for information about the United Nations, for example, or something with no ambiguity regarding capitalization and ambiguity between two or more terms cut down by capitalization, they may prefer a case-sensitive search.

<span class="mw-page-title-main">Tab key</span> Key on a keyboard for tabulation

The tab keyTab ↹ on a keyboard is used to advance the cursor to the next tab stop.

<span class="mw-page-title-main">Input method</span> Input of characters not natively available

An input method is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters that are available to them. Using an input method is usually necessary for languages that have more graphemes than there are keys on the keyboard.

Code completion is an autocompletion feature in many integrated development environments (IDEs) that speeds up the process of coding applications by fixing common mistakes and suggesting lines of code. This usually happens through popups while typing, querying parameters of functions, and query hints related to syntax errors. Modern code completion software typically uses generative artificial intelligence systems to predict lines of code. Code completion and related tools serve as documentation and disambiguation for variable names, functions, and methods, using static analysis.

<span class="mw-page-title-main">BBEdit</span> Proprietary text editor

BBEdit is a proprietary text editor made by Bare Bones Software, originally developed for Macintosh System Software 6, and currently supporting macOS.

In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases.

<span class="mw-page-title-main">Incremental search</span> User interface method to search for text

In computing, incremental search, also known as hot search, incremental find or real-time suggestions, is a user interface interaction method to progressively search for and filter through text. As the user types text, one or more possible matches for the text are found and immediately presented to the user. This immediate feedback often allows the user to stop short of typing the entire word or phrase they were looking for. The user may also choose a closely related option from the presented list.

Predictive text is an input technology used where one key or button represents many letters, such as on the physical numeric keypads of mobile phones and in accessibility technologies. Each key press results in a prediction rather than repeatedly sequencing through the same group of "letters" it represents, in the same, invariable order. Predictive text could allow for an entire word to be input by single keypress. Predictive text makes efficient use of fewer device keys to input writing into a text message, an e-mail, an address book, a calendar, and the like.

The hyphen-minus symbol - is the form of hyphen most commonly used in digital documents. On most keyboards, it is the only character that resembles a minus sign or a dash so it is also used for these. The name hyphen-minus derives from the original ASCII standard, where it was called hyphen–(minus). The character is referred to as a hyphen, a minus sign, or a dash according to the context where it is being used.

<span class="mw-page-title-main">T9 (predictive text)</span> T9 is a predictive text technology for mobile phones with a 3×4 numeric keypad

T9 is a predictive text technology for mobile phones, originally developed by Tegic Communications, now part of Nuance Communications. T9 stands for Text on 9 keys.

<span class="mw-page-title-main">TED Notepad</span> Text editor for Microsoft Windows

TED Notepad is freeware portable text editor software for Microsoft Windows, developed by Juraj Šimlovič since 2001, originally as a school project. It looks similar to Windows Notepad, but provides additional features, including experimental line completion and selection jumping.

<span class="mw-page-title-main">Snippet (programming)</span> Small region of re-usable source code, machine code, or text

Snippet is a programming term for a small region of re-usable source code, machine code, or text. Ordinarily, these are formally defined operative units to incorporate into larger programming modules. Snippet management is a feature of some text editors, program source code editors, IDEs, and related software. It allows the user to avoid repetitive typing in the course of routine edit operations.

<span class="mw-page-title-main">Speech-generating device</span> Augmenting speech device

Speech-generating devices (SGDs), also known as voice output communication aids, are electronic augmentative and alternative communication (AAC) systems used to supplement or replace speech or writing for individuals with severe speech impairments, enabling them to verbally communicate. SGDs are important for people who have limited means of interacting verbally, as they allow individuals to become active participants in communication interactions. They are particularly helpful for patients with amyotrophic lateral sclerosis (ALS) but recently have been used for children with predicted speech deficiencies.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

<span class="mw-page-title-main">Search suggest drop-down list</span> Query feature used in computing

A search suggest drop-down list is a query feature used in computing to show the searcher shortcuts, while the query is typed into a text box. Before the query is complete, a drop-down list with the suggested completions appears to provide options to select. The suggested queries then enable the searcher to complete the required search quickly. As a form of autocompletion, the suggestion list is distinct from search history in that it attempts to be predictive even when the user is searching for the first time. Data may come from popular searches, sponsors, geographic location or other sources. These lists are used by operating systems, web browsers and various websites, particularly search engines. Search suggestions are common with a 2014 survey finding that over 80% of e-commerce websites included them.

The following outline is provided as an overview of and topical guide to natural-language processing:

In May 2013, the German Federal Court of Justice stated that Google's predictions within the autocomplete function of its web search engine can violate the right of personality. The right of personality ensures that a person's personality (reputation) is respected and can be freely developed. Only the individual shall, in principle, decide how he/she wants to present himself/herself to third parties and the public.

References

  1. "How to use Auto-Correction and predictive text on your iPhone, iPad, or iPod touch". Apple Support. Apple.
  2. 1 2 3 4 5 Tam, Cynthia; Wells, David (2009). "Evaluating the Benefits of Displaying Word Prediction Lists on a Personal Digital Assistant at the Keyboard Level". Assistive Technology. 21 (3): 105–114. doi:10.1080/10400430903175473. PMID   19908678. S2CID   23183632.
  3. Anson, D.; Moist, P.; Przywara, M.; Wells, H.; Saylor, H.; Maxime, H. (2006). "The Effects of Word Completion and Word Prediction on Typing Rates Using On-Screen Keyboards". Assistive Technology. 18 (2): 146–154. doi:10.1080/10400435.2006.10131913. PMID   17236473. S2CID   11193172.
  4. 1 2 Trnka, K.; Yarrington, J.M.; McCoy, K.F. (2007). "The Effects of Word Prediction on Communication Rate for AAC". NAACL-Short '07: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics. Vol. Companion Volume, Short Papers. Association for Computational Linguistics. pp. 173–6. CiteSeerX   10.1.1.363.2416 .
  5. 1 2 Beukelman, D.R.; Mirenda, P. (2005). Augmentative and Alternative Communication: Supporting Children and Adults with Complex Communication Needs (3rd ed.). Baltimore, MD: Brookes. p. 77. ISBN   9781557666840. OCLC   254228982.
  6. 1 2 3 Witten, I.H.; Darragh, John J. (1992). The reactive keyboard. Cambridge University Press. pp. 43–44. ISBN   978-0-521-40375-7.
  7. Jelinek, F. (1990). "Self-Organized Language Modeling for Speech Recognition". In Waibel, A.; Lee, Kai-Fu (eds.). Readings in Speech Recognition. Morgan Kaufmann. p. 450. ISBN   9781558601246.
  8. Oster, Jan (2015). "Communication, defamation and liability of intermediaries". Legal Studies. 35 (2): 348–368. doi:10.1111/lest.12064. S2CID   143005665.
  9. McCulloch, Gretchen (11 February 2019). "Autocomplete Presents the Best Version of You". Wired . Retrieved 11 February 2019.
  10. Mcclure, Max (12 November 2012). "Chinese typewriter anticipated predictive text, finds historian".
  11. 1 2 Sorrel, Charlie (February 23, 2009). "How it Works: The Chinese Typewriter". Wired .
  12. Greenwood, Veronique (14 December 2016). "Why predictive text is making you forget how to write". New Scientist.
  13. O'Donovan, Caroline (16 August 2016). "How This Decades-Old Technology Ushered In Predictive Text". Buzzfeed.
  14. 1 2 Mullaney, Thomas S. (2018-07-16). "90,000 Characters on 1 Keyboard". Foreign Policy. Retrieved 25 April 2020.
  15. Featured Research – world's first history of the Chinese typewriter, Humanities at Stanford, January 2, 2010
  16. "[AHK 1.1]TypingAid v2.22.0 — Word AutoCompletion Utility". AutoHotkey. 2010.
  17. Clasohm, Carsten (2011). "LetMeType". Archived from the original on 2012-05-27. Retrieved 2012-05-09.
  18. "Medical Transcription Software — IntelliComplete". FlashPeak. 2014.
  19. Davids, Neil (2015-06-03). "Changing Autocomplete Search Suggestions". Reputation Station. Retrieved 19 June 2015.
  20. Dabbagh, H.H.; Damper, R.I. (1985). "Average Selection Length and Time as Predictors of Communication Rate". In Brubaker, C.; Hobson, D.A. (eds.). Technology, a Bridge to Independence: Proceedings of the Eighth Annual Conference on Rehabilitation Technology, Memphis, Tennessee, June 24–28th, 1985. Rehabilitation Engineering Society of North America. pp. 404–6. OCLC   15055289. 80177b42-e668-4ed5-a256-49b9440bdfa5.
  21. Goodenough-Trepagnier, C.; Rosen, M.J. (1988). "Predictive Assessment for Communication Aid Prescription: Motor-Determined Maximum Communication Rate". In Bernstein, L.E. (ed.). The vocally impaired: Clinical Practice and Research. Philadelphia: Grune & Stratton. pp. 165–185. ISBN   9780808919087. OCLC   567938402. as cited in Tam & Wells 2009
  22. Swiffin, A.L.; Arnott, J.L.; Pickering, J.A.; Newell, A.F. (1987). "Adaptive and predictive techniques in a communication prosthesis". Augmentative and Alternative Communication. 3 (4): 181–191. doi:10.1080/07434618712331274499. as cited in Tam & Wells 2009
  23. Tam, C.; Reid, D.; Naumann, S.; O'Keefe, B. (2002). "Perceived benefits of word prediction intervention on written productivity in children with spina bifida and hydrocephalus". Occupational Therapy International. 9 (3): 237–255. doi: 10.1002/oti.167 . PMID   12374999. as cited in Tam & Wells 2009.
  24. Sumit Software (2010). "Typing Assistant – New generation of word prediction software". PRLog: Press Release Distribution.
  25. Longuet-Higgins, H.C.; Ortony, A. (1968). "The Adaptive Memorization of Sequences". Machine Intelligence 3, Proceedings of the Third Annual Machine Intelligence Workshop, University of Edinburgh, September 1967. Edinburgh University Press. pp. 311–322.