Image translation

Last updated January 20, 2023

Image translation is the machine translation of images of printed text (posters, banners, menus, screenshots etc.). This is done by applying optical character recognition (OCR) technology to an image to extract any text contained in the image, and then have this text translated into a language of their choice, and the applying digital image processing on the original image to get the translated image with a new language.

General

Machine translation made available on the internet (web and mobile) is a notable advance in multilingual communication eliminating the need for an intermediary translator/interpreter, translating foreign texts still poses a problem to the user as they cannot be expected to be able to type the foreign text they wish to translate and understand. Manually entering the foreign text may prove to be a difficulty especially in cases where an unfamiliar alphabet is used from a script which user can't read, e.g. Cyrillic, Chinese, Japanese etc. for an English speaker or any speaker of a Latin-based language or vice versa.

The technical advancements in OCR made it possible to recognize text from images. The possibility to use one's mobile device's camera to capture and extract printed text is also known as mobile OCR and was first introduced in Japanese manufactured mobile telephones in 2004. Using the handheld's camera one could take a picture of (a line of) text and have it extracted (digitalized) for further manipulation such as storing the information in their contacts list, as a web page address (URL) or text to use in an SMS/email message etc.

Presently, mobile devices having a camera resolution of 2 megapixels or above with an auto-focus ability, often feature the text scanner service. Taking the text scanning facility one step further, image translation emerged, giving users the ability to capture text with their mobile phone's camera, extract the text, and have it translated in their own language.

More and more applications emerged on this technology including Word Lens^[1]^{[ circular reference ]}. After getting acquired by Google, it was made a part of Google Translate mobile app.

Another simultaneous advancement in Image Processing, has also made it possible now to replace the text on the image with the translated text and create a new image altogether.^[2]

History

The development of the image translation service springs from the advances in OCR technology (miniaturization and reduction of memory resources consumed) enabling text scanning on mobile telephones.

Among the first to announce mobile software capable of “reading” text using the mobile device's camera is International Wireless Inc. who in February 2003 released their “CheckPoint” and “WebPoint” applications. “CheckPoint” reads critical symbolic information on checks and is aimed at reducing losses that mobile merchants suffer from “bounced” checks by scanning the MICR number on the bottom of a check, while “WebPoint” enables the visual recognition and decoding of printed URL's, which are then opened by the device's web browser.^[3]

The first commercial release of a mobile text scanner, however, took place in December 2004 when Vodafone and Sharp began selling the 902SH mobile which was the first to feature a 2 megapixel digital camera with optical zoom. Among the device's various multimedia features was the built-in text/bar code/QR code scanner. The text scanner function could handle up to 60 alphabetical characters simultaneously. The scanned text could be then sent as an email or SMS message, added as a dictionary entry or, in the case of scanned URLs, opened via the device's web browser. All subsequent Sharp mobiles feature the text scanner functionality.^[4]

In September 2005, NEC Corporation and the Nara Institute of Science and Technology in Japan (NAIST) announced new software capable of transforming cameraphones into text scanners. The application differs substantially from similarly equipped mobile telephones in Japan (able to scan businesscards and small bits of text and use OCR to convert that to editable text or to URL addresses) by it ability to scan a whole page. The two companies, however, said they would not release the software commercially before the end of 2008.^[5]

Combining the text scanner function with machine translation technology was first made by US company RantNetwork who in July 2007 started selling the Communilator, a machine translation application for mobile devices featuring the Image Translation functionality. Using the built-in camera, the mobile user could take a picture of some printed text, apply OCR to recognize the text and then translate it into any one of over 25 language available.^[6]

In April 2008 Nokia showcased their Shoot-to-Translate application for the N73 model which is capable of taking a picture using the device's camera, extracting the text and then translating it. The application only offers Chinese to English translation, and does not handle large segments of text. Nokia said they are in the process of developing their Multiscanner product which, besides scanning text and business cards, would be able to translate between 52 languages.^[7]

Again in April 2008, Korean company Unichal Inc. released their handheld Dixau text scanner capable of scanning and recognizing English text and then translating it into Korean using online translation tools such as Wikipedia or Google Translate. The device is connected to a PC or a laptop via the USB port.^[8]

In February 2009, Bulgarian company Interlecta presented at the Mobile World Congress in Barcelona their mobile translator including image recognition and speech synthesis. The application handles all European languages along with Chinese, Japanese and Korean. The software connects to a server over the Internet to accomplish the image recognition and the translation.^[9]

In May 2014, Google acquired Word Lens to improve the quality of visual and voice translation. It is able to scan text or picture with one's device and have it translated instantly.^[10]^{[ circular reference ]}

Since the OCR has been improving many companies or website started combining OCR and translation, to read the text from an image and show the translated text.

In August 2018, an Indian company created ImageTranslate. It is able to read, translate and re-create the image in another language.

Currently, image translation is offered by the following companies:

Google Translate app with camera
ImageTranslate^[11]
Yandex^[12]

Related Research Articles

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image.

Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available. A handwriting recognition system handles formatting, performs correct segmentation into characters, and finds the most plausible words.

A barcode reader is an optical scanner that can read printed barcodes, decode the data contained in the barcode to a computer. Like a flatbed scanner, it consists of a light source, a lens and a light sensor for translating optical impulses into electrical signals. Additionally, nearly all barcode readers contain decoder circuitry that can analyse the barcode's image data provided by the sensor and send the barcode's content to the scanner's output port.

An image scanner—often abbreviated to just scanner—is a device that optically scans images, printed text, handwriting or an object and converts it to a digital image. Commonly used in offices are variations of the desktop flatbed scanner where the document is placed on a glass window for scanning. Hand-held scanners, where the device is moved by hand, have evolved from text scanning "wands" to 3D scanners used for industrial design, reverse engineering, test and measurement, orthotics, gaming and other applications. Mechanically driven scanners that move the document are typically used for large-format documents, where a flatbed design would be impractical.

Optical mark recognition is the process of reading information that people mark on surveys, tests and other paper documents.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

VueScan is a computer program for image scanning, especially of photographs, including negatives. It supports optical character recognition (OCR) of text documents. The software can be downloaded and used free of charge, but adds a watermark on scans until a license is purchased.

Book scanning or book digitization is the process of converting physical books and magazines into digital media such as images, electronic text, or electronic books (e-books) by using an image scanner. Large scale book scanning projects have made many books available online.

High Capacity Color Barcode (HCCB) is a technology developed by Microsoft for encoding data in a 2D "barcode" using clusters of colored triangles instead of the square pixels conventionally associated with 2D barcodes or QR codes. Data density is increased by using a palette of 4 or 8 colors for the triangles, although HCCB also permits the use of black and white when necessary. It has been licensed by the ISAN International Agency for use in its International Standard Audiovisual Number standard, and serves as the basis for the Microsoft Tag mobile tagging application.

The HTC TyTN II is an Internet-enabled Windows Mobile Pocket PC smartphone designed and marketed by HTC Corporation of Taiwan. It has a tilting touchscreen with a right-side slide-out QWERTY keyboard. The TyTN II's functions include those of a camera phone and a portable media player in addition to text messaging and multimedia messaging. It also offers Internet services including e-mail, instant messaging, web browsing, and local Wi-Fi connectivity. It is a quad-band GSM phone with GPRS, EDGE, UMTS, HSDPA, and HSUPA.

CuneiForm Cognitive OpenOCR is a freely distributed open-source OCR system developed by Russian software company Cognitive Technologies.

VirusTotal is a website created by the Spanish security company Hispasec Sistemas. Launched in June 2004, it was acquired by Google in September 2012. The company's ownership switched in January 2018 to Chronicle, a subsidiary of Google.

Mobile translation is any electronic device or software application that provides audio translation. The concept includes any handheld electronic device that is specifically designed for audio translation. It also includes any machine translation service or software application for hand-held devices, including mobile telephones, Pocket PCs, and PDAs. Mobile translation provides hand-held device users with the advantage of instantaneous and non-mediated translation from one human language to another, usually against a service fee that is, nevertheless, significantly smaller than a human translator charges.

The Ricoh 500SE digital compact camera is suitable for outdoor photography and networkability. Capability includes external information such as GPS position or barcode numbers within the image headers. External vendors sell hardware and software for workflows involving GPS positioning or barcode scanning. Most NMEA compliant bluetooth GPS receivers can be used with this camera through its built in bluetooth communication capability. The body is resistant to dust and water, making it robust for many environments.

Digital mailroom is the automation of incoming mail processes. Using document scanning, document capture, and cloud storage technologies, companies can digitise incoming mail and automate the classification and distribution of mail within the organization. Both paper and electronic mail (email) can be managed through the same process allowing companies to standardize their internal mail distribution procedures and adhere to company compliance policies.

<span class="mw-page-title-main">OCRFeeder</span>

OCRFeeder is an optical character recognition suite for GNOME, which also supports virtually any command-line OCR engine, such as CuneiForm, GOCR, Ocrad and Tesseract. It converts paper documents to digital document files and can serve to make them accessible to visually impaired users.

<span class="mw-page-title-main">Project Naptha</span>

Project Naptha is a browser extension software for Google Chrome that allows users to highlight, copy, edit and translate text from within images. It was created by developer Kevin Kwok, and released in April 2014 as a Chrome add-on. This software was first made available only on Google Chrome, downloadable from the Chrome Web Store. It was then made available on Mozilla Firefox, downloadable from the Mozilla Firefox add-ons repository but was soon removed. The reason behind the removal remains unknown.

Yandex Translate is a web service provided by Yandex, intended for the translation of text or web pages into another language.

Scanitto Pro is Windows-based software application for image scanning, direct printing and copying, basic editing and text recognition (OCR).

Barcode library or Barcode SDK is a software library that can be used to add barcode features to desktop, web, mobile or embedded applications. Barcode library presents sets of subroutines or objects which allow to create barcode images and put them on surfaces or recognize machine-encoded text / data from scanned or captured by camera images with embedded barcodes. The library can support two modes: generation and recognition mode, some libraries support barcode reading and writing in the same way, but some libraries support only one mode.

References

↑ WordLens Tm: wiki, Retrieved 2019-03-23
↑ "ImageTranslate Tm: website" . Retrieved 2019-03-23.
↑ "International Wireless, Inc. Reads Personal Checks with Cell Phones. - Free Online Library". Thefreelibrary.com. Archived from the original on 2016-01-27. Retrieved 2012-02-24.
↑ "The Sharp 902 - Europe's First 3G Mobile with 2 Megapixel Digital Camera". UMTS Forum. Retrieved 2012-02-24.
↑ "Camera phones will be high-precision scanners - tech - 14 September 2005". New Scientist. Retrieved 2012-02-24.
↑ "RantNetwork Tm: PressRelease". Rantnetwork.com. Retrieved 2012-02-24.
↑ Archived April 12, 2009, at the Wayback Machine
↑ "1-Click Dictionary Dixau - News - Dixau text scanner" (in Korean). En.dixau.com. Retrieved 2012-02-24.
↑ "Translation Software with Speech for BlackBerry". Archived from the original on January 13, 2010. Retrieved April 28, 2017.
↑ Google Translate
↑ "ImageTranslate" . Retrieved March 23, 2019.
↑ "Yandex ocr translate" . Retrieved March 23, 2019.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] WordLens Tm: wiki, Retrieved 2019-03-23

[2] "ImageTranslate Tm: website" . Retrieved 2019-03-23.

[3] "International Wireless, Inc. Reads Personal Checks with Cell Phones. - Free Online Library". Thefreelibrary.com. Archived from the original on 2016-01-27. Retrieved 2012-02-24.

[4] "The Sharp 902 - Europe's First 3G Mobile with 2 Megapixel Digital Camera". UMTS Forum. Retrieved 2012-02-24.

[5] "Camera phones will be high-precision scanners - tech - 14 September 2005". New Scientist. Retrieved 2012-02-24.

[6] "RantNetwork Tm: PressRelease". Rantnetwork.com. Retrieved 2012-02-24.

[7] Archived April 12, 2009, at the Wayback Machine

[8] "1-Click Dictionary Dixau - News - Dixau text scanner" (in Korean). En.dixau.com. Retrieved 2012-02-24.

[9] "Translation Software with Speech for BlackBerry". Archived from the original on January 13, 2010. Retrieved April 28, 2017.

[10] Google Translate

[11] "ImageTranslate" . Retrieved March 23, 2019.

[12] "Yandex ocr translate" . Retrieved March 23, 2019.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Image translation

Contents

General

History

Related Research Articles

References