VoTT

Last updated
VoTT
Original author(s) Commercial Software Engineering (CSE) group at Microsoft in Israel
Developer(s) Microsoft and community
Initial release2018;5 years ago (2018)
Stable release
v2.2.0 / June 3, 2020;3 years ago (2020-06-03)
Repository github.com/microsoft/VoTT
Written in TypeScript
Operating system Windows, Linux, macOS
Platform Cross-platform
Type Image annotation tool
License MIT License
Website vott.z22.web.core.windows.net

VoTT (Visual Object Tagging Tool) is a free and open source Electron app for image annotation and labeling developed by Microsoft. [1] The software is written in the TypeScript programming language and used for building end-to-end object detection models from image and videos assets for computer vision algorithms. [2]

Contents

Overview

VoTT is a React+Redux web application that requires Node.js and npm. [3] It is available as a stand-alone web application and can be used in any modern web browser. [4]

Notable features include the ability to label images or video frames, [2] support for importing data from local or cloud storage providers, [3] and support for exporting labeled data to local or cloud storage providers.

Labeled assets can be exported into the following formats:

The VoTT source code is licensed under MIT License and available on GitHub. [6]

See also

Related Research Articles

Microsoft engineering groups are the operating divisions of Microsoft. Starting in April 2002, Microsoft organised itself into seven groups, each an independent financial entity. In September 2005, Microsoft announced a reorganization of its then seven groups into three. In July 2013, Microsoft announced another reorganization into five engineering groups and six corporate affairs groups. A year later, in June 2015, Microsoft reformed into three engineering groups. In September 2016, a new group was created to focus on artificial intelligence and research. On March 29, 2018, a new structure merged all of these into three.

TypeScript is a free and open-source high-level programming language developed by Microsoft that adds static typing with optional type annotations to JavaScript. It is designed for the development of large applications and transpiles to JavaScript. Because TypeScript is a superset of JavaScript, all JavaScript programs are syntactically valid TypeScript, but they can fail to type-check for safety reasons.

Azure DevOps Server is a Microsoft product that provides version control, reporting, requirements management, project management, automated builds, testing and release management capabilities. It covers the entire application lifecycle and enables DevOps capabilities. Azure DevOps can be used as a back-end to numerous integrated development environments (IDEs) but is tailored for Microsoft Visual Studio and Eclipse on all platforms.

<span class="mw-page-title-main">WorldWide Telescope</span> Set of open-source services

WorldWide Telescope (WWT) is an open-source set of applications, data and cloud services, originally created by Microsoft Research but now an open source project hosted on GitHub. The .NET Foundation holds the copyright and the project is managed by the American Astronomical Society and has been supported by grants from the Moore Foundation and National Science Foundation. WWT displays astronomical, earth and planetary data allowing visual navigation through the 3-dimensional (3D) Universe. Users are able to navigate the sky by panning and zooming, or explore the 3D universe from the surface of Earth to past the Cosmic microwave background (CMB), viewing both visual imagery and scientific data about that area and the objects in it. Data is curated from hundreds of different data sources, but its open data nature allows users to explore any third party data that conforms to a WWT supported format. With the rich source of multi-spectral all-sky images it is possible to view the sky in many wavelengths of light. The software utilizes Microsoft's Visual Experience Engine technologies to function. WWT can also be used to visualize arbitrary or abstract data sets and time series data.

<span class="mw-page-title-main">Microsoft Azure</span> Cloud computing platform by Microsoft

Microsoft Azure, often referred to as Azure, is a cloud computing platform run by Microsoft. It offers access, management, and the development of applications and services through global data centers. It also provides a range of capabilities, including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). Microsoft Azure supports many programming languages, tools, and frameworks, including Microsoft-specific and third-party software and systems.

Text Template Transformation Toolkit is a free and open-source template-based text generation framework. T4 source files are usually denoted by the file extension ".tt".

<span class="mw-page-title-main">Reverse image search</span> Content-based image retrieval

Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

<span class="mw-page-title-main">Electron (software framework)</span> Development framework built on Chromium

Electron is a free and open-source software framework developed and maintained by OpenJS Foundation. The framework is designed to create desktop applications using web technologies that are rendered using a version of the Chromium browser engine and a back end using the Node.js runtime environment. It also uses various APIs to enable functionality such as native integration with Node.js services and an inter-process communication module.

The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes.

Microsoft, a technology company historically known for its opposition to the open source software paradigm, turned to embrace the approach in the 2010s. From the 1970s through 2000s under CEOs Bill Gates and Steve Ballmer, Microsoft viewed the community creation and sharing of communal code, later to be known as free and open source software, as a threat to its business, and both executives spoke negatively against it. In the 2010s, as the industry turned towards cloud, embedded, and mobile computing—technologies powered by open source advances—CEO Satya Nadella led Microsoft towards open source adoption although Microsoft's traditional Windows business continued to grow throughout this period generating revenues of 26.8 billion in the third quarter of 2018, while Microsoft's Azure cloud revenues nearly doubled.

<span class="mw-page-title-main">Windows Terminal</span> Terminal emulator for Windows 10 and later

Windows Terminal is a multi-tabbed terminal emulator developed by Microsoft for Windows 10 and later as a replacement for Windows Console. It can run any command-line app in a separate tab. It is preconfigured to run Command Prompt, PowerShell, WSL, SSH, and Azure Cloud Shell Connector. Windows Terminal comes with its own rendering back-end; starting with version 1.11 on Windows 11, command-line apps can run using this newer back-end instead of the old Windows Console.

<span class="mw-page-title-main">Computer Vision Annotation Tool</span> Free and open source, web-based image and video annotation tool

Computer Vision Annotation Tool (CVAT) is a free, open source, web-based image and video annotation tool which is used for labeling data for computer vision algorithms. Originally developed by Intel, CVAT is designed for use by a professional data annotation team, with a user interface optimized for computer vision annotation tasks.

AirSim is an open-source, cross platform simulator for drones, ground vehicles such as cars and various other objects, built on Epic Games’ proprietary Unreal Engine 4 as a platform for AI research. It is developed by Microsoft and can be used to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. This allows testing of autonomous solutions without worrying about real-world damage.

Deepfake pornography, or simply fake pornography, is a type of synthetic porn that is created via altering already-existing pornographic material by applying deepfake technology to the faces of the actors. The use of deepfake porn has sparked controversy because it involves the making and sharing of realistic videos featuring non-consenting individuals, typically female celebrities, and is sometimes used for revenge porn. Efforts are being made to combat these ethical concerns through legislation and technology-based solutions.

CloudApp is a cross-platform screen capture and screen recording desktop client that supports online storage and sharing.

References

  1. Tung, Liam. "Free AI developer app: IBM's new tool can label objects in videos for you". ZDNet.
  2. 1 2 3 Bornstein, Aaron (Ari) (February 4, 2019). "Using Object Detection for Complex Image Classification Scenarios Part 4". Medium.
  3. 1 2 Solawetz, Jacob (July 27, 2020). "Getting Started with VoTT Annotation Tool for Computer Vision". Roboflow Blog.
  4. "Best Open Source Annotation Tools for Computer Vision". www.sicara.ai.
  5. "Beyond Sentiment Analysis: Object Detection with ML.NET". September 20, 2020.
  6. "GitHub - microsoft/VoTT: Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos". November 15, 2020 via GitHub.