Original author(s) | Commercial Software Engineering (CSE) group at Microsoft in Israel |
---|---|
Developer(s) | Microsoft and community |
Initial release | 2018 |
Stable release | v2.2.0 / June 3, 2020 |
Repository | github |
Written in | TypeScript |
Operating system | Windows, Linux, macOS |
Platform | Cross-platform |
Type | Image annotation tool |
License | MIT License |
Website | vott |
VoTT (Visual Object Tagging Tool) is a free and open source Electron app for image annotation and labeling developed by Microsoft. [1] The software is written in the TypeScript programming language and used for building end-to-end object detection models from image and videos assets for computer vision algorithms.
VoTT is a React+Redux web application that requires Node.js and npm. [2] It is available as a stand-alone web application and can be used in any modern web browser. [3]
Notable features include the ability to label images or video frames, support for importing data from local or cloud storage providers, [2] and support for exporting labeled data to local or cloud storage providers.
Labeled assets can be exported into the following formats:
The VoTT source code is licensed under MIT License and available on GitHub. [5]
The following tables list notable software packages that are nominal IDEs; standalone tools such as source-code editors and GUI builders are not included. These IDEs are listed in alphabetic order of the supported language.
Microsoft engineering groups are the operating divisions of Microsoft. Starting in April 2002, Microsoft organised itself into seven groups, each an independent financial entity. In September 2005, Microsoft announced a reorganization of its then seven groups into three. In July 2013, Microsoft announced another reorganization into five engineering groups and six corporate affairs groups. A year later, in June 2015, Microsoft reformed into three engineering groups. In September 2016, a new group was created to focus on artificial intelligence and research. On March 29, 2018, a new structure merged all of these into three.
TypeScript is a free and open-source high-level programming language developed by Microsoft that adds static typing with optional type annotations to JavaScript. It is designed for the development of large applications and transpiles to JavaScript. Because TypeScript is a superset of JavaScript, all JavaScript programs are syntactically valid TypeScript, but they can fail to type-check for safety reasons.
JSDoc is a markup language used to annotate JavaScript source code files. Using comments containing JSDoc, programmers can add documentation describing the application programming interface of the code they're creating. This is then processed, by various tools, to produce documentation in accessible formats like HTML and Rich Text Format. The JSDoc specification is released under CC BY-SA 3.0, while its companion documentation generator and parser library is free software under the Apache License 2.0.
Azure DevOps Server, formerly known as Team Foundation Server (TFS) and Visual Studio Team System (VSTS), is a Microsoft product that provides version control, reporting, requirements management, project management, automated builds, testing and release management capabilities. It covers the entire application lifecycle and enables DevOps capabilities. Azure DevOps can be used as a back-end to numerous integrated development environments (IDEs) but is tailored for Microsoft Visual Studio and Eclipse on all platforms.
WorldWide Telescope (WWT) is an open-source set of applications, data and cloud services, originally created by Microsoft Research but now an open source project hosted on GitHub. The .NET Foundation holds the copyright and the project is managed by the American Astronomical Society and has been supported by grants from the Moore Foundation and National Science Foundation. WWT displays astronomical, earth and planetary data allowing visual navigation through the 3-dimensional (3D) Universe. Users are able to navigate the sky by panning and zooming, or explore the 3D universe from the surface of Earth to past the Cosmic microwave background (CMB), viewing both visual imagery and scientific data about that area and the objects in it. Data is curated from hundreds of different data sources, but its open data nature allows users to explore any third party data that conforms to a WWT supported format. With the rich source of multi-spectral all-sky images it is possible to view the sky in many wavelengths of light. The software utilizes Microsoft's Visual Experience Engine technologies to function. WWT can also be used to visualize arbitrary or abstract data sets and time series data.
Text Template Transformation Toolkit is a free and open-source template-based text generation framework. T4 source files are usually denoted by the file extension ".tt".
Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.
Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.
Electron is a free and open-source software framework developed and maintained by OpenJS Foundation. The framework is designed to create desktop applications using web technologies that are rendered using a version of the Chromium browser engine and a back end using the Node.js runtime environment. It also uses various APIs to enable functionality such as native integration with Node.js services and an inter-process communication module.
The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge, where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes.
Microsoft, a technology company historically known for its opposition to the open source software paradigm, turned to embrace the approach in the 2010s. From the 1970s through 2000s under CEOs Bill Gates and Steve Ballmer, Microsoft viewed the community creation and sharing of communal code, later to be known as free and open source software, as a threat to its business, and both executives spoke negatively against it. In the 2010s, as the industry turned towards cloud, embedded, and mobile computing—technologies powered by open source advances—CEO Satya Nadella led Microsoft towards open source adoption although Microsoft's traditional Windows business continued to grow throughout this period generating revenues of 26.8 billion in the third quarter of 2018, while Microsoft's Azure cloud revenues nearly doubled.
Windows Terminal is a multi-tabbed terminal emulator developed by Microsoft for Windows 10 and later as a replacement for Windows Console. It can run any command-line app in a separate tab. It is preconfigured to run Command Prompt, PowerShell, WSL and Azure Cloud Shell Connector, and can also connect to SSH by manually configuring a profile. Windows Terminal comes with its own rendering back-end; starting with version 1.11 on Windows 11, command-line apps can run using this newer back-end instead of the old Windows Console.
Computer Vision Annotation Tool (CVAT) is a free, open source, web-based image and video annotation tool used for labeling data for computer vision algorithms. Originally developed by Intel, CVAT is designed for use by a professional data annotation team, with a user interface optimized for computer vision annotation tasks.
AirSim is an open-source, cross platform simulator for drones, ground vehicles such as cars and various other objects, built on Epic Games’ proprietary Unreal Engine 4 as a platform for AI research. It is developed by Microsoft and can be used to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. This allows testing of autonomous solutions without worrying about real-world damage.
CloudApp is a cross-platform screen capture and screen recording desktop client that supports online storage and sharing.