Manual image annotation is the process of manually defining regions in an image and creating a textual description of those regions. Such annotations can for instance be used to train machine learning algorithms for computer vision applications.
This is a list of computer software which can be used for manual annotation of images.
Software | Description | Platform | License | References |
---|---|---|---|---|
Computer Vision Annotation Tool (CVAT) | Computer Vision Annotation Tool (CVAT) is an open source, web-based annotation tool which helps to label video and images for computer vision algorithms. CVAT has many powerful features: interpolation of bounding boxes between key frames, automatic annotation using TensorFlow OD API and deep learning models in Intel OpenVINO IR format, shortcuts for most of critical actions, dashboard with a list of annotation tasks, LDAP and basic authorizations, etc. It was created for and used by a professional data annotation team. UX and UI were optimized especially for computer vision annotation tasks. | JavaScript, HTML, CSS, Python, Django | MIT License | [1] [2] [3] |
LabelMe | Online annotation tool to build image databases for computer vision research. | Perl, JavaScript, HTML, CSS [4] | MIT License | |
TagLab | Desktop open source interactive software system for facilitating the precise annotation of benthic species in orthophoto of the bottom of the sea. | Python [5] | GPL | [6] [7] |
VoTT (Visual Object Tagging Tool) | Free and open source electron app for image annotation and labeling developed by Microsoft. | TypeScript/Electron (Windows, Linux, macOS) | MIT License | [8] [9] [10] [11] [12] |
The GNU Image Manipulation Program, commonly known by its acronym GIMP, is a free and open-source raster graphics editor used for image manipulation (retouching) and image editing, free-form drawing, transcoding between different image file formats, and more specialized tasks. It is extensible by means of plugins, and scriptable. It is not designed to be used for drawing, though some artists and creators have used it in this way.
Git is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively.
OpenCV is a library of programming functions mainly for real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez. The library is cross-platform and licensed as free and open-source software under Apache License 2. Starting in 2011, OpenCV features GPU acceleration for real-time operations.
LabelMe is a project created by the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) that provides a dataset of digital images with annotations. The dataset is dynamic, free to use, and open to public contribution. The most applicable use of LabelMe is in computer vision research. As of October 31, 2010, LabelMe has 187,240 images, 62,197 annotated images, and 658,992 labeled objects.
OpenSCAD is a free software application for creating solid 3D computer-aided design (CAD) objects. It is a script-only based modeller that uses its own description language; the 3D preview can be manipulated interactively, but cannot be interactively modified in 3D. Instead, an OpenSCAD script specifies geometric primitives and defines how they are modified and combined to render a 3D model. As such, the program performs constructive solid geometry (CSG). OpenSCAD is available for Windows, Linux, and macOS.
CellCognition is a free open-source computational framework for quantitative analysis of high-throughput fluorescence microscopy (time-lapse) images in the field of bioimage informatics and systems microscopy. The CellCognition framework uses image processing, computer vision and machine learning techniques for single-cell tracking and classification of cell morphologies. This enables measurements of temporal progression of cell phases, modeling of cellular dynamics and generation of phenotype map.
Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.
Perforce Software, Inc. is an American developer of software used for developing and running applications, including version control software, web-based repository management, developer collaboration, application lifecycle management, web application servers, debugging tools, platform automation, and agile planning software.
The KDE Gear is a set of applications and supporting libraries that are developed by the KDE community, primarily used on Linux-based operating systems but mostly multiplatform, and released on a common release schedule.
The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge, where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes.
Art of Illusion is a free software, and open source software package for making 3D graphics.
Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.
Project Jupyter is a project to develop open-source software, open standards, and services for interactive computing across multiple programming languages.
Microsoft, a tech company historically known for its opposition to the open source software paradigm, turned to embrace the approach in the 2010s. From the 1970s through 2000s under CEOs Bill Gates and Steve Ballmer, Microsoft viewed the community creation and sharing of communal code, later to be known as free and open source software, as a threat to its business, and both executives spoke negatively against it. In the 2010s, as the industry turned towards cloud, embedded, and mobile computing—technologies powered by open source advances—CEO Satya Nadella led Microsoft towards open source adoption although Microsoft's traditional Windows business continued to grow throughout this period generating revenues of 26.8 billion in the third quarter of 2018, while Microsoft's Azure cloud revenues nearly doubled.
Computer Vision Annotation Tool (CVAT) is an open source, web-based image and video annotation tool used for labeling data for computer vision algorithms. Originally developed by Intel, CVAT is designed for use by a professional data annotation team, with a user interface optimized for computer vision annotation tasks.
AirSim is an open-source, cross platform simulator for drones, ground vehicles such as cars and various other objects, built on Epic Games’ proprietary Unreal Engine 4 as a platform for AI research. It is developed by Microsoft and can be used to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. This allows testing of autonomous solutions without worrying about real-world damage.
Foliate is a free and open-source program for reading e-books in Linux. In English, foliate is an adjective meaning to be shaped like a leaf, from the Latin foliatus, meaning leafy.
VoTT is a free and open source Electron app for image annotation and labeling developed by Microsoft. The software is written in the TypeScript programming language and used for building end-to-end object detection models from image and videos assets for computer vision algorithms.