Computer Vision Annotation Tool

Last updated
Computer Vision Annotation Tool
Cvat (2).jpg
CVAT version 2.0.0
Developer(s) CVAT.ai
Initial releaseJune 29, 2018;4 years ago (2018-06-29)
Repository github.com/opencv/cvat
Written in JavaScript, CSS, Python, HTML, Django
Operating system Windows 7 or later, OS X 10.11 or later, Linux
Available inEnglish (US)
Type Image and video annotation tool
License MIT License [1]
Website opencv.github.io/cvat/about/

Computer Vision Annotation Tool (CVAT) is a free, open source, web-based image and video annotation tool which is used for labeling data for computer vision algorithms. Originally developed by Intel, CVAT is designed for use by a professional data annotation team, with a user interface optimized for computer vision annotation tasks. [2]

Contents

CVAT supports the primary tasks of supervised machine learning: object detection, image classification, and image segmentation. CVAT allows users to annotate data for each of these cases. [3]

CVAT has many powerful features, including interpolation of shapes between key frames, semi-automatic annotation using deep learning models, shortcuts for most critical actions, a dashboard with a list of annotation projects and tasks, LDAP and basic access authentication, etc. [4]

CVAT is written mainly in TypeScript, React, Ant Design, CSS, Python, and Django. It is distributed under the MIT License, and its source code is available on GitHub.

CVAT team hosts an online version of the data annotation platform at cvat.ai as SaaS.

See also

Related Research Articles

<span class="mw-page-title-main">OpenCV</span> Computer vision library

OpenCV is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage then Itseez. The library is cross-platform and free for use under the open-source Apache 2 License. Starting with 2011, OpenCV features GPU acceleration for real-time operations.

As of the early 2000s, several speech recognition (SR) software packages exist for Linux. Some of them are free and open-source software and others are proprietary software. Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language. Voice control may refer to software used for communicating operational commands to a computer.

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable. It can be used to create systems that help make decisions in the face of uncertainty.

<span class="mw-page-title-main">Google Fonts</span> Web font library

Google Fonts is a computer font and web font service owned by Google. This includes free and open source font families, an interactive web directory for browsing the library, and APIs for using the fonts via CSS and Android. Popular fonts in the Google Fonts library include Roboto, Open Sans, Lato, Oswald, Montserrat, Source Sans Pro, and Raleway.

<span class="mw-page-title-main">Deeplearning4j</span> Open-source deep learning library

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

<span class="mw-page-title-main">Mapillary</span> Swedish service for sharing crowdsourced geotagged photos

Mapillary is a service for sharing crowdsourced geotagged photos, developed by remote company Mapillary AB, based in Malmö, Sweden. Mapillary was launched in 2013 and acquired by Facebook, Inc. in 2020. This is one of the few alternative platforms that has street level imagery like Google Street View.

<span class="mw-page-title-main">GitLab</span> Open-source Git software package

GitLab Inc. is an open-core company that provides GitLab, a DevOps software package that combines the ability to develop, secure, and operate software in a single application. The open source software project was created by Ukrainian developer Dmitriy Zaporozhets and Dutch developer Sytse Sijbrandij. In 2018, GitLab Inc. was considered the first partly-Ukrainian unicorn.

The following table compares notable software frameworks, libraries and computer programs for deep learning.

spaCy

spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

<span class="mw-page-title-main">Caffe (software)</span> Deep learning framework

Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.

SqueezeNet is the name of a deep neural network for computer vision that was released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters that can more easily fit into computer memory and can more easily be transmitted over a computer network.

openpilot Open source driver assistance system

openpilot is an open source, semi-automated driving system developed by comma.ai. openpilot operates as a replacement for OEM Advanced driver-assistance systems with the objective of improving visual perception and electromechanical actuator control. It allows users to modify their existing car with increased computing power, enhanced sensors, and continuously-updated driver assistance features that improve with user-submitted data.

OpenVINO toolkit is a free toolkit facilitating the optimization of a deep learning model from a framework and deployment using an inference engine onto Intel hardware. The toolkit has two versions: OpenVINO toolkit, which is supported by open source community and Intel Distribution of OpenVINO toolkit, which is supported by Intel. OpenVINO was developed by Intel. The toolkit is cross-platform and free for use under Apache License version 2.0. The toolkit enables a write-once, deploy-anywhere approach to deep learning deployments on Intel platforms, including CPU, integrated GPU, Intel Movidius VPU, and FPGAs.

oneAPI (compute acceleration) Open standard for parallel computing

oneAPI is an open standard for a unified application programming interface intended to be used across different compute accelerator (coprocessor) architectures, including GPUs, AI accelerators and field-programmable gate arrays. It is intended to eliminate the need for developers to maintain separate code bases, multiple programming languages, and different tools and workflows for each architecture.

<span class="mw-page-title-main">Neural Network Intelligence</span> Microsoft open source library

NNI is a free and open source AutoML toolkit developed by Microsoft. It is used to automate feature engineering, model compression, neural architecture search, and hyper-parameter tuning.

VoTT is a free and open source electron app for image annotation and labeling developed by Microsoft. The software is written in the TypeScript programming language and used for building end to end object detection models from image and videos assets for computer vision algorithms.

Meta AI is an artificial intelligence laboratory that belongs to Meta Platforms Inc. Meta AI intends to develop various forms of artificial intelligence, improving augmented and artificial reality technologies. Meta AI is an academic research laboratory focused on generating knowledge for the AI community. This is in contrast to Facebook's Applied Machine Learning (AML) team, which focuses on practical applications of its products.

References

  1. "cvat_LICENSE at develop · opencv/cvat". GitHub. CVAT.ai.
  2. "Intel open-sources CVAT, a toolkit for data labeling". VentureBeat. 2019-03-05. Retrieved 2019-08-13.
  3. "Computer Vision Annotation Tool: A Universal Approach to Data Annotation". software.intel.com. 2019-03-04. Retrieved 2019-08-13.
  4. User's guide of Computer Vision Annotation Tool, CVAT.ai, 2019-08-13, retrieved 2021-07-29