Computer Vision Annotation Tool

Last updated
Computer Vision Annotation Tool
Developer(s) CVAT.ai
Initial releaseJune 29, 2018;6 years ago (2018-06-29)
Repository github.com/cvat-ai/cvat
Written in JavaScript, CSS, Python, HTML, Django
Operating system Windows 7 or later, OS X 10.11 or later, Linux
Available inEnglish (US)
Type Image and video annotation tool
License MIT License [1]
Website opencv.github.io/cvat/about/

Computer Vision Annotation Tool (CVAT) is a free, open source, web-based image and video annotation tool used for labeling data for computer vision algorithms. Originally developed by Intel, CVAT is designed for use by a professional data annotation team, with a user interface optimized for computer vision annotation tasks. [2]

Contents

CVAT supports the primary tasks of supervised machine learning: object detection, image classification, and image segmentation. CVAT allows users to annotate data for each of these cases. [3]

CVAT has many powerful features, including interpolation of shapes between key frames, semi-automatic annotation using deep learning models, shortcuts for most critical actions, a dashboard with a list of annotation projects and tasks, LDAP and basic access authentication, etc. [4]

CVAT is written mainly in TypeScript, React, Ant Design, CSS, Python, and Django. It is distributed under the MIT License, and its source code is available on GitHub.

CVAT team hosts an online version of the data annotation platform at cvat.ai as SaaS.

See also

Related Research Articles

<span class="mw-page-title-main">OpenCV</span> Computer vision library

OpenCV is a library of programming functions mainly for real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez. The library is cross-platform and licensed as free and open-source software under Apache License 2. Starting in 2011, OpenCV features GPU acceleration for real-time operations.

Computer-assistedqualitative data analysis software (CAQDAS) offers tools that assist with qualitative research such as transcription analysis, coding and text interpretation, recursive abstraction, content analysis, discourse analysis, grounded theory methodology, etc.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

The following tables compare notable software frameworks, libraries, and computer programs for deep learning applications.

Caffe is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.

<span class="mw-page-title-main">Project Jupyter</span> Open source data science software

Project Jupyter is a project to develop open-source software, open standards, and services for interactive computing across multiple programming languages.

openpilot Open source driver assistance system

openpilot is an open-source, semi-automated driving software by comma.ai, Inc. When paired with comma hardware, it replaces advanced driver-assistance systems in various cars, improving over the original system. As of 2023, openpilot supports 250+ car models and has 6000+ users, accumulating over 90 million miles (140,000,000 km).

oneAPI (compute acceleration) Open standard for parallel computing

oneAPI is an open standard, adopted by Intel, for a unified application programming interface (API) intended to be used across different computing accelerator (coprocessor) architectures, including GPUs, AI accelerators and field-programmable gate arrays. It is intended to eliminate the need for developers to maintain separate code bases, multiple programming languages, tools, and workflows for each architecture.

VoTT is a free and open source Electron app for image annotation and labeling developed by Microsoft. The software is written in the TypeScript programming language and used for building end-to-end object detection models from image and videos assets for computer vision algorithms.

Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

<span class="mw-page-title-main">Stable Diffusion</span> Image-generating machine learning model

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing artificial intelligence boom.

Albumentations is an open-source image augmentation library created in June 2018 by a group of researchers and engineers, including Alexander Buslaev, Vladimir Iglovikov, and Alex Parinov. The library was designed to provide a flexible and efficient framework for data augmentation in computer vision tasks.

A vector database, vector store or vector search engine is a database that can store vectors along with other data items. Vector databases typically implement one or more Approximate Nearest Neighbor algorithms, so that one can search the database with a query vector to retrieve the closest matching database records.

Medical open network for AI (MONAI) is an open-source, community-supported framework for Deep learning (DL) in healthcare imaging. MONAI provides a collection of domain-optimized implementations of various DL algorithms and utilities specifically designed for medical imaging tasks. MONAI is used in research and industry, aiding the development of various medical imaging applications, including image segmentation, image classification, image registration, and image generation.

Open-source artificial intelligence is an AI system that is freely available to use, study, modify, and share. These attributes extend to each of the system’s components, including datasets, code, and model parameters, promoting a collaborative and transparent approach to AI development.

llama.cpp Software library for LLM inference

llama.cpp is an open source software library written mostly in C++ that performs inference on various large language models such as Llama. It is co-developed alongside the GGML project, a general-purpose tensor library.

References

  1. "cvat_LICENSE at develop · opencv/cvat". GitHub. CVAT.ai.
  2. "Intel open-sources CVAT, a toolkit for data labeling". VentureBeat. 2019-03-05. Retrieved 2019-08-13.
  3. "Computer Vision Annotation Tool: A Universal Approach to Data Annotation". software.intel.com. 2019-03-04. Retrieved 2019-08-13.
  4. User's guide of Computer Vision Annotation Tool, CVAT.ai, 2019-08-13, retrieved 2021-07-29