Original author(s) | Davis E. King |
---|---|
Initial release | 2002 |
Stable release | 19.24.4 [1] / 31 March 2024 |
Repository | |
Written in | C++ |
Operating system | Cross-platform |
Type | Library, machine learning |
License | Boost |
Website | dlib |
Dlib is a general purpose cross-platform software library written in the programming language C++. Its design is heavily influenced by ideas from design by contract and component-based software engineering. Thus it is, first and foremost, a set of independent software components. It is open-source software released under a Boost Software License.
Since development began in 2002, Dlib has grown to include a wide variety of tools. As of 2016, it contains software components for dealing with networking, threads, graphical user interfaces, data structures, linear algebra, machine learning, image processing, data mining, XML and text parsing, numerical optimization, Bayesian networks, and many other tasks. In recent years, much of the development has been focused on creating a broad set of statistical machine learning tools and in 2009 Dlib was published in the Journal of Machine Learning Research . [2] Since then it has been used in a wide range of domains. [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
Computer vision tasks include methods for acquiring, processing, analyzing and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.
In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.
Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.
Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.
Scale-space segmentation or multi-scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing.
In machine learning, systems which employ offline learning do not change their approximation of the target function when the initial training phase has been completed.
In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images. It has been used in a wide variety of robotic applications, such as on the Mars Exploration Rovers.
In robotics and motion planning, a velocity obstacle, commonly abbreviated VO, is the set of all velocities of a robot that will result in a collision with another robot at some moment in time, assuming that the other robot maintains its current velocity. If the robot chooses a velocity inside the velocity obstacle then the two robots will eventually collide, if it chooses a velocity outside the velocity obstacle, such a collision is guaranteed not to occur.
Matti Kalevi Pietikäinen is a computer scientist. He is currently Professor (emer.) in the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. His research interests are in texture-based computer vision, face analysis, affective computing, biometrics, and vision-based perceptual interfaces. He was Director of the Center for Machine Vision Research, and Scientific Director of Infotech Oulu.
Tendon-driven robots (TDR) are robots whose limbs mimic biological musculoskeletal systems. They use plastic straps to mimic muscles and tendons. Such robots are claimed to move in a "more natural" way than traditional robots that use rigid metal or plastic limbs controlled by geared actuators. TDRs can also help understand how biomechanics relates to embodied intelligence and cognition.
René Vidal is a Chilean electrical engineer and computer scientist who is known for his research in machine learning, computer vision, medical image computing, robotics, and control theory. He is the Herschel L. Seder Professor of the Johns Hopkins Department of Biomedical Engineering, and the founding director of the Mathematical Institute for Data Science (MINDS).
An event camera, also known as a neuromorphic camera, silicon retina or dynamic vision sensor, is an imaging sensor that responds to local changes in brightness. Event cameras do not capture images using a shutter as conventional (frame) cameras do. Instead, each pixel inside an event camera operates independently and asynchronously, reporting changes in brightness as they occur, and staying silent otherwise.
Stefan Roth is a German computer scientist, professor of computer science and dean of the department of computer science of the Technische Universität Darmstadt. He heads the Visual Inference Lab.
Daniel Cremers is a German computer scientist, Professor of Informatics and Mathematics and Chair of Computer Vision & Artificial Intelligence at the Technische Universität München. His research foci are computer vision, mathematical image, partial differential equations, convex and combinatorial optimization, machine learning and statistical inference.
Margarita Chli is an assistant professor and leader of the Vision for Robotics Lab at ETH Zürich in Switzerland. Chli is a leader in the field of computer vision and robotics and was on the team of researchers to develop the first fully autonomous helicopter with onboard localization and mapping. Chli is also the Vice Director of the Institute of Robotics and Intelligent Systems and an Honorary Fellow of the University of Edinburgh in the United Kingdom. Her research currently focuses on developing visual perception and intelligence in flying autonomous robotic systems.
Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals, rather than relying on external labels provided by humans. In the context of neural networks, self-supervised learning aims to leverage inherent structures or relationships within the input data to create meaningful training signals. SSL tasks are designed so that solving it requires capturing essential features or relationships in the data. The input data is typically augmented or transformed in a way that creates pairs of related samples. One sample serves as the input, and the other is used to formulate the supervisory signal. This augmentation can involve introducing noise, cropping, rotation, or other transformations. Self-supervised learning more closely imitates the way humans learn to classify objects.
Clément Farabet is a computer scientist and AI expert known for his contributions to the field of deep learning. He served as a research scientist at the New York University. He serves as the Vice President of Research at Google DeepMind and previously served as the VP of AI Infrastructure at NVIDIA.
Jürgen Sturm is a German software engineer, entrepreneur and academic. He is a Senior Staff Software Engineering Manager at Intrinsic, where he works on developing a robot SDK aimed at facilitating and reducing the cost of integrating AI-/ML-powered robots into industrial manufacturing processes.