Dlib

Last updated
Dlib
Original author(s) Davis E. King
Initial release2002 (2002)
Stable release
19.24.4 [1] / 31 March 2024;2 months ago (31 March 2024)
Repository
Written in C++
Operating system Cross-platform
Type Library, machine learning
License Boost
Website dlib.net   OOjs UI icon edit-ltr-progressive.svg

Dlib is a general purpose cross-platform software library written in the programming language C++. Its design is heavily influenced by ideas from design by contract and component-based software engineering. Thus it is, first and foremost, a set of independent software components. It is open-source software released under a Boost Software License.

Contents

Since development began in 2002, Dlib has grown to include a wide variety of tools. As of 2016, it contains software components for dealing with networking, threads, graphical user interfaces, data structures, linear algebra, machine learning, image processing, data mining, XML and text parsing, numerical optimization, Bayesian networks, and many other tasks. In recent years, much of the development has been focused on creating a broad set of statistical machine learning tools and in 2009 Dlib was published in the Journal of Machine Learning Research . [2] Since then it has been used in a wide range of domains. [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

See also

Related Research Articles

Computer vision tasks include methods for acquiring, processing, analyzing and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.

<span class="mw-page-title-main">Neural network (machine learning)</span> Computational model used in machine learning, based on connected, hierarchical functions

In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.

<span class="mw-page-title-main">Automatic image annotation</span>

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

<span class="mw-page-title-main">Thomas Huang</span> Chinese-American engineer and computer scientist (1936–2020)

Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.

<span class="mw-page-title-main">Scale-space segmentation</span>

Scale-space segmentation or multi-scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing.

In machine learning, systems which employ offline learning do not change their approximation of the target function when the initial training phase has been completed.

<span class="mw-page-title-main">Visual odometry</span> Determining the position and orientation of a robot by analyzing associated camera images

In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images. It has been used in a wide variety of robotic applications, such as on the Mars Exploration Rovers.

<span class="mw-page-title-main">Velocity obstacle</span> Term in robotics and motion planning

In robotics and motion planning, a velocity obstacle, commonly abbreviated VO, is the set of all velocities of a robot that will result in a collision with another robot at some moment in time, assuming that the other robot maintains its current velocity. If the robot chooses a velocity inside the velocity obstacle then the two robots will eventually collide, if it chooses a velocity outside the velocity obstacle, such a collision is guaranteed not to occur.

Matti Kalevi Pietikäinen is a computer scientist. He is currently Professor (emer.) in the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. His research interests are in texture-based computer vision, face analysis, affective computing, biometrics, and vision-based perceptual interfaces. He was Director of the Center for Machine Vision Research, and Scientific Director of Infotech Oulu.

Tendon-driven robots (TDR) are robots whose limbs mimic biological musculoskeletal systems. They use plastic straps to mimic muscles and tendons. Such robots are claimed to move in a "more natural" way than traditional robots that use rigid metal or plastic limbs controlled by geared actuators. TDRs can also help understand how biomechanics relates to embodied intelligence and cognition.

<span class="mw-page-title-main">René Vidal</span> Chilean computer scientist (born 1974)

René Vidal is a Chilean electrical engineer and computer scientist who is known for his research in machine learning, computer vision, medical image computing, robotics, and control theory. He is the Herschel L. Seder Professor of the Johns Hopkins Department of Biomedical Engineering, and the founding director of the Mathematical Institute for Data Science (MINDS).

<span class="mw-page-title-main">Event camera</span> Type of imaging sensor

An event camera, also known as a neuromorphic camera, silicon retina or dynamic vision sensor, is an imaging sensor that responds to local changes in brightness. Event cameras do not capture images using a shutter as conventional (frame) cameras do. Instead, each pixel inside an event camera operates independently and asynchronously, reporting changes in brightness as they occur, and staying silent otherwise.

Stefan Roth is a German computer scientist, professor of computer science and dean of the department of computer science of the Technische Universität Darmstadt. He heads the Visual Inference Lab.

Daniel Cremers is a German computer scientist, Professor of Informatics and Mathematics and Chair of Computer Vision & Artificial Intelligence at the Technische Universität München. His research foci are computer vision, mathematical image, partial differential equations, convex and combinatorial optimization, machine learning and statistical inference.

<span class="mw-page-title-main">Margarita Chli</span> Greek computer vision and robotics researcher

Margarita Chli is an assistant professor and leader of the Vision for Robotics Lab at ETH Zürich in Switzerland. Chli is a leader in the field of computer vision and robotics and was on the team of researchers to develop the first fully autonomous helicopter with onboard localization and mapping. Chli is also the Vice Director of the Institute of Robotics and Intelligent Systems and an Honorary Fellow of the University of Edinburgh in the United Kingdom. Her research currently focuses on developing visual perception and intelligence in flying autonomous robotic systems.

Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals, rather than relying on external labels provided by humans. In the context of neural networks, self-supervised learning aims to leverage inherent structures or relationships within the input data to create meaningful training signals. SSL tasks are designed so that solving it requires capturing essential features or relationships in the data. The input data is typically augmented or transformed in a way that creates pairs of related samples. One sample serves as the input, and the other is used to formulate the supervisory signal. This augmentation can involve introducing noise, cropping, rotation, or other transformations. Self-supervised learning more closely imitates the way humans learn to classify objects.

Clément Farabet is a computer scientist and AI expert known for his contributions to the field of deep learning. He served as a research scientist at the New York University. He serves as the Vice President of Research at Google DeepMind and previously served as the VP of AI Infrastructure at NVIDIA.

<span class="mw-page-title-main">Jürgen Sturm</span> German software engineer

Jürgen Sturm is a German software engineer, entrepreneur and academic. He is a Senior Staff Software Engineering Manager at Intrinsic, where he works on developing a robot SDK aimed at facilitating and reducing the cost of integrating AI-/ML-powered robots into industrial manufacturing processes.

References

  1. "Release 19.24.4". 31 March 2024. Retrieved 22 April 2024.
  2. King, D. E. (2009). "Dlib-ml: A Machine Learning Toolkit" (PDF). J. Mach. Learn. Res. 10 (Jul): 1755–1758. CiteSeerX   10.1.1.156.3584 .
  3. Scholarly research using Dlib
  4. Dlib on mloss.org
  5. Autonome Mobile Systeme 2009
  6. ESS: Extremely Simple Serialization for C++
  7. Gould, S. (2012). "Darwin: A Framework for Machine Learning and Computer Vision Research and Development" (PDF). J. Mach. Learn. Res. 13 (Dec): 3533–3537. CiteSeerX   10.1.1.413.8518 .
  8. Yan, Junchi, et al. "Online incremental regression for electricity price prediction." Service Operations and Logistics, and Informatics (SOLI), 2012 IEEE International Conference on. IEEE, 2012. Yan, J.; Tian, C.; Wang, Y.; Huang, J. (2012). "Online incremental regression for electricity price prediction". Proceedings of 2012 IEEE International Conference on Service Operations and Logistics, and Informatics. p. 31. doi:10.1109/SOLI.2012.6273500. ISBN   978-1-4673-2401-4. S2CID   19017900.
  9. Kuijf, Hugo J., Max A. Viergever, and Koen L. Vincken. "Automatic Extraction of the Curved Midsagittal Brain Surface on MR Images." Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging. Springer Berlin Heidelberg, 2013. 225-232. Kuijf, H. J.; Viergever, M. A.; Vincken, K. L. (2013). "Automatic Extraction of the Curved Midsagittal Brain Surface on MR Images". Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging. Lecture Notes in Computer Science. Vol. 7766. p. 225. doi:10.1007/978-3-642-36620-8_22. ISBN   978-3-642-36619-2.
  10. Bormann, Richard Klaus Eduard. "Vision-based place categorization." (2010).
  11. Brodu, Nicolas, and Dimitri Lague. "3D terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: Applications in geomorphology." ISPRS Journal of Photogrammetry and Remote Sensing 68 (2012): 121–134.
  12. Aung, Zeyar, et al. "Towards accurate electricity load forecasting in smart grids." DBKDA 2012, The Fourth International Conference on Advances in Databases, Knowledge, and Data Applications. 2012.
  13. Rodriguez, Alberto, et al. "Abort and retry in grasping." Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, 2011. Rodriguez, A.; Mason, M. T.; Srinivasa, S. S.; Bernstein, M.; Zirbel, A. (2011). "Abort and retry in grasping". 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. p. 1804. doi:10.1109/IROS.2011.6095100. ISBN   978-1-61284-456-5. S2CID   6637718.
  14. Mohan, Vandana, et al. "Intraoperative prediction of tumor cell concentration from Mass Spectrometry Imaging." Int. Symp. Math. Theo. Netw. Syst. 2010.
  15. Nakashima, Yuta, Noboru Babaguchi, and Jianping Fan. "Detecting intended human objects in human-captured videos." Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on. IEEE, 2010. Nakashima, Y.; Babaguchi, N.; Fan, J. (2010). "Detecting intended human objects in human-captured videos". 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops. p. 33. doi:10.1109/CVPRW.2010.5543721. ISBN   978-1-4244-7029-7. S2CID   16767135.