Dlib

Last updated
Dlib
Dlib c++ library logo.png
Original author(s) Davis E. King
Initial release2002 (2002)
Stable release
19.22 [1] / 28 March 2021;2 months ago (28 March 2021)
Repository OOjs UI icon edit-ltr-progressive.svg
Written in C++
Operating system Cross-platform
Type Library, machine learning
License Boost
Website dlib.net   OOjs UI icon edit-ltr-progressive.svg

Dlib is a general purpose cross-platform software library written in the programming language C++. Its design is heavily influenced by ideas from design by contract and component-based software engineering. Thus it is, first and foremost, a set of independent software components. It is open-source software released under a Boost Software License.

Contents

Since development began in 2002, Dlib has grown to include a wide variety of tools. As of 2016, it contains software components for dealing with networking, threads, graphical user interfaces, data structures, linear algebra, machine learning, image processing, data mining, XML and text parsing, numerical optimization, Bayesian networks, and many other tasks. In recent years, much of the development has been focused on creating a broad set of statistical machine learning tools and in 2009 Dlib was published in the Journal of Machine Learning Research . [2] Since then it has been used in a wide range of domains. [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

See also

Related Research Articles

Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human visual system can do.

Artificial neural network Computational model used in machine learning, based on connected, hierarchical functions

Artificial neural networks (ANNs), usually simply called neural networks (NNs), are computing systems vaguely inspired by the biological neural networks that constitute animal brains.

Jürgen Schmidhuber German computer scientist

Jürgen Schmidhuber is a computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the district of Lugano, in Ticino in southern Switzerland. He is sometimes called the "father of (modern) AI"; note that other sourecs point at Frank Rosenblatt.

Gesture recognition

Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. It is a subdiscipline of computer vision. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current focuses in the field include emotion recognition from face and hand gesture recognition. Users can use simple gestures to control or interact with devices without physically touching them. Many approaches have been made using cameras and computer vision algorithms to interpret sign language. However, the identification and recognition of posture, gait, proxemics, and human behaviors is also the subject of gesture recognition techniques. Gesture recognition can be seen as a way for computers to begin to understand human body language, thus building a richer bridge between machines and humans than primitive text user interfaces or even GUIs, which still limit the majority of input to keyboard and mouse and interact naturally without any mechanical devices. Using the concept of gesture recognition, it is possible to point a finger at this point will move accordingly. This could make conventional input on devices such and even redundant.

Claytronics is an abstract future concept that combines nanoscale robotics and computer science to create individual nanometer-scale computers called claytronic atoms, or catoms, which can interact with each other to form tangible 3D objects that a user can interact with. This idea is more broadly referred to as programmable matter. Claytronics has the potential to greatly affect many areas of daily life, such as telecommunication, human-computer interfaces, and entertainment.

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

Thomas Huang Chinese-American engineer and computer scientist

Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.

Gregory Dudek is a chaired professor of computer science at McGill University, was the Director of the McGill Center for Intelligent Machines from 2004 to 2007, and was the Director of the McGill University School of Computer Science from 2008 to 2016. He served as the Scientific Director of the NSERC Canadian Field Robotics Network from 2012 to 2018. He became Scientific Director and Lead Investigatior or it ssuccessor the NSERC Canadian Robotics Network. In 2018, Samsung announced that he would become a VP Research and Lead their new Samsung AI Center in Montreal (SAIC-Montreal). Th is the son of poet Louis Dudek, he was made a Dawson Scholar of that university and subsequently James McGill Chair (∈), and directs the mobile robotics laboratory there. He has written over 300 refereed articles on computer vision and robotics, and is co-author of the book Computational Principles of Mobile Robotics which is used to teach robotics at a number of universities [1].

Conditional random field

Conditional random fields (CRFs) are a class of statistical modeling method often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighboring" samples, a CRF can take context into account. To do so, the prediction is modeled as a graphical model, which implements dependencies between the predictions. What kind of graph is used depends on the application. For example, in natural language processing, linear chain CRFs are popular, which implement sequential dependencies in the predictions. In image processing the graph typically connects locations to nearby and/or similar locations to enforce that they receive similar predictions.

Visual odometry

In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images. It has been used in a wide variety of robotic applications, such as on the Mars Exploration Rovers.

Velocity obstacle

In robotics and motion planning, a velocity obstacle, commonly abbreviated VO, is the set of all velocities of a robot that will result in a collision with another robot at some moment in time, assuming that the other robot maintains its current velocity. If the robot chooses a velocity inside the velocity obstacle then the two robots will eventually collide, if it chooses a velocity outside the velocity obstacle, such a collision is guaranteed not to occur.

Masakatsu G. Fujie is a Japanese scientist who has played a major role in cutting-edge research in biomedical engineering. He has been responsible for many advances in the field of robotics.

Matti Kalevi Pietikäinen is a computer scientist. He is currently Professor (emer.) in the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. His research interests are in texture-based computer vision, face analysis, affective computing, biometrics, and vision-based perceptual interfaces. He was Director of the Center for Machine Vision Research, and Scientific Director of Infotech Oulu.

Tendon-driven robots (TDR) are robots whose limbs mimic biological musculoskeletal systems. They use plastic straps to mimic muscles and tendons. Such robots are claimed to move in a "more natural" way than traditional robots that use rigid metal or plastic limbs controlled by geared actuators. TDRs can also help understand how biomechanics relates to embodied intelligence and cognition.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

A facial expression database is a collection of images or video clips with facial expressions of a range of emotions. Well-annotated (emotion-tagged) media content of facial behavior is essential for training, testing, and validation of algorithms for the development of expression recognition systems. The emotion annotation can be done in discrete emotion labels or on a continuous scale. Most of the databases are usually based on the basic emotions theory which assumes the existence of six discrete basic emotions. However, some databases include the emotion tagging in continuous arousal-valence scale.

René Vidal

René Vidal is a Chilean electrical engineer and computer scientist who is known for his research in machine learning, computer vision, medical image computing, robotics, and control theory. He is the Herschel L. Seder Professor of the Johns Hopkins Department of Biomedical Engineering, and the founding director of the Mathematical Institute for Data Science (MINDS).

Gregory D. Hager

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

R Chli is an Assistant Professor and leader of the Vision for Robotics Lab at ETH Zürich in Switzerland. Chli is a leader in the field of computer vision and robotics and was on the team of researchers to develop the first fully autonomous helicopter with onboard localization and mapping. Chli is also the Vice Director of the Institute of Robotics and Intelligent Systems and an Honorary Fellow of the University of Edinburgh in the United Kingdom. Her research currently focuses on developing visual perception and intelligence in flying autonomous robotic systems.

References

  1. "Release 19.22". 28 March 2021. Retrieved 10 April 2021.
  2. King, D. E. (2009). "Dlib-ml: A Machine Learning Toolkit" (PDF). J. Mach. Learn. Res. 10 (Jul): 1755–1758. CiteSeerX   10.1.1.156.3584 .
  3. Scholarly research using Dlib
  4. Dlib on mloss.org
  5. Autonome Mobile Systeme 2009
  6. ESS: Extremely Simple Serialization for C++
  7. Gould, S. (2012). "Darwin: A Framework for Machine Learning and Computer Vision Research and Development" (PDF). J. Mach. Learn. Res. 13 (Dec): 3533–3537. CiteSeerX   10.1.1.413.8518 .
  8. Yan, Junchi, et al. "Online incremental regression for electricity price prediction." Service Operations and Logistics, and Informatics (SOLI), 2012 IEEE International Conference on. IEEE, 2012. Yan, J.; Tian, C.; Wang, Y.; Huang, J. (2012). "Online incremental regression for electricity price prediction". Proceedings of 2012 IEEE International Conference on Service Operations and Logistics, and Informatics. p. 31. doi:10.1109/SOLI.2012.6273500. ISBN   978-1-4673-2401-4.
  9. Kuijf, Hugo J., Max A. Viergever, and Koen L. Vincken. "Automatic Extraction of the Curved Midsagittal Brain Surface on MR Images." Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging. Springer Berlin Heidelberg, 2013. 225-232. Kuijf, H. J.; Viergever, M. A.; Vincken, K. L. (2013). "Automatic Extraction of the Curved Midsagittal Brain Surface on MR Images". Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging. Lecture Notes in Computer Science. 7766. p. 225. doi:10.1007/978-3-642-36620-8_22. ISBN   978-3-642-36619-2.
  10. Bormann, Richard Klaus Eduard. "Vision-based place categorization." (2010).
  11. Brodu, Nicolas, and Dimitri Lague. "3D terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: Applications in geomorphology." ISPRS Journal of Photogrammetry and Remote Sensing 68 (2012): 121–134.
  12. Aung, Zeyar, et al. "Towards accurate electricity load forecasting in smart grids." DBKDA 2012, The Fourth International Conference on Advances in Databases, Knowledge, and Data Applications. 2012.
  13. Rodriguez, Alberto, et al. "Abort and retry in grasping." Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, 2011. Rodriguez, A.; Mason, M. T.; Srinivasa, S. S.; Bernstein, M.; Zirbel, A. (2011). "Abort and retry in grasping". 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. p. 1804. doi:10.1109/IROS.2011.6095100. ISBN   978-1-61284-456-5.
  14. Mohan, Vandana, et al. "Intraoperative prediction of tumor cell concentration from Mass Spectrometry Imaging." Int. Symp. Math. Theo. Netw. Syst. 2010.
  15. Nakashima, Yuta, Noboru Babaguchi, and Jianping Fan. "Detecting intended human objects in human-captured videos." Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on. IEEE, 2010. Nakashima, Y.; Babaguchi, N.; Fan, J. (2010). "Detecting intended human objects in human-captured videos". 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops. p. 33. doi:10.1109/CVPRW.2010.5543721. ISBN   978-1-4244-7029-7.