This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages)
|
Gary Bradski | |
---|---|
Born | United States |
Nationality | American |
Alma mater | University of California at Berkeley Boston University |
Awards | Darpa Grand Challenge First Place (2005) |
Scientific career | |
Fields | Computer Science Computer Vision |
Institutions | Intel Willow Garage Industrial Perception Magic Leap [1] |
Gary Bradski is an American scientist, engineer, entrepreneur, and author. He co-founded Industrial Perception, a company that developed perception applications for industrial robotic application (since acquired by Google in 2012 [2] ) and has worked on the OpenCV Computer Vision library, as well as published a book on that library. [3]
The OpenCV Library is a Computer Vision Software Library.
Originally published in 2006, the book Learning OpenCV (O'Reilly) serves as an introduction to the library and its use. An updated version of the book], which covers OpenCV 3, was published by O'Reilly Media in 2016.
Bradski has published a wide variety of articles in computer science on the topics of computer vision and optimization. The following are his most highly cited works: [4]
Computer vision tasks include methods for acquiring, processing, analyzing and understanding digital images,and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.
Machine vision is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance.
OpenCV is a library of programming functions mainly for real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez. The library is cross-platform and licensed as free and open-source software under Apache License 2. Starting in 2011, OpenCV features GPU acceleration for real-time operations.
libavcodec is a free and open-source library of codecs for encoding and decoding video and audio data.
Stanley is an autonomous car created by Stanford University's Stanford Racing Team in cooperation with the Volkswagen Electronics Research Laboratory (ERL). It won the 2005 DARPA Grand Challenge, earning the Stanford Racing Team a $2 million prize.
The following outline is provided as an overview of and topical guide to artificial intelligence:
Computer stereo vision is the extraction of 3D information from digital images, such as those obtained by a CCD camera. By comparing information about a scene from two vantage points, 3D information can be extracted by examining the relative positions of objects in the two panels. This is similar to the biological process of stereopsis.
Willow Garage was a robotics research lab and technology incubator devoted to developing hardware and open source software for personal robotics applications. The company was best known for its open source software suite Robot Operating System (ROS), which rapidly became a common, standard tool among robotics researchers upon its initial release in 2010. It was begun in late 2006 by Scott Hassan, who had worked with Larry Page and Sergey Brin to develop the technology that became the Google Search engine. Steve Cousins was the president and CEO. Willow Garage was located in Menlo Park, California.
The following outline is provided as an overview of and topical guide to robotics:
Andrew Yan-Tak Ng is a British-American computer scientist and technology entrepreneur focusing on machine learning and artificial intelligence (AI). Ng was a cofounder and head of Google Brain and was the former Chief Scientist at Baidu, building the company's Artificial Intelligence Group into a team of several thousand people.
AForge.NET is a computer vision and artificial intelligence library originally developed by Andrew Kirillov for the .NET Framework.
Deep learning is the subset of machine learning methods based on neural networks with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
Google Brain was a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.
Adrian Kaehler is an American scientist, engineer, entrepreneur, inventor and author. He is best known for his work on the OpenCV Computer Vision library, as well as two books on that library.
Movidius is a company based in San Mateo, California, that designs low-power processor chips for computer vision. The company was acquired by Intel in September 2016.
An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.
David Stavens is an American entrepreneur and scientist. He was co-founder and CEO of Udacity; a co-creator of Stanley, the winning self-driving car of the DARPA Grand Challenge; and co-founder and CEO of Nines, a creator of AI-enabled FDA-approved medical devices. Stavens has published in the fields of robotics, machine learning, and artificial intelligence and has helped start organizations with an aggregate market value of over $30 billion.
DeepScale, Inc. was an American technology company headquartered in Mountain View, California, that developed perceptual system technologies for automated vehicles. On October 1, 2019, the company was acquired by Tesla, Inc.
Studierfenster or StudierFenster (SF) is a free, non-commercial open science client/server-based medical imaging processing online framework. It offers capabilities, like viewing medical data (computed tomography (CT), magnetic resonance imaging (MRI), etc.) in two- and three-dimensional space directly in the standard web browsers, like Google Chrome, Mozilla Firefox, Safari, and Microsoft Edge. Other functionalities are the calculation of medical metrics (dice score and Hausdorff distance), manual slice-by-slice outlining of structures in medical images (segmentation), manual placing of (anatomical) landmarks in medical image data, viewing medical data in virtual reality, a facial reconstruction and registration of medical data for augmented reality, one click showcases for COVID-19 and veterinary scans, and a Radiomics module.
A vision transformer (ViT) is a transformer designed for computer vision. A ViT breaks down an input image into a series of patches, serialises each patch into a vector, and maps it to a smaller dimension with a single matrix multiplication. These vector embeddings are then processed by a transformer encoder as if they were token embeddings.