Vladlen Koltun | |
---|---|
Born | 1980 |
Nationality | Israeli–American |
Citizenship | United States |
Alma mater |
|
Known for |
|
Awards |
|
Scientific career | |
Fields |
|
Institutions | |
Thesis | Arrangements in four dimensions and related structures (2002) |
Doctoral advisor | Micha Sharir |
Other academic advisors | Christos Papadimitriou |
Website | vladlen |
Vladlen Koltun (born 1980) is an Israeli-American computer scientist and intelligent systems researcher. He currently serves as distinguished scientist at Apple Inc. His main areas of research are artificial intelligence, computer vision, machine learning, and pattern recognition. He also made a significant contribution to robotics and autonomous driving. [6]
Koltun's contributions to research and publications are in the areas of convolutional neural networks, reality simulation, view synthesis, [7] photorealistic rendering, urban self-driving cars simulation, 3D computer graphics, robot locomotion, [1] and drones maneuverability in dynamic environments. [8] [9]
He is also known for his work on the photorealism enhancement system, [10] [11] and for the critical study of the Hirsch index, a metric of scientists' work that is common in the community. [12]
Vladlen Koltun was born in 1980 in Kiev, Ukraine and grew up in Israel. [13] He completed his BS degree in computer science magna cum laude from Tel Aviv University in 2000. He continued his studies at the university, and finished his PhD with honors in computer science in 2002 with a thesis, Arrangements in four dimensions and related structures; his doctoral adviser was Micha Sharir. [13] [14] He then completed his postdoctoral fellowship under the supervision of Christos Papadimitriou at the University of California, Berkeley, where he conducted research in theoretical computer science in 2002–2005. [15]
Koltun served as an assistant professor at Stanford University from 2005 to 2013, where he lectured in the areas of computer science, computer graphics, and geometric algorithms. [16] During his tenure at Stanford, he supervised PhD students and postdoctoral researchers. [14] While at Stanford, Koltun was a recipient of the Sloan Research Fellowship [2] and the National Science Foundation's Career Award. [3]
Koltun's research at Stanford contributed to the development of data-driven 3D modeling technology in collaboration with Siddhartha Chaudhuri. [17] Chaudhuri's work along with Koltun, Evangelos Kalogerakis, and Leonidas Guibas resulted in a SIGGRAPH publication in 2011. [18] As a result, Mixamo licensed the technology from Stanford and later Adobe Inc. acquired Mixamo and further developed Adobe Fuse CC, 3D computer graphics software that enabled users to create 3D characters. [19] In 2014, Koltun joined Adobe to conduct research in visual computing with the primary focus on three-dimensional reconstruction. [20]
Koltun left Adobe to join Intel, where he served in various positions until 2021 for the company's R&D projects for Intelligent Systems. [6] [5]
Since August 2021, Koltun has been serving as a distinguished scientist at Apple Inc. [16]
At Intel, Koltun contributed to the development of virtual reality simulators for urban autonomous driving, robots, and drones, focusing on deep reinforcement learning techniques with neural networks in virtual environments. These networks underwent trial-and-error learning in VR before being transferred to robots or drones for real-world applications. This method was applied to the ANYmal robot, a quadrupedal machine with proprioceptive feedback in locomotion control. [21] [8] [22]
The studies in the domain of urban autonomous driving led Koltun's group to the development of the Car Learning to Act (CARLA) project in 2017. [23] It is an open-source simulator, powered by Unreal Engine, that can be used to test self-driving technologies in realistic environments with random dangerous situations. [23] [24] [25] The project was funded by the Intel Labs and Toyota Research Institute. [23] [26]
In 2020, inspired by Google Cardboard, Koltun developed OpenBot along with a German scientist Matthias Müller. [1] [27] It is a software stack that transforms Android smartphones into four-wheeled robots capable of navigation, object tracking, and obstacle avoidance. The robot features a 3D-printable chassis, accommodating a controller, LEDs, a smartphone mount, and a USB cable. [27] The software consists of the Arduino Nano board, which bridges the smartphone with the motor actuation tasks and batteries, and an Android app responsible for the integration of data. [1] [27] The project was released as open-source software for robotics-related applications with the software development kit available on GitHub. [28] [29]
Koltun also contributed to further development in the fields of 3D photorealistic view synthesis and rendering. In 2021, using his work with other researchers at Intel, Enhancing Photorealism Enhancement, [30] a photorealism enhancement system was tested in the Grand Theft Auto 5. [31] [32] [33]
Koltun co-authored a research that developed Swift, an autonomous drone system using onboard sensors that can match the performance of human world champions. [34] [35] The system integrates deep reinforcement learning with real-world data, enabling the drone to perform effectively in physical environments. [34]
Koltun has expressed concerns regarding the h-index's reliability, highlighting the inflation of its values due to the prevalence of multiple co-authorships in scientific communities. This critique was presented in collaboration with David Hafner. [36] [37]
Dario Floreano is a Swiss-Italian roboticist and engineer. He is Director of the Laboratory of Intelligent System (LIS) at the École Polytechnique Fédérale de Lausanne in Switzerland and was the founding director of the Swiss National Centre of Competence in Research (NCCR) Robotics.
Non-photorealistic rendering (NPR) is an area of computer graphics that focuses on enabling a wide variety of expressive styles for digital art, in contrast to traditional computer graphics, which focuses on photorealism. NPR is inspired by other artistic modes such as painting, drawing, technical illustration, and animated cartoons. NPR has appeared in movies and video games in the form of cel-shaded animation as well as in scientific visualization, architectural illustration and experimental animation.
Legged robots are a type of mobile robot which use articulated limbs, such as leg mechanisms, to provide locomotion. They are more versatile than wheeled robots and can traverse many different terrains, though these advantages require increased complexity and power consumption. Legged robots often imitate legged animals, such as humans or insects, in an example of biomimicry.
Artificial intelligence (AI) has been used in applications throughout industry and academia. Similar to electricity or computers, AI serves as a general-purpose technology that has numerous applications. Its applications span language translation, image recognition, decision-making, credit scoring, e-commerce and various other domains. AI which accommodates such technologies as machines being equipped perceive, understand, act and learning a scientific discipline.
Daniela L. Rus is a roboticist and computer scientist, Director of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), and the Andrew and Erna Viterbi Professor in the Department of Electrical Engineering and Computer Science (EECS) at the Massachusetts Institute of Technology. She is the author of the books Computing the Future and The Heart and the Chip.
Computer graphics deals with generating images and art with the aid of computers. Computer graphics is a core technology in digital photography, film, video games, digital art, cell phone and computer displays, and many specialized applications. A great deal of specialized hardware and software has been developed, with the displays of most devices being driven by computer graphics hardware. It is a vast and recently developed area of computer science. The phrase was coined in 1960 by computer graphics researchers Verne Hudson and William Fetter of Boeing. It is often abbreviated as CG, or typically in the context of film as computer generated imagery (CGI). The non-artistic aspects of computer graphics are the subject of computer science research.
Open-source robotics is a branch of robotics where robots are developed with open-source hardware and free and open-source software, publicly sharing blueprints, schematics, and source code. It is thus closely related to the open design movement, the maker movement and open science.
Andrew Yan-Tak Ng is a British-American computer scientist and technology entrepreneur focusing on machine learning and artificial intelligence (AI). Ng was a cofounder and head of Google Brain and was the former Chief Scientist at Baidu, building the company's Artificial Intelligence Group into a team of several thousand people.
Mixamo Inc. is a 3D computer graphics technology company. Based in San Francisco, the company develops and sells web-based services for 3D character animation. Mixamo's technologies use machine learning methods to automate the steps of the character animation process, including 3D modeling to rigging and 3D animation.
Google Brain was a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.
Tango was an augmented reality computing platform, developed and authored by the Advanced Technology and Projects (ATAP), a skunkworks division of Google. It used computer vision to enable mobile devices, such as smartphones and tablets, to detect their position relative to the world around them without using GPS or other external signals. This allowed application developers to create user experiences that include indoor navigation, 3D mapping, physical space measurement, environmental recognition, augmented reality, and windows into a virtual world.
Adobe Fuse CC was a 3D computer graphics software developed by Mixamo that enables users to create 3D characters. Its main novelty is the ability to import and integrate user-generated content into the character creator. Fuse was part of Mixamo's product suite, and it is aimed at video game developers, video game modders, and 3D enthusiasts.
Intel RealSense Technology, formerly known as Intel Perceptual Computing, is a product range of depth and tracking technologies designed to give machines and devices depth perception capabilities. The technologies, owned by Intel are used in autonomous drones, robots, AR/VR, smart home devices amongst many others broad market products.
An AI accelerator, deep learning processor, or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.
Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.
This article presents a detailed timeline of events in the history of computing from 2020 to the present. For narratives explaining the overall developments, see the history of computing.
AirSim is an open-source, cross platform simulator for drones, ground vehicles such as cars and various other objects, built on Epic Games’ proprietary Unreal Engine 4 as a platform for AI research. It is developed by Microsoft and can be used to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. This allows testing of autonomous solutions without worrying about real-world damage.
Chelsea Finn is an American computer scientist and assistant professor at Stanford University. Her research investigates intelligence through the interactions of robots, with the hope to create robotic systems that can learn how to learn. She is part of the Google Brain group.
A neural radiance field (NeRF) is a method based on deep learning for reconstructing a three-dimensional representation of a scene from sparse two-dimensional images. The NeRF model enables learning of novel view synthesis, scene geometry, and the reflectance properties of the scene. Additional scene properties such as camera poses may also be jointly learned. NeRF enables rendering of photorealistic views from novel viewpoints. First introduced in 2020, it has since gained significant attention for its potential applications in computer graphics and content creation.
Wang Gang, also known as Michael Wang, is an electrical and computer engineer and academic specializing in Artificial Intelligence and its application in autonomous driving. Wang has authored or co-authored more than 100 publications, cited over 28,000 times. His h-index is computed to be 72.