Lumitrack is a motion capture technology developed by Robert Xiao, Chris Harrison and Scott Hudson at Carnegie Mellon University. It combines projectors and sensors to provide high-fidelity motion-tracking. These types of sensors are used in video game controllers, such as Microsoft's Kinect, and in motion capture for movie and television production.
Motion capture is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robotics. In filmmaking and video game development, it refers to recording actions of human actors, and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. In many fields, motion capture is sometimes called motion tracking, but in filmmaking and games, motion tracking usually refers more to match moving.
Chris Harrison is a British-born, American computer scientist and entrepreneur, working in the fields of human-computer interaction, machine learning and sensor-driven interactive systems. He is a professor at Carnegie Mellon University and director of the Future Interfaces Group within the Human-Computer Interaction Institute. He has previously conducted research at AT&T Labs, Microsoft Research, IBM Research and Disney Research. He is also the CTO and co-founder of Qeexo, a machine learning and interaction technology startup.
Scott E. Hudson is a professor in the Human-Computer Interaction Institute at Carnegie Mellon University. He was previously an associate professor in the College of Computing at the Georgia Institute of Technology, and prior to that, an assistant professor of computer science at the University of Arizona. He earned his Ph.D. in computer science at the University of Colorado in 1986.
Although the research prototype of Lumitrack currently uses visible light, it could be adapted to utilize invisible infrared light. According to the university, the sensors require little power and should be cheap to mass-produce. They could even be built into smartphones.
The projectors cover the tracked area with structured patterns called a binary m-sequence that resemble barcodes. The series of bars encodes a series of an assortment of vertical lines of varying thicknesses, without repeating any combination of seven adjacent line types anywhere in the projected image. [1] The sensors read the bars to assess motion. The initial implementation offers sub-millimeter accuracy. When two m-sequences are projected at right angles to each other, the sensor can determine its position in two dimensions; while additional sensors enable 3D tracking. [2]
The sensors are simple to manufacture and require little power and features response times in the range of 2.5 milliseconds, [1] making them candidates for incorporation into other devices, such as phones. The sensors can be attached to tracked objects or to fixed objects such as walls. [2]
The developers target video games as an initial application. Other possibilities include including CGI for movies and television and human–robot interaction. [2]
Human–robot interaction is the study of interactions between humans and robots. It is often referred as HRI by researchers. Human–robot interaction is a multidisciplinary field with contributions from human–computer interaction, artificial intelligence, robotics, natural language understanding, design, and social sciences.
A barcode is a method of representing data in a visual, machine-readable form. Initially, barcodes represented data by varying the widths and spacings of parallel lines. These barcodes, now commonly referred to as linear or one-dimensional (1D), can be scanned by special optical scanners, called barcode readers. Later, two-dimensional (2D) variants were developed, using rectangles, dots, hexagons and other geometric patterns, called matrix codes or 2D barcodes, although they do not use bars as such. 2D barcodes can be read or deconstructed using application software on mobile devices with inbuilt cameras, such as smartphones.
A barcode reader is an optical scanner that can read printed barcodes, decode the data contained in the barcode and send the data to a computer. Like a flatbed scanner, it consists of a light source, a lens and a light sensor translating for optical impulses into electrical signals. Additionally, nearly all barcode readers contain decoder circuitry that can analyze the barcode's image data provided by the sensor and sending the barcode's content to the scanner's output port.
A rotary encoder, also called a shaft encoder, is an electro-mechanical device that converts the angular position or motion of a shaft or axle to analog or digital output signals.
A handheld projector is an image projector in a handheld device. It was developed to as a computer display device for compact portable devices such as mobile phones, personal digital assistants, and digital cameras, which have sufficient storage capacity to handle presentation materials but are too small to accommodate a display screen that an audience can see easily. Handheld projectors involve miniaturized hardware, and software that can project digital images onto a nearby viewing surface.
Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current focuses in the field include emotion recognition from face and hand gesture recognition. Users can use simple gestures to control or interact with devices without physically touching them. Many approaches have been made using cameras and computer vision algorithms to interpret sign language. However, the identification and recognition of posture, gait, proxemics, and human behaviors is also the subject of gesture recognition techniques. Gesture recognition can be seen as a way for computers to begin to understand human body language, thus building a richer bridge between machines and humans than primitive text user interfaces or even GUIs, which still limit the majority of input to keyboard and mouse and interact naturally without any mechanical devices. Using the concept of gesture recognition, it is possible to point a finger at this point will move accordingly. This could make conventional input on devices such and even redundant.
An interactive whiteboard (IWB) also commonly known as Interactive board or Smart Whiteboards is a large interactive display in the form factor of a whiteboard. It can either be a standalone touchscreen computer used independently to perform tasks and operations, or a connectable apparatus used as a touchpad to control computers from a projector. They are used in a variety of settings, including classrooms at all levels of education, in corporate board rooms and work groups, in training rooms for professional sports coaching, in broadcasting studios, and others.
Facial motion capture is the process of electronically converting the movements of a person's face into a digital database using cameras or laser scanners. This database may then be used to produce CG computer animation for movies, games, or real-time avatars. Because the motion of CG characters is derived from the movements of real people, it results in a more realistic and nuanced computer character animation than if the animation were created manually.
The following are common definitions related to the machine vision field.
Range imaging is the name for a collection of techniques that are used to produce a 2D image showing the distance to points in a scene from a specific point, normally associated with some type of sensor device.
Document cameras, also known as visual presenters, visualisers, digital overheads, or docucams, are real-time image capture devices for displaying an object to a large audience. Like an opaque projector, a document camera is able to magnify and project the images of actual, three-dimensional objects, as well as transparencies. They are, in essence, high resolution web cams, mounted on arms so as to facilitate their placement over a page. This allows a teacher, lecturer or presenter to write on a sheet of paper or to display a two or three-dimensional object while the audience watches. Theoretically, all objects can be displayed by a document camera. Most objects are simply placed under the camera. The camera takes the picture which in turn produces a live picture using a projector or monitor. Different types of document camera/visualizer allow great flexibility in terms of placement of objects. Larger objects, for example, can simply be placed in front of the camera and the camera rotated as necessary,or a ceiling mounted document camera can also be used to allow a larger working area to be used.
A structured-light 3D scanner is a 3D scanning device for measuring the three-dimensional shape of an object using projected light patterns and a camera system.
In electrical engineering, capacitive sensing is a technology, based on capacitive coupling, that can detect and measure anything that is conductive or has a dielectric different from air.
In the field of gesture recognition and image processing, finger tracking is a high-resolution technique that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D. In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard and a mouse.
PrimeSense was an Israeli 3D sensing company based in Tel-Aviv. PrimeSense had offices in Israel, North America, Japan, Singapore, Korea, China and Taiwan (China). PrimeSense was bought by Apple Inc. for $360 million on November 24, 2013.
A projector or image projector is an optical device that projects an image onto a surface, commonly a projection screen. Most projectors create an image by shining a light through a small transparent lens, but some newer types of projectors can project the image directly, by using lasers. A virtual retinal display, or retinal projector, is a projector that projects an image directly on the retina instead of using an external projection screen.
IllumiRoom is a Microsoft Research project that augments a television screen with images projected onto the wall and surrounding objects. The current proof-of-concept uses a Kinect sensor and video projector. The Kinect sensor captures the geometry and colors of the area of the room that surrounds the television, and the projector displays video around the television that corresponds to a video source on the television, such as a video game or movie.
The Sony Xperia XZs is an Android smartphone manufactured and marketed by Sony. Part of the Xperia X series, the device was announced to the public along with the Xperia XZ Premium at the annual Mobile World Congress last February 2017.