Pose tracking

Last updated

In virtual reality (VR) and augmented reality (AR), a pose tracking system detects the precise pose of head-mounted displays, controllers, other objects or body parts within Euclidean space. Pose tracking is often referred to as 6DOF tracking, for the six degrees of freedom in which the pose is often tracked. [1]

Contents

Pose tracking is sometimes referred to as positional tracking, but the two are separate. Pose tracking is different from positional tracking because pose tracking includes orientation whereas and positional tracking does not. In some consumer GPS systems, orientation data is added additionally using magnetometers, which give partial orientation information, but not the full orientation that pose tracking provides.

Pose tracking in virtual reality Positional tracking in virtual reality.png
Pose tracking in virtual reality

In VR, it is paramount that pose tracking is both accurate and precise so as not to break the illusion of a being in virtual world. Several methods of tracking the position and orientation (pitch, yaw and roll) of the display and any associated objects or devices have been developed to achieve this. Many methods utilize sensors which repeatedly record signals from transmitters on or near the tracked object(s), and then send that data to the computer in order to maintain an approximation of their physical locations. A popular tracking method is Lighthouse tracking. By and large, these physical locations are identified and defined using one or more of three coordinate systems: the Cartesian rectilinear system, the spherical polar system, and the cylindrical system. Many interfaces have also been designed to monitor and control one's movement within and interaction with the virtual 3D space; such interfaces must work closely with positional tracking systems to provide a seamless user experience. [2]

Another type of pose tracking used more often in newer systems is referred to as inside-out tracking, including Simultaneous localization and mapping (SLAM) or Visual-inertial odometry (VIO). One example of a device that uses inside-out pose tracking is the Oculus Quest 2.

Wireless tracking

Wireless tracking uses a set of anchors that are placed around the perimeter of the tracking space and one or more tags that are tracked. This system is similar in concept to GPS, but works both indoors and outdoors. Sometimes referred to as indoor GPS. The tags triangulate their 3D position using the anchors placed around the perimeter. A wireless technology called Ultra Wideband has enabled the position tracking to reach a precision of under 100 mm. By using sensor fusion and high speed algorithms, the tracking precision can reach 5 mm level with update speeds of 200 Hz or 5 ms latency.

Pros:

Cons:

Optical tracking

Markerless pose tracking

Optical tracking uses cameras placed on or around the headset to determine position and orientation based on computer vision algorithms. This method is based on the same principle as stereoscopic human vision. When a person looks at an object using binocular vision, they are able to define approximately at what distance the object is placed due to the difference in perspective between the two eyes. In optical tracking, cameras are calibrated to determine the distance to the object and its position in space. Optical systems are reliable and relatively inexpensive, but they can be difficult to calibrate. Furthermore, the system requires a direct line of light without occlusions, otherwise it will receive wrong data.

Optical tracking can be done either with or without markers. Tracking with markers involves targets with known patterns to serve as reference points, and cameras constantly seek these markers and then use various algorithms (for example, POSIT algorithm) to extract the position of the object. Markers can be visible, such as printed QR codes, but many use infrared (IR) light that can only be picked up by cameras. Active implementations feature markers with built-in IR LED lights which can turn on and off to sync with the camera, making it easier to block out other IR lights in the tracking area. [5] Passive implementations are retroreflectors which reflect the IR light back towards the source with little scattering. Markerless tracking does not require any pre-placed targets, instead using the natural features of the surrounding environment to determine position and orientation. [6]

Outside-in tracking

In this method, cameras are placed in stationary locations in the environment to track the position of markers on the tracked device, such as a head mounted display or controllers. Having multiple cameras allows for different views of the same markers, and this overlap allows for accurate readings of the device position. [5] The original Oculus Rift utilizes this technique, placing a constellation of IR LEDs on its headset and controllers to allow external cameras in the environment to read their positions. [7] This method is the most mature, having applications not only in VR but also in motion capture technology for film. [8] However, this solution is space-limited, needing external sensors in constant view of the device.

Pros:

Cons:

Inside-out tracking

In this method, the camera is placed on the tracked device and looks outward to determine its location in the environment. Headsets that use this tech have multiple cameras facing different directions to get views of its entire surroundings. This method can work with or without markers. The Lighthouse system used by the HTC Vive is an example of active markers. Each external Lighthouse module contains IR LEDs as well as a laser array that sweeps in horizontal and vertical directions, and sensors on the headset and controllers can detect these sweeps and use the timings to determine position. [10] [11] Markerless tracking, such as on the Oculus Quest, does not require anything mounted in the outside environment. It uses cameras on the headset for a process called SLAM, or simultaneous localization and mapping, where a 3D map of the environment is generated in real time. [6] Machine learning algorithms then determine where the headset is positioned within that 3D map, using feature detection to reconstruct and analyze its surroundings. [12] [13] This tech allows high-end headsets like the Microsoft HoloLens to be self-contained, but it also opens the door for cheaper mobile headsets without the need of tethering to external computers or sensors. [14]

Pros:

Cons:

Inertial tracking

Inertial tracking use data from accelerometers and gyroscopes, and sometimes magnetometers. Accelerometers measure linear acceleration. Since the derivative of position with respect to time is velocity and the derivative of velocity is acceleration, the output of the accelerometer could be integrated to find the velocity and then integrated again to find the position relative to some initial point. Gyroscopes measure angular velocity. Angular velocity can be integrated as well to determine angular position relatively to the initial point. Magnetometers measure magnetic fields and magnetic dipole moments. The direction of Earth's magnetic field can be integrated to have an absolute orientation reference and to compensate for gyroscopic drifts. [15] Modern inertial measurement units systems (IMU) are based on MEMS technology allows to track the orientation (roll, pitch, yaw) in space with high update rates and minimal latency. Gyroscopes are always used for rotational tracking, but different techniques are used for positional tracking based on factors like cost, ease of setup, and tracking volume. [16]

Dead reckoning is used to track positional data, which alters the virtual environment by updating motion changes of the user. [17] The dead reckoning update rate and prediction algorithm used in a virtual reality system affect the user experience, but there is no consensus on best practices as many different techniques have been used. [17] It is hard to rely only on inertial tracking to determine the precise position because dead reckoning leads to drift, so this type of tracking is not used in isolation in virtual reality. [18] A lag between the user's movement and virtual reality display of more than 100ms has been found to cause nausea. [19]

Inertial sensors are not only capable of tracking rotational movement (roll, pitch, yaw), but also translational movement. These two types of movement together are known as the Six degrees of freedom. Many applications of virtual reality need to not only track the users’ head rotations, but also how their bodies move with them (left/right, back/forth, up/down). [20] Six degrees of freedom capability is not necessary for all virtual reality experiences, but it is useful when the user needs to move things other than their head.

Pros:

Cons:

Sensor fusion

Sensor fusion combines data from several tracking algorithms and can yield better outputs than only one technology. One of the variants of sensor fusion is to merge inertial and optical tracking. These two techniques are often used together because while inertial sensors are optimal for tracking fast movements they also accumulate errors quickly, and optical sensors offer absolute references to compensate for inertial weaknesses. [16] Further, inertial tracking can offset some shortfalls of optical tracking. For example, optical tracking can be the main tracking method, but when an occlusion occurs inertial tracking estimates the position until the objects are visible to the optical camera again. Inertial tracking could also generate position data in-between optical tracking position data because inertial tracking has higher update rate. Optical tracking also helps to cope with a drift of inertial tracking. Combining optical and inertial tracking has shown to reduce misalignment errors that commonly occur when a user moves their head too fast. [21] Microelectrical magnetic systems advancements have made magnetic/electric tracking more common due to their small size and low cost. [22]

Acoustic tracking

Acoustic tracking systems use techniques for identifying an object or device's position similar to those found naturally in animals that use echolocation. Analogous to bats locating objects using differences in soundwave return times to their two ears, acoustic tracking systems in VR may use sets of at least three ultrasonic sensors and at least three ultrasonic transmitters on devices in order to calculate the position and orientation of an object (e.g. a handheld controller). [23] There are two ways to determine the position of the object: to measure time-of-flight of the sound wave from the transmitter to the receivers or the phase coherence of the sinusoidal sound wave by receiving the transfer.

Time-of-flight methods

Given a set of three noncollinear sensors (or receivers) with distances between them d1 and d2, as well as the travel times of an ultrasonic soundwave (a wave with frequency greater than 20 kHz) from a transmitter to those three receivers, the relative Cartesian position of the transmitter can be calculated as follows:

Here, each li represents the distance from the transmitter to each of the three receivers, calculated based on the travel time of the ultrasonic wave using the equation l = ctus. The constant c denotes the speed of sound, which is equal to 343.2 m/s in dry air at temperature 20°C. Because at least three receivers are required, these calculations are commonly known as triangulation.

Beyond its position, determining a device's orientation (i.e. its degree of rotation in all directions) requires at least three noncollinear points on the tracked object to be known, mandating the number of ultrasonic transmitters to be at least three per device tracked in addition to the three aforementioned receivers. The transmitters emit ultrasonic waves in sequence toward the three receivers, which can then be used to derive spatial data on the three transmitters using the methods described above. The device's orientation can then be derived based on the known positioning of the transmitters upon the device and their spatial locations relative to one another. [24]

Phase-coherent methods

As opposed to TOF methods, phase-coherent (PC) tracking methods have also been used to locate object acoustically. PC tracking involves comparing the phase of the current soundwave received by sensors to that of a prior reference signal, such that one can determine the relative change in position of transmitters from the last measurement. Because this method operates only on observed changes in position values, and not on absolute measurements, any errors in measurement tend to compound over more observations. Consequently, this method has lost popularity with developers over time.


Pros:

Cons:

In summary, implementation of acoustic tracking is optimal in cases where one has total control over the ambient environment that the VR or AR system resides in, such as a flight simulator. [2] [25] [26]

Magnetic tracking

Magnetic tracking relies on measuring the intensity of inhomogenous magnetic fields with electromagnetic sensors. A base station, often referred to as the system's transmitter or field generator, generates an alternating or a static electromagnetic field, depending on the system's architecture.

To cover all directions in the three dimensional space, three magnetic fields are generated sequentially. The magnetic fields are generated by three electromagnetic coils which are perpendicular to each other. These coils should be put in a small housing mounted on a moving target which position is necessary to track. Current, sequentially passing through the coils, turns them into electromagnets, which allows them to determine their position and orientation in space.

Because magnetic tracking does not require a head-mounted display, which are frequently used in virtual reality, it is often the tracking system used in fully immersive virtual reality displays. [21] Conventional equipment like head-mounted displays are obtrusive to the user in fully enclosed virtual reality experiences, so alternative equipment such as that used in magnetic tracking is favored. Magnetic tracking has been implemented by Polhemus and in Razer Hydra by Sixense. The system works poorly near any electrically conductive material, such as metal objects and devices, that can affect an electromagnetic field. Magnetic tracking worsens as the user moves away from the base emitter, [21] and scalable area is limited and can't be bigger than 5 meters.

Pros:

Cons:

See also

Related Research Articles

<span class="mw-page-title-main">Virtual reality</span> Computer-simulated experience

Virtual reality (VR) is a simulated experience that employs 3D near-eye displays and pose tracking to give the user an immersive feel of a virtual world. Applications of virtual reality include entertainment, education and business. VR is one of the key technologies in the reality-virtuality continuum. As such, it is different from other digital visualization solutions, such as augmented virtuality and augmented reality.

<span class="mw-page-title-main">Augmented reality</span> View of the real world with computer-generated supplementary features

Augmented reality (AR) is an interactive experience that combines the real world and computer-generated 3D content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory. AR can be defined as a system that incorporates three basic features: a combination of real and virtual worlds, real-time interaction, and accurate 3D registration of virtual and real objects. The overlaid sensory information can be constructive, or destructive. As such, it is one of the key technologies in the reality-virtuality continuum.

<span class="mw-page-title-main">Motion capture</span> Process of recording the movement of objects or people

Motion capture is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robots. In filmmaking and video game development, it refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. In many fields, motion capture is sometimes called motion tracking, but in filmmaking and games, motion tracking usually refers more to match moving.

<span class="mw-page-title-main">Cave automatic virtual environment</span> Immersive virtual reality environment

A cave automatic virtual environment is an immersive virtual reality environment where projectors are directed to between three and six of the walls of a room-sized cube. The name is also a reference to the allegory of the Cave in Plato's Republic in which a philosopher contemplates perception, reality, and illusion.

<span class="mw-page-title-main">Head-mounted display</span> Type of display device

A head-mounted display (HMD) is a display device, worn on the head or as part of a helmet, that has a small display optic in front of one or each eye. HMDs have many uses including gaming, aviation, engineering, and medicine.

<span class="mw-page-title-main">Wired glove</span> Input device for human–computer interaction

A wired glove is an input device for human–computer interaction worn like a glove.

The Sword of Damocles is widely misattributed as the name of the first AR display prototype. According to Ivan Sutherland, this was merely a joke name for the mechanical system that supported and tracked the actual HMD below it. It happened to look like a giant overhead cross, hence the joke. Ivan Sutherland's 1968 ground-breaking AR prototype was actually called "the head-mounted display", which is perhaps the first recorded use of the term "HMD", and he preferred "Stereoscopic-Television Apparatus for Individual Use."

<span class="mw-page-title-main">Six degrees of freedom</span> Types of movement possible for a rigid body in three-dimensional space

Six degrees of freedom (6DOF), or sometimes six degrees of movement, refers to the six mechanical degrees of freedom of movement of a rigid body in three-dimensional space. Specifically, the body is free to change position as forward/backward (surge), up/down (heave), left/right (sway) translation in three perpendicular axes, combined with changes in orientation through rotation about three perpendicular axes, often termed yaw, pitch, and roll.

A positioning system is a system for determining the position of an object in space. One of the most well-known and commonly used positioning systems is the Global Positioning System (GPS).

<span class="mw-page-title-main">Indoor positioning system</span> Network of devices used to wirelessly locate objects inside a building

An indoor positioning system (IPS) is a network of devices used to locate people or objects where GPS and other satellite technologies lack precision or fail entirely, such as inside multistory buildings, airports, alleys, parking garages, and underground locations.

<span class="mw-page-title-main">Motion controller</span> Video game controller that tracks motions

In computing, a motion controller is a type of input device that uses accelerometers, gyroscopes, cameras, or other sensors to track motion.

<span class="mw-page-title-main">Virtuality (product)</span> Virtual reality gaming machine

Virtuality was a range of virtual reality machines produced by Virtuality Group, and found in video arcades in the early 1990s. The machines delivered real-time VR gaming via a stereoscopic VR headset, joysticks, tracking devices and networked units for a multi-player experience.

In computing, 3D interaction is a form of human-machine interaction where users are able to move and perform interaction in 3D space. Both human and machine process information where the physical position of elements in the 3D space is relevant.

<span class="mw-page-title-main">Input device</span> Device that provides data and signals to a computer

In computing, an input device is a piece of equipment used to provide data and control signals to an information processing system, such as a computer or information appliance. Examples of input devices include keyboards, computer mice, scanners, cameras, joysticks, and microphones.

<span class="mw-page-title-main">Finger tracking</span> High-resolution technique in gesture recognition and image processing

In the field of gesture recognition and image processing, finger tracking is a high-resolution technique developed in 1969 that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D. In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard and a mouse.

<span class="mw-page-title-main">Oculus Rift</span> Virtual reality headsets by Oculus VR

Oculus Rift is a discontinued line of virtual reality headsets developed and manufactured by Oculus VR, a virtual reality company founded by Palmer Luckey that is widely credited with reviving the virtual reality industry. It was the first virtual reality headset to provide a realistic experience at an accessible price, utilizing novel technology to increase quality and reduce cost by orders of magnitude compared to earlier systems. The first headset in the line was the Oculus Rift DK1, released on March 28, 2013. The last was the Oculus Rift S, discontinued in April 2021.

Oculus Touch is a line of motion controller systems used by Meta Platforms virtual reality headsets. The controller was first introduced in 2016 as a standalone accessory for the Oculus Rift CV1, and began to be bundled with the headset and all future Oculus products beginning in July 2017. Since their original release, Touch controllers have undergone revisions for later generations of Oculus/Meta hardware, including a switch to inside-out tracking, and other design changes.

<span class="mw-page-title-main">Virtual reality headset</span> Head-mounted device that provides virtual reality for the wearer

A virtual reality headset is a head-mounted device that uses 3D near-eye displays and positional tracking to provide a virtual reality environment for the user. VR headsets are widely used with VR video games, but they are also used in other applications, including simulators and trainers. VR headsets typically include a stereoscopic display, stereo sound, and sensors like accelerometers and gyroscopes for tracking the pose of the user's head to match the orientation of the virtual camera with the user's eye positions in the real world.

<span class="mw-page-title-main">Virtual reality game</span> Video game played in virtual reality

A virtual reality game or VR games is a video game played on virtual reality (VR) hardware. Most VR games are based on player immersion, typically through head-mounted display unit or headset with stereoscopic displays and one or more controllers.

<span class="mw-page-title-main">Oculus Rift CV1</span> Virtual reality headset by Oculus VR

Oculus Rift CV1, also known simply as Oculus Rift, is a virtual reality headset developed by Oculus VR, a subsidiary of Meta Platforms, known at the time as Facebook Inc. It was announced in January 2016, and released in March the same year. The device constituted the first commercial release in the Oculus Rift lineup.

References

  1. "What is a 3 DoF vs 6 DoF in VR?".
  2. 1 2 Aukstakalnis, Steve. Practical augmented reality : a guide to the technologies, applications, and human factors for AR and VR. Boston. ISBN   978-0-13-409429-8. OCLC   958300989.
  3. Emura, Satoru; Tachi, Susumu (August 1998). "Multisensor Integrated Prediction for Virtual Reality". Presence: Teleoperators and Virtual Environments. 7 (4): 410–422. doi:10.1162/105474698565811. ISSN   1054-7460. S2CID   34491936.
  4. url=https://indotraq.com/?page_id=1949
  5. 1 2 VR, Road to (2014-06-02). "Overview of Positional Tracking Technologies for Virtual Reality". Road to VR. Retrieved 2020-11-06.
  6. 1 2 "How Oculus squeezed sophisticated tracking into pipsqueak hardware". TechCrunch. 22 August 2019. Retrieved 2020-11-06.
  7. "Oculus App Store Will Require Pre-Approvals, Comfort Ratings, Tax". TechCrunch. 12 June 2015. Retrieved 2020-11-06.
  8. Pustka, D.; Hülß, J.; Willneff, J.; Pankratz, F.; Huber, M.; Klinker, G. (November 2012). "Optical outside-in tracking using unmodified mobile phones". 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). pp. 81–89. doi:10.1109/ISMAR.2012.6402542. ISBN   978-1-4673-4662-7. S2CID   18349919.
  9. 1 2 "Inside-out v Outside-in: How VR tracking works, and how it's going to change". Wareable. 2017-05-03. Retrieved 2020-11-06.
  10. Dempsey, P. (2016-08-01). "The Teardown: HTC Vive virtual reality headset". Engineering & Technology. 11 (7): 80–81. doi:10.1049/et.2016.0731. ISSN   1750-9637.
  11. Niehorster, Diederick C.; Li, Li; Lappe, Markus (June 2017). "The Accuracy and Precision of Position and Orientation Tracking in the HTC Vive Virtual Reality System for Scientific Research". i-Perception. 8 (3): 204166951770820. doi:10.1177/2041669517708205. ISSN   2041-6695. PMC   5439658 . PMID   28567271.
  12. Chen, Liyan; Peng, Xiaoyuan; Yao, Junfeng; Qiguan, Hong; Chen, Chen; Ma, Yihan (August 2016). "Research on the augmented reality system without identification markers for home exhibition". 2016 11th International Conference on Computer Science & Education (ICCSE). Nagoya, Japan: IEEE. pp. 524–528. doi:10.1109/ICCSE.2016.7581635. ISBN   978-1-5090-2218-2. S2CID   17281382.
  13. Rasmussen, Loki; Basinger, Jay; Milanova, Mariofanna (March 2019). "Networking Consumer Systems to Provide a Development Environment for Inside-Out Marker-Less Tracking for Virtual Reality Headsets". 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). Osaka, Japan: IEEE. pp. 1132–1133. doi:10.1109/VR.2019.8798349. ISBN   978-1-7281-1377-7. S2CID   201066258.
  14. hferrone. "How inside-out tracking works - Enthusiast Guide". docs.microsoft.com. Retrieved 2020-11-06.
  15. "The Optimal Number of Axes for Motion Sensors". CEVA’s Experts blog. 5 February 2019. Retrieved 8 September 2022.
  16. 1 2 Bleser, Gabriele; Stricker, Didier (February 2009). "Advanced tracking through efficient image processing and visual–inertial sensor fusion". Computers & Graphics. 33 (1): 59–72. doi:10.1016/j.cag.2008.11.004. S2CID   5645304.
  17. 1 2 Bleser, Gabriele; Stricker, Didier (February 2009). "Advanced tracking through efficient image processing and visual–inertial sensor fusion". Computers & Graphics. 33 (1): 59–72. doi:10.1016/j.cag.2008.11.004. S2CID   5645304.
  18. "How virtual reality positional tracking works". VentureBeat. 2019-05-05. Retrieved 2020-11-06.
  19. Emura, Satoru; Tachi, Susumu (August 1998). "Multisensor Integrated Prediction for Virtual Reality". Presence: Teleoperators and Virtual Environments. 7 (4): 410–422. doi:10.1162/105474698565811. ISSN   1054-7460. S2CID   34491936.
  20. "A quick guide to Degrees of Freedom in Virtual Reality". Kei Studios. 2018-02-12. Retrieved 2020-11-06.
  21. 1 2 3 4 5 Hogue, A.; Jenkin, M. R.; Allison, R. S. (May 2004). "An optical-inertial tracking system for fully-enclosed VR displays". First Canadian Conference on Computer and Robot Vision, 2004. Proceedings. pp. 22–29. doi:10.1109/CCCRV.2004.1301417. ISBN   0-7695-2127-4. S2CID   1010865.
  22. 1 2 3 Atrsaei, Arash; Salarieh, Hassan; Alasty, Aria; Abediny, Mohammad (May 2018). "Human Arm Motion Tracking by Inertial/Magnetic Sensors Using Unscented Kalman Filter and Relative Motion Constraint". Journal of Intelligent & Robotic Systems. 90 (1–2): 161–170. doi:10.1007/s10846-017-0645-z. ISSN   0921-0296. S2CID   3887896.
  23. Jones, Gareth (July 2005). "Echolocation". Current Biology. 15 (13): R484–R488. doi: 10.1016/j.cub.2005.06.051 . ISSN   0960-9822. PMID   16005275.
  24. Mihelj, Matjaž; Novak, Domen; Beguš, Samo (2014). "Virtual Reality Technology and Applications". Intelligent Systems, Control and Automation: Science and Engineering. 68. doi:10.1007/978-94-007-6910-6. ISBN   978-94-007-6909-0. ISSN   2213-8986.
  25. T. Mazuryk, Virtual Reality History, Applications, Technology and Future. Vienna, Austria: Vienna University of Technology, 1996.
  26. R. Holloway and A. Lastra, “Virtual Environments: A Survey of the Technology,” cs.unc.edu. [Online]. Available: http://www.cs.unc.edu/techreports/93-033.pdf .

Bibliography