FreeTrack

Last updated
FreeTrack
Stable release
v2.2 / October 7, 2008
Operating system Microsoft Windows
Type Optical motion tracking
License GNU General Public License
Website www.free-track.net

FreeTrack is a general-purpose optical motion tracking application for Microsoft Windows, released under the GNU General Public License, that can be used with common inexpensive cameras. Its primary focus is head tracking with uses in virtual reality, simulation, video games, 3D modeling, computer aided design and general hands-free computing to improve computer accessibility. Tracking can be made sensitive enough that only small head movements are required so that the user's eyes never leave the screen.

Contents

A camera is positioned to observe a rigid point model worn by the user, the points of which need to be isolated from background light by means of physical and software filtering. Motion is tracked with up to six degrees of freedom (6DOF): yaw, pitch, roll, left/right, up/down and forward/back. Windows-compatible video devices like webcams are supported, as well as special Nintendo Wii Remote camera, iPhone Truedepth camera with Eyeware Beam, and NaturalPoint cameras (TrackIR, SmartNav and OptiTrack).

FreeTrack can output head-tracking data to programs directly using its own open interface, as well as TrackIR, SimConnect and FSUIPC interfaces. Programs that support these interfaces are regarded as being FreeTrack-compatible. FreeTrack can also emulate mouse, keyboard, and joystick (via PPJoy) if a program does not support a direct interface.

FreeTrack is coded in Delphi 7 and uses DirectShow and DirectX. Head tracking is achieved using implementations of DeMenthon's four-point iterative pose estimation algorithm (POSIT) [1] and Alter's three point geometric algorithm. [2]

Software

FreeTrack uses a camera to collect real-time information on the point model worn by the user. Specifically the image coordinates of the model points, which are either received directly from the camera or extracted from a video stream. These coordinates are used to generate an estimate of the real head pose, which can be transformed by the user in a number of ways to create a virtual pose. One of the most fundamental transformations involves amplifying rotation so that only small head movements are required. Finally, the virtual pose is sent to the user's choice of outputs. This is all done in the background, with tracking status displayed in the system tray.

A 3D preview is available that shows the virtual head position and orientation for a given real head pose and can be viewed from multiple perspectives, including first-person. This greatly assists with testing and makes it easier to experiment with different settings.

Each degree of freedom (axis) has a response curve that can be modified to change the way the virtual head moves for a given real head movement. This is commonly used to create a central deadzone region so that the user’s head can be more relaxed there.

Keyboard, mouse and joystick buttons can be used to toggle tracking settings, including the virtual centre location (like adjusting the seat position in a car) and individually toggle axes and outputs.

For NaturalPoint cameras, FreeTrack can provide advanced features and a level of customization that is not available with official software.

Camera

Comparison of some cameras compatible with FreeTrack
Camera Sensor resolution FPS Sensor Angle(°) Output CPU usage Subpixel precision IR LEDs Approx. price (USD)
Ideal webcam640×480≥60 monochrome 42 highly compressed smallSoftware-dependentYes ?
OEM IR webcam [3] 640×48030 color 42 JPEG compressed smallSoftware-dependentYes$5
Sony PlayStation EyeToy 640×48030color56JPEG compressed [4] smallSoftware-dependentNo$16
Sony PlayStation 3 Eye 640×480187@320x240(CLEye), 125@320x240(Directshow), 75@640x480(DirectShow) [5] color75, 56JPEG compressed, rawsmallSoftware-dependentNo$24
Microsoft Xbox Live Vision 640×48060@320×240, 30@640×480color ?JPEG compressed, [6] rawsmallSoftware-dependentNo$14
Nintendo Wii Remote 128×96100(Bluetooth), 250(I2C) [7] color41 point coordinates none1/8No$23
NaturalPoint TrackIR 1 [8] 60k pixels (e.g. 300×200)60monochrome33 binary threshold minimalYesretired
NaturalPoint TrackIR 2 [8] 60k pixels (e.g. 300×200)100monochrome33binary threshold [9] minimalYesretired
NaturalPoint TrackIR 3 [10] 355×28880monochrome33binary thresholdminimalYesretired
NaturalPoint TrackIR 3 Pro [10] 355x288120monochrome33binary thresholdminimalYesretired
NaturalPoint TrackIR 4 Pro [10] 355×288 (subsampled at 710×480)120monochrome46binary threshold [11] minimal1/20thYes$99.95
NaturalPoint TrackIR 5 [10] 640×480120monochrome51.7grayscale threshold [12] minimal1/150thYes$149.95
NaturalPoint SmartNav 1/260k pixels (e.g. 300×200)60monochrome33binary thresholdminimalYesretired
NaturalPoint SmartNav 3 [13] 355×288120monochrome33binary thresholdminimal1/20thYesretired
NaturalPoint SmartNav 4 [13] 640×480 (subsampled at 1280×480)100monochrome41grayscale threshold [14] minimal1/150thYes$400 to $500

Resolution

In most cases a resolution of 320×240 is sufficient, this is capable of producing a much higher sub-pixel resolution, enough to allow accurate cursor control on a high-resolution monitor. Resolutions 640×480 and above have diminishing returns and correspond to an exponential[ why? ] increase in CPU usage when not sufficiently compressed before reaching the computer. Higher resolutions become more important at greater distances from the camera. The Wii utilizes a low-resolution 128×96 sensor, which is found by some to produce jittery tracking and may require smoothing to improve stability at the cost of decreased responsiveness. [15]

Sensor

For the same resolution, monochrome sensors can resolve finer details much better than color sensors due to the lack of a color filter array.

Frame rate

FreeTrack uses interpolation with low-frame-rate video devices to improve panning smoothness. However, responsiveness is fundamentally limited to the frame rate; a 30 frame/s webcam has a maximal response delay of 33.3 milliseconds compared with 8.33 milliseconds for a 120 frame/s camera. To put this into perspective, a human’s reaction time to visual stimulus (finger reflex) is typically around 200 ms; 30 ms can be regarded as a competitive ping in online reflex-based games, and an LCD monitor refresh rate is typically 17 ms.

Higher responsiveness gives a greater feeling of control, but since virtual head motion is amplified, it can also cause it to move unrealistically fast. For this reason, some programs limit head movement speed, wasting some of the responsiveness of higher-frame-rate cameras.

Angle

A wider viewing angle allows a larger tracking region when in close proximity to the camera. At further distances a wide angle is not desirable, more of the frame is unused and the effective resolution drops more rapidly. More peripheral light can also be seen, which can interfere with tracking. Viewing angle can be reduced by using digital zoom at the cost of resolution.

CPU usage

The Nintendo Wii Remote effectively uses no CPU, NaturalPoint cameras use a small amount, and general video devices can use a significant amount, depending on the brand and the specific camera settings in use. A PlayStation Eye running at the same resolution and frame rate as a TrackIR 4 would be very demanding on a single-core CPU. However, modern multi-core CPUs are making this less of an issue. Resolution and frame rate can always be reduced to conserve CPU resources.

Filters

FreeTrack requires the tracking points to be isolated from all other light; this is best done using infrared LEDs and a visible-light blocking filter in front of the camera. Photographic film or the magnetic storage medium inside floppy disks can be used as inexpensive visible-light filters. Further filtering can be done in software by adjusting exposure and threshold.

All video devices like webcams have a built-in infrared-blocking filter, which can be removed to improve sensitivity to infrared light, allowing better point isolation and the possibility of retroreflective tracking. This is normally a straightforward and reversible procedure for most webcams.

Wii Remotes and NaturalPoint cameras are designed for infrared point tracking, so they already have visible-light-blocking filters.

Point model

Model configurations

Models can be made in a DIY fashion at minimal expense using readily available electronic components. Component kits and fully constructed models are also available for purchase from some members of the FreeTrack community.

Active points

An active point model uses visible or infrared LEDs (5 mm or larger) to represent the tracking points, powered by battery, transformer (plug pack) or USB. The electric circuit is very basic and can be made by someone with little or no experience with electronics.

Common LEDs, like those found in remote controls, have a narrow, highly focused beam which is not suitable for optical motion tracking. They can be easily turned into wide angle LEDs by filing their lens tips down flat. Alternatively, wide angle LEDs can be purchased from specialist electronics retailers, like the infrared Siemens/Osram SFH485P, with a half-angle of 40 degrees.

Reflective points

Retroreflective material can be used to represent the tracking points by illumination with an infrared light source. This configuration doesn’t require wires or batteries connected to the user but is more susceptible to interference by background light. In most cases a webcam’s internal infrared blocking filter needs to be removed to increase sensitivity enough that the infrared light reflected by the tracking points can be seen.

FreeTrack interface

FreeTrack has a simple interface that can be freely used by third party programs such as Eyeware Beam to access 6DOF tracking data, both real raw measurements and virtual. It is hardware agnostic, so is not dependent on a specific brand or version of hardware and can be used without restriction. Bohemia Interactive's ARMA 2 is the first game to support the FreeTrack interface [16] and GP Bikes is the first to have exclusive support. [17]

TrackIR interface

FreeTrack is compatible with the unencrypted version of NaturalPoint's head tracking TrackIR interface that has widespread support in simulation games. NaturalPoint have been supplying game developers with an encrypted version of the interface for more popular titles since late 2008, these can be identified as requiring TrackIR software version 4.1.036 or higher and are incompatible with FreeTrack. [18] The developers of the first game affected, DCS: Black Shark, [19] tried to release their own head tracking interface but soon after canceled it at NaturalPoint's request. [20] FreeTrack compatibility is still possible using TrackIRFixer to remove the encryption requirement in games. [21]

TIRViews.dll is a dynamic-link library file distributed with TrackIR software that provides tailored support for a small number of mostly older games, using special interfaces or memory hacks to facilitate view control. [22] Though a violation of the TrackIR software's EULA, [23] it is possible to use it with FreeTrack.

NaturalPoint's TrackIR interface SDK is only available under a signed license agreement [24] and is covered by a NDA, so while FreeTrack is free software, the TrackIR interface component is required to be closed source. [25]

See also

Related Research Articles

<span class="mw-page-title-main">Pointing device</span> Human interface device for computers

A pointing device is a human interface device that allows a user to input spatial data to a computer. CAD systems and graphical user interfaces (GUI) allow the user to control and provide data to the computer using physical gestures by moving a hand-held mouse or similar device across the surface of the physical desktop and activating switches on the mouse. Movements of the pointing device are echoed on the screen by movements of the pointer and other visual changes. Common gestures are point and click and drag and drop.

<span class="mw-page-title-main">Webcam</span> Video camera connected to a computer or network

A webcam is a video camera which is designed to record or stream to a computer or computer network. They are primarily used in video telephony, live streaming and social media, and security. Webcams can be built-in computer hardware or peripheral devices, and are commonly connected to a device using USB or wireless protocols.

<span class="mw-page-title-main">Color grading</span> Enhancing the color of an image or video

Color grading is a post-production process common to filmmaking and video editing of altering the appearance of an image for presentation in different environments on different devices. Various attributes of an image such as contrast, color, saturation, detail, black level, and white balance may be enhanced whether for motion pictures, videos, or still images. Color grading and color correction are often used synonymously as terms for this process and can include the generation of artistic color effects through creative blending and compositing of different layer masks of the source image. Color grading is generally now performed in a digital process either in a controlled environment such as a color suite, and is usually done in a dim or dark environment.

<span class="mw-page-title-main">Motion capture</span> Process of recording the movement of objects or people

Motion capture is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robots. In filmmaking and video game development, it refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. In many fields, motion capture is sometimes called motion tracking, but in filmmaking and games, motion tracking usually refers more to match moving.

<span class="mw-page-title-main">Infrared photography</span> Near-infrared imaging

In infrared photography, the photographic film or image sensor used is sensitive to infrared light. The part of the spectrum used is referred to as near-infrared to distinguish it from far-infrared, which is the domain of thermal imaging. Wavelengths used for photography range from about 700 nm to about 900 nm. Film is usually sensitive to visible light too, so an infrared-passing filter is used; this lets infrared (IR) light pass through to the camera, but blocks all or most of the visible light spectrum.

<span class="mw-page-title-main">Gesture recognition</span> Topic in computer science and language technology

Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to interpret gestures.

<span class="mw-page-title-main">NASA Infrared Telescope Facility</span>

The NASA Infrared Telescope Facility is a 3-meter (9.8 ft) telescope optimized for use in infrared astronomy and located at the Mauna Kea Observatory in Hawaii. It was first built to support the Voyager missions and is now the US national facility for infrared astronomy, providing continued support to planetary, solar neighborhood, and deep space applications. The IRTF is operated by the University of Hawaii under a cooperative agreement with NASA. According to the IRTF's time allocation rules, at least 50% of the observing time is devoted to planetary science.

<span class="mw-page-title-main">Interactive whiteboard</span> Large interactive display

An interactive whiteboard (IWB), also known as interactive board or smart board, is a large interactive display board in the form factor of a whiteboard. It can either be a standalone touchscreen computer used independently to perform tasks and operations, or a connectable apparatus used as a touchpad to control computers from a projector. They are used in a variety of settings, including classrooms at all levels of education, in corporate board rooms and work groups, in training rooms for professional sports coaching, in broadcasting studios, and others.

<span class="mw-page-title-main">Nikon D2H</span> Digital single-lens reflex camera

The Nikon D2H is a professional-grade digital single-lens reflex camera introduced by Nikon Corporation on July 22, 2003. It uses Nikon's own JFET-LBCAST sensor with a 4.1-megapixel resolution, and is optimised for sports and action shooting that require a high frame rate. In 2005, the D2H was replaced by the D2Hs, which added new features derived from the 12-megapixel D2X digital SLR. The D2Hs was discontinued after the introduction of the D300 and D3 models.

<span class="mw-page-title-main">3D scanning</span> Scanning of an object or environment to collect data on its shape

3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance. The collected data can then be used to construct digital 3D models.

<span class="mw-page-title-main">Projection keyboard</span> Virtual device projected onto a surface

A projection keyboard is a form of computer input device whereby the image of a virtual keyboard is projected onto a surface: when a user touches the surface covered by an image of a key, the device records the corresponding keystroke. Some connect to Bluetooth devices, including many of the latest smartphone, tablet, and mini-PC devices with Android, iOS or Windows operating system.

<span class="mw-page-title-main">TrackIR</span>

TrackIR is an optical motion tracking controller for Microsoft Windows created by NaturalPoint Inc.. TrackIR tracks head motions with up to six degrees of freedom (6DOF) allowing handsfree view control for improved game immersion and situational awareness.

<span class="mw-page-title-main">Microsoft PixelSense</span> Interactive surface computing platform by Microsoft

Microsoft PixelSense was an interactive surface computing platform that allowed one or more people to use and touch real-world objects, and share digital content at the same time. The PixelSense platform consists of software and hardware products that combine vision based multitouch PC hardware, 360-degree multiuser application design, and Windows software to create a natural user interface (NUI).

<span class="mw-page-title-main">Kinect</span> Motion-sensing input device for the Xbox 360 and Xbox One

Kinect is a line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of flight calculations, which can in turn be used to perform real-time gesture recognition and body skeletal detection, among other capabilities. They also contain microphones that can be used for speech recognition and voice control.

<span class="mw-page-title-main">Finger tracking</span> High-resolution technique in gesture recognition and image processing

In the field of gesture recognition and image processing, finger tracking is a high-resolution technique developed in 1969 that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D. In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard and a mouse.

<span class="mw-page-title-main">FinePix IS Pro</span> Digital single lens reflex camera

The FinePix IS Pro is a digital single lens reflex camera introduced by Fujifilm in 2007. It is based on a FinePix S5 Pro, which is in turn based on the Nikon D200. It has a Nikon F lens mount and can use most lenses made for 35 mm Nikon SLR cameras. It replaces the Fujifilm FinePix S3 Pro UVIR.

<span class="mw-page-title-main">Magic Camera</span> Virtual webcam software

Magic Camera, sometimes known as Magic Camera virtual webcam, is an application for Microsoft Windows to generate virtual webcams on windows, which can be used to stream files/screens as webcam, or create webcam effects on physical webcam. Since the first release of Magic Camera on 20 March 2006, its author ShiningMorning Soft has maintained and kept the product shareware.

Project Digits is a Microsoft Research Project under Microsoft's computer science laboratory at the University of Cambridge; researchers from Newcastle University and University of Crete are also involved in this project. Project is led by David Kim a Microsoft Research PhD and also a PhD Student in computer science at Newcastle University. Digits is an input device which can be mounted on the wrist of human hand and it captures and displays a complete 3D graphical representation of the user's hand on screen without using any external sensing device or hand covering material like data gloves. This project aims to make gesture controlled interfaces completely hands free with greater mobility and accuracy. It allows user to interact with whatever hardware while moving from room to room or walking down the street without any line of sight connection with the hardware.

<span class="mw-page-title-main">Pose tracking</span>

In virtual reality (VR) and augmented reality (AR), a pose tracking system detects the precise pose of head-mounted displays, controllers, other objects or body parts within Euclidean space. Pose tracking is often referred to as 6DOF tracking, for the six degrees of freedom in which the pose is often tracked.

The Meta Quest Pro is a mixed reality (MR) headset developed by Reality Labs, a division of Meta Platforms.

References

  1. DeMenthon, Daniel; Larry S. Davis (1992). "Model-Based Object Pose in 25 Lines of Code". European Conference on Computer Vision. 15: 335–343. CiteSeerX   10.1.1.50.9280 .
  2. Alter, T. D. (1992). "3D Pose from Three Corresponding Points Under Weak-Perspective Projection". A.i. Memo. 1378 (AIM–1378): 43. CiteSeerX   10.1.1.18.1908 .
  3. "8.0 Mega 6 IR LED Webcam Web Cam Camera Skype MSN Mic" . Retrieved 2010-09-07.
  4. "Using ov519 webcams (Eyetoy) with pdp/Gem (jpeg frames)" . Retrieved 2010-05-08.
  5. "CL Eye Platform SDK Changelog" . Retrieved 2010-10-30.
  6. "XBOX Live Vision Camera in Ubuntu" . Retrieved 2010-05-08.
  7. "Automatic Take Off, Hovering and Landing Control for Miniature Helicopters with Low-Cost Onboard Hardware" (PDF). Retrieved 2010-05-08.
  8. 1 2 "TrackIR3 Pro heads-up game controller". ars technica. 25 August 2004. Retrieved 2007-10-13.
  9. "TrackIR2, Track IR2 headtracking buy, review, featured" . Retrieved 2010-05-08.
  10. 1 2 3 4 "TrackIR Product Comparison". NaturalPoint. Retrieved 2007-10-13.
  11. "TrackIR 4 Grayscale". 11 June 2009. Retrieved 2010-05-08.
  12. "TrackIR 5 Grayscale" . Retrieved 2010-05-08.
  13. 1 2 "SmartNav Older Model Comparison". NaturalPoint. Retrieved 2008-11-01.
  14. "SmartNav 4 Grayscale" . Retrieved 2010-05-08.
  15. "Wii resolution and latency" . Retrieved 2010-12-07.
  16. "Arma 2: Patch v1.05" . Retrieved 2010-07-20.
  17. "PiBoSo Alpha 6 released" . Retrieved 2010-03-16.
  18. "NaturalPointofView - The NaturalPoint TrackIR Monopoly" . Retrieved 2010-07-20.
  19. "TrackIR Enhanced Games : DCS: Black Shark". NaturalPoint. Retrieved 2008-10-26.
  20. Tez - ED Team (14 November 2008). "HeadTracker interface - ED Forums". Eagle Dynamics. Retrieved 2010-03-16.
  21. "NaturalPointofView - The NaturalPoint TrackIR Monopoly: TrackIRFixer" . Retrieved 2010-07-20.
  22. "FreeTrack Forum V2.2 & FSX/FS9" . Retrieved 2010-02-20.
  23. "TrackIR software download page". NaturalPoint. Retrieved 2010-02-20.
  24. "TrackIR Developers : Which SDK Do I Need?" . Retrieved 2010-02-20.
  25. "Head banging..." Archived from the original on 2011-06-05. Retrieved 2010-02-20.