Stable release | v2.2 / October 7, 2008 |
---|---|
Operating system | Microsoft Windows |
Type | Optical motion tracking |
License | GNU General Public License |
Website | www.free-track.net |
FreeTrack is a general-purpose optical motion tracking application for Microsoft Windows, released under the GNU General Public License, that can be used with common inexpensive cameras. Its primary focus is head tracking with uses in virtual reality, simulation, video games, 3D modeling, computer aided design and general hands-free computing to improve computer accessibility. Tracking can be made sensitive enough that only small head movements are required so that the user's eyes never leave the screen.
A camera is positioned to observe a rigid point model worn by the user, the points of which need to be isolated from background light by means of physical and software filtering. Motion is tracked with up to six degrees of freedom (6DOF): yaw, pitch, roll, left/right, up/down and forward/back. Windows-compatible video devices like webcams are supported, as well as special Nintendo Wii Remote camera, iPhone Truedepth camera with Eyeware Beam, and NaturalPoint cameras (TrackIR, SmartNav and OptiTrack).
FreeTrack can output head-tracking data to programs directly using its own open interface, as well as TrackIR, SimConnect and FSUIPC interfaces. Programs that support these interfaces are regarded as being FreeTrack-compatible. FreeTrack can also emulate mouse, keyboard, and joystick (via PPJoy) if a program does not support a direct interface.
FreeTrack is coded in Delphi 7 and uses DirectShow and DirectX. Head tracking is achieved using implementations of DeMenthon's four-point iterative pose estimation algorithm (POSIT) [1] and Alter's three point geometric algorithm. [2]
FreeTrack uses a camera to collect real-time information on the point model worn by the user. Specifically the image coordinates of the model points, which are either received directly from the camera or extracted from a video stream. These coordinates are used to generate an estimate of the real head pose, which can be transformed by the user in a number of ways to create a virtual pose. One of the most fundamental transformations involves amplifying rotation so that only small head movements are required. Finally, the virtual pose is sent to the user's choice of outputs. This is all done in the background, with tracking status displayed in the system tray.
A 3D preview is available that shows the virtual head position and orientation for a given real head pose and can be viewed from multiple perspectives, including first-person. This greatly assists with testing and makes it easier to experiment with different settings.
Each degree of freedom (axis) has a response curve that can be modified to change the way the virtual head moves for a given real head movement. This is commonly used to create a central deadzone region so that the user’s head can be more relaxed there.
Keyboard, mouse and joystick buttons can be used to toggle tracking settings, including the virtual centre location (like adjusting the seat position in a car) and individually toggle axes and outputs.
For NaturalPoint cameras, FreeTrack can provide advanced features and a level of customization that is not available with official software.
Camera | Sensor resolution | FPS | Sensor | Angle(°) | Output | CPU usage | Subpixel precision | IR LEDs | Approx. price (USD) |
---|---|---|---|---|---|---|---|---|---|
Ideal webcam | 640×480 | ≥60 | monochrome | 42 | highly compressed | small | Software-dependent | Yes | ? |
OEM IR webcam [3] | 640×480 | 30 | color | 42 | JPEG compressed | small | Software-dependent | Yes | $5 |
Sony PlayStation EyeToy | 640×480 | 30 | color | 56 | JPEG compressed [4] | small | Software-dependent | No | $16 |
Sony PlayStation 3 Eye | 640×480 | 187@320x240(CLEye), 125@320x240(Directshow), 75@640x480(DirectShow) [5] | color | 75, 56 | JPEG compressed, raw | small | Software-dependent | No | $24 |
Microsoft Xbox Live Vision | 640×480 | 60@320×240, 30@640×480 | color | ? | JPEG compressed, [6] raw | small | Software-dependent | No | $14 |
Nintendo Wii Remote | 128×96 | 100(Bluetooth), 250(I2C) [7] | color | 41 | point coordinates | none | 1/8 | No | $23 |
NaturalPoint TrackIR 1 [8] | 60k pixels (e.g. 300×200) | 60 | monochrome | 33 | binary threshold | minimal | Yes | retired | |
NaturalPoint TrackIR 2 [8] | 60k pixels (e.g. 300×200) | 100 | monochrome | 33 | binary threshold [9] | minimal | Yes | retired | |
NaturalPoint TrackIR 3 [10] | 355×288 | 80 | monochrome | 33 | binary threshold | minimal | Yes | retired | |
NaturalPoint TrackIR 3 Pro [10] | 355x288 | 120 | monochrome | 33 | binary threshold | minimal | Yes | retired | |
NaturalPoint TrackIR 4 Pro [10] | 355×288 (subsampled at 710×480) | 120 | monochrome | 46 | binary threshold [11] | minimal | 1/20th | Yes | $99.95 |
NaturalPoint TrackIR 5 [10] | 640×480 | 120 | monochrome | 51.7 | grayscale threshold [12] | minimal | 1/150th | Yes | $149.95 |
NaturalPoint SmartNav 1/2 | 60k pixels (e.g. 300×200) | 60 | monochrome | 33 | binary threshold | minimal | Yes | retired | |
NaturalPoint SmartNav 3 [13] | 355×288 | 120 | monochrome | 33 | binary threshold | minimal | 1/20th | Yes | retired |
NaturalPoint SmartNav 4 [13] | 640×480 (subsampled at 1280×480) | 100 | monochrome | 41 | grayscale threshold [14] | minimal | 1/150th | Yes | $400 to $500 |
In most cases a resolution of 320×240 is sufficient, this is capable of producing a much higher sub-pixel resolution, enough to allow accurate cursor control on a high-resolution monitor. Resolutions 640×480 and above have diminishing returns and correspond to an exponential[ why? ] increase in CPU usage when not sufficiently compressed before reaching the computer. Higher resolutions become more important at greater distances from the camera. The Wii utilizes a low-resolution 128×96 sensor, which is found by some to produce jittery tracking and may require smoothing to improve stability at the cost of decreased responsiveness. [15]
For the same resolution, monochrome sensors can resolve finer details much better than color sensors due to the lack of a color filter array.
FreeTrack uses interpolation with low-frame-rate video devices to improve panning smoothness. However, responsiveness is fundamentally limited to the frame rate; a 30 frame/s webcam has a maximal response delay of 33.3 milliseconds compared with 8.33 milliseconds for a 120 frame/s camera. To put this into perspective, a human’s reaction time to visual stimulus (finger reflex) is typically around 200 ms; 30 ms can be regarded as a competitive ping in online reflex-based games, and an LCD monitor refresh rate is typically 17 ms.
Higher responsiveness gives a greater feeling of control, but since virtual head motion is amplified, it can also cause it to move unrealistically fast. For this reason, some programs limit head movement speed, wasting some of the responsiveness of higher-frame-rate cameras.
A wider viewing angle allows a larger tracking region when in close proximity to the camera. At further distances a wide angle is not desirable, more of the frame is unused and the effective resolution drops more rapidly. More peripheral light can also be seen, which can interfere with tracking. Viewing angle can be reduced by using digital zoom at the cost of resolution.
The Nintendo Wii Remote effectively uses no CPU, NaturalPoint cameras use a small amount, and general video devices can use a significant amount, depending on the brand and the specific camera settings in use. A PlayStation Eye running at the same resolution and frame rate as a TrackIR 4 would be very demanding on a single-core CPU. However, modern multi-core CPUs are making this less of an issue. Resolution and frame rate can always be reduced to conserve CPU resources.
FreeTrack requires the tracking points to be isolated from all other light; this is best done using infrared LEDs and a visible-light blocking filter in front of the camera. Photographic film or the magnetic storage medium inside floppy disks can be used as inexpensive visible-light filters. Further filtering can be done in software by adjusting exposure and threshold.
All video devices like webcams have a built-in infrared-blocking filter, which can be removed to improve sensitivity to infrared light, allowing better point isolation and the possibility of retroreflective tracking. This is normally a straightforward and reversible procedure for most webcams.
Wii Remotes and NaturalPoint cameras are designed for infrared point tracking, so they already have visible-light-blocking filters.
Models can be made in a DIY fashion at minimal expense using readily available electronic components. Component kits and fully constructed models are also available for purchase from some members of the FreeTrack community.
An active point model uses visible or infrared LEDs (5 mm or larger) to represent the tracking points, powered by battery, transformer (plug pack) or USB. The electric circuit is very basic and can be made by someone with little or no experience with electronics.
Common LEDs, like those found in remote controls, have a narrow, highly focused beam which is not suitable for optical motion tracking. They can be easily turned into wide angle LEDs by filing their lens tips down flat. Alternatively, wide angle LEDs can be purchased from specialist electronics retailers, like the infrared Siemens/Osram SFH485P, with a half-angle of 40 degrees.
Retroreflective material can be used to represent the tracking points by illumination with an infrared light source. This configuration doesn’t require wires or batteries connected to the user but is more susceptible to interference by background light. In most cases a webcam’s internal infrared blocking filter needs to be removed to increase sensitivity enough that the infrared light reflected by the tracking points can be seen.
FreeTrack has a simple interface that can be freely used by third party programs such as Eyeware Beam to access 6DOF tracking data, both real raw measurements and virtual. It is hardware agnostic, so is not dependent on a specific brand or version of hardware and can be used without restriction. Bohemia Interactive's ARMA 2 is the first game to support the FreeTrack interface [16] and GP Bikes is the first to have exclusive support. [17]
FreeTrack is compatible with the unencrypted version of NaturalPoint's head tracking TrackIR interface that has widespread support in simulation games. NaturalPoint have been supplying game developers with an encrypted version of the interface for more popular titles since late 2008, these can be identified as requiring TrackIR software version 4.1.036 or higher and are incompatible with FreeTrack. [18] The developers of the first game affected, DCS: Black Shark, [19] tried to release their own head tracking interface but soon after canceled it at NaturalPoint's request. [20] FreeTrack compatibility is still possible using TrackIRFixer to remove the encryption requirement in games. [21]
TIRViews.dll is a dynamic-link library file distributed with TrackIR software that provides tailored support for a small number of mostly older games, using special interfaces or memory hacks to facilitate view control. [22] Though a violation of the TrackIR software's EULA, [23] it is possible to use it with FreeTrack.
NaturalPoint's TrackIR interface SDK is only available under a signed license agreement [24] and is covered by a NDA, so while FreeTrack is free software, the TrackIR interface component is required to be closed source. [25]
A webcam is a video camera which is designed to record or stream to a computer or computer network. They are primarily used in video telephony, live streaming and social media, and security. Webcams can be built-in computer hardware or peripheral devices, and are commonly connected to a device using USB or wireless protocols.
Color grading is a post-production process common to filmmaking and video editing of altering the appearance of an image for presentation in different environments on different devices. Various attributes of an image such as contrast, color, saturation, detail, black level, and white balance may be enhanced whether for motion pictures, videos, or still images. Color grading and color correction are often used synonymously as terms for this process and can include the generation of artistic color effects through creative blending and compositing of different layer masks of the source image. Color grading is generally now performed in a digital process either in a controlled environment such as a color suite, and is usually done in a dim or dark environment.
Motion capture is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robots. In films, television shows and video games, motion capture refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. In many fields, motion capture is sometimes called motion tracking, but in filmmaking and games, motion tracking usually refers more to match moving.
Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to interpret gestures.
The NASA Infrared Telescope Facility is a 3-meter (9.8 ft) telescope optimized for use in infrared astronomy and located at the Mauna Kea Observatory in Hawaii. It was first built to support the Voyager missions and is now the US national facility for infrared astronomy, providing continued support to planetary, solar neighborhood, and deep space applications. The IRTF is operated by the University of Hawaii under a cooperative agreement with NASA. According to the IRTF's time allocation rules, at least 50% of the observing time is devoted to planetary science.
An interactive whiteboard (IWB), also known as interactive board, interactive display, interactive digital board or smart board, is a large interactive display board in the form factor of a whiteboard. It can either be a standalone touchscreen computer used independently to perform tasks and operations, or a connectable apparatus used as a touchpad to control computers from a projector. They are touch screen enabled small computers.They are used in a variety of settings, including classrooms at all levels of education, in corporate board rooms and work groups, in training rooms for professional sports coaching, in broadcasting studios, and others.
The Nikon D2H is a professional-grade digital single-lens reflex camera introduced by Nikon Corporation on July 22, 2003. It uses Nikon's own JFET-LBCAST sensor with a 4.1-megapixel resolution, and is optimised for sports and action shooting that require a high frame rate. In 2005, the D2H was replaced by the D2Hs, which added new features derived from the 12-megapixel D2X digital SLR. The D2Hs was discontinued after the introduction of the D300 and D3 models.
Low-definition television (LDTV) refers to TV systems that have a lower screen resolution than standard-definition television systems. The term is usually used in reference to digital television, in particular when broadcasting at the same resolution as low-definition analog television systems. Mobile DTV systems usually transmit in low definition, as do all slow-scan television systems.
A projection keyboard is a form of computer input device whereby the image of a virtual keyboard is projected onto a surface: when a user touches the surface covered by an image of a key, the device records the corresponding keystroke. Some connect to Bluetooth devices, including many of the latest smartphone, tablet, and mini-PC devices with Android, iOS or Windows operating system.
The Leica M8 is the first digital camera in the rangefinder M series introduced by Leica Camera AG on 14 September 2006. It uses an APS-H 10.3-megapixel CCD image sensor designed and manufactured by Kodak.
TrackIR, created by NaturalPoint Inc., is an optical motion tracking controller for Microsoft Windows.
Microsoft PixelSense was an interactive surface computing platform that allowed one or more people to use and touch real-world objects, and share digital content at the same time. The PixelSense platform consists of software and hardware products that combine vision based multitouch PC hardware, 360-degree multiuser application design, and Windows software to create a natural user interface (NUI).
Kinect is a discontinued line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of flight calculations, which can in turn be used to perform real-time gesture recognition and body skeletal detection, among other capabilities. They also contain microphones that can be used for speech recognition and voice control.
The Nintendo DSi system software is a discontinued set of updatable firmware versions, and a software frontend on the Nintendo DSi video game console. Updates, which are downloaded via the system's Internet connection, allow Nintendo to add and remove features and software. All updates also include all changes from previous updates.
In the field of gesture recognition and image processing, finger tracking is a high-resolution technique developed in 1969 that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D. In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard and a mouse.
Magic Camera, sometimes known as Magic Camera virtual webcam, is an application for Microsoft Windows to generate virtual webcams on Windows, which can be used to stream files/screens as webcam, or create webcam effects on physical webcam. Since the first release of Magic Camera on 20 March 2006, its author ShiningMorning Soft has maintained and kept the product shareware.
Project Digits is a Microsoft Research Project under Microsoft's computer science laboratory at the University of Cambridge; researchers from Newcastle University and University of Crete are also involved in this project. Project is led by David Kim, a Microsoft Research PhD and also a PhD student in computer science at Newcastle University. Digits is an input device which can be mounted on the wrist of human hand and it captures and displays a complete 3D graphical representation of the user's hand on screen without using any external sensing device or hand covering material like data gloves. This project aims to make gesture-controlled interfaces completely hands free with greater mobility and accuracy. It allows user to interact with whatever hardware while moving from room to room or walking down the street without any line of sight connection with the hardware.
The Nikon D810 is a 36.3-megapixel professional-grade full-frame digital single-lens reflex camera produced by Nikon. The camera was officially announced in June 2014, and became available in July 2014.
In virtual reality (VR) and augmented reality (AR), a pose tracking system detects the precise pose of head-mounted displays, controllers, other objects or body parts within Euclidean space. Pose tracking is often referred to as 6DOF tracking, for the six degrees of freedom in which the pose is often tracked.
The Meta Quest Pro is a mixed reality (MR) headset developed by Reality Labs, a division of Meta Platforms.