A point cloud is a discrete set of data points in space. The points may represent a 3D shape or object. Each point position has its set of Cartesian coordinates (X, Y, Z). [1] [2] Points may contain data other than position such as RGB colors, [2] normals, [3] timestamps [4] and others. Point clouds are generally produced by 3D scanners or by photogrammetry software, which measure many points on the external surfaces of objects around them. As the output of 3D scanning processes, point clouds are used for many purposes, including to create 3D computer-aided design (CAD) or geographic information systems (GIS) models for manufactured parts, for metrology and quality inspection, and for a multitude of visualizing, animating, rendering, and mass customization applications.
When scanning a scene in real world using LiDar, the captured point clouds contain snippets of the scene, which requires alignment to generate a full map of the scanned environment.
Point clouds are often aligned with 3D models or with other point clouds, a process termed point set registration.
The Iterative closest point (ICP) algorithm can be used to align two point clouds that have an overlap between them, and are separated by a rigid transform. [5] Point clouds with elastic transforms can also be aligned by using a non-rigid variant of the ICP (NICP). [6] With advancements in machine learning in recent years, point cloud registration may also be done using end-to-end neural networks. [7]
For industrial metrology or inspection using industrial computed tomography, the point cloud of a manufactured part can be aligned to an existing model and compared to check for differences. Geometric dimensions and tolerances can also be extracted directly from the point cloud.
While point clouds can be directly rendered and inspected, [10] [11] point clouds are often converted to polygon mesh or triangle mesh models, non-uniform rational B-spline (NURBS) surface models, or CAD models through a process commonly referred to as surface reconstruction.
There are many techniques for converting a point cloud to a 3D surface. [12] Some approaches, like Delaunay triangulation, alpha shapes, and ball pivoting, build a network of triangles over the existing vertices of the point cloud, while other approaches convert the point cloud into a volumetric distance field and reconstruct the implicit surface so defined through a marching cubes algorithm. [13]
In geographic information systems, point clouds are one of the sources used to make digital elevation model of the terrain. [14] They are also used to generate 3D models of urban environments. [15] Drones are often used to collect a series of RGB images which can be later processed on a computer vision algorithm platform such as on AgiSoft Photoscan, Pix4D, DroneDeploy or Hammer Missions to create RGB point clouds from where distances and volumetric estimations can be made.[ citation needed ]
Point clouds can also be used to represent volumetric data, as is sometimes done in medical imaging. Using point clouds, multi-sampling and data compression can be achieved. [16]
MPEG began standardizing point cloud compression (PCC) with a Call for Proposal (CfP) in 2017. [17] [18] [19] Three categories of point clouds were identified: category 1 for static point clouds, category 2 for dynamic point clouds, and category 3 for LiDAR sequences (dynamically acquired point clouds). Two technologies were finally defined: G-PCC (Geometry-based PCC, ISO/IEC 23090 part 9) [20] for category 1 and category 3; and V-PCC (Video-based PCC, ISO/IEC 23090 part 5) [21] for category 2. The first test models were developed in October 2017, one for G-PCC (TMC13) and another one for V-PCC (TMC2). Since then, the two test models have evolved through technical contributions and collaboration, and the first version of the PCC standard specifications was expected to be finalized in 2020 as part of the ISO/IEC 23090 series on the coded representation of immersive media content. [22]
The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and file formats for various applications. Together with JPEG, MPEG is organized under ISO/IEC JTC 1/SC 29 – Coding of audio, picture, multimedia and hypermedia information.
MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of audiovisual data for Internet video and CD distribution, voice and broadcast television applications. The MPEG-4 standard was developed by a group led by Touradj Ebrahimi and Fernando Pereira.
JPEG 2000 (JP2) is an image compression standard and coding system. It was developed from 1997 to 2000 by a Joint Photographic Experts Group committee chaired by Touradj Ebrahimi, with the intention of superseding their original JPEG standard, which is based on a discrete cosine transform (DCT), with a newly designed, wavelet-based method. The standardized filename extension is .jp2 for ISO/IEC 15444-1 conforming files and .jpx for the extended part-2 specifications, published as ISO/IEC 15444-2. The registered MIME types are defined in RFC 3745. For ISO/IEC 15444-1 it is image/jp2.
X3D is a set of royalty-free ISO/IEC standards for declaratively representing 3D computer graphics. X3D includes multiple graphics file formats, programming-language API definitions, and run-time specifications for both delivery and integration of interactive network-capable 3D data. X3D version 4.0 has been approved by Web3D Consortium, and is under final review by ISO/IEC as a revised International Standard (IS).
In scientific visualization and computer graphics, volume rendering is a set of techniques used to display a 2D projection of a 3D discretely sampled data set, typically a 3D scalar field.
Registration authorities (RAs) exist for many standards organizations, such as ANNA, the Object Management Group, W3C, and others. In general, registration authorities all perform a similar function, in promoting the use of a particular standard through facilitating its use. This may be by applying the standard, where appropriate, or by verifying that a particular application satisfies the standard's tenants. Maintenance agencies, in contrast, may change an element in a standard based on set rules – such as the creation or change of a currency code when a currency is created or revalued. The Object Management Group has an additional concept of certified provider, which is deemed an entity permitted to perform some functions on behalf of the registration authority, under specific processes and procedures documented within the standard for such a role.
In computer vision and image processing, motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion happens in three dimensions (3D) but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom.
Iterative closest point (ICP) is an algorithm employed to minimize the difference between two clouds of points. ICP is often used to reconstruct 2D or 3D surfaces from different scans, to localize robots and achieve optimal path planning, to co-register bone models, etc.
3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance. The collected data can then be used to construct digital 3D models.
DSSP stands for digital shape sampling and processing. It is an alternative and often preferred way of describing "reverse engineering" software and hardware. The term originated in a 2005 Society of Manufacturing Engineers' "Blue Book" on the topic, which referenced numerous suppliers of both scanning hardware and processing software.
The Video Coding Experts Group or Visual Coding Experts Group is a working group of the ITU Telecommunication Standardization Sector (ITU-T) concerned with standards for compression coding of video, images, audio signals, biomedical waveforms, and other signals. It is responsible for standardization of the "H.26x" line of video coding standards, the "T.8xx" line of image coding standards, and related technologies.
Image-based meshing is the automated process of creating computer models for computational fluid dynamics (CFD) and finite element analysis (FEA) from 3D image data. Although a wide range of mesh generation techniques are currently available, these were usually developed to generate models from computer-aided design (CAD), and therefore have difficulties meshing from 3D imaging data.
In 3D computer graphics, 3D modeling is the process of developing a mathematical coordinate-based representation of a surface of an object in three dimensions via specialized software by manipulating edges, vertices, and polygons in a simulated 3D space.
The Point Cloud Library (PCL) is an open-source library of algorithms for point cloud processing tasks and 3D geometry processing, such as occur in three-dimensional computer vision. The library contains algorithms for filtering, feature estimation, surface reconstruction, 3D registration, model fitting, object recognition, and segmentation. Each module is implemented as a smaller library that can be compiled separately. PCL has its own data format for storing point clouds - PCD, but also allows datasets to be loaded and saved in many other formats. It is written in C++ and released under the BSD license.
3D reconstruction from multiple images is the creation of three-dimensional models from a set of images. It is the reverse process of obtaining 2D images from 3D scenes.
A digital outcrop model (DOM), also called a virtual outcrop model, is a digital 3D representation of the outcrop surface, mostly in a form of textured polygon mesh.
CloudCompare is a 3D point cloud processing software. It can also handle triangular meshes and calibrated images.
Volumetric capture or volumetric video is a technique that captures a three-dimensional space, such as a location or performance. This type of volumography acquires data that can be viewed on flat screens as well as using 3D displays and VR goggles. Consumer-facing formats are numerous and the required motion capture techniques lean on computer graphics, photogrammetry, and other computation-based methods. The viewer generally experiences the result in a real-time engine and has direct input in exploring the generated volume.
JPEG XS is an interoperable, visually lossless, low-latency and lightweight image and video coding system used in professional applications. Applications of the standard include streaming high-quality content for virtual reality, drones, autonomous vehicles using cameras, gaming, and broadcasting. It was the first ISO codec ever designed for this specific purpose. JPEG XS, built on core technology from both intoPIX and Fraunhofer IIS, is formally standardized as ISO/IEC 21122 by the Joint Photographic Experts Group with the first edition published in 2019. Although not official, the XS acronym was chosen to highlight the eXtra Small and eXtra Speed characteristics of the codec. Today, the JPEG committee is still actively working on further improvements to XS, with the second edition scheduled for publication and initial efforts being launched towards a third edition.
Gaussian Splatting is a volume rendering technique that deals with the direct rendering of volume data without converting the data into surface or line primitives. The technique was originally introduced as splatting by Lee Westover in the early 1990s. With advancements in computer graphics, newer methods such as 3D and 4D Gaussian splatting have been developed to offer real-time radiance field rendering and dynamic scene rendering respectively.
{{cite journal}}
: Cite journal requires |journal=
(help)