Part of a series on |
Machine learning and data mining |
---|
In machine learning, a neural field (also known as implicit neural representation, neural implicit, or coordinate-based neural network), is a mathematical field that is fully or partially parametrized by a neural network. Initially developed to tackle visual computing tasks, such as rendering or reconstruction (e.g., neural radiance fields), neural fields emerged as a promising strategy to deal with a wider range of problems, including surrogate modelling of partial differential equations, such as in physics-informed neural networks. [1]
Differently from traditional machine learning algorithms, such as feed-forward neural networks, convolutional neural networks, or transformers, neural fields do not work with discrete data (e.g. sequences, images, tokens), but map continuous inputs (e.g., spatial coordinates, time) to continuous outputs (i.e., scalars, vectors, etc.). This makes neural fields not only discretization independent, but also easily differentiable. Moreover, dealing with continuous data allows for a significant reduction in space complexity, which translates to a much more lightweight network. [1]
According to the universal approximation theorem, provided adequate learning, sufficient number of hidden units, and the presence of a deterministic relationship between the input and the output, a neural network can approximate any function to any degree of accuracy. [2]
Hence, in mathematical terms, given a field , with and , a neural field , with parameters , is such that: [1]
For supervised tasks, given examples in the training dataset (i.e., ), the neural field parameters can be learned by minimizing a loss function (e.g., mean squared error). The parameters that satisfy the optimization problem are found as: [1] [3] [4] Notably, it is not necessary to know the analytical expression of , for the previously reported training procedure only requires input-output pairs. Indeed, a neural field is able to offer a continuous and differentiable surrogate of the true field, even from purely experimental data. [1]
Moreover, neural fields can be used in unsupervised settings, with training objectives that depend on the specific task. For example, physics-informed neural networks may be trained on just the residual. [4]
As for any artificial neural network, neural fields may be characterized by a spectral bias (i.e., the tendency to preferably learn the low frequency content of a field), possibly leading to a poor representation of the ground truth. [5] In order to overcome this limitation, several strategies have been developed. For example, SIREN uses sinusoidal activations, [6] while the Fourier-features approach embeds the input through sines and cosines. [7]
In many real-world cases, however, learning a single field is not enough. For example, when reconstructing 3D vehicle shapes from Lidar data, it is desirable to have a machine learning model that can work with arbitrary shapes (e.g., a car, a bicycle, a truck, etc.). The solution is to include additional parameters, the latent variables (or latent code) , to vary the field and adapt it to diverse tasks. [1]
When dealing with conditional neural fields, the first design choice is represented by the way in which the latent code is produced. Specifically, two main strategies can be identified: [1]
Since the latter strategy requires additional optimization steps at inference time, it sacrifices speed, but keeps the overall model smaller. Moreover, despite being simpler to implement, an encoder may harm the generalization capabilities of the model. [1] For example, when dealing with a physical scalar field (e.g., the pressure of a 2D fluid), an auto-decoder-based conditional neural field can map a single point to the corresponding value of the field, following a learned latent code . [10] However, if the latent variables were produced by an encoder, it would require access to the entire set of points and corresponding values (e.g. as a regular grid or a mesh graph), leading to a less robust model. [1]
In a neural field with global conditioning, the latent code does not depend on the input and, hence, it offers a global representation (e.g., the overall shape of a vehicle). However, depending on the task, it may be more useful to divide the domain of in several subdomains, and learn different latent codes for each of them (e.g., splitting a large and complex scene in sub-scenes for a more efficient rendering). This is called local conditioning. [1]
There are several strategies to include the conditioning information in the neural field. In the general mathematical framework, conditioning the neural field with the latent variables is equivalent to mapping them to a subset of the neural field parameters: [1] In practice, notable strategies are:
Instead of relying on the latent code to adapt the neural field to a specific task, it is also possible to exploit gradient-based meta-learning. In this case, the neural field is seen as the specialization of an underlying meta-neural-field, whose parameters are modified to fit the specific task, through a few steps of gradient descent. [13] [14] An extension of this meta-learning framework is the CAVIA algorithm, that splits the trainable parameters in context-specific and shared groups, improving parallelization and interpretability, while reducing meta-overfitting. This strategy is similar to the auto-decoding conditional neural field, but the training procedure is substantially different. [15]
Thanks to the possibility of efficiently modelling diverse mathematical fields with neural networks, neural fields have been applied to a wide range of problems: