A visual sensor network is a network of spatially distributed smart camera devices capable of processing and fusing images of a scene from a variety of viewpoints into some form more useful than the individual images. A visual sensor network may be a type of wireless sensor network, and much of the theory and application of the latter applies to the former. The network generally consists of the cameras themselves, which have some local image processing, communication and storage capabilities, and possibly one or more central computers, where image data from multiple cameras is further processed and fused (this processing may, however, simply take place in a distributed fashion across the cameras and their local controllers). Visual sensor networks also provide some high-level services to the user so that the large amount of data can be distilled into information of interest using specific queries.
The primary difference between visual sensor networks and other types of sensor networks is the nature and volume of information the individual sensors acquire: unlike most sensors, cameras are directional in their field of view, and they capture a large amount of visual information which may be partially processed independently of data from other cameras in the network. Alternatively, one may say that while most sensors measure some value such as temperature or pressure, visual sensors measure patterns. In light of this, communication in visual sensor networks differs substantially from traditional sensor networks.
Visual sensor networks are most useful in applications involving area surveillance, tracking, and environmental monitoring. Of particular use in surveillance applications is the ability to perform a dense 3D reconstruction of a scene and storing data over a period of time, so that operators can view events as they unfold over any period of time (including the current moment) from any arbitrary viewpoint in the covered area, even allowing them to "fly" around the scene in real time. High-level analysis using object recognition and other techniques can intelligently track objects (such as people or cars) through a scene, and even determine what they are doing so that certain activities could be automatically brought to the operator's attention. Another possibility is the use of visual sensor networks in telecommunications, where the network would automatically select the "best" view (perhaps even an arbitrarily generated one) of a live event.