A vision-guided robot (VGR) system is a robot fitted with one or more cameras used as sensors to provide a secondary feedback signal to the robot controller for a more accurate movement to a variable target position. VGR is rapidly transforming production processes by enabling robots to be highly adaptable and more easily implemented, while dramatically reducing the cost and complexity of fixed tooling previously associated with the design and set up of robotic cells, whether for material handling, automated assembly, agricultural applications, [1] life sciences, and more. [2]
In one classic but rather dated example of VGR used for industrial manufacturing, the vision system (camera and software) determines the position of randomly fed products onto a recycling conveyor. The vision system provides the exact location coordinates of the components to the robot, which are spread out randomly beneath the camera's field of view, enabling the robot arm(s) to position the attached end effector (gripper) to the selected component and pick from the conveyor belt. The conveyor may stop under the camera to allow the position of the part to be determined, or if the cycle time is sufficient, it is possible to pick a component without stopping the conveyor using a control scheme that tracks the moving component through the vision software, typically by fitting an encoder to the conveyor, and using this feedback signal to update and synchronize the vision and motion control loops.
Such functionality is now common in the field of vision-guided robotics (VGR). It is a rapidly evolving technology that is proving to be economically advantageous in countries with high manufacturing overheads and skilled labor costs by reducing manual intervention, improving safety, increasing quality, and raising productivity rates, among other benefits. [3]
The expansion of vision-guided robotic systems is part of the broader growth within the machine vision market, which is expected to grow to $17.72 billion by 2028. This growth can be attributed to the increasing demand for automation and precision, as well as the broad adoption of smart technologies across industries.
A vision system comprises a camera and microprocessor or computer, with associated software. This is a broad definition that can be used to cover many different types of systems which aim to solve a large variety of tasks. Vision systems can be implemented in virtually any industry for any purpose. It can be used for quality control to check dimensions, angles, colour, surface structure, or for the recognition of an object as used in VGR systems.
A camera can be anything from a standard compact camera system with an integrated vision processor to more complex laser sensors and high-resolution and high-speed cameras. Combinations of several cameras to build up 3D images of an object are also available.
There are always difficulties in integrated vision systems to match the camera with the set expectations of the system. In most cases, this is caused by a lack of knowledge on behalf of the integrator or machine builder. Many vision systems can be applied successfully to virtually any production activity, as long as the user knows exactly how to set up system parameters. This setup, however, requires a large amount of knowledge by the integrator, and the number of possibilities can make the solution complex. Lighting in industrial environments can be another major downfall of many vision systems.
An advantage of 3D vision technology is its independence from lighting conditions. Unlike 2D systems that rely on specific lighting for accurate imaging, 3D vision systems can perform reliably under a variety of lighting scenarios. This is because 3D imaging typically involves capturing spatial information less sensitive to contrast and shadows than 2D systems.
In recent years, start-ups have started to appear, offering softwares simplifying the programming and integration of these 3D systems, in order to make them more accessible for industries. By leveraging 3D vision technologies, robots can navigate and perform tasks in environments with dynamic or uncontrolled lighting, which significantly expands their applications in real-world settings.
Typically, vision guidance systems fall into two categories: stationary camera mount, or robot arm-mounted camera. A stationary camera is typically mounted on a gantry or other structure where it can observe the entire robot cell area. This approach has the advantage of knowing its fixed position, providing a stable point of reference for all the activity within the cell. It has the disadvantage of additional infrastructure cost, and occasionally having its view obstructed by the robot arm's position. It also typically requires large image files (5 Mpixel or more) since the image must cover the entire work area.
These may be 2D or 3D cameras, although the vast majority of installations (2019) use machine vision 2D cameras offered by companies such as Keyence, Basler, Sick, Datalogic, COGNEX, and many others. Emerging players such as Leopard Imaging, Pickit3D, Zivid, and Photoneo are offering 3D cameras for stationary use. COGNEX recently acquired EnShape to add 3D capabilities to its lineup as well. 3D stationary mount cameras create large image files and point clouds that require substantial computing resources to process.
A camera mounted on a robot arm has some advantages and disadvantages. Some 3D cameras are simply too large to be practical when mounted on a robot, but Pickit 3D's Xbox cameras and 2D cameras such as Robotiq's wrist camera are compact and/or light enough to not meaningfully affect available robot working payload. An arm-mounted camera has a smaller field of view and can operate successfully at a lower resolution, even VGA, because it only surveys a fraction of the entire work cell at any point in time. This leads to faster image processing times.
However, arm-mounted cameras, whether 2D or 3D, typically suffer from XYZ disorientation because they are continually moving and have no way of knowing the robot arm's position. The typical workaround is to interrupt each robot cycle long enough for the camera to take another image and get reoriented. This is visible in essentially all published videos of arm-mounted camera performances, whether 2D or 3D, and can increase cycle times by as much as double what would otherwise be required.
Pickit 3D's Xbox camera has been arm-mounted for some applications. While capable of more complex 3D tasks such as bin picking, it still requires the stop-take-a-picture re-orientation mentioned above. Its 3D awareness does not help with that problem.
Visual Robotics claims to eliminate this cycle interruption with its "Vision-in-Motion" capabilities. Their system combines a 2D imager with internal photogrammetry and software to perform 3D tasks at high speed, owing to the smaller image files. The company claims a pending patent covering techniques for ensuring the camera knows its location in 3D space without stopping to get reoriented, leading to substantially faster cycle times. While much faster than other 3D approaches, it is not likely to be able to handle the more complex 3D tasks a true stereo camera can. On the other hand, many 3D applications require relatively simple object identification easily supported by the technique. To date, their ability to visually pick objects in motion (e.g. items on a conveyor) using an arm-mounted camera appears to be unprecedented.
Conversely, Inbolt presents a platform-independent 3D Vision-based robotic guidance system that integrates a 3D camera, advanced algorithms, and the fastest point cloud processing AI currently available. Their system is designed to handle high-frequency data processing efficiently, allowing for real-time tracking. This means the robot can adjust to variations in the position and orientation of objects within its field of vision. This adaptability is critical in environments where precision and flexibility are essential, making it well-suited for unstructured and unplanned environments. By enabling robots to operate without the need for mechanical constraints, it also eliminates the need for expensive jigs and fixtures.
These new solutions are changing the paradigm of manufacturing industries by offering unique solutions that cater to the evolving needs of modern manufacturing processes.
Traditional automation means serial production with large batch sizes and limited flexibility. Complete automation lines are usually built around a single product or possibly a small family of similar products that can run in the same production line. If a component is changed or a completely new product is introduced, this usually causes large changes in the automation process. In most cases, new component fixtures are required with time-consuming setup procedures. If components are delivered to the process by traditional hoppers and vibrating feeders, new bowl feeder tooling or additional bowl feeder tops are required. It may be that different products must be manufactured on the same process line, the cost for pallets, fixtures, and bowl feeders can often be a large part of the investment. Other areas to be considered are space constraints, storage of change parts, spare components, and changeover time between products.
VGR systems can run side by side with very little mechanical setup. In the most extreme cases, a gripper change is the only requirement, and the need to position components to set pick-up position is eliminated. With its vision system and control software, it is possible for the VGR system to handle different types of components. Parts with various geometry can be fed in any random orientation to the system and be picked and placed without any mechanical changes to the machine, resulting in quick changeover times. Other features and benefits of VGR systems are: [6]
Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form of decisions. "Understanding" in this context signifies the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.
Machine vision is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance.
Motion capture is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robots. In films, television shows and video games, motion capture refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. In many fields, motion capture is sometimes called motion tracking, but in filmmaking and games, motion tracking usually refers more to match moving.
Logistics automation is the application of computer software or automated machinery to logistics operations in order to improve its efficiency. Typically this refers to operations within a warehouse or distribution center, with broader tasks undertaken by supply chain engineering systems and enterprise resource planning systems.
Automatix Inc., founded in January 1980, was the first company to market industrial robots with built-in machine vision. Its founders were Victor Scheinman, inventor of the Stanford arm; Phillippe Villers, Michael Cronin, and Arnold Reinhold of Computervision; Jake Dias and Dan Nigro of Data General; Gordon VanderBrug, of NBS, Donald L. Pieper of General Electric and Norman Wittels of Clark University.
Cognex Corporation is an American manufacturer of machine vision systems, software and sensors used in automated manufacturing to inspect and identify parts, detect defects, verify product assembly, and guide assembly robots. Cognex is headquartered in Natick, Massachusetts, USA and has offices in more than 20 countries.
FANUC is a Japanese group of companies that provide automation products and services such as robotics and computer numerical control wireless systems. These companies are principally FANUC Corporation of Japan, Fanuc America Corporation of Rochester Hills, Michigan, USA, and FANUC Europe Corporation S.A. of Luxembourg.
Surface-mount technology (SMT) component placement systems, commonly called pick-and-place machines or P&Ps, are robotic machines which are used to place surface-mount devices (SMDs) onto a printed circuit board (PCB). They are used for high speed, high precision placing of a broad range of electronic components onto the PCBs which are in turn used in computers, consumer electronics, and industrial, medical, automotive, military and telecommunications equipment. Similar equipment exists for through-hole components. This type of equipment is sometimes used to package microchips using the flip chip method.
Product and manufacturing information, also abbreviated PMI, conveys non-geometric attributes in 3D computer-aided design (CAD) and Collaborative Product Development systems necessary for manufacturing product components and assemblies. PMI may include geometric dimensions and tolerances, 3D annotation (text) and dimensions, surface finish, and material specifications. PMI is used in conjunction with the 3D model within model-based definition to allow for the elimination of 2D drawings for data set utilization.
3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance. The collected data can then be used to construct digital 3D models.
The following are common definitions related to the machine vision field.
Document cameras, also known as visual presenters, visualizers, digital overheads, or docucams, are high-resolution, real-time image capture devices used to display an object to a large audience, such as in a classroom or a lecture hall, or in online presentations such as webinars.
Visual servoing, also known as vision-based robot control and abbreviated VS, is a technique which uses feedback information extracted from a vision sensor to control the motion of a robot. One of the earliest papers that talks about visual servoing was from the SRI International Labs in 1979.
Vibratory bowl feeders, also known as a bowl feeders, are common devices used to orient and feed individual component parts for assembly on industrial production lines. They are used when a randomly sorted bulk package of small components must be fed into another machine one-by-one, oriented in a particular direction.
A time-of-flight camera, also known as time-of-flight sensor, is a range imaging camera system for measuring distances between the camera and the subject for each point of the image based on time-of-flight, the round trip time of an artificial light signal, as provided by a laser or an LED. Laser-based time-of-flight cameras are part of a broader class of scannerless LIDAR, in which the entire scene is captured with each laser pulse, as opposed to point-by-point with a laser beam such as in scanning LIDAR systems. Time-of-flight camera products for civil applications began to emerge around 2000, as the semiconductor processes allowed the production of components fast enough for such devices. The systems cover ranges of a few centimeters up to several kilometers.
Automated Imaging Association (AIA) is the world's largest machine vision trade group. AIA has more than 350 members from 32 countries, including system integrators, camera, lighting and other vision components manufacturers, vision software providers, OEMs and distributors. The association's headquarters is located in Ann Arbor, Michigan. Now part of the A3; the Association for Advancing Automation AIA joins RIA; Robotic Industries Association, MCMA, Motion Control & Motor Association and A3 Mexico to form one of the largest collaborative trade association. All organizations offer industry training, news and member benefits.
RNA Automation, a member of Rhein-Nadel Automation, was established in Birmingham UK, in 1986, and has progressed into becoming the major supplier of parts handling equipment in the UK. The company operates in the area of specialized Automation Engineering, providing automatic parts handling equipment for high volume production in the cosmetics, pharmaceutical, electronics, food and metal working industries, with seven manufacturing facilities across Europe and North America and a network of sales and service outlets across the globe.
Component placement is an electronics manufacturing process that places electrical components precisely on printed circuit boards (PCBs) to create electrical interconnections between functional components and the interconnecting circuitry in the PCBs (leads-pads). The component leads must be accurately immersed in the solder paste previously deposited on the PCB pads. The next step after component placement is soldering.
Zivid is a Norwegian machine vision technology company headquartered in Oslo, Norway. It designs and sells 3D color cameras with vision software that are used in autonomous industrial robot cells, collaborative robot (cobot) cells and other industrial automation systems.
A high performance positioning system (HPPS) is a type of positioning system consisting of a piece of electromechanics equipment (e.g. an assembly of linear stages and rotary stages) that is capable of moving an object in a three-dimensional space within a work envelope. Positioning could be done point to point or along a desired path of motion. Position is typically defined in six degrees of freedom, including linear, in an x,y,z cartesian coordinate system, and angular orientation of yaw, pitch, roll. HPPS are used in many manufacturing processes to move an object (tool or part) smoothly and accurately in six degrees of freedom, along a desired path, at a desired orientation, with high acceleration, high deceleration, high velocity and low settling time. It is designed to quickly stop its motion and accurately place the moving object at its desired final position and orientation with minimal jittering.
This article needs additional citations for verification .(October 2008) |