How Do Your Machines See?

EP Editorial Staff | April 11, 2024

The machine-vision system you choose depends on available data and the required task.

By Alex Wirtz, Motion Automation Intelligence

In simple terms, machine-vision systems are image-capturing and processing systems used in industrial processes. These systems serve as a rich source of production information, revealing valuable insights into the manufacturing process. Additionally, data and measurements can be stored in databases and used for real-time display of process metrics. Machine vision is used in metrology, to read 1D and 2D codes or text, and to perform complex tasks such as anomaly detection and defect classification for process monitoring.

Traditionally, machine-vision solutions use rule-based algorithms for various manufacturing functions. A rule-based system is simply a model type that uses prewritten rules to solve problems. Tried and true, these methods are highly effective at guaranteeing quality in various applications. However, rule-based methods struggle to solve several issues, and some solutions are difficult to realistically implement in a production environment. Recent developments in machine learning, including deep learning, have helped vision engineers solve previously tricky problems.

When machine learning is mentioned, people tend to automatically think of deep learning. However, deep learning is only one subset of machine learning; both fall under the umbrella of artificial intelligence. This is an important distinction, as in the field of machine vision, using rule-based machine learning and deep learning methods all have benefits and drawbacks when used for an application.

Rule-Based Algorithms

Rule-based machine-vision algorithms are quick to develop and deploy. They generally only require a small amount of data to create, often needing only a few samples of each form of inspection. This means less time is needed to gather data during development and, in a manufacturing environment, less product is wasted during data acquisition. Usually, algorithm building can happen in parallel with the building of the larger system.

If an engineer can easily predefine the characteristics, or rules, of the image inspection, it is best practice to begin a machine-vision application with a rule-based approach. Rule-based algorithms are highly accurate and repeatable, with most systems performing at or near 100% accuracy for whatever task. This makes it the ideal approach whenever high precision is necessary—such as metrology and gauging—as well as precision alignment tasks. However, its utility is by no means limited to these use cases.

Rule-based algorithms work best with strict and controlled part handling or advanced software fixturing techniques. Very slight changes in how an object looks in an image can dramatically hinder the performance of an algorithm. For this reason, an engineer
designing a machine-vision system using these algorithms must carefully consider how to present a part to a sensor for image acquisition to ensure the process is repeatably precise.

In addition, the objects that the system is attempting to detect must be consistent, with little acceptable variation. This can make it difficult to integrate these machine-vision system types into production environments where careful part handling (or fixturing) cannot be guaranteed or when the part has unpredictable variations.

Whether you use rule-based or machine-learning algorithms depends on what you’re inspecting and the available data.

Machine-Learning Algorithms

There are two distinct types of machine-learning algorithms: supervised and unsupervised. Each type seeks to create models that learn patterns from input data to predict outcomes. The difference is that supervised learning requires labeled data and unsupervised does not. For supervised models to train properly, the data requires human input during the acquisition and labeling process, which can be time consuming and laborious.

Machine-learning algorithms typically require larger datasets than a rule-based algorithm during development. The data must represent as many input varieties as possible in the production environment to be accurate and predictable. For a machine-vision inspection, this means collecting multiple samples of every form of inspection. The amount of data required varies with the chosen machine-learning method and the problem’s complexity.

Non-deep machine learning methods often require less data than deep learning methods but can be more limiting in application. Any machine-learning-based machine vision application will usually need to start with building a way to acquire and properly label data. The algorithm is built after the physical system has been set up and acquired data.

Machine-learning models thrive with applications where the data can greatly vary or may be subject to interpretation. A machine-vision system can use supervised machine-learning techniques to classify “good” or “bad” products that may otherwise require a trained human eye to sort. Human inspection is not ideal since opinions on tolerances will differ from one person to the next and may be inconsistent day to day. Machine learning removes this human error while still being able to classify data that traditional rule-based methods could not. A drawback of machine-learning algorithms, including deep learning, is that they are less accurate than conventional vision algorithms. Many applications struggle to perform at greater than 99% accuracy.

Deep Learning

Deep learning is currently the most popular topic in the machine-learning field. A deep-learning vision system uses neural networks to extract relevant information from an image to generate a desired output. These algorithms are a form of supervised learning that uses reinforcement to improve itself as it trains. The nature of this training makes deep-learning algorithms act like a black box, where the output cannot be predicted, given any input. When something fails unexpectedly, it can lead to additional data collection and retraining.

Common machine-vision applications include classification (optical character recognition), object detection, segmentation, posing and, more recently, generative tasks. Deep learning excels in handling complex visual patterns and features. If the task involves intricate patterns or requires an understanding of high-level visual concepts, deep learning can outperform traditional vision algorithms.

Deep-learning models are highly scalable and can be trained to handle various tasks with minimal changes to the underlying architecture. Retraining is required only when new data is introduced. In situations where high variability is expected or in visual inspections where a data pattern may not be obvious, deep-learning algorithms become necessary.

Deep-learning models can require substantial data accumulation to perform at a level similar to other methods. This can sometimes be a limiting factor, as gathering the required data may be inconvenient. However, using data augmentation while training can reduce the data needed to train a model. It artificially increases the dataset size without gathering new data samples.

Augmentation methods include, but are not restricted to:

• adding noise to existing images
• stretching and rotating existing images
randomly placing additional data within the image.

Also, many deep-learning resources today offer pre-trained models. Pretraining a neural network means that it has already been trained on other generalized datasets, which can decrease the amount of data needed to train it on new datasets. These developments have all helped make deep learning more effective in machine-vision applications.

While each approach has benefits and drawbacks, recent advancements in machine learning, including data augmentation and pre-trained models, have significantly enhanced the effectiveness of deep-learning methods in machine-vision applications. Understanding each method’s strengths and limitations is critical for selecting the best approach to meet a given machine vision task’s specific requirements. EP

Alexander Wirtz is a Software and Machine Vision Engineer for the Motion Ai Vision Excellence Group, based in Birmingham, AL ( He develops and deploys vision systems for manufacturing in multiple industries.


Sign up for insights, trends, & developments in
  • Machinery Solutions
  • Maintenance & Reliability Solutions
  • Energy Efficiency
Return to top