Hybrid classification: What do you do when machine learning hits some limitations?


When talking about data and its analysis, machine learning comes to mind very quickly, and indeed machine learning has provided excellent solutions in many of our projects. However, we have sometimes hit some limitations as well, for instance when certain data are rare or difficult to obtain, or when computing power is scarce, such as in embedded applications. A possible solution is to combine different classification models – let’s call this hybrid classification – which I would like to present below based on an abstracted example.

In the following example, the system should recognized whether a robot arm has gripped a tool or not based on some pictures from a camera. This classification is the basis of further automation of the system. To classify these images successfully and reliably in all possible cases, a machine learning model that can distinguish between the cases of ‘a tool is present’ and ‘no tool present’ needs a large amount of training data, i.e. labelled images.

I could select an artificial neuronal network as the machine learning model of choice. There is the possibility, however, that running such a model on an embedded system with limited computing power, would lead to slow performance. Using classical image processing approaches such as template matching offers some advantages here, as they incorporate known information about the classification into the model. However, template matching cannot deal with all images and leaves some images unclassified. I therefore decide to go for a hybrid approach: Classical image processing as the first step, which classifies the majority of the images in a reasonable time, and a neuronal net as a second step, which classifies only the remaining images.

In my example, this hybrid approach led to a correct classification of 99.9% of the images, while taking about 2 to 3 seconds per image to classify. Whether this is good enough of course depends on the individual requirements of the application.

This example demonstrates how classical data processing methods can be combined with machine learning, exploiting the advantages of both approaches: Complex neuronal networks can learn a great many things but need large amounts of annotated data. Template matching manages to incorporate known information about the classification into the model, using fewer computing resources. This way, we get the best of both worlds!

Author: Sara Hänzi