Why accuracy is key in Computer Vision-powered retail execution

Why accuracy is key in Computer Vision-powered retail execution

Computer Vision helps automate many key in-store tasks but for this to work, you must prioritize high image recognition accuracy as the first line of differentiation.

In today’s fast-paced, hyper-competitive retail market, most manufacturers and retailers don’t have accurate and timely on-shelf visibility or actionable insights on their retail reality in store. Manual audits by sales personnel documenting shelf conditions on a handheld device are slow, expensive, inconsistent and result in data that is subjective.

Computer Vision (CV) solutions are changing the way brands and retailers see the shelves. This technology provides manufacturers with insight into conditions on the retail shelf in the form of a digital image and gives guidance for field personnel in near real time on how to drive greater value.

But for this to work, you must prioritize high image recognition accuracy as the first line of differentiation. In this article, we will go behind the scenes to see what it takes to deliver accurate SKU-level recognition at speed and scale.

Why Accuracy Matters

In a typical grocery store, there could be over 30,000 SKUs and on average 30% of them could change through the year with new product entries and packaging redesign. On top of tackling the sheer volume of products, an automated recognition platform must also overcome other hurdles to accuracy in the retail environment such as:

  • Nearly identical products
  • Obscure and reflective packaging, and poor angles
  • Poor visual conditions like low light
  • Partially obstructed products
  • Changes in the product lifecycle like new design variants

Overlooking these could result in inaccurate data – for example, incorrect calculation of a brand’s share of shelf due to partially obstructed products. When this unreliable data is used to drive sales decisions, the monetary losses can quickly mount over time.

A Four-Pronged Approach to Enforce Accuracy

At Trax, these hurdles are overcome through a rigorous data quality enforcement approach built on the following measures:

  1. Huge training data repository

Computer Vision uses advanced neural networks and deep learning techniques for tasks such as object detection. It is modeled on the human visual cortex, which means that much like the human brain, the more you see, the better you learn.

With a sufficiently large image repository, advanced deep learning algorithms overcome challenges in store conditions like poor lighting and obstructions. The most accurate systems are those that sense shelf context and detect not just products, but also empty spaces on shelves quickly.


  1. High quality images

During capture, a blur prevention tool ensures that the image produced is sharp, while a snapshot tool alerts the user if the orientation is different from a previous image. To compensate for lack of light, the camera flash is automatically activated, and a built-in leveler tool helps alert the user to orient their devices at the perfect angle and distance from the shelf.

  1. Multi-layered validation

An active learning engine ensures that products are automatically recognized from images. Image samples in which it’s harder to recognize or distinguish products go through an additional layer of validation with a process known as voting. Here qualified domain experts validate the data collected to check if it accurately matches what is really in the image. This means the system always establishes extremely high levels of ‘confidence’ in its recognition accuracy, even in the case of identical products or new designs.



  1. Recognition accuracy measurement

Precision is at the core of state-of-the-art deep learning systems. This means that algorithms return substantially more relevant results than irrelevant ones. At the same time, it must return most of the relevant results. Say there are 100 products on a shelf. An algorithm may recognize 80 of them, of which 77 are recognized accurately. This means that the precision of recognition is 96%.

Trax ensures data transparency by providing users with web dashboards that outlines the accuracy of the recognition performance, the confidence, unidentified products and other metrics.

Check out this video of the Trax Computer Vision platform in action to see how we convert images to insight accurately.

Back to top