Cellr Vuforia - Inmost

Reliable and scalable web & mobile solutions

15 min read

Feb 9, 2023

Briefly about the application

Inmost and CELLR aspired to create a proprietary application for producers and consumers of wines where they could easily add new wines without relying on a competitor's licensed product. Consequently, the CELLR app was developed as a mobile and web version. The application was created for producers and consumers of wine who would participate as members of a more extensive wine community.

Inmost has developed both a mobile and web version of the CELLR app.

It allows you to get accurate information about the desired wine: Producer, Vintage, Country of release, Varietal, market pricing, and consumer reviews.

The app bonds a community where members can share feedback and trade. Inmost and CELLR have developed a trading system where users can make transactions conveniently and safely with wine authentication.

The user can create his collection, store it in a user's cellar, and keep their records, noting when the wine was bought or drunk.

You can also import your inventory from other services or export it as a backup. Among other features, we have implemented convenient filtering and searching by Producer, Varietal, Country, Region, Appellation, Vintage, and Price.

One of the advantageous features of the application is the search for wine by label photo.

Check this link for a more detailed description: https://inmost.pro/cases/peer-to-peer-trading-platform-that-directly-connects-wine-enthusiasts-and-cellar-owners.

Wine search by label photo

The user needs to press the search by photo button, point the camera at the label, and take a photo. The application will find the wine and show complete information about it. If there are multiple matching results, they will be sorted in descending order of relevance.

During the development of this functionality, we considered two possible concepts:

Concept 1. We could recognize the text in the photo of the label and perform a full-text search in a PostgreSQL database.
Concept 2. We could use one of the Machine Learning technologies like the Vuforia Engine.

How we evaluated recognition algorithms

To assess the quality of the recognition algorithm, we used a statistical method (check https://en.wikipedia.org/wiki/Type_I_and_type_II_errors).

We created a test set of 73 photos from the most popular manufacturers. We included an equal number of photos of labels with no text, medium, and a lot of text in the test set.

Next, we made up a JSON file that contained metadata for each label and a link to a photo in an AWS S3 bucket.

For automatic testing, we needed a script that would iterate through all the photos in a folder and make requests to our Backend server. The server will use two search algorithms. In the first case, PostgreSQL full-text search. In the second case, it will use the Vuforia Engine. After each server response, the script will complete the JSON file with search results. At the end of the script, the script will count the number of correct and incorrect search results for the label photo.

The server can return one of three search results:

the server replied that the wine was found, and its result matched the correct answer;
the error of the first type: the server replied that the wine was not found, but it was in our database;
the error of the second type: the server replied that the wine was found but made a mistake and specified another wine.

With the help of the script, we could test each of the two algorithms in just a few minutes and get the number of correct answers and the number of errors of the first and second types for each. By changing algorithm configuration parameters, such as weights (explanation below) for full-text search and photo quality for Vuforia Engine, we evaluated the algorithm again and again. In the end, we found the optimal configuration with the maximum number of correct answers and the minimum number of errors.

PostgreSQL Full Text Search Algorithm

We implemented this functionality on the server to test the first concept. The user takes a photo of the label in the mobile application and sends a request to the server. Next, the server will convert the image to Base64 encoding and access the Google Cloud Vision service (Cloud Vision API) to extract text from the photo. The response comes in the form of a string (line) of text that contains all the text that could be recognized on the label. The server uses this text for full-text searches in the PostgreSQL database. See details about full-text searches here:

https://www.postgresql.org/docs/current/textsearch-intro.html.

Let's imagine that information about wine is contained in a table, and each column contains the name of the wine, the name of the manufacturer, the grape variety, the year of release, the country, the region, and the capacity of the bottle. For full-text search in a PostgreSQL database, we will create a text vector and assign a weight to each column. Possible values of the weight coefficients are D=0.1, C=0.2, B=0.4, A=1.

Empirically, we have selected the best weights for our sample: wine_producer=1.0; wine_name=1.0; country, region, subregion=0.4; variable=0.2; bottle_size=0.1.

After evaluating this algorithm, we obtained the following statistics:

Table A.1 - Number of errors using PostgreSQL full-text search

Item	Total
Number of images	73
Correct results	51
Type 1 errors	16
Type 2 errors	6

If this algorithm is slightly complicated, it can be crucially improved. We noticed that the critical search parameter is the wine producer. If the manufacturer is found correctly, then the search by other parameters can be significantly narrowed, and the number of manufacturers is small. If the manufacturer were determined incorrectly, the search by other parameters would give a deliberately false result, which we can see by the number of type 2 errors.

The improved algorithm should search for two iterations. First, find the manufacturer, and only in the case when it was found, correctly perform the second iteration of the search among the list of wines from this manufacturer.

However, we did not complicate the search algorithm and decided to test the second concept using the Vuforia Engine machine learning model.

Algorithm using Vuforia Engine

Information about Vuforia Engine you can view here: https://library.vuforia.com/objects/image-targets.

We created a Vuforia account and used a ready-made test set of photos. Vuforia allows you to easily integrate your engine into the Unity platform using the SDK (check https://library.vuforia.com/getting-started/getting-started-vuforia-engine-unity).

We made up a Unity project, connected the SDK, and used the webcam. We loaded target photos of labels using varying quality into Vuforia Engine and showed actual bottles with labels in front of a webcam. We found that the target photo should have good contrast and should be taken in good lighting. The same photograph of the label in different quality gives very different results.

Vuforia assigns a conditional rating to the photo (in points from 0 to 5):

Pictures with a rating of 2 points and below could be recognized better. It takes a long time to show the label to the camera before it can be recognized. Photos with this rating are not suitable for the application;
Photos with a rating of 3 points were recognized well;
Photos with a rating of 4 and 5 points were recognized ideally as soon as the label hit the camera lens;
Contrast photos performed much better, and monochrome photos performed significantly worse;
Recognition did not work well if there were spots or dirt on the label, even small ones;
If the photo is bent in front of the camera and rotated at different angles, it is recognized well regardless.

We used the Vuforia REST API for automated testing.

(Check https://library.vuforia.com/articles/Solution/How-To-Use-the-Vuforia-Web-Services-API.html).

You need to send a POST request to the Vuforia service and add a Base64-encoded photo to the request body. Vuforia would return an answer if it was found among the target photos.

Here are some of the examples we’ve tested:

1.jpg => found: {'target_timestamp': 1614817446, 'name': 'Constant_Diamond_Mountain_Vineyard_Cabernet_Franc'}

2.jpg => found: {'target_timestamp': 1614817446, 'name': 'Constant_Diamond_Mountain_Vineyard_Cabernet_Franc}

3.jpg => found: {'target_timestamp': 1614817446, 'name': 'Constant_Diamond_Mountain_Vineyard_Cabernet_Franc}

4.jpg => found: {'target_timestamp': 1614817446, 'name': 'Constant_Diamond_Mountain_Vineyard_Cabernet_Franc}

Photo of a wine label without text:

1.jpg => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare}

2.jpg => not found

3.jpg => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare}

4.jpg => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare'}

5.jpg => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare'}

5.png => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare'}

6.jpg => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare'}

8.jpg => found: {'target_timestamp': 1615511800, 'name': 'Orin_Swift_Blank_Stare'}

1.jpg => not found

2.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

3.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

4.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

5.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

6.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

7.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

8.jpg => found: {'target_timestamp': 1615514547, 'name': 'Morlet_Family_Vineyards'}

1.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

2.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

3.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

4.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

5.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

6.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

7.jpg => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

8.JPG => found: {'target_timestamp': 1615516540, 'name': 'Chateau_Lascombes_Chevalier_de_lascombes'}

We’ve evaluated the quality of the search algorithm for a test set of photos:

Table A.2 - Number of errors using Vuforia Engine

Item	Total
Number of images	73
Correct results	61
Type 1 errors	12
Type 2 errors	0

We got fewer errors compared to PostgreSQL full-text searches, and they were all type 1 errors, which means we need to improve the quality of the photos. We need photos that Vuforia Engine will assign a rating of 4 to 5. It won’t be easy to find 1-2 million photos of labels in good quality. In one of the following sections, we will describe how we solved this problem.

Integrating Vuforia Engine into a Mobile App

At the development stage, we only opened this functionality to beta users. Now our recognition algorithm looks like this:

We need to have a sufficient database of label photos before many of the search queries are found. In addition, the user can send photos that do not belong to the labels, shooting foreign objects on the camera. When a user makes a photo search request, the wine may or may not be found. We have created two buckets of AWS S3, where we will store separately found and not found photos. We also created two tables: vuforia_noreco_images and vuforia_reco_images. If a label photo is found, we store it in the reco bucket and add an entry to the vuforia_reco_images table. If the photo is not found, we save it to the noreco folder and add an entry to the vuforia_noreco_images table.

Before enabling this search functionality in the Production version of the application, we need to fill our database with label photos. We need about 1-2 million decent-quality photos. This was the next task before integrating it into the mobile app.

Obtaining a database of label photos

To obtain our database of label photos, we contacted the COLA service (check https://ttbonline.gov/colasonline/publicSearchColasBasic.do).

It is designed to certify products and contains over 2 million wine certificates. Each certificate contains the metadata we need: name, manufacturer, year of issue, the volume of the bottle, and most importantly, a good-quality photo of the label.

We have written a separate service that will download these certificates, parse their content, save the label photos to an AWS S3 bucket, and save the metadata to our database. The COLA service has its classification of product codes. The numbers we are interested in are in the range of 80-89C.

We created a folder in the AWS S3 bucket for each wine code.

After processing each certificate, we will create a separate folder with the name of the wine, in which we will save the JSON file with metadata and JPEG photos of the labels. We will need to adapt this data to our wine model before adding it to our database and then upload the photos to Vuforia as Target Images.

To avoid loading the COLA service with requests, we used the node-cron library (check https://www.npmjs.com/package/node-cron). With its help, our service started once an hour and made 300 requests. Thus, we uploaded 7200 new certificates per day with metadata and photos of labels without making a big load on the COLA service.

We launched two AWS EC2 instances, and each downloaded and parsed certificates. Since the COLA service products are divided by category codes, we quickly divided this work between two servers, dividing the categories equally.

Consequently, each of the two AWS EC2 instances downloaded 300 certificates per hour and parsed them after that. It took us 4.5 months to download all the certificates we needed with label photos - about 2.1 million.

Request high-level consultation in one click