Build a visual search app with TensorFlow in less than 24 hours
For a while now, I really wanted to work on a machine learning project, especially since Apple let you import trained model in your iOS app now. Last September, I took part of a 24h hackathon for an e-commerce business, that was my chance to test it. The idea was simple: a visual search app, listing similar products based on a picture.
Dataset and model
To be able to recommend products based on a picture, the first step is to train a model to classify those pictures and show similar products. To do so, best practice is to train a model against the data set of products. Lucky me, I had a fashion catalog given for the event but if you need one, Zalando already did one, which helped me a lot to have a structure for the process of the training.
Since it was a 24h challenge, instead of training (or retraining a model), I used a trained model “InceptionV3” which is known for image classification. The rest of our stack was Python code with TensorFlow for the machine learning framework, Keras for the interface to TensorFlow and AWS SageMaker to run our tests.
Prediction
To speed up even faster the process for this hackathon, and because I already had a dataset of fashion products, my team and I preferred to store the prediction of each image in our dataset, ready to be compared. It’s equivalent of storing the hash of a string to compare them. We would find the nearest predictions, and therefore, the matching products too.
Finally, for any image to analyse, we would run the prediction and find the closest one. Our fashion dataset was matching a catalog, we could show the matching product ready to buy. We exposed that service into a Python api to be able to consume from mobile and web apps.
Conclusion
Starting at 11am, we built this prototype with only 14h coding. This included the machine learning layer, exposing an api to consume and an iOS app to upload listing products from the result.
“Did it work?”
Yes, for any shoes pictures, we got more shoes. Although we had any kind of tops for t-shirt pictures.
“Was it accurate?”
Not enough. the more time you invest in training of the model and the classification of the images, the more accuracy you’ll get.
Coding cut being the next morning, we knew we couldn’t improve the accuracy much more over the night. With more time (like weeks or months), we could improve this accuracy. I think the best would be to retrain the model on the whole product catalog.
We could push even further the idea and train it based on brands to be even more accurate. We could also do recognition on each fashion product in a picture and compute each object to get the whole look recommended.
On the other side, because I’m only a mobile software engineer, it makes sense to me to add the model into a mobile app. It would be faster to compute a result since there is no upload but also more secure: nobody could trace what is actually in the computed image.
We didn’t get jury award but we won the audience prize, everybody loved it which made it worth it. For me, once again, hackathon was a great experience to meet talented people and learn so much from them.
Thanks for reading