We shipped version 1 of our Product Catalog.
This was a feature that our product-engineering team did not want to do.
Managing vast amounts of data can be time consuming, cumbersome, and expensive. And not only that, we then have to use that data in a way that makes the lives of our customers better, not putting more of a burden on them.
As a small team (11 at the time of ship), we really had to weigh the pros and cons of diving into a project like this.
In an effort to not do work, we asked ourselves: can we just throw this over the wall and let our POS partners handle it? (Hi friends!) We asked around and some said “Yeah, we’re dealing with this!” But others said “Nope, it’s not on our short-term roadmap”.
Ultimately, we decided we HAD to do it.
The market demanded it.
For anyone saying that IF cannabis goes fully legal, any SaaS company can just enter the market and use their industry-ambiguous tech solution to fulfill retailers’ needs: Good luck to you!
Cannabis is a distinct industry. Cannabis retailers have distinct, and sometimes intriguing asks... such as: catalog every single cannabis brand and their products (with images and product data). We're talking *millions* of products.
The market demanded that we do this feature. Other players in the space have these product catalogs that they've been working on for years, so customer expectation is that they don't need to upload images themselves, they can just plug into a data source for it. And we get it: Sourcing and uploading images for thousands of products is a real pain point. Retailers are already operating on tight margins, so dedicating a resource to data management is out of the question for most.
It’s a hard sell for us to get someone to switch to a Dispense menu without this feature. Onboarding becomes a nightmare. Churn is inevitable.
Another thing the market seems to demand is that we, as the software provider, have to deal with, and/or clean up, the retailer's junk data coming from the POS. We’re not mad at these retailers, oftentimes they are inheriting this data from other sources to begin with. It is what it is.
And honestly, we're happy to help where we can. But sometimes the data is not in a predictable format, so doing this automatically is impossible without some human-intervention.
Things we've seen 🙈:
Anyway, let's get out of the weeds. No pun intended.
Version 1 of our product catalog contains over 100,000 unique products. It's a fraction of the 1M+ that others boast, but they have large teams who have been working on these for years. It's a start. And a solid one. We're scrappy and we'll get more data.
Originally, we wanted to do only automatic matching, mostly just in an attempt to keep scope in check to not have to build out manual-matching. But cannabis naming conventions are super variable and can be wonky. There's funky names, brands that are similar, and we see lots of junk data (see above). So we set a high threshold for auto-matching and then added the ability to manual match.
Back into the weeds:
To power our auto-matching, we utilized MongoDB’s Atlas Search, as it was already what we used for our product and order searching. We quickly learned that using a text-based search to perform our auto-matching was no easy feat.
When a product is added to our product catalog, we first run it through Atlas Search to get what we refer to as the “match score”. This match score represents the best case scenario of a product match: an exact match on the product name, brand, and category. This match score is then saved to that product and is referenced any time a potential match is located.
Let’s take an example product from our catalog: 2:1 Grape Night Gummies - 10 Pack from the brand Ozone. For purposes of this example, we’ll say the match score returned by Atlas Search was 10.
Now, assume one of our retail partners carries this product, but with a different naming convention. When this product syncs from the retail partner’s POS, the name syncs as Ozone - 2:1 Grape Night Gummies. This is the same product, but the product name now includes the brand and excludes the product count.
When searching for a match, our matching algorithm will take the product name, brand, and category and search existing products in our catalog. Atlas Search will then return to us a list of all potential matches as well as a score. This score represents how accurate the search result was to the potential match. In this example, we will say the score returned by Atlas is 8.
These scores are generated using a set of algorithms that analyze each word provided against the words in the search result. If you’re interested in learning more about how these score algorithms work, you can reference Apache Lucene’s scoring logic as it is what powers Atlas Search under the hood.
Once we have our search results, we compare the score returned from our search to the match score saved on the product from our catalog. In this example, we have a search score of 8 and a match score of 10. Currently, we require an accuracy of 85% to auto-match a product. The accuracy in this instance is only 80% (score of 8 divided by the match score of 10), so this product while being a match will not be automatically matched.
This is the limitation with using simply a text based search, as it does not harbor the intelligence needed to determine that these are in fact a match, and we hope to continue to improve our matching logic in future versions of our product catalog.
Overall, we're really happy with the v1 of this feature and we've seen great results onboarding a few accounts with it. Menus are looking nice 🔥.
To give credit where it's due: Kerri Crawford (see above) did the bulk of the development work with Tim Officer jumping in on some UI. The team did a fantastic job working through issues and testing everything (shoutout Kevin Bowman). And as predicted, we did end up needing to do a bit of manual dirty work to wrangle the data (shoutout Kevin Bowman and Lindsay Breckheimer).
We're seeing how our auto-matching performs and gathering customer feedback. We hope to improve the matching logic in the future to make it 'smarter'.
We also know we'll need more data, so we're evaluating options. If you've got a plug, hit us up!