Behind the Feature: Product Catalog

We shipped version 1 of our Product Catalog.

This was a feature that our product-engineering team did not want to do.


Managing vast amounts of data can be time consuming, cumbersome, and expensive. And not only that, we then have to use that data in a way that makes the lives of our customers better, not putting more of a burden on them.

As a small team (11 at the time of ship), we really had to weigh the pros and cons of diving into a project like this.

In an effort to not do work, we asked ourselves: can we just throw this over the wall and let our POS partners handle it? (Hi friends!) We asked around and some said “Yeah, we’re dealing with this!” But others said “Nope, it’s not on our short-term roadmap”. 

Ultimately, we decided we HAD to do it.


The market demanded it.

The Cannabis Industry Has Demands!

For anyone saying that IF cannabis goes fully legal, any SaaS company can just enter the market and use their industry-ambiguous tech solution to fulfill retailers’ needs: Good luck to you!

Cannabis is a distinct industry. Cannabis retailers have distinct, and sometimes intriguing asks... such as: catalog every single cannabis brand and their products (with images and product data). We're talking *millions* of products.

The market demanded that we do this feature. Other players in the space have these product catalogs that they've been working on for years, so customer expectation is that they don't need to upload images themselves, they can just plug into a data source for it. And we get it: Sourcing and uploading images for thousands of products is a real pain point. Retailers are already operating on tight margins, so dedicating a resource to data management is out of the question for most.

It’s a hard sell for us to get someone to switch to a Dispense menu without this feature. Onboarding becomes a nightmare. Churn is inevitable.

Menu Data Problems

Another thing the market seems to demand is that we, as the software provider, have to deal with, and/or clean up, the retailer's junk data coming from the POS. We’re not mad at these retailers, oftentimes they are inheriting this data from other sources to begin with. It is what it is.

And honestly, we're happy to help where we can. But sometimes the data is not in a predictable format, so doing this automatically is impossible without some human-intervention.

Things we've seen 🙈:

  • The Product Name field also contains the Category, or Weight, or Brand, or all of them plus other fun surprises (like Price?)
  • The Brand Name is in the Product Name and the Brand Name field is empty.
  • Inconsistent data within the same store, such as half of the products are coming in with the Brand in the Product Name field, but the other half don't have it 🤷‍♀️
  • No delimiters, like a pipe symbol (dashes are unreliable as some Brand Names have them) for us to determine with 100% accuracy what the different fields should be. For example, if the Product Name contains the Brand Name at the beginning of it, we can't assume that the Brand Name is just the first word, as of course some brand names are multiple words long. We also can't assume the Brand Name in the Product Name field is exactly the same as what we see in the Brand Name field (if present), for example: "Cacasdia Gardens" is coming through as Brand, but "Cascadia" is showing in the Product Name.

Anyway, let's get out of the weeds. No pun intended.

What did we ship?

Version 1 of our product catalog contains over 100,000 unique products. It's a fraction of the 1M+ that others boast, but they have large teams who have been working on these for years. It's a start. And a solid one. We're scrappy and we'll get more data.

Originally, we wanted to do only automatic matching, mostly just in an attempt to keep scope in check to not have to build out manual-matching. But cannabis naming conventions are super variable and can be wonky. There's funky names, brands that are similar, and we see lots of junk data (see above). So we set a high threshold for auto-matching and then added the ability to manual match.

Back into the weeds:

Auto Matching

To power our auto-matching, we utilized MongoDB’s Atlas Search, as it was already what we used for our product and order searching. We quickly learned that using a text-based search to perform our auto-matching was no easy feat. 

When a product is added to our product catalog, we first run it through Atlas Search to get what we refer to as the “match score”. This match score represents the best case scenario of a product match: an exact match on the product name, brand, and category. This match score is then saved to that product and is referenced any time a potential match is located.

Let’s take an example product from our catalog: 2:1 Grape Night Gummies - 10 Pack from the brand Ozone. For purposes of this example, we’ll say the match score returned by Atlas Search was 10. 

Now, assume one of our retail partners carries this product, but with a different naming convention. When this product syncs from the retail partner’s POS, the name syncs as Ozone - 2:1 Grape Night Gummies. This is the same product, but the product name now includes the brand and excludes the product count.

When searching for a match, our matching algorithm will take the product name, brand, and category and search existing products in our catalog. Atlas Search will then return to us a list of all potential matches as well as a score. This score represents how accurate the search result was to the potential match. In this example, we will say the score returned by Atlas is 8. 

These scores are generated using a set of algorithms that analyze each word provided against the words in the search result. If you’re interested in learning more about how these score algorithms work, you can reference Apache Lucene’s scoring logic as it is what powers Atlas Search under the hood.

Once we have our search results, we compare the score returned from our search to the match score saved on the product from our catalog. In this example, we have a search score of 8 and a match score of 10. Currently, we require an accuracy of 85% to auto-match a product. The accuracy in this instance is only 80% (score of 8 divided by the match score of 10), so this product while being a match will not be automatically matched. 

This is the limitation with using simply a text based search, as it does not harbor the intelligence needed to determine that these are in fact a match, and we hope to continue to improve our matching logic in future versions of our product catalog.

Closing Thoughts

Overall, we're really happy with the v1 of this feature and we've seen great results onboarding a few accounts with it. Menus are looking nice 🔥.

To give credit where it's due: Kerri Crawford (see above) did the bulk of the development work with Tim Officer jumping in on some UI. The team did a fantastic job working through issues and testing everything (shoutout Kevin Bowman). And as predicted, we did end up needing to do a bit of manual dirty work to wrangle the data (shoutout Kevin Bowman and Lindsay Breckheimer).

What's next?

We're seeing how our auto-matching performs and gathering customer feedback. We hope to improve the matching logic in the future to make it 'smarter'.

We also know we'll need more data, so we're evaluating options. If you've got a plug, hit us up!

Chelsea Officer

Chelsea is the Director of Product at Dispense and has been with the company since the beginning. With a background in both D2C and B2B SaaS startups, Chelsea excels at designing and building scalable, user-friendly products. Connect with Chelsea on LinkedIn.

Posts by this author >

Related Articles