Machine learning meets HS tariff code classification
At Zonos, our team of industry and software experts is always working on ways to simplify cross-border trade. One aspect of this is ensuring that any business moving products across borders has those items properly classified. All countries use the same system to describe the products you sell called the Harmonized Code system. Having your products properly classified by using HS or HTS codes helps ensure consistency when your packages are being examined at the border.
There are many complexities when it comes to shipping products cross-border; but when it comes to HS codes, accurate automated classifications can serve as a critical tool to grow your business. Although accurate product classification has been a tedious, expensive process in the past, we are fixing that. Zonos has an industry-changing tool called Classify that assigns a Harmonized Code to product descriptions.
As both an industry specialist and software engineer, I can explain how this works from a unique perspective. Let’s break down how HS classification for cross-border trade meets machine learning.
If you’re selling a product called, “Men’s Gold Toe Windsor Wool (12 Pairs) Black,” our system might assign a code to it, 6115.94. The customs officer inspecting items shipped to the country can look at the code, understand what is in the package, and move the item along without delay.
The first two numbers, 61, tell the officer the item is knitted or crocheted apparel, rather than 62 for non-knitted items. The second two numbers, 15, indicate these are socks. The last two numbers, 94, reveal these are full-length socks.
It may seem obvious that a product with a name like, “Men’s Gold Toe Windsor Wool (12 Pairs) Black” would fit the 6115.94 description, but there are a lot of similar codes that could have been chosen depending on the thread thickness or other considerations. Our classification tool needs to match a code to a product and find the best match available.
This gets more complicated once we know where the item is being shipped. Each country assigns additional codes. For example, the additional 00 on the end of 6115.94.00 tells a Canadian customs officer that this is a wool sock and 6115.94.0000 provides even more detail letting them know that the taxes due on the item are quite a bit higher than for a mere wool sock. I couldn’t find an explanation of why this is different, which is why proper classification is so important; it is so easy to make a mistake when classifying products. The 6115.94.0000 description is identical to the 6115.94.00 description, yet those two extra 0s can increase the duty rate. Our tools must not only understand the most correct classification but also understand how our decisions affect our customers.
Solutions to classification
Because there are so many different products and frequently changing rules, this can get complicated. Humans, even experts, only get the code right about 7 times out of 10, and manual product classification is an extra workload.
Another solution is to write a computer program with all of the rules. Industry experts classify products and software engineers add the rules the experts follow. A system like that can get the correct code about 9 times out of 10. I know this because our team at Zonos has done this. The rules we create pay attention to the following:
- The way a product is described
- What image is used to advertise the product
- Who is selling the product
- The cost of the product
Paying attention to these factors results in maximum classification accuracy.
Having 9/10 item classification accuracy is incredible, but we’re not done yet. That one misclassified item is valuable for us because we use it as data to identify difficult patterns to improve our accuracy and results. For example, I recently saw a product in an online store described as socks that “look good in a tailored suit and would be the perfect complement to an expensive pair of dress shoes,” which could be hard to classify. Humans know the product is for socks, but the description could be about suits or shoes. To prevent the computer from getting confused, the rules get more and more complicated. You can imagine this gets even harder when only a brand name is used, a word is misspelled, or the picture for a product includes several items.
With all of these existing problem-causing factors, to get the classification right at least 99 times out of 100, we use machine learning. Machine learning says we’re going to stop telling the computer every little thing that might be important and ask the computer to learn what is important for itself. We still have experts and rules, but the computer has a much bigger job and can often get much better results than humans.
Machine learning: Recognizing patterns
One way we ask the computer to solve this problem for us is to give it the product images and ask the computer to find patterns in those images. Once it’s ready, the computer can identify differences that humans might miss.
Here is an example of auto headlights. Given thousands of pictures of auto headlights, I asked the computer to organize them by how different the images are from each other. I didn’t tell the computer anything about the images. When it did this, the computer decided there were three or four different kinds of images commonly used to sell an auto headlight. Examples of each type of product image look like this:
In these images, there are two shapes of auto headlights and one that is on a car. If I look at other images in each group, I see that each group is filled with images that are similar. If auto part stores started selling auto headlights with pictures of the box, machine learning tools would notice and adapt automatically. I use this information for thousands of different product categories across millions of products.
Often the machine learning tools can notice patterns humans miss. For example, a motorcycle headlight looks slightly different than an auto headlight; or, a beauty product is often sold with a woman holding the item near her face, or an expensive item of clothing is often placed on a model and a cheaper item is often placed on a mannequin or just folded on a table. These seem like easy patterns and details to notice once I notice them, but they are actually quite easy to miss.
The machines can learn common sense as well. If a store focuses on health and beauty products, we know the flowers and grapes that are in some of the product images aren’t being sold; they’re just decorative. Or if a semi-truck is being sold for $24.99, it’s probably a toy truck or a t-shirt. These common sources of confusion are treated by the machine more or less the same as you or I would treat them. The more the system is used, the more it improves.
We can teach the machines to learn from our mistakes. After a classification confuses the grapes or the flowers in the image the first time, it can be taught to ignore these irrelevant, decorative items and focus on the tub of beauty cream in the center of the image. This comes up often when something is new. We can always bring a human classifier in to look at difficult or new problems.
Having the computer look for patterns and an expert review those patterns is a great way to get more products shipped with less trouble. At the end of the day, this creates confidence. Sending a product across borders shouldn’t require that you know everything that goes into the effort; but knowing that the work has been done, you can trust that our classification system is an important tool to grow your business.