Automated classification of product masters using generative AI: How to bring a new perspective to your analysis

  • Data Utilization
  • Generation AI
Automated classification of product masters using generative AI: How to bring a new perspective to your analysis

In the distribution and retail industries, product masters that centrally manage product information play an important role. However, JAN codes, which were designed with logistics and inventory as the top priority, have a large granularity that makes them difficult to use for marketing and multifaceted analysis, which can lead to issues such as essentially identical products being treated as different data.
In this article, we will explain the new perspectives that transformation in product master management brings, through the mechanism of automatic classification brought about by generative AI and the benefits that can be gained by introducing it.

Analysis challenges in product master management

If traditional JAN codes, which were designed for logistics and inventory management, are used for analysis as is, the difference in granularity of product information and the fixed data structure become bottlenecks.

Inconsistency between the granularity of the JAN code and the axis you want to analyze

The rules for assigning JAN codes take into consideration product distribution channels and packaging changes. However, from a marketing perspective, products in the same category may have multiple codes, which can cause confusion when compiling data.

For example, new JAN codes may be assigned when making small changes to the package design or when switching production lots. When the number of codes increases with each small change, it can appear as if multiple identical products exist, making centralized management difficult.

While there are cases where items should be managed separately for inventory and distribution purposes, there are also many cases where items should be treated the same in marketing analysis, and items that are essentially the same may end up being tallied as separate items because they have different codes.In sales analysis and inventory management, it is easy to forget to sum up by code, or to have unified classification standards, resulting in scattered data.

The limitations of fixed master data

Product master data based on JAN codes tend to be difficult to change or expand once the codes are decided. Consumer tastes and market trends change daily, but updating the master data requires a huge amount of man-hours, which can result in cases where analysis cannot keep up with the times.

For example, to capture new trends such as "health consciousness" or "environmental awareness," flexibility in product classification itself is necessary. However, with a statically managed JAN code-based master, simply adding a new tag or category is time-consuming, and delays in updating information can result in missed business opportunities.

Even if the JAN code itself remains the same, classification rules must be revised when the product lineup expands or requirements change. However, because the task of changing the master data itself is highly dependent on personnel and costly, there are increasing cases where systems are unable to keep up with the speed of change. This means that the analysis environment cannot always keep up with the latest business requirements, and even if data is collected, it cannot be fully utilized.

Support for various analytical axes (dimensions)

When utilizing a product master, there is a need to freely set multiple analytical axes, not just simple category classification. For example, as the perspectives that are emphasized change with the times, such as "healthcare," "time-saving," and "organic," there is a need for the ability to flexibly modify product classifications and add tags.

As new product concepts and consumer behaviors emerge one after another, master data based solely on JAN codes cannot fully capture their characteristics. Without a system that allows you to flexibly tag products and expand your perspective, you will run into the problem of not being able to use the data you have worked so hard to obtain in your analysis.

Automating product classification with generative AI

The generative AI approach, which interprets the context of text information and automatically assigns categories without being bound by existing rules, is a means of integrating current fragmented product information and viewing it from a new perspective.

▼I want to know more about generative AI
Generative AI | Glossary

Extracting meaning from unstructured data

Even for seemingly simple text information, such as product names and descriptions, generative AI can infer categories and tags by using the context and the relevance of adjacent keywords. By understanding the context in which a word is used, rather than simply the frequency of its appearance, it is possible to achieve more accurate classification than keyword-based methods.

Because generative AI has learned from a large amount of text, it tries to interpret the meaning of the entire sentence, rather than simply matching search keywords. For this reason, even if a sentence contains similar expressions or abbreviated expressions, if it is determined that they essentially have the same product attributes, it will automatically categorize them into the same category.

Automatic tagging using general knowledge

Generative AI can refer to general knowledge and a wide range of data it has already learned, so it can infer and tag unknown products based on partial information. This is a major advantage as it can also handle information that is not yet registered in the in-house database, such as new products or seasonal limited items.

Even when unknown product data is added, the model's vast linguistic knowledge can be used to make consistent category predictions. This makes it possible to instantly register and classify a huge number of products while keeping costs down, enabling operations that do not impede time to market.

Flexible change of category definitions

With generative AI, you can update categories and tagging criteria simply by adjusting prompts, rather than rewriting the program. This means that even when you start a new marketing plan or want to completely change the way you categorize your products, there's no need for major system overhauls.

With traditional rule-based approaches, it was time-consuming to rebuild the system logic every time the classification rules were changed significantly.On the other hand, with generative AI, it is easy to adjust the model output by simply replacing the specified instructions and sample examples, significantly reducing the amount of work required by operators.

Ensuring classification accuracy in practice

When using automated classification by generative AI in business, it is important to create a system that minimizes misclassification and ensures continuous operation with high accuracy. Even results that AI can confidently derive are not completely error-free. There is a risk of misclassification, especially with new or specialized products, and a way to adjust the AI's behavior is required to address this.

The importance of "context design" that influences the accuracy of AI responses

Learning classification criteria by showing representative examples

By providing the generative AI with a few representative product examples and their correct classifications, it becomes easier for the model to understand the required classification criteria. This technique, known as "few-shot prompting," has the effect of indicating a company's own classification policies and priorities to the AI's inferences, making it extremely useful as an auxiliary means to reduce misjudgments and inaccuracies.

By providing specific examples of product categorization, the generative AI can more easily distinguish between products with similar contexts. For example, if you clarify the definition of "prepared food" with examples, the AI will naturally classify products with similar descriptions as "prepared food."

Use of evidence-based data (ingredients list, product description)

By taking into account not only the product name but also raw materials and specification information when classifying products, the AI's judgments can be made more objective. By including data such as ingredient lists and manufacturing methods in the input, it is possible to expect highly accurate automatic classification that does not rely solely on the appearance of keywords.

Information on ingredients and nutritional content is an important indicator, especially when classifying health-conscious or diet products. By utilizing this additional data, AI can capture subtle characteristics that cannot be classified by differences in labeling alone, enabling more accurate and reproducible classification.

Combining with web search

This method involves collecting information about new products that are not included in the AI's learning data through web searches and reflecting it in the AI's classification process. This makes it easier to constantly incorporate the latest information, enabling quick responses to market changes.

Even when there are few precedents, such as for seasonal limited edition products or collaborative products, it is possible to understand characteristics from official information and reviews on the web and use them for classification. By expanding the range of data that AI can refer to, it can quickly assign the correct label to unknown products.

Normalization of spelling variations

Even for the same product, it is not uncommon for product names and size notations to differ slightly. Generative AI's natural language processing expertise can absorb variations in notation and automate name matching, making it easier to handle data as a single product.

For example, there are many variations in spelling in the field, such as brand names and specifications being written as abbreviations, or product descriptions being written in a mixture of katakana and English. By utilizing generative AI, it is possible to automatically execute a process to treat these items as the same product in a unified manner. This reduces the time required for manual matching and improves data quality.

Use with BI tools and build data marts

By utilizing product attributes automatically generated by generative AI in BI tools and data marts, we can create an environment where flexible analytical axes can be easily added. Product tags and category information added by AI provide a new analytical perspective that has never existed before. By obtaining hints for marketing and product development rather than relying solely on existing attribute information, you can make decisions from a more multifaceted perspective.

Use as a dynamic dimension

Setting the tags assigned by AI as the axis of analysis adds flexibility to previously fixed dimensions. For example, it becomes easier to add new perspectives that are in line with the times, such as "health-conscious," "instant meal needs," and "organic," allowing for deeper analysis of customer behavior and demand forecasts.

For example, you can easily extract only products with tags related to "health-consciousness" and visualize best-selling items and regional differences. The advantage of being able to instantly add or change such aspects is that users can immediately verify any hypotheses they come up with.

Streamlining data mart creation

By simply reflecting classified product attributes in a data mart, data sets for analysis can be easily constructed. By automating the work that previously required manual preparation of product attributes, AI reduces development and operational costs and speeds up preparation for analysis.

By directly importing the results of AI-powered automatic classification into a DWH, engineers and analysts can significantly reduce the amount of additional processing required. This also makes it easier to set up reports and visualization tools, enabling quick responses to requests from business departments.

Achieving a rapid hypothesis verification cycle

You can instantly perform aggregations and segment analysis from new perspectives, dramatically improving the speed of hypothesis testing. Because you no longer need to spend a lot of time on the traditional process of master maintenance and code modification, you can easily seize opportunities to improve your business by implementing the PDCA cycle in a short period of time.

Rapid product attribute assignment using AI eliminates the need for the previously cumbersome process of updating master data. As a result, data can be quickly switched over as soon as an idea is generated, and the data can be instantly reflected in visualization tools, enabling speedy ad-hoc analysis.

Realizing automatic classification through generative AI and data integration

The role of generative AI is not simply to replace manual work, but to give deeper meaning to corporate data and extract new added value. Beyond simply automating product registration, the tags and categories that generative AI assigns to products contain insights that reflect customer needs and the times. This gives the data a multifaceted evaluation axis that could not be captured by conventional rule-based systems.

Combining in-house data with generative AI

mission-critical system, core system primarily store standardized information such as sales and inventory, but by combining it with generative AI, it is possible to give that data explainable context and meaning. For example, it may become easier to interpret the product characteristics and trends behind sales rankings.

Sales and inventory are factual information captured in numbers, but adding layers of language and context to them can redefine their business meaning. AI-based product classification acts as a bridge between data and the market environment, leading to the discovery of new business opportunities.

Giving data a new perspective

AI can pick up on latent product characteristics that would otherwise be overlooked with traditional master management, leading to the discovery of new segments and the generation of promotional ideas.

If the product has functionality or appeal that cannot be imagined from the product name alone, AI can pick up on these characteristics from the context of the text and form a new category. By shedding light on products that have previously been lumped together as "other," you may be able to find opportunities to develop unexpected demand or niche markets.

Finally

Static management that relies on JAN codes makes it difficult to respond to diversifying analysis needs. Automatic classification using generative AI not only improves efficiency but also brings flexibility to master management. An environment that allows you to reconstruct data in the most optimal way when needed forms the foundation for trial and error analysis.

The key to utilizing this technology is linking generative AI with internal company data. By combining the general knowledge of the AI with detailed information held by the company, classification accuracy based on the company's own criteria can be ensured. The key to increasing the usefulness of data is to link information while complementing its context, rather than treating it fragmented.

Incorporating generative AI into your data infrastructure is a process that gives new meaning to your existing data. By combining your own data with external intelligence, you can analyze it from a completely new perspective. Building this collaborative system is what keeps transforming your data into valuable information.

Saison Technology Online Consultation

Saison Technology Online Consultation

If you would like to hear more about our data utilization platform, we also offer online consultations.

Make an online consultation

The person who wrote the article

Affiliation: Data Integration Consulting Department, Data & AI Evangelist

Shinnosuke Yamamoto

After joining the company, he worked as a data engineer, designing and developing data infrastructure, primarily for major manufacturing clients. He then became involved in business planning for the standardization of data integration and the introduction of generative AI environments. From April 2023, he will be working as a pre-sales representative, proposing and planning services related to data infrastructure, while also giving lectures at seminars and acting as an evangelist in the "data x generative AI" field. His hobbies are traveling to remote islands and visiting open-air baths.
(Affiliations are as of the time of publication)

Recommended Content

Related Content

Return to column list