Generative AI opens up "data democratization": How to break through the "two barriers" that prevent all employees from using data

  • Data Utilization
  • Generation AI

The use of data in business is now essential. However, several challenges stand in the way of achieving "data democratization," where all company employees use data in their respective work. In this article, we will explore specific approaches to breaking through these two barriers, using a case study from our company where we used generative AIto create a system that allowed employees who tend to be overwhelmed with work to smoothly utilize data.

Why data utilization is important in business

With the development of big data and cloud computing, companies are now using digital platforms to generate huge amounts of data. With the rapid changes in the market, leveraging data to make quick decisions and gain a competitive advantage has become an important trend in modern business.

There are several reasons why data utilization is important in business.

  • Improved decision-making accuracy:
    Data-driven decisions are more reliable than emotions or intuition, and developing strategies based on concrete information reduces risk.
  • Deeper customer understanding:
    By analyzing customer behavior data, it becomes possible to understand their needs and behavioral patterns, enabling the provision of personalized services and products, thereby contributing to improved customer satisfaction.
  • Increased efficiency:
    Through data analysis of business processes, waste and bottlenecks can be identified and measures to improve efficiency can be found.
  • Gain a competitive advantage:
    By effectively utilizing data, you can understand market trends and develop strategies ahead of your competitors.
  • Creating new business models:
    Through data analysis, it is possible to discover untapped markets and ideas for new services, which can lead to business growth.

Data utilization shifts from specialists to general employees

Traditionally, data analysis has been primarily performed by specialized data scientists and analysts.
These experts had advanced skills and knowledge and were responsible for collecting, analyzing, and visualizing data. As a result, access to and analysis of data was limited to a select few experts, and many employees and departments were unable to fully utilize data. As a result, decision-making often relied on experience and intuition rather than data.

Against this background, the idea of "democratizing data" began to spread.
When more people have access to data, they can make faster, more informed decisions, and using data from different perspectives will generate new ideas and solutions, stimulating innovation.
Furthermore, making data open leads to a deeper understanding of processes and results, and expanding data analysis skills improves individual capabilities, contributing to improved performance across the organization.

perspective Traditional Data Analysis Democratizing Data
Analyst Limited number of experts (data scientists, analysts) Regular employees
merit - Highly specialized experts analyze data accurately and precisely
- Use advanced analytical tools and techniques.
- Decisions are faster and more data-driven
- Using data from a variety of perspectives makes it easier to generate new ideas
Disadvantages - Only a limited number of people have access to the data, making it difficult to utilize across the organization
- At the employee level, decision-making relies on experience and intuition
- The accuracy of the analysis may be low (because we are not experts)
- There may be differences in the level of data understanding among employees
- Risk of inadequate data usage and security
Speed of data utilization - It takes time to get the analysis results - Rapid data utilization for immediate decision-making and action
Innovation - Expert analysis is the driving force, and improvement suggestions and new ideas are limited - As data is used by various departments and employees, the diversity of ideas and solutions increases.
Skills within the organization - Data analysis skills are concentrated in a limited number of specialists - Develop data analysis skills and individual capabilities across the organization
competitiveness - It may delay efficient decision making, which may affect competitiveness - Improved performance across the organization by empowering all employees to make decisions using data to gain a competitive advantage

As such, "data democratization" is not just a trend, but a key strategy for building efficient and competitive organizations. Companies that proactively utilize data and create an environment in which all employees can realize their potential will be successful in the future.

Data infrastructure supporting "data democratization"

To advance the democratization of data, we need an environment in which data can be accessed and used by all users.

A poor data foundation means data is spread across multiple systems and formats, making it difficult to find the information you need. This leads to inconsistent data and conflicting information from different sources, increasing the risk of poor decision-making.

In addition, analytical work becomes inefficient, making it difficult to make quick decisions. Furthermore, inadequate data management increases security risks such as information leaks and unauthorized access, and vague rules regarding data usage weaken governance and make compliance and quality control difficult. These issues have a significant impact on business growth and efficiency.

Although the system configuration of data infrastructure varies depending on the purpose and characteristics of the company, much of the know-how is believed to be useful to all companies. In particular, know-how such as fostering an organizational culture that promotes effective data utilization is extremely important, and despite aiming for company-wide utilization, the completed data infrastructure often requires IT skills and data literacy, meaning many employees are unable to benefit from it, and the goal is often far from being achieved.

The goal is not to simply build a data infrastructure; rather, the real start is what happens after it is built. The true value of data is tested when employees utilize it.

Examples of companies that have worked to "democratize data"

Saison Technology Co., Ltd. has launched a company-wide data-driven project with the goal of "democratizing data," that is, creating an environment where all employees can voluntarily use data to improve their work. As part of the project, they built a "Data-Driven Platform" (hereinafter referred to as DDP), a platform equipped with all the functions needed to "collect, store, search, and utilize data."

This article introduces the process from project launch to planning, construction, utilization, and establishment, as well as points to note and the results of using DDP.

In the DDP project, we believed that the true requirements lie in the minds of people who understand the pain points of business, so we had people from the business department participate in the project from the early stages.

It is possible for engineers alone to build a data infrastructure, but in that case, the objective becomes simply to "collect" and "store," making it difficult to achieve the true goal. Providing a "truly usable" data infrastructure requires detailed interviews with business personnel and extracting the necessary and sufficient data. However, to gain the cooperation of business personnel who are busy with their daily work, ingenuity in project management is required, such as reaching out to the heads of the relevant departments and managers and demonstrating specific benefits that can be obtained.

To create a data infrastructure that all employees can use with confidence

Even if "anyone can view it," full openness should be avoided. How to protect personal information and data that cannot be disclosed under contract is extremely important. Imposing too many access restrictions from a governance perspective could hinder motivation to utilize the data.

It is also important to create mechanisms to encourage effective use of the data infrastructure. In all of our initiatives, we are conscious of the idea that "data is a shared asset for employees, and its value is shared." Specifically, we have undertaken the following measures to increase motivation and encourage employees to try using data.

  • Setting up an information aggregation site to prevent users from getting lost
  • Development templates and tutorials for users
  • Provide education and support to employees who are struggling with technical aspects, helping them achieve their goals.
  • Foster a culture of sharing best practices and knowledge and encouraging users to praise each other.

Two barriers to data-driven

A year has passed since the Data Driven Platform (DDP) was rolled out company-wide, and the number of users has exceeded 30% of all employees, but it has still not reached the number originally expected. Understanding this situation as "there must have been an issue slowing down user adoption," the company conducted user interviews to investigate the cause, and the following issues came to light.

  • Being busy with your daily work and not having time to learn tools or SQL
  • The tool is over-specced. I want to automate and visualize it more easily.
  • I know the data exists, but I can't find the data I want
  • I want information and examples that lead to concrete actions.
  • I don't know how much I can trust the data

Broadly categorizing these five challenges, we can see that there are two barriers:
"The barrier to data utilization skills"
"The barrier to understanding data"
is.

To overcome these two barriers, the DDP Project is launching a new initiative called "DDP 2.0." The goals of DDP 2.0 are as follows:

  • Making it possible to handle data without knowledge of tools or SQL (overcoming barriers to data utilization)
  • Providing users with the information they want (metadata) (overcoming the barrier to understanding data)

Breaking through the two barriers that slow down company-wide data utilization with generative AI

To overcome the two barriers that slow down user adoption - the barrier of data utilization and the barrier of data understanding - the project focused on "generative AI."
Let's take a look at the approach Saison Technology took using generative AI.

Taking data utilization to the next level with "generative AI"

"Generative AI," represented by ChatGPT, Gemini, Claude, and others, has the ability to understand natural language and create sentences and conversations.

We thought that by utilizing this feature of "making inquiries in natural language and generating answers," we could overcome the two barriers that stand in the way of data-driven approaches: "the barrier of utilizing data" and "the barrier of understanding data."

However, because generative AI is trained on publicly available data on the internet, it cannot provide answers based on specific personal information not included in the training data, confidential company data, or the latest real-time data.

The approach used to overcome this challenge is called "RAG." RAG stands for "Retrieval-Augmented Generation," and is a method that combines information retrieval and answer generation. It works through the following process:

  1. Information retrieval: Searches external databases for relevant information based on a user question or request.
  2. Answer generation: Based on the searched information, the generative AI generates answer text in natural language.

This technique allows AI to generate answers based on information contained in external data bases that it would not otherwise have access to, resulting in answers that are more reliable because they are based on more up-to-date and relevant information rather than simply relying on the model's knowledge.
Using RAG, we developed an application that overcomes these two barriers.

An application that transcends two barriers

"ChatDDP" allows you to access data infrastructure using natural language

First, to overcome the "barrier to data utilization," it is necessary to be able to handle data without knowing tools or SQL. To achieve this, we developed a system called "ChatDDP" that allows access to the data infrastructure in natural language.

In ChatDDP, when a user makes an inquiry in natural language, the generation AI understands the content and identifies the required data. It then generates an SQL statement to search for information from the table where that data is stored. This SQL is executed against the data base by the information retrieval system, and the generation AI obtains the data necessary to create an answer. The obtained data is then sent back to the generation AI, which generates an answer in natural language.

This allows users to get reliable answers based on the data stored in the database without using SQL or tools.

Demo video: ChatDDP Demo

"DDP Catalog" for understanding data and utilizing DDP

Next, to address the "barrier to understanding data," we felt that a "system that supports data utilization by interacting with users and staying close to them from search to output" was necessary, and so we began developing our second system, "DDP Catalog," following on from ChatDDP.

With DDP Catalog, when you inquire about the location and meaning of data in natural language, you will be provided with information on related tables and items, as well as links to Tableau dashboards and case studies that use the data as a source, and URLs for materials.

For example, if a user asks, "Where can I find Tableau that shows pipelines?", DDP Catalog will provide a URL for Tableau Cloud that shows the pipeline, along with the data that Tableau is referencing and the SQL that it is executing. In addition, users can edit and re-run parts of the SQL that is shown, download the data locally, or instruct Tableau to generate graphs using that data.

This makes it easy for users to verify the thought process derived by the generative AI and modify the parameters based on that thought to test new hypotheses, without having to worry about technical terms such as where the data is stored or the names of items in the system.

Demo video: DDP Catalog Demo

summary

In this article, we have introduced the things to consider for "data democratization," which enables all employees to utilize data, the steps involved, and how to overcome the obstacles that stand in the way of "data democratization," along with specific examples.

To achieve business growth through company-wide data utilization, the goal is not simply to build a data infrastructure; the core is how to utilize it after it is built.

At Saison Technology, we use our experience in DDP projects to organize issues, verify effectiveness, and accumulate knowledge, and we provide our customers with know-how based on that experience. Utilizing our experience and knowledge gained from in-house implementation, we are able to identify customer issues early and solve them effectively.

If you are interested in not only establishing a data infrastructure, but also utilizing and establishing it, and even efficiently introducing generative AI, please feel free to contact us.

The person who wrote the article

Affiliation: Marketing Department

Seiji Hosomi

After working in systems development for around 10 years at a system integrator in Tokyo, he joined Appresso (now Saison Technology) in 2016. After working as a development engineer and then project manager for the data integration software DataSpider, he is currently in charge of data utilization in the marketing department. Drawing on the IT system utilization experience he gained during his time as a systems engineer, he supports customers in "data utilization" and "digital transformation."
(Affiliations are as of the time of publication)

Recommended Content

Related Content

Return to column list