What is metadata? FAQ-style explanation from the basics to the latest trends
Metadata is an essential concept for defining, managing, and efficiently utilizing data.
For example, metadata for a book would include the title and author, and content such as photos and music would also include additional information such as the date the content was taken and the genre. This metadata makes data search and reuse easier, helping you to quickly find the information you need.
This article summarizes information about metadata. We hope that it will help you establish a system for consistent metadata management across your organization and realize efficient information utilization!
Q1. What is metadata in the first place?
メタデータとは「データに関するデータ」、例えば文書の作成日時や作成者、ファイル形式など、データの内容や性質を説明するための付加的な情報のことを指します。Q2. What types of metadata are there?
メタデータは利用目的や内容によって多様に分類されます。ビジネス的観点から定義されるか、技術的視点から定義されるかなど、扱う情報の切り口で種類が変わるのが特徴です。一般的には以下の6種類に分類されます:
- Descriptive metadata: Attribute information that describes data. Its purpose is to organize and manage basic information about the content, such as title, author, and summary. It is particularly useful for improving data searchability and visualization.
- Structural metadata: Shows the structure, hierarchy, and relationships of data. The development of structural metadata is particularly important if you want to use data across multiple platforms.
- Administrative metadata: Information such as storage format, access rights, and update history. This is an important element to consider in order to strengthen an organization's internal controls.
- Preservation metadata: Information added for archiving and backup purposes, such as the data retention period and migration destination. This is essential for overseeing the entire information lifecycle and ensuring proper maintenance and management.
- Technical metadata: Technical details about data files, such as file format, encoding information, storage location, etc. This has the advantage of making it easier to adapt to system changes and upgrades.
- Business metadata: Information that defines business operational policies and procedures, such as business rules and glossaries. For example, clarifying the definitions of terms like "customer" and "product" facilitates communication within an organization and prevents misunderstandings when using data.
Understanding the characteristics of each and determining which types of metadata are important to your organization is the first step to more efficient data management.
Q3. In what situations is metadata used?
活用例は幅広く、以下のような分野で使われています。
- How search engines understand your pages
- Database design and management
- Database design and management
- Corporate information governance and audit compliance
- Media archive (images and videos)
We explain why metadata is gaining attention in companies and organizations.
To properly manage and effectively utilize massive amounts of data, it is necessary to prepare not only the data itself but also the metadata that represents it. For example, in a rapidly growing big data environment, without a clue to determine which data should be used and how, valuable assets cannot be utilized. Utilizing metadata makes it easier to search and reuse data, improving the speed and quality of business activities such as analysis and reporting.
Q4. Who creates metadata and when?
目的によって異なりますが、
- Automatically generated information (such as camera Exif information)
- Manually added by the creator (author information of a paper, product information, etc.)
- Those maintained and managed by system administrators and data governance personnel
There is.
Q5. What are the problems caused by lack of metadata?
メタデータが不十分だと、
- Can't find the data you need?
- I don't understand the meaning of the data
- Cannot be used or reused
Problems such as The value of data will be lost.
Q6. What is the difference between metadata and master data?
- Metadata is information about the data itself (attributes, structure, history)
- Master data is the core information of business (the body of data on customers, products, employees, etc.)
Metadata has different roles and purposes, and it can be said that it helps give meaning to master data.
Q7. How can I streamline metadata management?
組織での効率的なメタデータ管理には、
- Introduction of dedicated metadata management tools (such as data catalogs)
- Clear naming rules and description format
- Establishment of a system of responsibility, such as data stewards
is effective.
Basic process of metadata management
With these seven FAQs in mind, we'll now introduce the step-by-step process you need to follow to manage your metadata.
Metadata management requires a step-by-step process, from strategic planning to daily operations. Systematically establishing these processes will help organizations maximize their data resources. It's important to understand the best practices at each step and to take an approach that allows for continuous improvement.
Developing a metadata strategy
This is the stage where you plan the overall picture and operation of metadata management, taking into account your organization's business goals and performance indicators. Important decisions are made here, such as which department will lead the effort and how much budget and personnel will be allocated. If the direction is not clear here, the management scope and scope tend to become unclear in later steps.
Understanding metadata requirements
Next, clarify what information you will actually collect and manage as metadata. The key is to comprehensively identify the attributes required by data users and the perspectives required in business processes. Having an accurate understanding of this will make it easier to achieve just the right amount of metadata management.
When organizing metadata, clarifying the essential items will help you operate efficiently. The minimum items you should keep in mind when writing metadata are as follows:
- Data Name
- Data Types
- Unique ID
- Creation date
- Update date
- Author or contact information
- Data Description
Creating Metadata
Based on the requirements, metadata is actually written and collected. It is common to use a data catalog or schema design tool to manage the data in a consistent format. The accuracy of the work done here will affect the entire subsequent operational flow, so it is desirable to have someone with specialized knowledge involved.
Metadata quality control
Regularly audit and review metadata to ensure it is up-to-date and accurate. Detecting and correcting issues such as missing updates, duplications, and typos early on can increase the reliability of data utilization. Creating a system for regularly inspecting metadata is especially important for large organizations.
Ongoing operation and maintenance of metadata
The final phase involves updating metadata, managing access, and upgrading operational tools on a daily basis. Metadata changes as data increases and business operations change, so continuous review is required. If this phase is neglected, the management system that has been painstakingly built may no longer function.
Standards and Schemas in Metadata Management
By learning about international metadata standards and schemas, you can achieve more compatible operations.
When companies and organizations handle metadata, utilizing standardized schemas and international standards dramatically improves operational efficiency. In particular, in the fields of geographic information and library and information science, where there are a wide variety of existing standards, utilizing them is recommended to ensure interoperability. Incorporating these international standards will facilitate smooth data integration between different organizations and systems.
▼I want to know more about data integration
⇒ data integration / data integration platform | Glossary
ISO 19115
It is an international standard used when handling geospatial information, and is responsible for expressing the quality and content of geographic data in a common format. When mutual use and sharing of GIS data is required, adopting ISO 19115 makes it easier to align data metadata. The larger the geographic information project, the greater the benefits.
JMP(Japan Metadata Profile)
This profile is based on ISO 19115 and is customized to suit the usage conditions and culture in Japan. It is often used when introducing GIS in Japanese government agencies and public works projects, and is recognized as a domestic standard. It is compatible with ISO 19115, but is designed to be easier to operate.
Other international standards and schemas
Dublin Core is widely used for general documents, and DataCite is used to manage citation data for academic papers and other documents. Implementing these international standards makes it easier to data integration with other systems and enables a global common language. It is important to select a standard that suits the data characteristics of your company or organization.
Benefits of using metadata
We will introduce the specific benefits that can be obtained by managing and utilizing metadata in an organized manner.
Systematically managing metadata not only improves the efficiency of data management, but also brings a wide range of benefits, including improved productivity across the entire business, enhanced security, and optimized system operations. Establishing the skills to utilize metadata early on as part of an organization's growth strategy and digital transformation promotion will enable a smooth transition to a data-driven business model.
Improving data utilization efficiency
Metadata organizes the location and meaning of data, allowing you to quickly reference the information you need. This allows you to quickly find the data that forms the basis of your analysis and reduces duplicated work. As a result, the data analysis cycle is shortened, increasing the speed of your organization.
Enhanced data security
Managing access rights and log information as metadata makes it easier to prevent accidental data leaks and unauthorized use. Utilizing audit logs also allows for a smooth response in the unlikely event of an incident. Security measures using metadata are essential, especially in industries that handle personal and confidential information.
Optimization of system development and operation
The development of metadata is also extremely useful when developing new systems or expanding existing systems. It makes it easier to accurately understand the specifications and structure of the required data, reducing errors during the requirements definition and design phases and reducing implementation costs. It also enables quick and accurate responses to changes and problems during the operational phase.
Latest Trends in Metadata
With the advent of the AI era, metadata has evolved beyond its role as simply "data on data" to become "the driving force behind data utilization."
Finally, I would like to introduce three recent trends that have been attracting attention in recent years.
Ingest-Native AI: Automatic metadata generation at the point of ingestion
Until now, metadata has generally been created manually after content creation or by post-processing using tools. However, in recent years, an approach called "Ingest-Native AI" has been attracting attention.
This is a technology that uses AI at the ingest stage of video and audio content to automatically generate metadata such as tags and descriptions.
For example, European media company DPG Media uses generative AI services like Amazon Bedrock and Amazon Transcribe to automatically:
Extracting character names and topic keywords from video audio
Categorize genres, moods, etc.
Generated metadata can be used directly for archive searches
This initiative has reportedly significantly streamlined video editing and send workflows and significantly reduced the amount of manpower required.
▼I want to know more about generative AI
⇒ Generative AI | Glossary
"CAI/C2PA" guarantees the authenticity of content
As the distribution of fake images and videos created by generative AI becomes an issue, there is a need for a system that can use metadata to determine whether the content is trustworthy.
Against this background, the international standards "CAI (Content Authenticity Initiative)" and "C2PA (Coalition for Content Provenance and Authenticity)," jointly developed by Adobe, The New York Times, Microsoft, and others, are attracting attention.
These efforts involve attaching tamper-evident metadata, known as "Content Credentials," to content, such as:
Photographer/Producer
Editing history (what was edited and how)
Date/time/location information
Tamper detection using digital signatures
This will allow anyone to check the "origin" and "editing traces" of images and videos.
Adobe Photoshop, Behance, Cloudflare, and others have already begun supporting this feature, and it is expected that news media and social media platforms will also adopt it in the future.
Attempts to ensure immutability of metadata using blockchain
Because metadata is at risk of being tampered with or lost, it is important to accurately record who added what information, when, and manage it in a reliable manner.
This is where metadata management using blockchain technology comes in.
Currently, some projects, including C2PA, are considering and implementing mechanisms to record metadata and content on the blockchain to guarantee authenticity and history.
- Tamper detection
- Copyright Protection
- Access Rights Management
From this perspective, long-term reliability is expected to be ensured.
Although commercial use is still limited, there is growing activity aimed at applications such as NFTs, copyright management, and verifying the authenticity of public data.
Saison Technology's metadata management solution "HULFT DataCatalog"
HULFT DataCatalog automatically collects and catalogs metadata about various data managed in a distributed manner within a company. By visualizing the location and history of data and sharing knowledge about the data, it helps to streamline data exploration and promote understanding of the data's contents.
summary
Metadata is an essential element for organizing and visualizing data, and is the driving force behind the effective use of a company's data assets. Metadata is no longer simply "supplementary information"; it has evolved into a fundamental technology that guarantees the reliability of information and enables efficient use. By preparing the right types of metadata for each purpose and building appropriate management processes, you can strengthen the foundation for analysis and digital transformation. Optimizing your metadata strategy to meet your organization's needs and constantly seeking to integrate it with the latest technologies will determine your success in the data-driven society of the future.
