The Tale of Genji also talks about data analysis
This is Watanabe from the Marketing Department. This is a column where I casually write about various topics related to data and IT.
On the topic of "The Tale of Genji": I'm sorry to announce that I'm writing about it in 2025.
This time I'd like to write a little about The Tale of Genji. The Tale of Genji (or Murasaki Shikibu) was the theme of the 2024 Taiga drama. I'm a little embarrassed to say this, as it's now the fall of 2025.
Even in works like "The Tale of Genji" and "Murasaki Shikibu," which seem to have nothing to do with data, I talk about how there are topics related to data analysis. So many things in the world are related to data.
The Mystery of the Tale of Genji
Just to be clear, I will first explain a little about what The Tale of Genji is.
The Tale of Genji is a long story written in the Heian period and is a work of classical Japanese literature. Its author is said to be Murasaki Shikibu. The Tale of Genji is not only widely known in Japan, but in the 20th century (the 1920s before the war) an English translation by the British author Arthur Waley gained immense acclaim, making it a literary work that is widely known around the world.
Murasaki Shikibu seems to have been a "capable person" in a wide range of fields. Records from that time show that she was a master of waka poetry, with her works remaining in the Hyakunin Isshu collection, and that she was also knowledgeable in Chinese poetry and Chinese classics, which were rare for women to understand at the time, as well as history.
Unfortunately, the original copy of The Tale of Genji has already been lost, and it is no longer possible to know what the original was like. The oldest remaining copy is a copy from the Kamakura period, and from this it is believed to have been a long story consisting of 54 chapters (54 volumes) and over 1 million characters.
What is History?
Now, as for The Tale of Genji and Murasaki Shikibu, since they happened over 1,000 years ago, there are many things we don't know clearly about them, which means they fall within the realm of history.
When talking about history, I think it's usually just "talking about historical romance." For example, it tends to be about learning how to live from great historical figures, or theories about what they imagine to be the truth of history (e.g., the theory that Tenkai, the brain behind Tokugawa Ieyasu, was actually Akechi Mitsuhide).
There is nothing wrong with doing that, but talking about oneself in that way does not work as an academic discipline, so the study of history involves pursuing the facts of the time based on evidence.
Evidence includes the contents of "ancient documents that remain to this day," the contents of local traditions, "things that have come up during archaeological excavations," etc. Rather than saying that Oda Nobunaga must have thought this, the approach is to say that there are letters that Oda Nobunaga wrote to his retainers that have been preserved, and in these letters there is a passage in which he is unsure what decision to make and is consulting with them, and from this it is suggested that at the time he sent the letters, Nobunaga had not yet made a decision and had not yet moved his troops from the castle.
However, it is necessary to verify whether the ancient documents are authentic, and in the case of Oda Nobunaga, he wrote them with a purpose and within the scope of his own understanding, so what is written cannot be taken as an objective fact of the time (in fact, I hear that ancient documents are often full of bias from the various positions and camps), and interpretations taking various factors into consideration are also necessary.
Current "data utilization"
There is currently a growing trend towards utilizing data. Companies are undertaking various initiatives, but the differences between "common history storytelling" and "historical studies," which we have discussed so far, may provide clues for thinking about what data utilization entails.
It is difficult to interpret data appropriately, and it is possible to read data in an artificial way, so taking a stance that something is absolutely correct just because it is based on data is also problematic.However, I think it is possible to think that in business too, it is possible to have ``discussions that infer facts based on evidence,'' and that IT and data can be used for this purpose.
The Tale of Genji
What is said to be true about The Tale of Genji is based on what was happening at the time, as estimated from historical documents that remain today. However, since it was over 1,000 years ago, there are many things that are not clear.
First of all, the original version of The Tale of Genji has already been lost. At the time, there were no copy machines or printing machines, and books were reproduced by "copying" (by reading and copying by hand). Therefore, it is sometimes difficult to know what the original was like before it was copied so many times, or what its structure was like in the first place.
Furthermore, the Tale of Genji itself does not say "Author: Murasaki Shikibu," but due to various circumstances, it is believed that Murasaki Shikibu was the author. For example, it might look something like this:
- The original manuscript of The Tale of Genji has been lost, but the surviving copies are of a 54-chapter story (although it is unclear whether it was originally composed of 54 chapters).
- The Tale of Genji itself does not contain any description such as "Author: Murasaki Shikibu"
- Since there are many records from that time that show that there was a talented woman named Murasaki Shikibu (for example, there is a waka poem written by Murasaki Shikibu in the Hyakunin Isshu), we should assume that Murasaki Shikibu actually existed.
- There are various records that say that Murasaki Shikibu wrote The Tale of Genji, so it is assumed that she was the author. However, there is no evidence to prove that she actually wrote the entire story herself.
It feels a little different from the clarity we learn in school, that "Murasaki Shikibu is the author" (if we write that we get a perfect score on a test).
Since it's not clear, it's possible to insist that "there is another true author of The Tale of Genji!" However, if you ask whether there is sufficient evidence to support this theory compared to the conventional theory, the answer is "No, there isn't." However, if you ask whether it can be said with certainty that Murasaki Shikibu was the author, the answer is no.
This tends to lead to vague statements such as "we presume that this is probably the case." While vague statements are not always well-received in society, I believe this attitude is also a "cautious, evidence-based stance."
Negative Capability
Evidence-based logical discussion is desirable in business endeavors as well, but such thoughtful thinking tends to result in something vague and vague, such as "The author is probably Murasaki Shikibu, but other possibilities cannot be ruled out."
These days, "easy-to-understand, definitive conclusions" tend to be popular. In response to this trend, it has become common to hear that "negative capability" (the ability to leave things uncertain or unresolved) is important.
Also, if you exaggerate and insist that "The Tale of Genji is a fake book fabricated after the Meiji Restoration" because "it is not clear and cannot be completely denied," it becomes a sensational and perhaps even exciting story, and the vague feeling is cleared up all at once (however, there is no evidence), and it can feel like the long-awaited truth has been revealed.I think this is the true nature of the "conspiracy theories" that have been causing problems recently.
However, in reality, even as we learn more (more evidence), the feeling of unease still remains, and I think we have to accept that conclusions (or those derived from data) are often ambiguous.
The importance of "preserving history" and "preserving data"
It also makes me think about the importance of leaving history (evidence) for future generations. For example, the oldest surviving Japanese story is said to be "The Tale of the Bamboo Cutter," but it is not clear who wrote it. The Tale of Princess Kaguya is very well known throughout Japan, but sadly, it is no longer clear who wrote it and how. This may be a great loss for Japan.
In comparison, much more is known about the time of The Tale of Genji and Murasaki Shikibu. The reason we know so much about that time is because many Heian aristocrats kept diaries at the time, and their writings remain as ancient documents.
In particular, there was Fujiwara Sanesuke (the aristocrat played by Robert Akiyama in the historical drama), who was so obsessed with keeping diaries that he wrote down a great deal of details about the Imperial Court, and thanks to his diary, "Shoyuki," we know much more about that time than any other period. We must be grateful to Robert Akiyama.
"Records have been lost and we don't know what things were like back then" is not something that only applies to the past. For example, in the Showa era, when Japan was rapidly developing in the manufacturing industry, there are cases where we no longer know what home appliances were like back then because there are no actual items or records left. Also, even in the post-Internet era, personal websites that were all the rage in the 2000s have now been lost as free homepage services have been discontinued.
When it comes to corporate data utilization, we often hear people say, "Keep your data properly stored in a data lake, etc.", or that if you don't organize and remember it, you may lose track of even important things from a short time ago. For example, when you try to completely renew a product for the first time in a long time, you may not understand how the mysterious and meaningful product specifications of the previous version were decided, and you may not be able to decide whether to discontinue or change the function.
"Who is the author?"
For this reason, there are still many things that we do not know in essence, but one mystery that is sometimes discussed surrounding The Tale of Genji is the final section: "Was Murasaki Shikibu the author of the Uji Chapter?"
Mystery: Who is the author of the "Uji Juyoshu"?
There has long been criticism that the writing style seems to change from the end of The Tale of Genji, where the setting of the story becomes "Uji in Kyoto" (the Ten Chapters of Uji), and that it must have been written by someone else. When people read it, many thought, "But the impression is different from this point on."
However, "I think so" is not an evidence-based opinion. There are also old documents that say "Maybe the author is different?" or "Perhaps the author was Murasaki Shikibu's real daughter, Daini Sanmi (who is also recorded at the time as a talented woman)." This can be said to be evidence that "people in the past were suspicious," but it is not enough to determine who the author actually was.
This topic remained a hopeless mystery for many years, but in the 20th century, a new method was developed to investigate it. Using The Tale of Genji as text data, researchers began to quantitatively analyze the characteristics of the text and examine whether it could be considered to have been written by the same author.
The Tale of Genji is a field that may seem unrelated to data or IT, but data analysis is being used as a new means to resolve questions that have existed for almost 1,000 years.
However, data analysis does not seem to have reached a clear conclusion. There are research results*1 that show that there are statistically significant differences when comparing the Uji Juyoshu with the earlier parts, but on the other hand, there are also research results*2 that show that there are not enough differences to say that they were written by different people.
- ⇒Author estimation based on stylistic statistics - On the author of The Tale of Genji and the Uji Chapters (1958)
- ⇒The authorship of the Tale of Genji and the Uji Chapter: A quantitative linguistic approach (1997)
If you think about it, it's true that the same person can write in different styles, and writing styles change over time. Also, you might find that a text written by someone else has similar ideas or styles to your own, or, for example, there are people who are good at "writing in the style of Haruki Murakami," and because they have a great deal of respect for Murasaki Shikibu (or they tried hard to make it feel natural for readers to read it as a sequel), their work might end up sounding similar to what someone else wrote.
However, it seems that it has been made objectively clear that there are differences. However, there is still room for debate as to whether the works were written by someone else (for example, Murasaki Shikibu's daughter) or whether her own writing style changed (for example, Murasaki Shikibu wrote the Uji Juyoshu separately in her later years).
Who is Shakespeare really?
There are other efforts like this one that use data to uncover the identity of the "mysterious author."
William Shakespeare left an enormous mark on British and world literature, and while his works are extremely famous, there is very little information about him, not even a diary remains. Also, the deep knowledge implied in his works does not match up with his background (he did not attend university), and this has led to suspicions that the works were written by another famous person, who used a different name when writing.
Rather than being a mainstream theory, it is a bit of a dubious story from an academic perspective, but it is an interesting topic that is popular with the public, and many people around the world have come forward with their own theories, such as "That's the real thing!" (There are so many theories that it's almost endless.) Research is also being conducted here using data analysis.
As a result, recent papers have suggested that some of Shakespeare's works may have been co-written with other authors.
- ⇒ Shakespeare's "Henry VI" was co-written, big data proves | Reuters
- Big data proves Shakespeare was a co-writer... (Full article) | Daily Shincho
It is a hot topic that a work that is thought to be a "co-authorship" with Christopher Marlowe, a playwright from the same era who has long been suspected to be Shakespeare's real identity (Christopher Marlowe himself is not Shakespeare's real identity), has been revealed. The paper was only published in 2016.
However, it does not appear that the claims made in this paper have been widely agreed upon within the academic community and that all is settled, and the debate is likely to continue.
The true identity of Satoshi Nakamoto, the creator of Bitcoin
The creator of Bitcoin is an unknown individual who calls himself "Satoshi Nakamoto." Many people have said that "he is the real Satoshi Nakamoto."
There were some people who were widely reported in the mainstream media as having been identified (such as Dorian Satoshi Nakamoto), but it seemed that many people thought that it couldn't possibly be him, and in the end it was deemed a false report and the reports were subsequently dropped, leaving the identity of the person unknown.
There have been rumors that the culprit is a Japanese person (such as Professor Shinichi Mochizuki of the Research Institute for Mathematical Sciences at Kyoto University and Tatsuaki Okamoto, an authority on cryptography at NTT Basic Research Laboratories), but these are also hard to believe in the first place.
Two theories that I think are plausible are that the real identity is the late Hal Finney, and that the real identity is Nick Szabo, the creator of BitGold, which was trying to create a system similar to Bitcoin.
Hal Finney, who is now deceased, was a neighbor of Dorian Satoshi Nakamoto and was involved with BitGold, a project similar to but predating Bitcoin.
Nick Szabo wrote the original paper on BitGold, and it is said that the "text characteristics" of a famous important document in which Satoshi Nakamoto outlined the basic idea of Bitcoin (the link below is not the original text, but a Japanese translation of it) match those of Nick Szabo.
- ⇒Bitcoin: P2P electronic currency system Satoshi Nakamoto
- ⇒ Who is the real Satoshi Nakamoto? A researcher may have found the answer | TechCrunch
The Data Era
These days, we often hear about the age of data, but I hope to have shown that data utilization does not just mean "doing things that look like data analysis," but that data is used in a very wide range of fields in the world, and is used in ways that are in line with the practical needs of those fields.
As examples of this, I introduced how data analysis plays an important role in the long-standing mystery (or debate) surrounding the Tale of Genji and in the debate over the true identity of Shakespeare.
Generally speaking, "The Tale of Genji" doesn't give the impression of being technological at all. When you compare the various fields you are involved in on a daily basis and the tasks you are responsible for, most of them seem more technology-related (or practical) than research on The Tale of Genji. If even The Tale of Genji uses data, then there must be plenty of room for data utilization in other fields.
What you need to "take on data utilization"
As you can see, we are now in an age where we need to consider data utilization in every field, but it is impossible for all researchers in The Tale of Genji to master Python and become proficient in using AWS. We are in an age where everyone uses data, but it is also impossible to expect everyone to acquire a full set of data utilization skills.
Or, to put it another way, there are actually other things you need besides a full set of data skills, and these shouldn't be overlooked if you want to produce useful results.
Even if you don't have a full set of data analysis skills, by using self-service BI tools that you can use yourself and prepare the data necessary to conduct analysis yourself (which is often more difficult than the analysis itself), people who are well-versed in the needs of data utilization can take on the task of utilizing data themselves.
That's where our products come in. Our products (DataSpider and HULFT Square) allow you to prepare the data needed for analysis from a wide variety of sources without coding, and also allow for flexible pre-processing. By using these products, you can utilize data yourself or at the initiative of your workplace.
If "data analysis of The Tale of Genji" seems "surprising," then in business terms, it's an unexpected and formidable move for your competitors. There are surely many ways to utilize data. If they think, "I can't believe they're using data analysis in such a place," then you will naturally have an advantage. Please consider our product.
