What Are The Steps To Write An Effective Essay?

Almost every scholar faces the same issue while writing an essay and that is “What steps should I follow while doing my essay writing assignment”? When I was a scholar I faced the same issue too. It…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Deciphering Edo Period Japanese Scripture with Machine Learning

As human society and technology evolves, language grows and continuously adapts to reflect changes in culture and communication styles. Since the modernization of Japan in the 1900’s, the Japanese have, by and large, lost the ability to read kuzushiji, the script used between the 9th and 20th centuries. Over 3 million books, literature and drawings have been preserved today, many of which are from the Edo period. Only a small fraction have been translated since there are few remaining scholars able to translate these texts.

Since kuzushiji was not standardized, words were often written in many styles and formats. There are likely well over 5000 unique characters in the language, which highlights the difficulty of making translations. With rapid advancements in machine learning, specifically in image recognition, a number of researchers and machine learning practitioners have built algorithms to help historians identify text and digitize content to unlock the history hidden in these historical documents.

So, how are researchers able to digitize text with machine learning? The process of developing a machine learning solution starts with the data. The National Institute of Japanese Literature (NIJL) created and released a kuzushiji dataset containing 1 million images of cropped handwritten characters and their kanji equivalents (a modern Japanese language), which was curated by the Center for Open Data in the Humanities (CODH).

I’ll walk through one method of achieving this which is not unlike how people process images and text. The scope is limited to detecting and recording each character found on a page.

For a machine to be able to translate a whole page of kuzushiji text to modern day kanji, some preprocessing is needed to first identify whether a marking or object in the image is a kuzushiji character or not. This is similar to when we quickly scan an image with our eyes, figure out if text is present and if so, which language it is. Machine learning systems solve the problem of detecting the presence of text on a page by training on the language script and inferring its visual characteristics. This process is referred to as object detection and there are multiple ways to achieve this.

The heat-map on the right shows areas of high intensity and low intensity. The less intense (darker) areas have a lower probability of being a kuzushiji character and are ignored. The higher probability (lighter areas) areas are considered centers. They are segregated and passed onto the next step: character recognition.

The last step is to stitch the outputs together and annotate the original historical image with character annotations appearing in the correct positions and recording the character digitally in readable order.

Machine learning image recognition methods are of course imperfect and error-prone. But this technology has huge time-saving advantages when it comes to processing and digitizing image collections. Ultimately, machine learning is best used to assist and supplement human expertise and skills. This is especially true for specialty domains where there are few experts able to confidently execute work such as translating and deriving meaning from classical scripture. I can see this technology will benefit the community of Japanese historians and researchers best by assisting them to prioritize the works of art and literature that should be explored for historical significance.

Add a comment

Related posts:

The hidden numeral

I was so overwhelmed that the simple act of using my eyes to see would dazzleblind the rest of my brain. I was trying to explain why I didn’t understand her emotional reaction to what had seemed to…

How to have a great career

Beyond selecting your life partner, the choices you make about your career may be the most impactful decisions you make in life. Sadly, we aren’t taught much about what’s important — which leads us…

Light the Way

As I drove out of my neighborhood at dusk, I saw three of these very vibrant, bright pink jet contrails. By the time I got to the end of my street, which is less than a quarter of a mile, the light…