Comments: MOJIKYO.

Thanks, L.H. and Small Dragon Lady. I only requested the Tangut stuff a day or two ago. I wish that everyone in my life were so quick.

Posted by Zizka at September 27, 2003 10:43 PM

This connects nicely with your post from a couple of days ago, on Unicode. Cho-kanji has about 130,000 characters compared to Unicode's 100,000. A major complaint about Unicode goes back to when it originally started, and decided not to encode "variant" characters on the basis that fonts can handle that.
Turns out that approach doesn't work well in Asian languages. The original Unicode consortium wasn't being terribly sensitive to Asian cultures, especially considering it was trying to impose a universal standard at least partly to make Asian encoding easier.
Unicode has recently come around and is adding variants and historical characters, but is still way behind the cho-kanji people.
The Cho-kanji project, which is closely tied to the TRON project (http://tronweb.super-nova.co.jp/b-right-vr2intro.html), claims to have all the necessary variant characters that Unicode still lacks.

Posted by Mark S at September 27, 2003 11:52 PM

Here's the direct link to the TRON project, and here's a fascinating post from Mark's blog that shows screen shots of a program that converts from kanji/kana to foreign names, place names, &c. and back: "Now here's where it gets interesting. I noticed that a lot of these conversions had a place name selection option. This is very useful because place names, especially archaic place names, are difficult in Japanese. It works, as expected, very well with archaic Japanese names and of course modern Japanese names. Then I tried a few foreign names. There are many foreign places which have archaic kanji characters that are not often used in modern Japanese. They generally converted well... Then I tried Seoul. It appears that most of Korea doesn't exist in the Microsoft Japanese IME. First, I had problems even typing the obsolete kanji for Seoul. I had to force it to do that combination. I tried several Korean names with similar results. The only exception I could find was for Pusan. But the kanji for Pusan get used regularly in modern Japanese, so I don't suppose they had a choice but to include it." No wonder Koreans can be so touchy about their nationhood.

Posted by language hat at September 28, 2003 08:01 AM

This is so excellent... thank you!

Something that's bugged me for ages is that Nagai Kafu's Bokuto Kidan (A Strange Tale from East of the River) uses an obsolete kanji for the "boku" character. Amazon lists the book as "墨東綺譚" but the first character is a greatly simplified version of the original that appears on the cover and title page of Kafu's novel. Now it looks like I might be able to find the correct "boku" character.

When I read your post, I immediately wondered what use the Mojikyo fonts would be, given that the IME only supports a limited number of characters. But then I read the instructions for installing and using the Mojikyo Character Map:

http://www.mojikyo.org/html/download/cmap/jack/Mojikyo_EN.html

I can't wait to try this. Three cheers for LH and xiaolongnu!

Posted by Jonathon Delacour at September 28, 2003 08:15 AM

Yeah, you were first on my list of "several Languagehat readers"—I figured you'd show up with pleased exclamations! Let me know how it works once you've installed and used it.

Posted by language hat at September 28, 2003 11:23 AM