Comments: VANISHED SIMPLIFICATIONS.

Wow, so I'm like, one of the last people to have studied Mandarin when the simplification was in force. Damn shame I remember hardly three words of the language...

Posted by Dorothea Salo at January 25, 2005 10:06 PM

I still havee to fake it on simplified characters. I studied in Taiwan. Glad to see that the last wave is being rescinded. I believe I have a C-E or E-C dictionary using the obsolete simplified forms; a collectors' item I guess. I hated using it and rarely did.

Posted by John Emerson at January 25, 2005 10:46 PM

Heh, I started learning Chinese toward the end of that period, and I definitely remember learning some of those (#1, 2, 4, 5, 8, 9, 10, 11, 24, 25, 26, 27, and 31). I've seen 14 and 28 as popular simplifications (grocery store shelf labels, that kind of thing).

I should read up more on the initial simplification process of the 1950s. My research is in a time period (5th and 6th centuries) whose inscriptions are notorious for being full of alternate (or wrong, depending on your perspective) characters. Many of these were actually used in the simplification process, and the upshot is that one of the best dictionaries of stele-alternative characters (碑别字) from the medieval period was actually compiled by one of the leaders of the simplification project.

Posted by xiaolongnu at January 26, 2005 03:18 PM

Wow Hat, thanks very much for this article. I've actually been looking for this very information for a couple of months since I heard of the aborted simplifications in an obscure scholarly Australian book on the Chinese language. I was looking for the exact documents which detailed the simplifications from the 1950s and 1960s or just the lists or traditional to simplified mappings. Even information on the non-aborted simplifications is unexpectedly hard to find on the internet. (Info on the less well-known Japanese simplifications were much easier to find). I asked on the English and Chinese Wikipedias and possibly on the Qalam mailing list, but to no avail. Thanks again!

Posted by Andrew Dunbar at January 27, 2005 02:28 AM

Glad to be of service!

Posted by language hat at January 27, 2005 09:06 AM

A commenter going by "huixing" just stopped by the page and dropped off links to a Japanese gallery of photographs of those simplified characters "in the wild". This includes some of those tablets xiaolongnv mentioned, as well as popular simplifications in China, Japan, and other parts of Asia. Also provided are explanations and Unicode information (or lack of it). Thought they might be of interest to some of you here:
   減 画 略 字: Four pages of simplifications
   音符書換字: A page of partial phonetic substitutions
There's other stuff there, too: extra strokes, modified character forms. Fascinating.

Posted by zhwj at January 27, 2005 10:44 PM

Hello LH hisashiburi,
I find myself wincing a bit when I read "mostly in the form of nationalism on the part of Japan". I would acknowledge that there is a fair bit of nationalism, but han unification has some problematic points outside of Japanese nationalism. There were a number of interesting urls that I have long since lost, but this article notes some of them. Here is a second article
link

It's a really fascinating area where computer font architecture and cultural patterns clash. I have to express some sympathy for the Japanese position. My name has a simplified kanji as its first kanji, but I think the 'real' kanji is its non-simplified form. So I can understand the reticence that Japanese speakers must feel. I don't know if Cheong's article is correct, but having to deal with student papers that change English fonts in the middle of the paper (or the middle of the word), the effect is painful, so I don't think it is simply Japanese nationalism.

Posted by joe tomei at January 31, 2005 05:20 AM

Joe is referring to this sentence from the Blogchina article: "The sinograph section of Unicode has always been a hotbed of political controversy, mostly in the form of nationalism on the part of Japan and the traditional-simplified struggle among China and her outlying regions." I wondered when I read that what exactly was being referred to.

Posted by language hat at January 31, 2005 08:17 AM

Some more low hanging fruit from Google searches. Many of the results of a search for unicode for kanji or Japanese gives ringing defenses for unicode. One of the alternate proposals was TRON and these links tell their side of the story.
here, here and here

this also encouraged me to check out Jonathan Delacour's The Heart of Things, because his mention of Mojikyo got me very interested in all of this, but as a Machead, I was stymied, so I was quite pleased to see he has moved over the the light (just kidding, some of my best friends use windows)

Posted by joe tomei at January 31, 2005 09:53 AM

There was just a story on the Japanese news about counterfeit beer coupons that were printed in China. They were indistinguishable from the originals except that they used one kanji that is used in China but not in Japan. I haven't found a news story with an image, but this link has some images of the points where that differentiate the fake from the real (in jpnese)

Posted by joe tomei at February 4, 2005 08:51 AM

Great story! And thanks for bringing this thread back to my attention, since I finally got around to checking the TRON links in your previous comment. Very interesting stuff. From the first link:

Unicode is meeting fierce opposition in East Asia. This is not just because the people of East Asia do not want American computer vendors deciding what characters they can and cannot use on their computer systems. That, of course, quite understandably exists. More importantly, it is because "Unicode does not answer the needs of the on-line East Asian societies of the future." For example, Unicode is inadequate for creating on-line digital libraries. Such libraries need unabridged character sets that include every character that has ever been used. Unicode does not provide that. Unicode-based systems also cannot be used for writing the personal and place names of many people and places in Japan, so they cannot be used for computerizing government offices in Japan. Because of its limited nature, Unicode likewise cannot be used for computerizing print shops and publishing offices in East Asia. So in the final analysis, Unicode is really of more benefit to computers themselves and their manufacturers, rather than to computer users in East Asia. However, considering its design goals, it is hardly surprising that it turned out to be such.

From the second:

Now the non-specialist reading this is probably saying to himself/herself that the above-mentioned surrogate mechanism has solved the problem of an insufficient number of character code points in Unicode, so it should be clear sailing from here on for Unicode. However, there is one very huge problem that has been created as a result of this surrogate pairs mechanism, which coincidentally seems to violate one of the basic tenets of programming, Occam's Razor: "never multiply entities unnecessarily." Since each new surrogate pair character code point is created by multiplying a two-byte code by a two-byte code, the result is a four-byte code, i.e., it's 32 bits long, which requires twice as much disk space to store as the 16-bit characters codes on the Unicode Basic Multilingual Plane. Accordingly, the new and improved Unicode has essentially become an inefficient 32-bit character encoding system, since 94 percent of the grand total of 1,114,112 character code points (1,048,576) are encoded with 32-bit encodings.

Posted by language hat at February 4, 2005 09:37 AM