VANISHED SIMPLIFICATIONS.

A Blogchina article discusses the 1977 round of simplified Chinese characters, which was rescinded in 1986. The details of the characters won’t mean much to non-readers of Chinese, but the Unicode situation might:

Scholars using Unicode will find themselves able to discuss the length and breadth of China’s Glorious Five-Thousand Years of history, and yet there is one period about which they must remain silent: the vast majority of the characters in the 1977 simplification draft are simply not present. The first sixteen characters in the quiz are all present in a full Unicode font, although 13-16 are in the Extension space. The remaining sixteen I pieced together with eudcedit.
The sinograph section of Unicode has always been a hotbed of political controversy, mostly in the form of nationalism on the part of Japan and the traditional-simplified struggle among China and her outlying regions. I suspect our situation here is much the same, whether through active efforts to exclude the characters, or a simple indifference. With electronic composition and transmission, scanning and indexing integral parts of current-day research, this decade-long orthographic experiment is as if it had never even existed.

Thanks go to Nelson (whose blog, now unfortunately on hiatus, inspired a lengthy LH post on the name Vietnam) for the link.

Comments

  1. Wow, so I’m like, one of the last people to have studied Mandarin when the simplification was in force. Damn shame I remember hardly three words of the language…

  2. I still havee to fake it on simplified characters. I studied in Taiwan. Glad to see that the last wave is being rescinded. I believe I have a C-E or E-C dictionary using the obsolete simplified forms; a collectors’ item I guess. I hated using it and rarely did.

  3. xiaolongnu says

    Heh, I started learning Chinese toward the end of that period, and I definitely remember learning some of those (#1, 2, 4, 5, 8, 9, 10, 11, 24, 25, 26, 27, and 31). I’ve seen 14 and 28 as popular simplifications (grocery store shelf labels, that kind of thing).
    I should read up more on the initial simplification process of the 1950s. My research is in a time period (5th and 6th centuries) whose inscriptions are notorious for being full of alternate (or wrong, depending on your perspective) characters. Many of these were actually used in the simplification process, and the upshot is that one of the best dictionaries of stele-alternative characters (碑别字) from the medieval period was actually compiled by one of the leaders of the simplification project.

  4. Wow Hat, thanks very much for this article. I’ve actually been looking for this very information for a couple of months since I heard of the aborted simplifications in an obscure scholarly Australian book on the Chinese language. I was looking for the exact documents which detailed the simplifications from the 1950s and 1960s or just the lists or traditional to simplified mappings. Even information on the non-aborted simplifications is unexpectedly hard to find on the internet. (Info on the less well-known Japanese simplifications were much easier to find). I asked on the English and Chinese Wikipedias and possibly on the Qalam mailing list, but to no avail. Thanks again!

  5. Glad to be of service!

  6. A commenter going by “huixing” just stopped by the page and dropped off links to a Japanese gallery of photographs of those simplified characters “in the wild”. This includes some of those tablets xiaolongnv mentioned, as well as popular simplifications in China, Japan, and other parts of Asia. Also provided are explanations and Unicode information (or lack of it). Thought they might be of interest to some of you here:
       減 画 略 字: Four pages of simplifications
       音符書換字: A page of partial phonetic substitutions
    There’s other stuff there, too: extra strokes, modified character forms. Fascinating.

  7. Hello LH hisashiburi,
    I find myself wincing a bit when I read “mostly in the form of nationalism on the part of Japan”. I would acknowledge that there is a fair bit of nationalism, but han unification has some problematic points outside of Japanese nationalism. There were a number of interesting urls that I have long since lost, but this article notes some of them. Here is a second article
    link
    It’s a really fascinating area where computer font architecture and cultural patterns clash. I have to express some sympathy for the Japanese position. My name has a simplified kanji as its first kanji, but I think the ‘real’ kanji is its non-simplified form. So I can understand the reticence that Japanese speakers must feel. I don’t know if Cheong’s article is correct, but having to deal with student papers that change English fonts in the middle of the paper (or the middle of the word), the effect is painful, so I don’t think it is simply Japanese nationalism.

  8. Joe is referring to this sentence from the Blogchina article: “The sinograph section of Unicode has always been a hotbed of political controversy, mostly in the form of nationalism on the part of Japan and the traditional-simplified struggle among China and her outlying regions.” I wondered when I read that what exactly was being referred to.

  9. Some more low hanging fruit from Google searches. Many of the results of a search for unicode for kanji or Japanese gives ringing defenses for unicode. One of the alternate proposals was TRON and these links tell their side of the story.
    here, here and here
    this also encouraged me to check out Jonathan Delacour’s The Heart of Things, because his mention of Mojikyo got me very interested in all of this, but as a Machead, I was stymied, so I was quite pleased to see he has moved over the the light (just kidding, some of my best friends use windows)

  10. There was just a story on the Japanese news about counterfeit beer coupons that were printed in China. They were indistinguishable from the originals except that they used one kanji that is used in China but not in Japan. I haven’t found a news story with an image, but this link has some images of the points where that differentiate the fake from the real (in jpnese)

  11. Great story! And thanks for bringing this thread back to my attention, since I finally got around to checking the TRON links in your previous comment. Very interesting stuff. From the first link:
    Unicode is meeting fierce opposition in East Asia. This is not just because the people of East Asia do not want American computer vendors deciding what characters they can and cannot use on their computer systems. That, of course, quite understandably exists. More importantly, it is because “Unicode does not answer the needs of the on-line East Asian societies of the future.” For example, Unicode is inadequate for creating on-line digital libraries. Such libraries need unabridged character sets that include every character that has ever been used. Unicode does not provide that. Unicode-based systems also cannot be used for writing the personal and place names of many people and places in Japan, so they cannot be used for computerizing government offices in Japan. Because of its limited nature, Unicode likewise cannot be used for computerizing print shops and publishing offices in East Asia. So in the final analysis, Unicode is really of more benefit to computers themselves and their manufacturers, rather than to computer users in East Asia. However, considering its design goals, it is hardly surprising that it turned out to be such.
    From the second:
    Now the non-specialist reading this is probably saying to himself/herself that the above-mentioned surrogate mechanism has solved the problem of an insufficient number of character code points in Unicode, so it should be clear sailing from here on for Unicode. However, there is one very huge problem that has been created as a result of this surrogate pairs mechanism, which coincidentally seems to violate one of the basic tenets of programming, Occam’s Razor: “never multiply entities unnecessarily.” Since each new surrogate pair character code point is created by multiplying a two-byte code by a two-byte code, the result is a four-byte code, i.e., it’s 32 bits long, which requires twice as much disk space to store as the 16-bit characters codes on the Unicode Basic Multilingual Plane. Accordingly, the new and improved Unicode has essentially become an inefficient 32-bit character encoding system, since 94 percent of the grand total of 1,114,112 character code points (1,048,576) are encoded with 32-bit encodings.

Speak Your Mind

*