Annotator for big5 Chinese texts

This page provides access to a CGI script that adds JavaScript popups or textual translations or phonetics to Chinese characters. Paste some big5-encoded text into this form, select your options, and go. You will need a browser capable of displaying big-5 encoded Chinese characters.

The script uses a dictionary file from the CEDICT online dictionary project (I'm currently using the 1 November 1998 update) for its pinyin lookup and translations, augmented by the PY input-method file from cxterm (a Chinese terminal emulator for Unix/X) as a pinyin reference for the increasingly few characters not in CEDICT. Audio links are to the Bell Labs text-to-speech CGI script, and dictionary links are to the index of Web dictionaries at A stream of characters is subdivided into "words" solely on the basis of the longest-possible dictionary lookup, so there's no reason the results should actually be correct... but I find them useful, anyway.

The Perl source for this script is available -- be warned, it's still very messy. If you want to run this script yourself, maybe drop me a line and I'll clean it up a bit first, if I can find the time. If you want to develop something else based on it, please go ahead. For a description of the basics of the tooltip JavaScript code, see this article.

It would probably be reasonably straightforward to merge some of this code with James Marshall's CGI proxy script to make a Chinese-character-translating proxy. That would be very cool and I might do it some day, but I won't have the bandwidth to be able to run such a proxy at this site.

Chinese grammar notes

The original application for this CGI script, and motivation for developing it, was to give me translations for the characters in a set of grammar notes for the first-year Chinese student (that's me). These notes can be found here with a simple form interface, or here with the advanced form.

I got the grammar notes (and some useful revision at the same time) by reading through Hongchu Fu's online grammar notes and typing lots of the Book 1 notes into the NJStar word processor, nearly verbatim. Any errors will doubtless be mine.

Also included is a list of measure-words. I make no claims for its usefulness and it certainly isn't complete, but you can always just print it out and use it to amaze your friends.

Chris Cannam, 1998