The Unicode Project

Encoding-Aware Scamper

Scamper is enabled to display 30 different encodings. Here you see a multilingual conference announcement from Unicode Inc. (

Scamper, showing a page encoded in utf-8

The Unicode Glyph Browser

The Unicode Glyph Browser is an application that shows glyph information for every single Unicode glyph point. This information is read from files that are available from Unicode Inc. To use the Unicode Glyph Browser, you have to download these files and to create memory-resident indices.

Here you see the information for the glyph at codepoint U+2295.

The Unicode Glyph Browser, showing information for glyph point u2295

The OpenType Font File Reader

This is an MVC-application that allows you to read the contents of TrueType files and OpenType files (Type extensions: ttf and ttc). Regrettably, the application is incomplete and I cannot currently show some of the newer and more interesting features of OpenType fonts like glyph substitution tables (which are needed to implement really good support for scripts that need script shaping.)

Here you see the font viewer that is opened on the font Shruti.ttf. This is a fully Unicode-conformant font to display the indic script Gujarati.

The OpenType font browser, showing the glyph at codepoint u0AD0

The Language Dictionaries

There are a lot of electronic dictionaries available on the Internet and the availablity of these dictionaries is one of the most convincing reasons to implement multilingualized software.

The Japanese Multilingual Dictionary

The Japanese Multilingual Dictionary from Monash University is one of the most ambitious projects to create electronic dictionaries. The dictionary was created by Jim Breen. It can be downloaded from
Here you see a reader for this dictionary. In the lower part of the window, you see a component that is used to enter the searchword. Currently, input of KangXi is selected. To the left of the button "search" you see the two ideographs of the searchword.

Jim Breen's Japanese Dictionary

The Indic Dictionaries

The Indic Institute of Information Technology maintains the page were you can find some dictionaries (in ISCII-encoding). Currently I have Squeak readers for two of these dictionaries: The English-Telugu Dictionary and the English-Hindi Dictionary.

This is the start page of the English-Telugu Dictionary from the Indic Institute of Information Technology:

English-Telugu Dictionary

and here you see the entry for the english word 'squeak':

English-Telugu Dictionary: The entry for 'squeak'

There is also an English-Hindi Dictionary available. It is very similar to the English-Telugu Dictionary but uses of course a different data file.

English-Hindi Dictionary: The entry for 'squeak'