Charset

http://www.nue.ci.i.u-tokyo.ac.jp/~duc/vime/
http://vietime.sourceforge.net/
http://vietunicode.sourceforge.net/download/vietime/
http://www.ohloh.net/projects/tinymcevim
http://www.codeproject.com/KB/scripting/VietUni2.aspx
http://vietpad.sourceforge.net/
http://www.scheduleworld.com/i18n.html
http://www.microsoft.com/windows/ie/ie6/downloads/recommended/ime/default.mspx

web how to type vietnamese
how to type unicode character into a web form
support for unicode character in firefox.
browser form unicode

AVIM. An input method editor (IME) for Vietnamese. Type fully-accented Vietnamese directly into any web page or dialog box using standard keyboard.

Vietnamese Language Pack
Vietnamese Portable
Vietnamese Dictionary for spellchecking with a wordlist based on Ho Ngoc Duc Free Vietnamese Dictionary Project
Mudim

Investigate Yahoo mail
i18n + FCKEditor
i18n video on YouTube


Character set can be alphabetic, punctuation, arithmetic, or specific to a discipline or vertical market (e.g. proofreading symbols, business symbols).

Characters may be composed of other characters.

CCS = Coded Character Set
CEF = Character Encoding Form
CES = Character Encoding Scheme

  1. Many character sets exists
  2. For one character set, there may be many encoding scheme
  3. Given just bytes, the character set and the encoding scheme can not be determine.

http://iana.org/assignments/character-sets

Content-Type: text/html; charset=UTF-8
<meta http-equiv="Content-type" content="text/html; charset=UTF-8" />
<?xml version="1.0" encoding="UTF-8" ?>
@charset "UTF-8"; /* Only used in the first line of external style sheets */

Declaring the character set of a linked documents:

<link title="Arabic text" type='text/html' charset='ISO-8859-6' rel='alternate' href='arabic.html'>
<a href='...' charset='UTF-8'>Unicode</a>

Charset on links can be incorrect if the document encoding on the server change. The encoding of the

<meta charset=...>

is unknown until the statement is processed.
@font-face { font-family: "Tex"; src:url(ftp://.../path/file.ttf); unicode-range: U+??, U+900-97f; }

CSS allows control of the type of quote to use according to language:

*[lang!=fr] { quote: '\ab\a0' '\a0\bb' }
qo:before { content: open-quote }
qo:after { content: close-quote }

CSS2 text-transform: uppercase, lowercase, capitalize, none, inherit

list-style-type
writing-mode: tb-rl; /* top to bottom, right to left */

Annotation is smaller characters running above or below base text. Used in Japanese for pronunciation of Kanji characters (Furigana). See W3C Ruby module for detail.

page_revision: 5, last_edited: 1227056223|%e %b %Y, %H:%M %Z (%O ago)
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License