GB2312 is the registered internet name for a key official character set of the People's Republic of China, used for simplified Chinese characters. GB abbreviates Guojia Biaozhun (国家标准), which means national standard in Chinese.
GB2312 (1980) has been superseded by GBK and GB18030, which include additional characters, but GB2312 is nonetheless still in widespread use.
While GB2312 covers 99.75% of the characters used for Chinese input, historical texts and many names remain out of scope. GB2312 includes 6,763 Chinese characters (on two levels: the first is arranged by reading, the second by radical then number of strokes), along with symbols and punctuation, Japanese kana, the Greek and Cyrillic alphabets, Zhuyin, and a double-byte set of Pinyin letters with tone marks.
There is a closely related analog character set to GB2312, with traditional character forms replacing simplified forms, known as GB/T 12345. GB-encoded fonts often come in pairs, one with the GB 2312 (jianti) character set and the other with the GB/T 12345 (fanti) character set.
The rows (numbered from 1 to 94) contain characters as follows:
The rows 10-15 and 88-94 are unassigned.
Compared to UTF-8, GB2312 is also more storage efficient, since Chinese characters are limited to a maximum of two bytes each, while UTF-8 uses at least three bytes.
To map the code points to bytes, add A0 to the 100's and 1000's value of the code point to form the high byte, and add A0 to the 1's and 10's value of the code point to form the low byte.
So, for example, if you have GB2312 code point 4566 ("foreign,") then the high byte will come from 45 (4500), and the low byte will come from 66 (0066). The encoding is in the range A1-F7 for the high byte, and in the range A1-FE for the low byte. So, we convert 45 to hex (2D), and add A0 to it (2D+A0=CD), and get the high byte. And we convert 66 to hex (42), and add A0 to it (42+A0=E2), and get the low byte. So, the full encoding is 0xCDE2.