BabelStone Fonts


BabelStone Han

Description

BabelStone Han v. 16.0.2 is a free Unicode CJK font with over 60,000 Han characters (hanzi, kanji, hanja), and 64,736 Unicode characters in total. It is a Song/Ming style (宋体/明體) font, with glyphs modelled on the official character forms used in the People's Republic of China, and is primarily intended for writing Modern Standard Chinese, Classical Chinese, and various Sinitic languages and dialects. However, it also includes a large number of characters used for Japanese, Korean, Vietnamese (Chữ Nôm), and Zhuang (Sawndip). For Chinese it covers both traditional form characters and simplified form characters (it has full coverage of all Chinese simplified characters included in Unicode, as well as all corresponding traditional form characters). The font also includes many rare or archaic characters that are not found in most CJK fonts, as well as many characters used for the scholarly transcription of Early Chinese texts written on bone, bronze, wood, bamboo, and silk. The current version of BabelStone Han supports Unicode 16.0 which was released on 10 September 2024.

* A plain text file listing all IDS sequences for CJK unified ideographs in Unicode version 16.0 is available as IDS.TXT.

BabelStone Han was created by merging the "AR PL Mingti2L Big5" font (13,068 CJK Unified Ideographs) and "AR PL SungtiL GB" font (6,763 CJK Unified Ideographs), which were both released by Arphic Technology (文鼎科技) in 2001 under the original Arphic Public License [see below]. The merged font, comprising 14,907 CJK Unified Ideographs, was then converted to Unicode mappings, and has since been expanded to provide full coverage of all characters in the main CJK block (URO), full coverage of CJK Extensions A, D, and I, and partial coverage of CJK Extensions B, C, E, F, G, and H. The original CJK glyphs have been modified where appropriate to match the G-source glyphs given in the Unicode code charts.

This font is under continuous development, as tens of thousands of additional CJK ideographs are scheduled for inclusion in Unicode over then next few years. I intend to add as many of these new characters to BabelStone Han as I can, but I will never be able to include anywhere near all the CJK ideographs that have been and will be added to Unicode, so I am concentrating on G-source additions (i.e. characters used in China). However, if there are any characters that you would like to have added to BabelStone Han urgently just let me know (email) and I will try to fast-track their inclusion in the font. Please also let me know if you notice any glyph errors.


Summary of Coverage of CJK Unified Ideographs in BabelStone Han
Unicode Block Total Coverage G-Source Coverage
Characters Percentage Characters Percentage
CJK Unified Ideographs20,992 / 20,992100%20,933 / 20,933100%
CJK Unified Ideographs Extension A6,592 / 6,592100%6,197 / 6,197100%
CJK Unified Ideographs Extension B16,839 / 42,72039.4%13,355 / 30,55043.7%
CJK Unified Ideographs Extension C2,267 / 4,15454.5%1,130 / 1,130100%
CJK Unified Ideographs Extension D222 / 222100%78 / 78100%
CJK Unified Ideographs Extension E3,449 / 5,76259.8%2,369 / 2,82283.9%
CJK Unified Ideographs Extension F3,177 / 7,47342.5%1,309 / 1,309100%
CJK Unified Ideographs Extension G3,454 / 4,93969.9%1,331 / 2,08263.9%
CJK Unified Ideographs Extension H2,379 / 4,19256.7%829 / 829100%
CJK Unified Ideographs Extension I622 / 622100%622 / 622100%
CJK Compatibility Ideographs12 / 12100%12 / 12100%
Total60,005 / 97,68061.4%48,165 / 66,56472.4%

BabelStone Han supports 48,165 out of the 66,564 CJK unified ideographs with a G-source (China) reference. A summary of the remaining 18,399 G-source characters that are not yet available in the font is given in the table below. The 294 GZ characters should be completed in the next major release of BabelStone Han.


G-Source Characters Not Yet in BabelStone Han
Source Block Count
GHZ (漢語大字典) Ext. B 6,453
GHZR (汉语大字典(第二版)) Ext. G 457
GKX (康熙字典) Ext. B 10,742
GZ (古壮字字典) Ext. G 294
GZJW (殷周金文集成引得) Ext. E 453


Private Use Area (PUA)

As version 15.1.1 reached the maximum glyph limit of 65,535 glyphs, from version 15.1.3 onwards all PUA characters in the range E080 through F8DF have been removed from the main BabelStone Han font, and are only available in the BabelStone Han PUA font.



License

BabelStone Han is released under the original Arphic Public License, and may be used and distributed freely for personal or commercial purposes. The license also allows you to modify the font in any way you like (e.g. add/remove/modify glyphs or change the character mappings) without seeking my permission, as long as the modified font retains the Arphic Public License. If you do intend to distribute a modified version of BabelStone Han, please ensure that the font name (and all relevant fields in the name table) is changed to something different, so as not to cause confusion between my font and your modified font. In particular, please do not use "BabelStone" in the font name.

In 2010 Arphic Technology released two new public fonts, "AR PLMingU20 Light" (文鼎PL明體U20-L) and "AR PLBaosong2GBK Light" (文鼎PL報宋2GBK), both under a new Arphic Public License. The old license (which BabelStone Han uses) explicitly allows distributers of the original or modified font to make a financial gain ("If you wish, you can charge for this service"), whereas the terms of the new license explicitly restrict distribution of the original or modified font to "only for non-profit purpose". The two new fonts both have complete coverage of the 20,902 original characters in the main CJK block, and it would have saved me hundreds of hours of work to use one of these fonts as the basis for my font, but I prefer to use the old fonts with the original, unrestricted license.



Samples

Traditional Chinese Text


Simplified Chinese Text



Character Repertoire

BabelStone Han includes a total of 64,736 Unicode characters (64,620 named Unicode characters and 116 Private Use Area characters), and has a total of 65,449 glyphs. The font includes coverage for a wide range of Han ideographs and Unicode symbols as listed in the table below. The font is a dual-width font, with a mixture of full-width characters and half-width characters, as indicated in the column headed "Character Width" (a few symbol characters are multiple half-width character width).


Unicode Block Range Coverage Character Width Notes
Basic Latin 00000..0007F 97 out of 97 characters Half-width
Latin-1 Supplement 00080..000FF 91 out of 128 characters Half-width
Latin Extended-A 00100..0017F 16 out of 128 characters Half-width
Latin Extended-B 00180..0024F 18 out of 208 characters Half-width
Spacing Modifier Letters 002B0..002FF 25 out of 80 characters Half-width
Combining Diacritical Marks 00300..0036F 18 out of 112 characters Full-width
Tibetan 00F00..00FFF 7 out of 211 characters Full-width Religious and astrological symbols
Mongolian 01800..018AF 3 out of 158 characters Full-width Lozenge-shaped punctuation marks
Combining Diacritical Marks Extended 01AB0..01AFF 1 out of 31 characters Full-width Combining wavy line below for use with Han ideographs
Latin Extended Additional 01E00..01EFF 2 out of 256 characters Half-width
General Punctuation 02000..0206F 111 out of 111 characters Mixed
Superscripts and Subscripts 02070..0209F 27 out of 42 characters Half-width
Currency Symbols 020A0..020CF 3 out of 32 characters Full-width
Combining Diacritical Marks for Symbols 020D0..020FF 3 out of 33 characters Full-width
Number Forms 02150..0218F 49 out of 60 characters Full-width
Arrows 02190..021FF 53 out of 112 characters Full-width
Mathematical Operators 02200..022FF 11 out of 256 characters Full-width
Miscellaneous Technical 02300..023FF 27 out of 256 characters Full-width
Enclosed Alphanumerics 02460..024FF 160 out of 160 characters Full-width
Box Drawing 02500..0257F 128 out of 128 characters Full-width
Block Elements 02580..0259F 32 out of 32 characters Full-width
Geometric Shapes 025A0..025FF 96 out of 96 characters Full-width
Miscellaneous Symbols 02600..026FF 61 out of 256 characters Full-width
Dingbats 02700..027BF 11 out of 192 characters Full-width
Supplemental Arrows-A 027F0..027FF 2 out of 16 characters Full-width
Supplemental Arrows-B 02900..0297F 16 out of 128 characters Full-width
Miscellaneous Mathematical Symbols-B 02980..029FF 10 out of 128 characters Full-width
Miscellaneous Symbols and Arrows 02B00..02BFF 79 out of 253 characters Full-width
Supplemental Punctuation 02E00..02E7F 7 out of 94 characters Full-width
CJK Radicals Supplement 02E80..02EFF 115 out of 115 characters Full-width
Kangxi Radicals 02F00..02FDF 214 out of 214 characters Full-width
Ideographic Description Characters 02FF0..02FFF 16 out of 16 characters Full-width
CJK Symbols and Punctuation 03000..0303F 53 out of 64 characters Full-width
Hiragana 03040..0309F 93 out of 93 characters Full-width Decomposed (NFD) characters and named sequences also supported
Katakana 030A0..030FF 96 out of 96 characters Full-width Decomposed (NFD) characters and named sequences also supported
Bopomofo 03100..0312F 43 out of 43 characters Full-width Poor quality glyphs
Kanbun 03190..0319F 16 out of 16 characters Full-width
Bopomofo Extended 031A0..031BF 32 out of 32 characters Full-width Poor quality glyphs
CJK Strokes 031C0..031EF 39 out of 39 characters Full-width
Katakana Phonetic Extensions 031F0..031FF 16 out of 16 characters Full-width
Enclosed CJK Letters and Months 03200..032FF 192 out of 255 characters Full-width
CJK Compatibility 03300..033FF 73 out of 256 characters Full-width
CJK Unified Ideographs Extension A 03400..04DBF 6,592 out of 6,592 characters Full-width
Yijing Hexagram Symbols 04DC0..04DFF 64 out of 64 characters Full-width
CJK Unified Ideographs 04E00..09FFF 20,992 out of 20,992 characters Full-width
Modifier Tone Letters 0A700..0A707 8 out of 32 characters Full-width
Latin Extended-D 0A720..0A7FF 1 out of 199 characters Half-width
Private Use Area 0E000..0F8FF 116 out of 6,400 characters Full-width

E000..E01D: Song and Yuan dynasty flute notation symbols.

E020..E04B: Tang and Song dynasty lute notation symbols.

E050..E07F: Miscellaneous symbols, including the single and triple paluta marks (E05E..E05F), the symbol of chaos (E061), and Daoist Mountain symbols (E070..E074).

F8E0..F8FD: Unencoded kana letters.

F8FE..F8FF: small-form ideographs.

NB Unencoded ideographs and ideographic components in the range E080 through F8DF are now only available in the BabelStone Han PUA font.

CJK Compatibility Ideographs 0F900..0FAFF 472 out of 472 characters Full-width See Glyph Forms below for explanation of the glyph forms used in the font.
Variation Selectors 0FE00..0FE0F 16 out of 16 characters Full-width
Vertical Forms 0FE10..0FE1F 10 out of 10 characters Full-width
CJK Compatibility Forms 0FE30..0FE4F 32 out of 32 characters Full-width
Small Form Variants 0FE50..0FE6F 26 out of 26 characters Full-width
Halfwidth and Fullwidth Forms 0FF00..0FFEF 173 out of 225 characters Full-width
Specials 0FFF0..0FFFF 5 out of 5 characters Full-width
Ideographic Symbols and Punctuation 16FE0..16FFF 4 out of 7 characters Full-width
Kana Supplement 1B000..1B0FF 256 out of 256 characters Full-width

Archaic Hiragana and Katakana letters added in Unicode 6.0

Hentaigana letters added in Unicode 10.0. Glyphs auto-traced from the Unicode code charts.

Kana Extended-A 1B100..1B12F 35 out of 35 characters Full-width Hentaigana letters added in Unicode 10.0. Glyphs auto-traced from the Unicode code charts.
Small Kana Extension 1B130..1B16F 9 out of 9 characters Full-width
Symbols for Legacy Computing Supplement 1CC00..1CEBF 36 out of 686 characters Full-width Outlined letters A-Z and outlined digits 0-9
Tai Xuan Jing Symbols 1D300..1D35F 87 out of 87 characters Full-width
Counting Rod Numerals 1D360..1D37F 25 out of 25 characters Full-width
Enclosed Alphanumeric Supplement 1F100..1F1FF 174 out of 200 characters Full-width
Enclosed Ideographic Supplement 1F200..1F2FF 64 out of 64 characters Full-width
Miscellaneous Symbols And Pictographs 1F300..1F5FF 64 out of 768 characters Full-width
Transport and Map Symbols 1F680..1F6FF 5 out of 118 characters Full-width
Geometric Shapes Extended 1F780..1F7FF 90 out of 103 characters Full-width
Supplemental Symbols and Pictographs 1F900..1F9FF 2 out of 256 characters Full-width
Chess Symbols 1FA00..1FA6F 14 out of 98 characters Full-width Xiangqi game symbols
Symbols and Pictographs Extended-A 1FA70..1FAFF 1 out of 114 characters Full-width
Symbols for Legacy Computing 1FB00..1FBFF 10 out of 249 characters Full-width Segmented digits 0-9
CJK Unified Ideographs Extension B 20000..2A6DF 16,839 out of 42,720 characters Full-width
CJK Unified Ideographs Extension C 2A700..2B73F 2,267 out of 4,154 characters Full-width Added in Unicode 5.2
CJK Unified Ideographs Extension D 2B740..2B81F 222 out of 222 characters Full-width "Urgent need characters", added in Unicode 6.0
CJK Unified Ideographs Extension E 2B820..2CEAF 3,449 out of 5,762 characters Full-width Added in Unicode 8.0
CJK Unified Ideographs Extension F 2CEB0..2EBEF 3,177 out of 7,473 characters Full-width Added in Unicode 10.0
CJK Unified Ideographs Extension I 2EBF0..2EE5F 622 out of 622 characters Full-width Added in Unicode 15.1
CJK Compatibility Ideographs Supplement 2F800..2FA1F 542 out of 542 characters Full-width See Glyph Forms below for explanation of the glyph forms used in the font.
CJK Unified Ideographs Extension G 30000..3134F 3,454 out of 4,939 characters Full-width Added in Unicode 13.0
CJK Unified Ideographs Extension H 31350..323AF 2,379 out of 4,192 characters Full-width Added in Unicode 15.0
Variation Selectors Supplement E0100..E01EF 240 out of 240 characters Full-width


Glyph Forms

The glyph forms for CJK unified ideographs in BabelStone Han are in principle based on the G-source glyph forms defined by the Standardization Administration of China that are given in the Unicode code charts:


If there is no G-source glyph for a particular unified ideograph, then I generally use a glyph form that is consistent with G-source glyph conventions, even if it may not exactly match the glyph form in the Unicode code chart. Where the G-source glyph form for a character in the SIP or TIP is inconsistent with G-source conventions, or seems to be incorrect, the font gives the expected correct glyph form, but the font also provides the G-source glyph form as an unofficial ideographic variation sequence (see below). NB BabelStone Han glyph forms for G-source characters in the BMP (URO and Ext. A) should always follow the code charts. All G-source characters for which the BabelStone Han glyph form deliberately does not match the code chart glyph form are listed below:


Code Point BS Han Glyph G-Source Glyph
VS Glyph VS Glyph Source Ref.
U+29B9A VS17 𩮚󠄀 VS18 𩮚󠄁 GKX-1455.47
U+2A931 VS17 𪤱󠄀 VS18 𪤱󠄁 GBK-1002.97
U+2A99D* VS17 𪦝󠄀 VS18 𪦝󠄁 GZJW-00855
U+2A9B3* VS17 𪦳󠄀 VS18 𪦳󠄁 GZJW-00879
U+2B058* VS17 𫁘󠄀 VS18 𫁘󠄁 GZJW-01380
U+2B37A* VS17 𫍺󠄀 VS18 𫍺󠄁 GZFY-00398
U+2B71C* VS17 𫜜󠄀 VS18 𫜜󠄁 GZJW-01907
U+2B71E* VS17 𫜞󠄀 VS18 𫜞󠄁 GZJW-01915
U+2B9FC* VS17 𫧼󠄀 VS18 𫧼󠄁 GZJW-00671
U+2BBA9* VS17 𫮩󠄀 VS18 𫮩󠄁 GZJW-00822
U+2BC71* VS17 𫱱󠄀 VS18 𫱱󠄁 GZJW-00864
U+2BDFC* VS17 𫷼󠄀 VS18 𫷼󠄁 GZJW-00987
U+2BE43* VS17 𫹃󠄀 VS18 𫹃󠄁 GZJW-01011
U+2BE64* VS17 𫹤󠄀 VS18 𫹤󠄁 GZJW-01027
U+2C2C9* VS17 𬋉󠄀 VS18 𬋉󠄁 GZJW-01253
U+2C2E3 VS17 𬋣󠄀 VS18 𬋣󠄁 GZJW-01806
U+2CE5C* VS17 𬹜󠄀 VS18 𬹜󠄁 GZJW-01910
U+2CE5E* VS17 𬹞󠄀 VS18 𬹞󠄁 GZJW-00846
U+2CE5F* VS17 𬹟󠄀 VS18 𬹟󠄁 GZJW-01911
U+2CE60* VS17 𬹠󠄀 VS18 𬹠󠄁 GZJW-01912
U+2CE61* VS17 𬹡󠄀 VS18 𬹡󠄁 GZJW-01913
U+2CE62* VS17 𬹢󠄀 VS18 𬹢󠄁 GZJW-01914
U+301A5 VS17 𰆥󠄀 VS18 𰆥󠄁 GHZR-10087.29
U+31F5C VS17 𱽜󠄀 VS18 𱽜󠄁 GXM-00285

For the characters marked with an asterisk in the table above, the BabelStone Han glyph form matches the glyph form given in GB 18030-2022 or GB 18030-2022 Amd. 1. The majority of these are characters sourced to GZJW (殷周金文集成引得) in Exts. C and E.



CJK Compatibility Ideographs

For CJK compatibility ideographs, where there is a G-source then the BabelStone Han character follows the G-source glyph form. Where there is no G-source, but the glyph forms of the compatibility ideograph and its corresponding unified ideograph are the same for the same source, then the BabelStone Han character uses the same glyph as for the corresponding unified ideograph in the font. In all other cases, the BabelStone Han character is based on the glyph form of the compatibility ideograph, but with G-source glyph conventions applied in most cases.



OpenType Features

BabelStone Han supports the following OpenType features.


Script Language Features Explanation
Han <hani> Default <dflt> Standard Ligatures <liga> Support for a limited set of vertically-combining Han ideographic ligatures, composed by putting U+200D ZERO WIDTH JOINER (ZWJ) between the two unified ideographs to be ligatured. These are listed in BSH_ZWJ.html.
Han <hani> Default <dflt> Standard Ligatures <liga> Support for a limited set of combining sequences of CJK unified ideographs and the Vietnamese reading marks encoded at U+16FF0 and U+16FF1. NB These do not work on most desktop applications running on Windows, but should work in recent web browsers (test page).
Han <hani> Default <dflt> Contextual Alternates <calt> Automatically rotates Suzhou numerals 1-3 when adjacent to each other (〡〢〣).
Latin <latn> Default <dflt> Standard Ligatures <liga> Sequences of two, three or four consecutive instances of U+1D377 TALLY MARK ONE 𝍷 should be ligatured into a single character space (𝍷𝍷, 𝍷𝍷𝍷, 𝍷𝍷𝍷𝍷).
Hiragana and Katakana <kana> Default <dflt> Standard Ligatures <liga> Used to compose decomposed Hiragana, Katakana and Hentaigana characters, including extended katakana sequences used for Ainu.


Standardized Variation Sequences

BabelStone Han supports 1,002 standardized variation sequences for 902 CJK Unified Ideographs corresponding to CJK Compatibility Ideographs (see StandardizedVariants.txt "CJK compatibility ideographs"). It also supports 16 standardized variation sequences for eight East Asian punctuation marks (see StandardizedVariants.txt "East Asian punctuation positional variants").



Ideographic Variation Sequences

BabelStone Han supports a provisional set of 2,452 ideographic variation sequences (IVS) for 1,196 characters. These are listed on this page (plain text listing at BSH_IVS.TXT). These are not yet registered in the Ideographic Variation Database (IVD), but it is intended to register a BabelStone IVS collection in the IVD at a future date. The IVS sequences currently in the font are not stable, and some may be modified or removed in response to user feedback and expert review.



Version History

BabelStone Han was created on 25 May 2009, and numerous versions have been released since then, with several releases in most years. See BabelStone Han Version History for details of all releases.



Download

BabelStone Han is free to download and use for personal or commercial purposes. For use on your computer or mobile device please download and install the TTF file (or the ZIP containing the TTF file). The WOFF and WOFF2 files are for web hosting only, and are not required by most users.


Font Version Download Notes
BabelStone Han

16.0.2

(2024-10-14)

BabelStoneHan.zip [19.5 MB]

BabelStoneHan.ttf [49.5 MB]

BabelStoneHan.woff [19.2 MB]

BabelStoneHan.woff2 [7.0 MB]

See Version History for details
BabelStone Han PUA

1.451

(2024-10-14)

BabelStoneHanPUA.ttf [4.75 MB]

Includes 5,892 Private Use Area (PUA) CJK ideographs and components in the range U+E080 through U+F8DF (for technical reasons these are no longer available in the main BabelStone Han font).
BabelStone Han Erjian 1

1.1.6

(2023-09-01)

BabelStoneErjian1.ttf [6.58 MB]

Table 1 2nd stage simplified glyph forms mapped to the corresponding standard simplified form of the character.
BabelStone Han Erjian 2

1.1.12

(2023-09-01)

BabelStoneErjian2.ttf [6.38 MB]

Table 1 and Table 2 2nd stage simplified glyph forms mapped to the corresponding standard simplified form of the character.

I do not maintain an archive of old versions of BabelStone Han, but version-identified zip files of the current and previous version of the font are available for download here:



Made with FontCreator badge



BabelStone Fonts