MIK (character set)
Bulgarian character code set used with DOS From Wikipedia, the free encyclopedia
MIK (МИК) is an 8-bit Cyrillic code page used with DOS. It is based on the character set used in the Bulgarian Pravetz 16[1] IBM PC compatible system. Kermit calls this character set "BULGARIA-PC" / "bulgaria-pc".[2][3][4] In Bulgaria, it was sometimes incorrectly referred to as code page 856 (which clashes with IBM's definition for a Hebrew code page). This code page is known by Star printers and FreeDOS as Code page 3021 (Earlier it was known by FreeDOS as code page 30033 (now used for a code page 857 variant which contains the Crimean Tatar hryvnia sign), but it was renumbered to match the Star Printer code page).
This is the most widespread DOS/OEM code page used in Bulgaria, rather than CP 808, CP 855, CP 866 or CP 872.
Almost every DOS program created in Bulgaria, which has Bulgarian strings in it, was using MIK as encoding, and many such programs are still in use.
Character set
Each character is shown with its equivalent Unicode code point and its decimal code point. Only the second half of the table (code points 128–255) is shown, the first half (code points 0–127) being the same as ASCII.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | А | Б | В | Г | Д | Е | Ж | З | И | Й | К | Л | М | Н | О | П |
9x | Р | С | Т | У | Ф | Х | Ц | Ч | Ш | Щ | Ъ | Ы | Ь | Э | Ю | Я |
Ax | а | б | в | г | д | е | ж | з | и | й | к | л | м | н | о | п |
Bx | р | с | т | у | ф | х | ц | ч | ш | щ | ъ | ы | ь | э | ю | я |
Cx | └ | ┴ | ┬ | ├ | ─ | ┼ | ╣ | ║ | ╚ | ╔ | ╩ | ╦ | ╠ | ═ | ╬ | ┐ |
Dx | ░ | ▒ | ▓ | │ | ┤ | № | § | ╗ | ╝ | ┘ | ┌ | █ | ▄ | ▌ | ▐ | ▀ |
Ex | α | ß[nb 1] | Γ | π | Σ[nb 2] | σ | µ[nb 3] | τ | Φ | Θ | Ω[nb 4] | δ | ∞ | φ | ε[nb 5] | ∩ |
Fx | ≡ | ± | ≥ | ≤ | ⌠ | ⌡ | ÷ | ≈ | ° | ∙ | · | √ | ⁿ | ² | ■ | NBSP |
Notes for implementors of mapping tables to Unicode
Implementors of mapping tables to Unicode should note that the MIK Code page unifies some characters:
- 0xE4 is both the n-ary summation sign (U+2211, ∑) and the Greek uppercase sigma (U+03A3, Σ);
- 0xE6 is both the micro sign (U+00B5, µ) and the Greek lowercase mu (U+03BC, μ);
- 0xEE is both the element-of sign (U+2208, ∈) and the Greek lowercase epsilon (U+03B5, ε)!
Binary character manipulations
The MIK code page maintains in alphabetical order all Cyrillic letters which enables very easy character manipulation in binary form:
10xx xxxx - is a Cyrillic Letter
100x xxxx - is an Upper-case Cyrillic Letter
101x xxxx - is a Lower-case Cyrillic Letter
In such case testing and character manipulating functions as:
IsAlpha(), IsUpper(), IsLower(), ToUpper() and ToLower(),
are bit operations and sorting is by simple comparison of character values.
See also
References
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.