Top Qs
Timeline
Chat
Perspective
General Punctuation
Unicode character block From Wikipedia, the free encyclopedia
Remove ads
General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.
Remove ads
Additional punctuation characters are in the Supplemental Punctuation block and sprinkled in dozens of other Unicode blocks.
Remove ads
Block
General Punctuation[1][2][3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+200x | NQ SP |
MQ SP |
EN SP |
EM SP |
3/M SP |
4/M SP |
6/M SP |
F SP |
P SP |
TH SP |
H SP |
ZW SP |
ZW NJ |
ZW J |
LRM | RLM |
U+201x | ‐ | NB ‑ |
‒ | – | — | ― | ‖ | ‗ | ‘ | ’ | ‚ | ‛ | “ | ” | „ | ‟ |
U+202x | † | ‡ | • | ‣ | ․ | ‥ | … | ‧ | L SEP |
P SEP |
LRE | RLE | LRO | RLO | NNB SP | |
U+203x | ‰ | ‱ | ′ | ″ | ‴ | ‵ | ‶ | ‷ | ‸ | ‹ | › | ※ | ‼ | ‽ | ‾ | ‿ |
U+204x | ⁀ | ⁁ | ⁂ | ⁃ | ⁄ | ⁅ | ⁆ | ⁇ | ⁈ | ⁉ | ⁊ | ⁋ | ⁌ | ⁍ | ⁎ | ⁏ |
U+205x | ⁐ | ⁑ | ⁒ | ⁓ | ⁔ | ⁕ | ⁖ | ⁗ | ⁘ | ⁙ | ⁚ | ⁛ | ⁜ | ⁝ | ⁞ | MM SP |
U+206x | WJ | ƒ() | × | , | + | LRI | RLI | FSI | PDI | I SS |
A SS |
I AFS |
A AFS |
NA DS |
NO DS | |
Notes |
Several characters in this block are usually not rendered with a directly visible glyph. Ten whitespace characters—U+2002 through U+200B (fixed en or 1⁄2 em, em, 1⁄3 em, 1⁄4 em, 1⁄6 em, figure and punctuation space, variable thin or 1⁄5 em and hair space, fixed zero-width space)—and U+205F (math medium or 2⁄9 em space) differ by horizontal width, while U+2000 and U+2001 (en and em quad) are effectively aliases of U+2002 and U+2003, respectively; another two, U+202F and U+2060 (ill-termed word joiner), are variants of U+2009 or U+2004 and U+200B that prohibit line breaks. Three zero-width characters, U+200B through U+200D (space, non-joiner and joiner), differ in how they affect ligation and shaping of adjacent letters such as contextual forms in Arabic. Eleven invisible characters—U+200E, U+200F (left-to-right and right-to-left mark), U+202A through U+202E (embeds, pops and overrides) and U+2066 through U+2069 (isolates)—control the directionality of text unless higher-level markup overrides them. There are explicit line and paragraph separators at U+2028 and U+2029.
Remove ads
Variation selectors
Starting with Unicode 16 (2024), the block has variation sequences defined for East Asian punctuation positional variants of the curly quotation marks ‘...’ and “...”. They use U+FE00 VARIATION SELECTOR-1 (VS01) and U+FE01 VARIATION SELECTOR-2 (VS02):[3]
U+ | 2018 | 2019 | 201C | 201D | Description |
base code point | ‘ | ’ | “ | ” | |
base + VS01 | ‘︀ | ’︀ | “︀ | ”︀ | non-fullwidth form |
base + VS02 | ‘︁ | ’︁ | “︁ | ”︁ | justified fullwidth form |
The non-fullwidth forms are expected to be separated with a space on one side, the fullwidth forms are not:

In vertical text, the fullwidth forms should display somewhat differently, and even as regular CJK quotation marks 「...」 and 『...』 if the vertical orientation property is set to "Hans":

Remove ads
Emoji
The General Punctuation block contains two emoji: U+203C and U+2049.[4][5]
The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation.[6]
U+ | 203C | 2049 |
base code point | ‼ | ⁉ |
base+VS15 (text) | ‼︎ | ⁉︎ |
base+VS16 (emoji) | ‼️ | ⁉️ |
History
Summarize
Perspective
The following Unicode-related documents record the purpose and process of defining specific characters in the General Punctuation block:
Remove ads
References
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads