Summary of coding range of each character set in regular expression

Time:2021-11-24

These character sets are especially helpful when using various words, punctuation and special symbols in the Japanese character set.  
UTF8
[\x01-\x7f]|[\xc0-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf]{2}|[\xf0-\xff][\x80-\xbf]{3}
UTF16
[\x00-\xd7][\xe0-\xff]|[\xd8-\xdf][\x00-\xff]{2}
JIS
[\x20-\x7e]|[\x21-\x5f]|[\x21-\x7e]{2}
SJIS
[\x20-\x7e]|[\xa1-\xdf]|([\x81-\x9f]|[\xe0-\xef])([\x40-\x7e]|[\x80-\xfc])
EUC_JP        
[\x20-\x7e]|\x81[\xa1-\xdf]|[\xa1-\xfe][\xa1-\xfe]|\x8f[\xa1-\xfe]{2}
EUC_ JP punctuation and special characters         
[\xa1-\xa2][\xa0-\xfe]
EUC_ JP full angle digital
\xa3[\xb0-\xb9]
EUC_ JP full width capital English
\xa3[\xc1-\xda]
EUC_ JP full angle small English      
\xa3[\xe1-\xfa]
EUC_ JP full width Hiragana
\xa4[\xa1-\xf3]
EUC_ JP full width Katakana   [color=Red]2007-03-12   15: 00 update [/ color]
\xa3[\xb0-\xb9]|\xa3[\xc1-\xda]|\xa5[\xa1-\xf6][\xa3][\xb0-\xfa]|[\xa1][\xbc-\xbe]|[\xa1][\xdd]
EUC_ JP full angle Chinese character   [color=Red]2007-03-12   15: 06 update [/ color]
[\xb0-\xcf][\xa0-\xd3]|[\xd0-\xf4][\xa0-\xfe]|[\xB0-\xF3][\xA1-\xFE]|[\xF4][\xA1-\xA6]|[\xA4][\xA1-\xF3]|[\xA5][\xA1-\xF6]|[\xA1][\xBC-\xBE]
Big5
[\x01-\x7f]|[\x81-\xfe]([\x40-\x7e]|[\xa1-\xfe])
GBK
[\x01-\x7f]|[\x81-\xfe][\x40-\xfe]
GB2312 Chinese characters
[\xb0-\xf7][\xa0-\xfe]
GB2312 half angle punctuation marks and special symbols
\xa1[\xa2-\xfe]
GB2312 Roman array and item serial number
\xa2([\xa1-\xaa]|[\xb1-\xbf]|[\xc0-\xdf]|[\xe0-\xe2]|[\xe5-\xee]|[\xf1-\xfc])
GB2312 full angle punctuation and letters
\xa3[\xa1-\xfe]
GB2312 Japanese Hiragana
\xa4[\xa1-\xf3]
GB2312 Japanese katakana
\xa5[\xa1-\xf6]
Supplement:  
GB18030
[\x00-\x7f]|[\x81-\xfe][\x40-\xfe]|[\x81-\xfe][\x30-\x39][\x81-\xfe][\x30-\x39]
[color=Red]2007-03-12   21:35   Supplementary [/ color]  
Japanese half width space
\x20
Sjis full space
(?:\x81\x81)
Sjis full angle digital
(?:\x82[\x4f-\x58])
Sjis full width capital English
(?:\x82[\x60-\x79])
Sjis full angle small English
(?:\x82[\x81-\x9a])
Sjis full width Hiragana
(?:\x82[\x9f-\xf1])
Sjis full width hiragana extension
(?:\x82[\x9f-\xf1]|\x81[\x4a\x4b\x54\x55])
Sjis full width Katakana
(?:\x83[\x40-\x96])
Sjis full width Katakana extension
(?:\x83[\x40-\x96]|\x81[\x45\x5b\x52\x53])
EUC_ JP full space
(?:\xa1\xa1)
EUC half width Katakana
(?:\x8e[\xa6-\xdf]) 

Recommended Today

Seven solutions for distributed transactions

1、 What is distributed transaction Distributed transaction means that transaction participants, transaction supporting servers, resource servers and transaction managers are located on different nodes of different distributed systems. A large operation is completed by more than n small operations. These small operations are distributed on different services. For these operations, either all of them are […]