Python regular take Chinese characters, number case English


Remove special characters and retain only Chinese, English and numbers

import re
String = "123I 123456abcdefgabcdff? / ,。 ,.:;:''';'''[]{}()()《》"
123I 123456abcdefg abcvdff? / ,。 ,.:;:''';'''[]{}()()《》
sub_str = re.sub(u"([^\u4e00-\u9fa5\u0030-\u0039\u0041-\u005a\u0061-\u007a])","",string)

123I 123456abcdefg abcvdff
function explain
sub(pattern,repl,string) Replace all matching expressions in the string with repl
[^**] Indicates that none of the characters in this character set are matched
\u4e00-\u9fa5 The Unicode range of Chinese characters
\u0030-\u0039 The Unicode range of numbers
\u0041-\u005a Upper case Unicode range
\u0061-\u007a Lower case Unicode range
\uAC00-\uD7AF The Unicode range of Korean
\u3040-\u31FF The range of Unicode in Japanese

Recommended Today

On the theoretical basis of SRE

What is SRE? When I first got into contact with SRE, many people thought that it was a post with full stack capability in Google and could solve many problems independently. After in-depth exploration, it is found that SRE can solve many problems, but there are too many problems. It is difficult for a post […]