Shell regular expression learning notes

Time:2021-10-18

Normal representation (or conventional representation) is used to search / replace / delete one or more columns of text strings through the arrangement of some special characters. In short, normal representation is a “Representation” used in the processing of strings. Formal representation is not a tool program, but a standard basis for string processing. If you want to process strings in the way of formal representation, you have to use tool programs that support formal representation. There are many such tool programs, such as VI, SED, awk, etc.

1、 What is a regular expression?

Regular expression is a syntax rule used to describe character arrangement and matching pattern. It is mainly used for pattern segmentation, matching, search and replacement of strings.

2、 Regular expressions and wildcards

1. Regular expression

It is used to match qualified strings in the file. The regular expression is “include match”. Grep, awk, SED and other commands can support regular expressions.

2. Regular expression metacharacter

Regular expressions match strings through metacharacters. For details, please refer to:http://www.cnblogs.com/refine1017/p/5011522.html

3. Wildcard

Used to match file names that meet the criteria. The wildcard is “exact match”. Ls, find and CP do not support regular expressions, so you can only use the shell’s own wildcards for matching.

4. Wildcards include

*Match any character

? Match any character

  [] matches any character in brackets

3、 Cut command

The cut command cuts bytes, characters, and fields from each line of the file and writes them to standard output.

1. Common parameters

-b: Split in bytes. These byte positions ignore multi byte character boundaries unless the – N flag is also specified.
-c: Split in characters.
-d: Custom separator, tab by default.
-f: Used with – D to specify which area to display.
-n: UN split multi byte characters. Use only with the – B flag.

2. Example 1: print a line of a file divided by tabs


[[email protected]host shell]# cat student.txt 
ID   Name  Gender Mark
1    ming  F    85
2    zhang  F    70
3    wang  M    75
4    li   M    90
[[email protected] shell]# cut -f 4 student.txt 
Mark
85
70
75
90 

3. Example 2: print a line of CSV file


[[email protected] shell]# cat student.csv 
ID,Name,Gender,Mark
1,ming,F,85
2,zhang,F,70
3,wang,M,75
4,li,M,90
[[email protected] shell]# cut -d "," -f 4 student.csv 
Mark
85
70
75
90 

4. Example 3: print the first character of a string


[[email protected] shell]# echo "abcdef" | cut -c 3
c 

5. Example 4: intercept a text of Chinese characters

[ [email protected] Shell]# echo "shell programming" | cut - NB 1
S
[ [email protected] Shell]# echo "shell programming" | cut - NB 2
h
[ [email protected] Shell]# echo "shell programming" | cut - NB 3
e
[ [email protected] Shell]# echo "shell programming" | cut - NB 4
l
[ [email protected] Shell]# echo "shell programming" | cut - NB 5
l
[ [email protected] Shell]# echo "shell programming" | cut - NB 8
Compile
[ [email protected] Shell]# echo "shell programming" | cut - NB 11
Course

4、 Printf command

1. Command format

printf    ‘ ‘output type output format’    Output content

2. Output type

%NS: output string. N stands for several characters to be output, and N ellipsis stands for all characters

%Ni: output integer. N refers to the output of several numbers, and N omitted represents all numbers

%m. NF: output floating point number. M and N are numbers, which refer to the integer and decimal places of the output. For example,% 8.2f means that 8 bits are output, of which 2 bits are small trees and 6 bits are integers.

3. Output format

\a: Output warning sound

\b: Output backspace

\f: Clear screen

\n: Line feed

\r: Enter

\t: Horizontal output backspace key

\v: Vertical output backspace key  

4. Examples


[[email protected] ~]# printf '%i %s %i %s %i\n' 1 "+" 2 "=" 3
1 + 2 = 3
[[email protected] ~]# printf '%i-%i-%i %i:%i:%i\n' 2015 12 3 21 56 30
2015-12-3 21:56:30 

5、 Awk command

1. Command format

Awk ‘condition 1 {action 1} condition 2 {action 2}…’ file name

Condition: generally, relational expressions are used as conditions, such as x > 10

Actions: format output, process control statement

2. Example 1: extract a line of a tab divided file


[[email protected] shell]# cat student.txt 
ID   Name  Gender Mark
1    ming  F    85
2    zhang  F    70
3    wang  M    75
4    li   M    90
[[email protected] shell]# awk '{print $1 "\t" $4}' student.txt 
ID   Mark
1    85
2    70
3    75
4    90 

3. Example 2: get disk utilization


[[email protected] shell]# df -h
Filesystem      Size Used Avail Use% Mounted on
/dev/sda2       18G 2.4G  14G 15% /
/dev/sda1       289M  16M 258M  6% /boot
tmpfs         411M   0 411M  0% /dev/shm
[[email protected] shell]# df -h | grep "sda1" | awk '{print $5}'
6% 

6、 Sed command

Sed is a lightweight stream editor that is included in almost all UNIX platforms, including Linux. Sed is mainly used to select, replace, delete and add data.

1. Command format

Sed [option] ‘[action]’ file name

2. Options

-n: Generally, the SED command will output all data to the screen. If this option is added, only the lines processed by the SED command will be output to the screen.

-e: Allows multiple sed command edits to be applied to input data.

-i: The modified result of SED is used to directly modify the file reading data, rather than output by the screen.

3. Action

a: Append adds one or more rows after the current row

c: Line replacement: replace the original data line with the string after C

i: Insert, inserts one or more rows before the current row.

d: Delete, delete the specified row

p: Print and output the specified line

s: String replacement, replacing one string with another. The format is “line range / S / old string / new string / g” (similar to the replacement format in VIM)

4. Examples

[[email protected] shell]# cat student.txt 
ID   Name  Gender Mark
1    ming  F    85
2    zhang  F    70
3    wang  M    75
4 Li m 90# test - n parameters
[[email protected] shell]# sed -n '2p' student.txt 
1 Ming f 85# test line deletion
[[email protected] shell]# sed '2d' student.txt 
ID   Name  Gender Mark
2    zhang  F    70
3    wang  M    75
4 Li m 90# test multi line deletion
[[email protected] shell]# sed '2,4d' student.txt 
ID   Name  Gender Mark
4 Li m 90# test added
[[email protected] shell]# sed '2a test append' student.txt
ID   Name  Gender Mark
1    ming  F    85
test append
2    zhang  F    70
3    wang  M    75
4 Li m 90# test insertion
[[email protected] shell]# sed '2i test insert' student.txt
ID   Name  Gender Mark
test insert
1    ming  F    85
2    zhang  F    70
3    wang  M    75
4 Li m 90# test line replacement
[[email protected] shell]# sed '2c test replace' student.txt
ID   Name  Gender Mark
test replace
2    zhang  F    70
3    wang  M    75
4 Li m 90# test content replacement
[[email protected] shell]# sed '2s/ming/replace/g' student.txt
ID   Name  Gender Mark
1    replace F    85
2    zhang  F    70
3    wang  M    75
4    li   M    90

Let’s take a look at simple regular expression matching examples. Through these examples, we believe we can skillfully master the use of basic regular expressions:

HelloWorld    Match 10 letters anywhere on any line: HelloWorld
^HelloWorld   Match the 10 letters that appear at the beginning of the line: HelloWorld
HelloWorld$   Match the 10 letters that appear at the end of the line: HelloWorld
^HelloWorld$   The match only includes these 10 letters: a line of HelloWorld
[Hh]elloWorld   Match HelloWorld or HelloWorld
Hello.World    The match contains the five letters Hello, plus any character, plus world
Hello*World   The match contains the five letters Hello, plus any letter, plus world

In the above example, “.” or “*” can be used to match 0 or more characters, but if the character to be matched is a range, then “{}” will be used. Because “{” and “}” in the shell have special meanings, the transfer character “\” needs to be used, for example:
[[email protected]  kouyang] #  grep -n ‘o\{2\}’  hello.txt
Find the line where two consecutive “O” appear in the hello.txt file

[[email protected] kouyang]# grep  -n ‘go\{2, 5\}g’ hello.txt
In the hello.txt file, find the line with 2-5 “O” followed by a “g” word after go

Recommended Today

OC basis

IOS development interview essential skills chart.png What are objects and what are the objects in OC? An object is an instance of a class; Is an instance created through a class, which is generally called an instance object; Common objects in OC include instance objects, class objects, and metaclass objects; What is a class? What […]