In short, this technique corresponds to the following scenario
Suppose there is text as follows
aaaa
bbbb
dddd
bbbb
cccc
aaaa
Now it needs to be de duplicated. This is very simple. Sort – u can handle it. But if I want to keep the original order of the text, for example, there are two AAAA here, I just want to remove the second AAAA, and the first AAAA is in front of BBBB. After de duplication, it still needs to be in front of it, so my expected output result is
aaaa
bbbb
dddd
Of course, this problem itself is not difficult. It’s easy to write in C + + or python, but when the so-called killing machine can be solved by shell command, it will always be our first choice. The answer is given at the end. Here’s how I thought of it
Sometimes when we want to add our own directory to the environment variable path, we will be in ~ / The bashrc file reads like this. For example, the directory to be added is $home / bin
In this way, we add the path $home / bin to the path and let it be searched at the front, but when we execute source ~ / After bashrc, the $home / bin directory will be added to path. If we add another directory next time, such as
Then execute source ~ / In bashrc, there are actually two records in the $home / bin directory in the path. Although this does not affect the use, it is unbearable for an obsessive-compulsive disorder. Therefore, the problem becomes that we need to remove the repeated paths in $path and keep the original path order unchanged, that is, who is in the front, and who is still in the front after de duplication, because we start from the first path when executing the shell command, So order is important
Well, having said so much, let’s reveal the final result. Take the data at the beginning of the article as an example, assuming that the input file is in Txt, the command is as follows
These are very simple shell commands, which are explained below
Sort – k2,2 – k1,1n: sort the input contents. The primary key is the second field and the second key is the first field, and sort by number
Uniq - F1: ignore the first column and de duplicate the text, but the first column will be included in the output
Sort – k1,1n: sort the input contents. Key is the first field and is sorted by number
Cut – F2 -: output the contents of column 2 and beyond. The default separator is \ t
You can start with the first command and combine them in turn to see the actual output effect, which will be easier to understand. How to deal with the repeated path in $path? Or in the previous example, just use tr to convert before and after
export PATH=`echo $PATH | tr ‘:’ ‘\n’ | cat -n | sort -k2,2 -k1,1n | uniq -f1 | sort -k1,1n | cut -f2- | tr ‘\n’ ‘:’`
In fact, there will be a problem when using path in this way. For example, if we want to remove the path of $home / bin after executing the above command, it is not enough to modify it to the following content
export PATH=`echo $PATH | tr ‘:’ ‘\n’ | cat -n | sort -k2,2 -k1,1n | uniq -f1 | sort -k1,1n | cut -f2- | tr ‘\n’ ‘:’`
Because we have added $home / bin to $path, this does not play the role of deletion. Perhaps the best way is to know all paths clearly and then display the specified instead of adding