Ripgrep, the fastest text search artifact in Linux

Time:2020-11-27

preface

When it comes to text search tools, you will know grep, which is one of the most useful and commonly used tools in Linux.
But if you want to search for a keyword in a large project, you must know that it is time-consuming.
So there are a lot of alternative tools. The most famous ones before are ACK and Ag
Recently, there is a new alternative ripgrep, which uses multithreading like ACK / Ag, but RG is faster than them

brief introduction

Ripgrep is a search tool based on behavior unit. It searches the specified directory recursively according to the provided pattern. It’s written in the rust language, and it’s faster than any other tool of its kind.
Several features are as follows:

  • Automatic recursive search (grep requires – R)
  • Automatically ignore files in. Gitignore and binary files
  • Can search for the specified file type (RG – TPY foo limits Python files, RG – TJS foo excludes JS files)
  • Features that support most grep features (commonly used)
  • Support various file compilation (UTF-8, utf-16, Latin-1, GBK, euc-jp, shift_ JIS, etc.)
  • Supports searching common compressed files (gzip, XZ, LZMA, bzip2, lz4)
  • Automatically highlight matching results
  • Fewer command names RG (grep is four characters)
  • Multi line search and fancy regularization are not supported

Repripg installation

Install rust first


curl https://sh.rustup.rs -sSf | sh

Then enter all the way

Install rigpre with rust


git clone https://github.com/BurntSushi/ripgrep
cd ripgrep
cargo build --release
cp ./target/release/rg /usr/local/bin/

The last step is to put it in a path in your path according to your situation

use

Search results display

Usage overall format


USAGE:
  rg [OPTIONS] PATTERN [PATH ...]
  rg [OPTIONS] [-e PATTERN ...] [-f PATTERNFILE ...] [PATH ...]
  rg [OPTIONS] --files [PATH ...]
  rg [OPTIONS] --type-list
  command | rg [OPTIONS] PATTERN

input parameter


ARGS:
  <PATTERN>
      A regular expression used for searching. To match a pattern beginning with a
      dash, use the -e/--regexp flag.

      For example, to search for the literal '-foo', you can use this flag:

        rg -e -foo

      You can also use the special '--' delimiter to indicate that no more flags
      will be provided. Namely, the following is equivalent to the above:

        rg -- -foo

  <PATH>...
      A file or directory to search. Directories are searched recursively. Paths specified on
      the command line override glob and ignore rules.
options Description other
-A, –after-context <NUM> Display the < num > line after the match Will override — context
-B, –before-context <NUM> Display the < num > line before the match Will override — context
-b, –byte-offset Displays the byte offset of the match in the file Use with – O to print offset only
-s, –case-sensitive Case sensitive Will override – I (ignore case), – S (smart case)
–color <WHEN> When to use color, auto is the default If — vimgre is used, the default value is never
Options are: never, auto, always, ANSI
–colors <COLOR_SPEC>… Set output color: color: red, blue, green, cyan
{type}:{attribute}:{value} magenta, yellow, white, black
{type}: path, line, column, match style: nobold, bold, nointense
{attribute}: fg, bg, style intense, nounderline, underline
{value}: a color or a text style Example:
{type}:noneThe color setting of {type} will be cleared rg –colors ‘match:fg:magenta’ –colors ‘line:bg:yellow’ foo
Extended color sets can be{value}Use if the terminal supports ANSI color
The description method is’ x ‘(256 color) or’ x, x, X ‘(24 bit true color)
X is a value between 0 and 255. The default value is decimal, and the prefix is hexadecimal
For example: RG — colors’ match:bg :0,128,255′
Or equivalent: RG — colors’ match:bg :0x0,0x80,0xFF’
Intensity and nointense are invalid when using extended color code
–column Number of columns in which the first match occurs (starting from 1) Can be cancelled by — no column
-C, –context <NUM> Show the < num > lines before and after the match It will override the – B and – a options
–context-separator <SEPARATOR> Used to separate discontinuous lines in output X7f or t can be used. The default is–
-c, –count Only the number of matched rows is displayed If only one file is given to ripgrep, only the number of matching lines will be printed
You can use — with file name to force the file name to be printed
It will override the — count matches option
–count-matches Show only the number of matches You can use — with file to force the output of the file name when there is only one file
–debug Show debug information
–dfa-size-limit <NUM+SUFFIX?> The upper limit of regex DFA, which is 10m by default
-E, –encoding <ENCODING> Description text encoding, the default is auto https://encoding.spec.whatwg.org/#concept-encoding-get
-f, –file <PATTERNFILE>… Read in the pattern from the file, one pattern per line It can be used multiple times or in combination with – E, so some combinations will be matched
–files Print all files that will be searched withrg <options> --files [PATH...]Pattern cannot be added
-l, –files-with-matches Only matching file names are printed Files without match
–files-without-match Print only no matching file names Overlay — file with matches
-F, –fixed-strings Treat pattern as regular text instead of regex You can disable this option with — no fixed strings
-L, –follow Will recursively search for links, closed by default You can use — no follow to close it
-g, –glob <GLOB>… You can use! Or! To retrieve files It can be used multiple times and matches the wildcard rule of. Gitignore
-h, –help print the help information
–heading Print the file name above the match instead of the same line This is the default behavior and can be turned off with — no heading
–hidden Search hidden files and folders It is ignored by default and can be turned off by — no hidden
–iglob <GLOB>… Same as — glob, but this case is insensitive
-i, –ignore-case Pattern case insensitive You can override this option with – S / – – case sensitive or – S / – – Smart case
–ignore-file <PATH>… Ignore the path, the format is the same as. Gitignore, can be multiple When there are multiple — ignore file tags, the latter has higher priority
On the command, use – G to achieve the same effect
-v, –invert-match Reverse matching
-n, –line-number Display the number of file lines, open by default
-x, –line-regexp Only rows with entire rows matching pattern are displayed Will override — word regexp
-M, –max-columns <NUM> Do not print matching rows longer than the < num > column
-m, –max-count <NUM> Limit a file to < num > line matching
–max-depth <NUM> Limit folder recursive search depth rg --max-depth 0 dir/No search is performed
–max-filesize <NUM+SUFFIX?> Ignore files larger than < num > byte Suffix can be K, m, g. the default is byte
–mmap Try to use memory maps, default behavior At present, it does not support all options, use — no MMAP to turn it off
–no-config Do not read the conf file, ignore ripgrep_ CONFIG_ PATH
–no-filename Do not print matching file names
–no-heading Print the file name before each matching line
–no-ignore Cancel the ignore file, such as. Gitignore,. Ignore You can close it with — ignore
–no-ignore-global Cancel global ignore file reading Such as $home /. Config / git / ignore
–no-ignore-messages Error in parsing. Ignroe,. Gitignore files It can be closed by — ignore messages
–no-ignore-parent Do not read. Gitignore,. Ignore files in the parent folder It can be closed by — ignore parent
–no-ignore-vcs All. Ignore files only It can be closed by — ignore VCs
-N, –no-line-number Do not print the number of matching lines
–no-messages Do not print open and read file related errors
-0, –null Add a nul character to the printed file path Very useful for xargs
-o, –only-matching Print only the matching content, not the entire line
–passthru Print matching and mismatched lines
–path-separator <SEPARATOR> Path separator, default on Linux is/
–pre <COMMAND> Process the file with < command > and give the result to RG There can be a huge performance penalty
for example
case “$1” in
*.pdf)
exec pdftotext “$1” –
;;
*)
case $(file “$1”) in
_Zstandard_)
exec pzstd -cdq
;;
*)
exec cat
;;
esac
;;
esac
-p, –pretty --color always --heading --line-number
-q, –quiet Do not print to stdout, if a match is found, stop RG RG is very useful when it comes to exit code
–regex-size-limit <NUM+SUFFIX?> Upper limit for compiling regex
-e, –regexp <PATTERN>… Use regular to match You can use this option multiple times to print lines that match any pattern
Can be used to search for patterns that start with –rg -e -foo
-r, –replace <REPLACEMENT_TEXT> Print the corresponding file instead of matching content The group number $5 can be used
-z, –search-zip Search in GZ, bz2, XZ, LZMA, lz4 file types It can be closed by — no search zip
-S, –smart-case If it’s all lowercase, it’s not case sensitive, otherwise it’s sensitive It can be closed by – S / – – case sensitive and – I / – – ignore case
–sort-files Sort output by file path The parallel search thread is closed
–stats Print out the statistical results
-a, –text Search binaries It can be closed by — no text
-j, –threads <NUM> Approximate number of threads used
-t, –type <TYPE>… Search for only one file type You can list the supported file types by — type LSIT
–type-add <TYPE_SPEC>… add file type asrg --type-add 'foo:*.foo' -tfoo PATTERN
It can also be used to create a rule that contains multiple file types –type-add ‘src:include:cpp,py,md’
–type-clear <TYPE>… Clear default file type
–type-list Lists all built-in file types
-T, –type-not <TYPE>… Do not search for a file type
-u, –unrestricted -U searches for files in. Gitignore, – UU searches for hidden files -Uuu search Binary
-V, –version Print version information
–vimgrep Print one line at a time Multiple matches in a row will print multiple lines
-H, –with-filename Print matching file path, default It can be closed by — no file name
-w, –word-regexp Matching pattern as a single word is equivalent to < >

Case study

Example 1


$ rg 'name' ./

Example 2

Search for content where name is an independent word (- W), equivalent to < pattern > and


$ rg -w 'name' ./

Example 3

Print only filenames (- L) that contain matches


$ rg -w 'name' ./ -l
src/cpp/epoll_server.cpp
src/cpp/uart_xtor.cpp

Example 4

Only CPP files (- t) can be searched, and – t can be used to not search for certain types of files


$ rg -w 'name' ./ -tcpp

Example 5

Regular search (- E)


$ rg -e "sa.*port" ./ -tcpp

Example 6

Display matching content and two lines (- C) above and below, similar to – A / – B


$ rg -e "sa.*port" ./ -tcpp -C2

Example 7

Show rows without debug (- V)


$ rg -v "debug" -tcpp ./

Example 8

Show only matching part (- O)


$ rg -e "if.*debug" ./ -tcpp -o

Example 9

Ignore case (- I)


$ rg -ie "if.*debug" ./ -tcpp -o

Example 10

Consider pattern as a constant character (- F), such as. () {} * + do not need escape. If the character you are searching for starts with -, use — as a separator, or userg -e "-foo"


rg -F "i++)" ./ -tcpp

Example 11

Print all files to be searched — files


rg --files

Example 12

Output built-in identification file type


$ rg --type-list
agda: *.agda, *.lagda
aidl: *.aidl
amake: *.bp, *.mk
asciidoc: *.adoc, *.asc, *.asciidoc
asm: *.S, *.asm, *.s
ats: *.ats, *.dats, *.hats, *.sats
avro: *.avdl, *.avpr, *.avsc
awk: *.awk
bazel: *.bzl, BUILD, WORKSPACE
bitbake: *.bb, *.bbappend, *.bbclass, *.conf, *.inc
bzip2: *.bz2
c: *.H, *.c, *.cats, *.h
cabal: *.cabal
cbor: *.cbor
ceylon: *.ceylon
clojure: *.clj, *.cljc, *.cljs, *.cljx
cmake: *.cmake, CMakeLists.txt
coffeescript: *.coffee
config: *.cfg, *.conf, *.config, *.ini
cpp: *.C, *.H, *.cc, *.cpp, *.cxx, *.h, *.hh, *.hpp, *.hxx, *.inl
creole: *.creole
crystal: *.cr, Projectfile
cs: *.cs
csharp: *.cs
cshtml: *.cshtml
css: *.css, *.scss
csv: *.csv
cython: *.pyx
d: *.d
dart: *.dart
dhall: *.dhall
docker: *Dockerfile*
elisp: *.el
elixir: *.eex, *.ex, *.exs
elm: *.elm
erlang: *.erl, *.hrl
fidl: *.fidl
fish: *.fish
fortran: *.F, *.F77, *.F90, *.F95, *.f, *.f77, *.f90, *.f95, *.pfo
fsharp: *.fs, *.fsi, *.fsx
gn: *.gn, *.gni
go: *.go
groovy: *.gradle, *.groovy
gzip: *.gz
h: *.h, *.hpp
haskell: *.c2hs, *.cpphs, *.hs, *.hsc, *.lhs
hbs: *.hbs
hs: *.hs, *.lhs
html: *.ejs, *.htm, *.html
idris: *.idr, *.lidr
java: *.java, *.jsp
jinja: *.j2, *.jinja, *.jinja2
jl: *.jl
js: *.js, *.jsx, *.vue
json: *.json, composer.lock
jsonl: *.jsonl
julia: *.jl
jupyter: *.ipynb, *.jpynb
kotlin: *.kt, *.kts
less: *.less
license: *[.-]LICEN[CS]E*, AGPL-*[0-9]*, APACHE-*[0-9]*, BSD-*[0-9]*, CC-BY-*, COPYING, COPYING[.-]*, COPYRIGHT, COPYRIGHT[.-]*, EULA, EULA[.-]*, GFDL-*[0-9]*, GNU-*[0-9]*, GPL-*[0-9]*, LGPL-*[0-9]*, LICEN[CS]E, LICEN[CS]E[.-]*, MIT-*[0-9]*, MPL-*[0-9]*, NOTICE, NOTICE[.-]*, OFL-*[0-9]*, PATENTS, PATENTS[.-]*, UNLICEN[CS]E, UNLICEN[CS]E[.-]*, agpl[.-]*, gpl[.-]*, lgpl[.-]*, licen[cs]e, licen[cs]e.*
lisp: *.el, *.jl, *.lisp, *.lsp, *.sc, *.scm
log: *.log
lua: *.lua
lz4: *.lz4
lzma: *.lzma
m4: *.ac, *.m4
make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile
man: *.[0-9][cEFMmpSx], *.[0-9lnpx]
markdown: *.markdown, *.md, *.mdown, *.mkdn
matlab: *.m
md: *.markdown, *.md, *.mdown, *.mkdn
mk: mkfile
ml: *.ml
msbuild: *.csproj, *.fsproj, *.proj, *.props, *.targets, *.vcxproj
nim: *.nim
nix: *.nix
objc: *.h, *.m
objcpp: *.h, *.mm
ocaml: *.ml, *.mli, *.mll, *.mly
org: *.org
pdf: *.pdf
perl: *.PL, *.perl, *.pl, *.plh, *.plx, *.pm, *.t
php: *.php, *.php3, *.php4, *.php5, *.phtml
pod: *.pod
protobuf: *.proto
ps: *.cdxml, *.ps1, *.ps1xml, *.psd1, *.psm1
puppet: *.erb, *.pp, *.rb
purs: *.purs
py: *.py
qmake: *.prf, *.pri, *.pro
r: *.R, *.Rmd, *.Rnw, *.r
rdoc: *.rdoc
readme: *README, README*
rst: *.rst
ruby: *.gemspec, *.rb, .irbrc, Gemfile, Rakefile
rust: *.rs
sass: *.sass, *.scss
scala: *.sbt, *.scala
sh: *.bash, *.bashrc, *.csh, *.cshrc, *.ksh, *.kshrc, *.sh, *.tcsh, *.zsh, .bash_login, .bash_logout, .bash_profile, .bashrc, .cshrc, .kshrc, .login, .logout, .profile, .tcshrc, .zlogin, .zlogout, .zprofile, .zshenv, .zshrc, bash_login, bash_logout, bash_profile, bashrc, profile, zlogin, zlogout, zprofile, zshenv, zshrc
smarty: *.tpl
sml: *.sig, *.sml
soy: *.soy
spark: *.spark
sql: *.psql, *.sql
stylus: *.styl
sv: *.h, *.sv, *.svh, *.v, *.vg
svg: *.svg
swift: *.swift
swig: *.def, *.i
systemd: *.automount, *.conf, *.device, *.link, *.mount, *.path, *.scope, *.service, *.slice, *.socket, *.swap, *.target, *.timer
taskpaper: *.taskpaper
tcl: *.tcl
tex: *.bib, *.cls, *.ltx, *.sty, *.tex
textile: *.textile
tf: *.tf
toml: *.toml, Cargo.lock
ts: *.ts, *.tsx
twig: *.twig
txt: *.txt
vala: *.vala
vb: *.vb
verilog: *.sv, *.svh, *.v, *.vh
vhdl: *.vhd, *.vhdl
vim: *.vim
vimscript: *.vim
webidl: *.idl, *.webidl, *.widl
wiki: *.mediawiki, *.wiki
xml: *.xml, *.xml.dist
xz: *.xz
yacc: *.y
yaml: *.yaml, *.yml
zsh: *.zsh, .zlogin, .zlogout, .zprofile, .zshenv, .zshrc, zlogin, zlogout, zprofile, zshenv, zshrc

summary

Ripgrep’s search speed is really fast. It helps me a lot when browsing the code. I believe that its value to every farmer is infinite, especially after the combination of FZF.
The only weakness is the support for regularization, but this is a trade-off. If you use a library like PCRE, it will greatly affect the speed.

All of the above, I hope that all of the above will be helpful to you.