Using sift

sift tries to be compatible with the basic options and output formats of the original grep, so in many cases sift can be used as a drop-in replacement for existing scripts and you do not have to learn everything from scratch.

One important difference is that sift defaults to searching recursive in the current directory if no target is given.

Example usages:

Command Result
sift pattern search for files matching pattern recursively in the current directory
sift pattern . the same as above
sift pattern file1 file2 dir1 search for pattern in file1 and file2 and recursively in the directory dir1
cat file | sift pattern search for pattern in STDIN
sift -e pattern1 -e pattern2 [TARGETS] search for pattern1 and pattern2 in the given targets
sift -f patternfile [TARGETS] search for patterns contained in patternfile (one pattern per line) in the given targets

sift supports various options to specify which files should be searched - please see the file selection section on the samples page for details.

Configuration

Config file locations and format

sift allows customizations through config files. The global config is expected at $HOME/.sift.conf and a local config is used if the file .sift.conf exists in the current directory.

The config file may only contain a subset of the available options. The local configuration overrules the global configuration for all options set in the local config.

The configuration is stored in JSON format - it can also be viewed with
sift --print-config
(This command also prints the expected locations of the configuration files)


An example config file:

{
 "BinarySkip": true,
 "Git": true,
 "GroupByFile": true,
 "IgnoreCase": true
 }

Changing the configuration

sift supports changing the configuration for many options (without editing the config file) via the parameter --write-config .

This command

  1. parses the global config file (if existent)
  2. parses the local config file (if existent)
  3. parses the command line parameters
and writes the resulting new configuration to the local config file (if existent) or the global config file.

Example: To enable the case-insensitive option as default, you can use
sift -i --write-config

Gitignore Support

sift understands .gitignore files: you can customize it to get only results for files you care about when searching in code repositories.

The implementation should support the full .gitignore pattern syntax documented by the git project here: https://git-scm.com/docs/gitignore

Use the option --git to enable git support when search with sift.

Alternatively, you can use sift --git --write-config to always enable support for .gitignore files.


Implementation details

The current implementation does the following for each searched target directory:

  1. search for a .gitignore file in the current directory
  2. search for .gitignore files in all parent directories
  3. parse the files in the correct order

Each file is then checked against all found patterns to check whether it should be ignored.
Directories named '.git' and files named '.gitignore' are not searched.

The implementation currently does not parse the patterns in $HOME/.config/git/ignore and $GIT_DIR/info/exclude. This may change in a future version.

Conditions

Conditions allow more complex queries - take a look at the samples to see them in action.

The supported conditions are split in two groups: "File Conditions" work on the complete file, i.e. matches in the file are only shown if all file conditions are fulfilled, while "Match Conditions" are evaluated for every single match.

All conditions can be inverted by putting "not-" in front of the option.


File Conditions

Option Description
--file-matches=PATTERN only show matches for the searched file if the file content also matches PATTERN
--line-matches=NUM:PATTERN only show matches for the searched file if the given line of the file matches PATTERN
--range-matches=X:Y:PATTERN only show matches for the searched file if any of the given lines (from X to Y inclusive) matches PATTERN

Match Conditions

Option Description
--preceded-by=PATTERN only show a match if it is preceded by a line matching PATTERN
--followed-by=PATTERN only show a match if it is followed by a line matching PATTERN
--surrounded-by=PATTERN only show a match if it is preceded or followed by a line matching PATTERN
--preceded-within=NUM:PATTERN only show a match if it is preceded by a line matching PATTERN within NUM lines
--followed-within=NUM:PATTERN only show a match if it is followed by a line matching PATTERN within NUM lines
--surrounded-within=NUM:PATTERN only show a match if it is preceded or followed by a line matching PATTERN within NUM lines

Options

All options are described in the output of sift --help .

Visit the samples page to see some of these options in practice.

Usage:
		sift [OPTIONS] PATTERN [FILE|PATH|tcp://HOST:PORT]...
		sift [OPTIONS] [-e PATTERN | -f FILE] [FILE|PATH|tcp://HOST:PORT]...
		sift [OPTIONS] --targets [FILE|PATH]...
OPTIONS
 --binary-skip
 skip files that seem to be binary
 -a, --binary-text
 process files that seem to be binary as text
 --blocksize=
 blocksize in bytes (with optional suffix K|M)
 --color
 enable colored output (default: auto)
 --no-color
 disable colored output
 -C, --context=NUM
 show NUM context lines
 -A, --context-after=NUM
 show NUM context lines after match
 -B, --context-before=NUM
 show NUM context lines before match
 -j, --cores=
 limit used CPU Cores (default: 0 = all)
 -c, --count
 print count of matches per file
 --dirs=GLOB
 recurse only into directories whose name matches GLOB
 --err-show-line-length
 show all line length errors
 --err-skip-line-length
 skip line length errors
 --exclude-dirs=GLOB
 do not recurse into directories whose name matches GLOB
 -x, --ext=
 limit search to specific file extensions (comma-separated)
 -X, --exclude-ext=
 exclude specific file extensions (comma-separated)
 --files=GLOB
 search only files whose name matches GLOB
 --exclude-files=GLOB
 do not select files whose name matches GLOB while recursing
 --path=PATTERN
 search only files whose path matches PATTERN
 --ipath=PATTERN
 search only files whose path matches PATTERN (case insensitive)
 --exclude-path=PATTERN
 do not search files whose path matches PATTERN
 --exclude-ipath=PATTERN
 do not search files whose path matches PATTERN (case insensi-
 tive)
 -t, --type=
 limit search to specific file types (comma-separated, see
 --list-types)
 -T, --no-type=
 exclude specific file types (comma-separated, --list-types)
 -l, --files-with-matches
 list files containing matches
 -L, --files-without-match
 list files containing no match
 --follow
 follow symlinks
 --git 
 respect .gitignore files and skip .git directories
 --group
 group output by file (default: off)
 --no-group
 do not group output by file
 -i, --ignore-case
 case insensitive (default: off)
 -I, --no-ignore-case
 disable case insensitive
 -s, --smart-case
 case insensitive unless pattern contains uppercase characters
 (default: off)
 -S, --no-smart-case
 disable smart case
 --no-conf
 do not load config files
 -v, --invert-match
 select non-matching lines
 --limit=NUM
 only show first NUM matches per file
 -Q, --literal
 treat pattern as literal, quote meta characters
 -m, --multiline
 multiline parsing (default: off)
 -M, --no-multiline
 disable multiline parsing
 --only-matching
 only show the matching part of a line
 -o, --output=FILE|tcp://HOST:PORT
 write output to the specified file or network connection
 --output-limit=
 limit output length per found match
 --output-sep=
 output separator (default: "\n")
 --output-unixpath
 output file paths in unix format ('/' as path separator)
 -e, --regexp=PATTERN
 add pattern PATTERN to the search
 -f, --regexp-file=FILE
 search for patterns contained in FILE (one per line)
 --print-config
 print config for loaded configs + given command line arguments
 -q, --quiet
 suppress output, exit with return code zero if any match is
 found
 -r, --recursive
 recurse into directories (default: on)
 -R, --no-recursive
 do not recurse into directories
 --replace=
 replace numbered or named (?P<name>pattern) capture groups. Use
 ${1}, ${2}, $name, ... for captured submatches
 --filename
 enforce printing the filename before results (default: auto)
 --no-filename
 disable printing the filename before results
 -n, --line-number
 show line numbers (default: off)
 -N, --no-line-number
 do not show line numbers
 --column
 show column numbers
 --no-column
 do not show column numbers
 --stats
 show statistics
 --targets
 only list selected files, do not search
 --list-types
 list available file types
 -V, --version
 show version and license information
 -w, --word-regexp
 only match on ASCII word boundaries
 --write-config
 save config for loaded configs + given command line arguments
 -z, --zip
 search content of compressed .gz files (default: off)
 -Z, --no-zip
 do not search content of compressed .gz files
 File Condition options:
 --file-matches=PATTERN
 only show matches if file also matches PATTERN
 --line-matches=NUM:PATTERN
 only show matches if line NUM matches PATTERN
 --range-matches=X:Y:PATTERN
 only show matches if lines X-Y match PATTERN
 --not-file-matches=PATTERN
 only show matches if file does not match PATTERN
 --not-line-matches=NUM:PATTERN
 only show matches if line NUM does not match PATTERN
 --not-range-matches=X:Y:PATTERN
 only show matches if lines X-Y do not match PATTERN
 Match Condition options:
 --preceded-by=PATTERN
 only show matches preceded by PATTERN
 --followed-by=PATTERN
 only show matches followed by PATTERN
 --surrounded-by=PATTERN
 only show matches surrounded by PATTERN
 --preceded-within=NUM:PATTERN
 only show matches preceded by PATTERN within NUM lines
 --followed-within=NUM:PATTERN
 only show matches followed by PATTERN within NUM lines
 --surrounded-within=NUM:PATTERN
 only show matches surrounded by PATTERN within NUM lines
 --not-preceded-by=PATTERN
 only show matches not preceded by PATTERN
 --not-followed-by=PATTERN
 only show matches not followed by PATTERN
 --not-surrounded-by=PATTERN
 only show matches not surrounded by PATTERN
 --not-preceded-within=NUM:PATTERN
 only show matches not preceded by PATTERN within NUM lines
 --not-followed-within=NUM:PATTERN
 only show matches not followed by PATTERN within NUM lines
 --not-surrounded-within=NUM:PATTERN
 only show matches not surrounded by PATTERN within NUM lines
 Help Options:
 -h, --help
 Show this help message
 

Limitations/restrictions

This section documents limitations/restrictions that currently apply to sift, especially regarding its functionality.

  • Multiline matching uses a sliding window (currently 32kb), bigger matches will not reliably be found.
  • When sift starts showing matches depends on various conditions:
    1. Under normal circumstances sift waits till a full file is processed before matches are shown due to the parallel processing of files.
    2. If only one single file is processed, matches are shown immediately. This also applies if sift only reads from STDIN. See 4. for exceptions.
    3. If sift finds more than 2^16 (65536) matches, sift starts printing them and the output of matches from other files will be blocked until the processing of the blocking file is finished.
    4. If conditions are used, sift will not show matches before the full file is processed.
  • If multiple matches exist on one single line, only one match gets highlighted. A matched line counts as one match, regardless of how many sub-matches exist on that line.
  • Currently matches are not highlighted on Windows. This might change in the future.
  • The Option --invert only supports very basic usage, mostly to be compatible with grep in some typical use cases.
    • Multiline matching is not supported.
    • Network connections are not supported.
    • Performance is bad.
  • Multiline matching using multiple patterns (options -e or -f) is currently undefined behavior in cases where potential matches overlap. The current implementation rejects matches if the match does not start behind the end of the last match.
  • The context options are not compatible with some other options.
    • Context options cannot be used with a non-standard output separator.
    • Context options cannot be used when reading from STDIN or a network connection.
    • Context options cannot be used with the --zip option enabled.

AltStyle によって変換されたページ (->オリジナル) /