Linux Classes
Linux Classes
Share This With a Friend
LINUX CLASSES - DATA MANIPULATION

How Do I Select Certain Records From a File?

The grep command selects and prints lines from a file (or a bunch of files) that match a pattern. Let's say your friend Bill sent you an email recently with his phone number, and you want to call him ASAP to order some books. Instead of launching your email program and sifting through all the messages, you can scan your in-box file, like this: grep 'number' /var/mail/hermie
call No Starch Press at this number: 800/420-7240.
noted that recently, an alarming number of alien spacecrafts
among colleagues at a number of different organizations

Here, grep has pulled out just the lines that contain the word number. The first line is obviously what you were after, while the others just happened to match the pattern. The general form of the grep command is this:

grep <flags> <pattern> <files>

The most useful grep flags are shown here:

-i Ignore uppercase and lowercase when comparing.
-v
Print only lines that do not match the pattern.
-c
Print only a count of the matching lines.
-n
Display the line number before each matching line.

When grep performs its pattern matching, it expects you to provide a regular expression for the pattern. Regular expressions can be very simple or quite complex, so we won't get into a lot of details here. Here are the most common types of regular expressions:

abc Match lines containing the string "abc" anywhere.
^abc
Match lines starting with "abc."
abc$
Match lines ending with "abc."
a..c
Match lines containing "a" and "c" separated by any two characters (the dot matches any single character).
a.*c
Match lines containing "a" and "c" separated by any number of characters (the dot- asterisk means match zero or more characters).



Regular expressions also come into play when using vi, sed, awk, and other Unix commands. If you want to master Unix, take time to understand regular expressions. Here is a sample poem.txt file and some grep commands to demonstrate regular-expression pattern matching:

Mary had a little lamb
Mary fried a lot of spam
Jack ate a Spam sandwich
Jill had a lamb spamwich

To print all lines containing spam (respecting uppercase and lowercase), enter

grep 'spam' poem.txt
Mary fried a lot of spam
Jill had a lamb spamwich

To print all lines containing spam (ignoring uppercase and lowercase), enter

grep -i 'spam' poem.txt
Mary fried a lot of spam
Jack ate a Spam sandwich
Jill had a lamb spamwich

To print just the number of lines containing the word spam (ignoring uppercase and lowercase), enter

grep -ic 'spam' poem.txt
3

To print all lines not containing spam (ignoring uppercase and lowercase), enter

grep -i -v 'spam' poem.txt
Mary had a little lamb

To print all lines starting with Mary, enter

grep '^Mary' poem.txt
Mary had a little lamb
Mary fried a lot of spam

To print all lines ending with ich, enter

grep 'ich$' poem.txt
Jack ate a Spam sandwich
Jill had a lamb spamwich

To print all lines containing had followed by lamb, enter

grep 'had.*lamb' poem.txt
Mary had a little lamb
Jill had a lamb spamwich

If you want to learn more about regular expressions, start with the man regexp command. There's also a good book called Mastering Regular Expressions, by Jeffrey Friedl, published by O'Reilly & Associates.

For more information on the grep command, see the grep manual.

Previous Lesson: Selecting Columns
Next Lesson: Search & Replace

[ RETURN TO INDEX ]



Comments - most recent first
(Please feel free to answer questions posted by others!)

John Hsu (04 Oct 2013, 19:45)
Remove tons of foo* files
ll | grep foo | awk '{print "rm "8ドル" "}' | csh

Note:
"8ドル" -- column 8 of ll comand



Thanks
Sincerely
John Hsu
johnthsu@hotmail.com
rajan (07 Nov 2012, 05:14)
find the occurence of three consective aqnd identical word character using 1. grep 2. sed
DIVYA K (14 Mar 2012, 23:06)
I have a doubt.I want the codings for searching two different fields from a record and display that fields of the record fully in perl coding with correct examples.Please care of this and send these to my mail id.
giridhar (07 Mar 2012, 05:52)
Hi Bob. can we use multiple commands on a single grep command
Bob Rankin (20 Jul 2011, 12:49)
Correct, that's all the cut command does. I'll give hints, but I won't do your homework. :-)
Francis (20 Jul 2011, 12:19)
Thanks. I have tried it and list all the value from the 807th position, but it did not give me the total counts if the 807th position having value "E".
Bob Rankin (20 Jul 2011, 09:45)
@Francis - You can use the command
cut -c807
to isolate the 807th character.
Francis (20 Jul 2011, 00:44)
I need to get a total counts from a file if its 807th record position having a letter of 'E'. Please help. Thanks
Andre (05 Jul 2011, 22:51)
Hi Bob,

Thanks for the tutorial. It's helping me as I read another book on Linux.

Though very good (it covers all flavours of Linux), I find myself having to refer to your website to expound and understand particular commands, like 'grep'.

Thanks, and have just recommended your site to a friend dipping his toes into Linux!

Keep up the good work.
Arturo (06 Oct 2010, 13:12)
Hi everyone, I have a questiテウn, I hope someone can help me.

I have a lot of text files and each one have registers, I need count how many records has some particular value in a fix position, if I use grep command count the registers that contains the search value, independent their possition, do you know some command that can help me?
mahesh (03 Sep 2010, 09:41)
what is a command to list all lines having space from etc/passwd file ?
Parash (11 Aug 2010, 10:14)
Hi Bob, I have a requirement which is a bit tricky and I am unable to resolve. I have list of table names in a directory.Some of the tables in the list contain "$" in their names(system tables).Also there are some other files in the same directory and some of the names of these files contain the names as a part of it present in the table list that I have. Now when I try using an egrep(as I am using some other check conditions as well like files beginning with ^table|^p_) with the names from the table list in iteration in the directory, I get an error:"egrep: $ anchor not at end of pattern." for those table names which have $ as a part of their name.I am unable to escape it.I am also typing the directory structure and the logic I am applying.Please understand that my requirement does not include replacing the "$" in the file name.Looking forward for your help.Thanks.

My table list
$ cat table.list
ACTIVE_LOGINS
AB$_BC$_MEM_MC_G
ATTACHMENT

also in the same directory there are files such as:
$
$ ls -1 *AB*
index_AB$_TABLE1_I.sql.TEMPLATE
table_AB$_BC$_MEM_MC_G.sql.TEMPLATE
view_AB$BC$_MEM_MC_S.sql


My Requirement logic:
for table in `cat table.list`
do
echo table=$table
ORIG_TABLE_FILE=`ls | egrep "(^p_|^table_)"${table}".sql.TEMPLATE`
echo ORIG_TABLE_FILE=$ORIG_TABLE_FILE
for file in $(egrep "(ON ${table} \()|(TABLE ${table}\$)" `ls . | grep -v ^table_ | grep -v ^p_` | cut -f1 -d: | uniq)
do
echo file=$file
done
done
sapna (05 Jul 2010, 03:23)
How can I select a few files with similar name (except for number at the end) and put the selected ones in another folder
umar ayaz (16 Jun 2010, 05:20)
Good for basic understanding
Tom (12 May 2010, 00:00)
Hi all....I have a doubt...Can we give the directory name istead of filename in grep command
Bob Rankin (20 Apr 2010, 17:58)
@agh - Fixed now, thanks!
agh (16 Apr 2010, 07:22)
Thank you for the tutorial. I was wondering why the first line in the example got taken by grep. I guess that maybe it is a small mistke.

grep 'number' /var/mail/hermie
can call No Starch Press at 800/420-7240. Office hours are

No number on this line :).

therealenki (09 Apr 2010, 07:15)
Hi Bob,
A useful addition to this page is recursive use of grep. Lots of folks search and ask about this and it's use is NOT obvious cos of shell wildcard expansion.
Example:
Search all 'c' source files in this and sub directories for the text 'link(something) list' starting at the current directory.

Don't use: grep -r 'link.* list' *.c (doesn't work - file not found)

Use: grep -r 'link.* list' . --include "*.c"
(of course this presumes that ur copy of grep actually has the recurse functionality built in otherwise u will need to combine find & grep)

Hope this helps others.
Arlino (22 Feb 2010, 07:11)
Hi Bob, thanks for the hint. Although the grep help command seens to be clear, I was wondering what means -i, --ignore-case ignore case distinction (extracted from #grep --help). Iエm not english native but reading you post cleared everything. It seems silly I know. Again, thanks a lot.

I welcome your comments. However... I am puzzled by many people who say "Please send me the Linux tutorial." This website *is* your Linux Tutorial! Read everything here, learn all you can, ask questions if you like. But don't ask me to send what you already have. :-)

NO SPAM! If you post garbage, it will be deleted, and you will be banned.
*Name:
Email:
Notify me about new comments on this page
Hide my email
*Text:




Copyright © by - Privacy Policy
All rights reserved - Redistribution is allowed only with permission.

Popular Linux Topics

Linux Intro
Linux Files
Linux Commands
Change Password
Copy Files
Linux Shell Basics

Linux Tutorial

Who is Doctor Bob?
What is Linux?
History of Unix
Operating Systems
What's Next?

Linux Basics

Living in a Shell
Root and Other Users
Virtual Consoles
Logoff and Shutdown
Choosing a Shell
The Command Prompt
Wildcards
Command History
Aliases
Redirection
Pipelines
Processes
Stopping a Program
Environment Variables
Help!

Linux Files

The Linux File System
Linux File Names
Linux Directories
Directory Terminology
Navigating the File System
Listing Linux Files
Displaying Linux Files
Copying and Renaming Files
Creating Files and Directories
Deleting Files and Directories
Linux Files - Wildcards
The Nine Deadly Keystrokes
Linux File Permissions
Changing File Permissions

Linux Commands

Important Linux Commands
Changing Your Password
Switching Users
Who is Logged In?
Date and Time
The Echo Command
Spell Checking
Printing Linux Files
Joining Files
Searching for Files
Comparing Files
Task Scheduling
Linking Files

Linux Editors

The Vi Editor
The Emacs Editor
The Pico Editor

Linux Data Manipulation

Slicing & Dicing
Heads or Tails?
Sorting Data
Eliminating Duplicates
Selecting Columns
Selecting Records
Search & Replace
Crunching Data
Finding Files
Pipe Fitting

Linux Shell Programming

Linux Shell Scripts
Executing a Script
Shell Script Variables
Shell Script Logic
Shell Script Looping
Shell Script Debugging

Perl Programming

Perl Basics
Perl Variables
Perl Arguments
Perl Logic
Perl Looping
Perl and Files
Perl Pattern Matching

Linux and Email

Sending Email
Reading Email
Other Mail Commands
Using Pine for Email
The Pine Inbox
Pine Email Basics
Pine Email Folders
Pine for Power Users

Compression and Encoding

Linux File Compression
Archiving With Tar
Compression With Gzip
Compress and Zcat
Zmore and Zless
Zip and Unzip
Encoding and Decoding
Encryption

Linux Does DOS

Accesing DOS Files
Accesing DOS Partitions
Running DOS Programs

Managing Linux

Updating Your Linux System
Installing Packages with RPM
Uninstalling Packages w/ RPM
Upgrading Packages with RPM
Querying Packages with RPM

AltStyle によって変換されたページ (->オリジナル) /