Thursday, March 17, 2011

UNIX Text Processing Commands


Cut command.
cut command selects a list of columns or fields from one or more files.
Option -c is for columns and -f for fields. It is entered as
cut options [files] 
for example if a file named testfile contains

this is firstline
this is secondline
this is thirdline
Examples:
cut -c1,4 testfile will print this to standard output (screen)
ts
ts
ts
It is printing columns 1 and 4 of this file which contains t and s (part of this).
    Options:
  • -c list cut the column positions identified in list.
  • -f list will cut the fields identified in list.
  • -s could be used with -f to suppress lines without delimiters.

Paste Command.
paste command merge the lines of one or more files into vertical columns separated by a tab.
for example if a file named testfile contains
this is firstline
and a file named testfile2 contains
this is testfile2
then running this command
paste testfile testfile2 > outputfile
will put this into outputfile
this is firstline       this is testfile2 
it contains contents of both files in columns.
who | paste - - will list users in two columns.
    Options:
  • -d'char' separate columns with char instead of a tab.
  • -s merge subsequent lines from one file.

Sort command.
sort command sort the lines of a file or files, in alphabetical order. for example if you have a file named testfile with these contents
zzz
aaa
1234
yuer
wer
qww
wwe
Then running
sort testfile
will give us output of
1234
aaa
qww
wer
wwe
yuer
zzz
    Options:
  • -b ignores leading spaces and tabs.
  • -c checks whether files are already sorted.
  • -d ignores punctuation.
  • -i ignores non-printing characters.
  • -n sorts in arithmetic order.
  • -ofile put output in a file.
  • +m[-m] skips n fields before sorting, and sort upto field position m.
  • -r reverse the order of sort.
  • -u identical lines in input file apear only one time in output.

Uniq command.
uniq command removes duplicate adjacent lines from sorted file while sending one copy of each second file.
Examples

sort names | uniq -d will show which lines appear more than once in names file.
    Options:
  • -c print each line once, counting instances of each.
  • -d print duplicate lines once, but no unique lines.
  • -u print only unique lines.

Awk and Nawk command.
awk is more like a scripting language builtin on all unix systems. Although mostly used for text processing, etc.
Here are some examples which are connected with other commands.
Examples:
df -t | awk 'BEGIN {tot=0} $2 == "total" {tot=tot+$1} END {print (tot*512)/1000000}' Will give total space in your system in megabytes.
Here the output of command df -t is being passed into awk which is counting the field 1 after pattern "total" appears. Same way if you change $1 to $4 it will accumulate and display the addition of field 4
which is used space.
for more information about awk and nawk command in your system enter man awk or man nawk.

Sed command.
sed command launches a stream line editor which you can use at command line.
you can enter your sed commands in a file and then using -f option edit your text file. It works as
sed [options] files 
    options:
  • -e 'instruction' Apply the editing instruction to the files.
  • -f script Apply the set of instructions from the editing script.
  • -n suppress default output.

for more information about sed, enter man sed at command line in your system.
Vi editor.
vi command launches a vi sual editor. To edit a file type
vi filename
vi editor is a default editor of all Unix systems. It has several modes. In order to write characters you will need to hit i to be in insert mode and then start typing. Make sure that your terminal has correct settings, vt100 emulation works good if you are logged in using pc.
Once you are done typing then to be in command mode where you can write/search/ you need to hit :w filename to write
and in case you are done writing and want to exit
:w! will write and exit. 

    options:
  • i for insert mode.
    • I inserts text at the curson
    • appends text at the end of the line.
    • appends text after cursor.
    • open a new line of text above the curson.
    • o open a new line of text below the curson.
  • for command mode.
    • <escape> to invoke command mode from insert mode.
    • :!sh to run unix commands.
    • x to delete a single character.
    • dd to delete an entire line
    • ndd to delete n number of lines.
    • d$ to delete from cursor to end of line.
    • yy to copy a line to buffer.
    • P to paste text from buffer.
    • nyy copy n number of lines to buffer.
    • :%s/stringA/stringb /g to replace stringA with stringB in whole file.
    • G to go to last line in file.
    • 1G to go to the first line in file.
    • w to move forward to next word.
    • b to move backwards to next word.
    • $ to move to the end of line.
    • J join a line with the one below it.
  • /string to search string in file.
  • n to search for next occurence of string.

No comments:

Post a Comment