Awk is my favorite CLI program for parsing different input. In most cases I just use it to cut out different fields from some output as its superior to cut utility – it allows me to set custom, multi-char field separator instead of just single character in cut.


{} – block executed once for each line of the input.

BEGIN{} – block executed  once at the start. useful to set some variables like custom field separator FS. You could set FS in the normal block {} also… but its faster if you don’t need to do it once for each line of the input.

END{} – block executed once in the end. useful to print out statistics if you count something.


FS – input field separator. default is any amount of whitespace – one or more spaces or tabs.

OFS – output field separator, defaults to space

RS – row or record separator. default is newline.
ORS – output record separator. set this to ” ” to remove newlines in the output.
NR – line/record  number of currently parsed input
NF – number of fields/columns on current line/record
$1,$2,$3….$NR  – references columns 1, 2, 3 up to the last one which can always referenced as $NR not depending on is it 10nth or 100th.
$0 – references entire line/record, all fields/columns and separators.

# awk print out last field using separator “.”

# echo "one.two.three" |awk 'BEGIN{FS="."}{print $NF;}'

# awk print out last but one field using separator “.”. This is also useful to remove file extension from list of files.

$ echo "one.two.three" |awk 'BEGIN{FS="."}{X=NF-1; print $X;}'
# remove file extension - quick and dirty solution. does not take into account files having zero or more than one . in the name.
£ find . -type f -printf '%f\n' |awk 'BEGIN{FS="."}{X=NF-1; print $X;}'

# awk – print fields 1-3 only from matching line using separator “:”. Simplified version using grep below.

# awk 'BEGIN{FS=":";}/root/{print $1 $2 $3;}'< /etc/passwd
# grep root /etc/passwd| awk 'BEGIN{FS=":";}{print $1 $2 $3;}'

# find all unique file extensions

# find /path/to/files/ -type f |awk 'BEGIN{FS=".";}{print $NF;}' |sort |uniq

Arrays, loops, conditional expressions

Following example queries information about client backups from netbackup master server and parses it by using if-else statements, for-loops, arrays and array sorting

hostlist should have netbackup clients hostnames. For-loop just runs bpclimagelist command for each client and parses the output. Empty BEGIN blocks could be omitted but its my habbit to keep them.

for h in `cat hostlist`;do

/usr/openv/netbackup/bin/bpclimagelist -client ${h} -server |awk -v hostname="${h}" 'BEGIN{}{if (lastincremental == "" && $7 == "Incr" && $8 == "Backup"){lastincremental=$1;}; if (lastfull == "" && $7 == "Full" && $8 == "Backup"){lastfull=$1;}; if (lastfull != "" && lastincremental != ""){exit}; }END{if (lastincremental == ""){lastincremental="n/a"}; if (lastfull == ""){lastfull="n/a"}; print hostname," ",lastincremental," ",lastfull;}' >> datafile.txt


parse that datafile.txt with another awk and produce report of missing backups.

grep "n/a" datafile.txt |awk 'BEGIN{}{if($2 == "n/a" && $3 == "n/a" ){both[bothlen++]=$1} else if($2 == "n/a"){incr[incrlen++]=$1} else if( $3 == "n/a" ){full[fulllen++]=$1};}END{if (bothlen > 0){asort(both);print "Servers without full or incremental successful backups"; for (b = 0; b < bothlen; b++){print both[b];}; print "";}; if ( incrlen > 0){ asort(incr); print "Servers without incremental successful backups"; for (i = 0; i < incrlen; i++){print incr[i];}; print "";}; if ( fulllen > 0 ){ asort(full); print "Servers without full successful backups"; for (f = 0; f < fulllen; f++){print full[f];};}; }'

to be continued…

If you found this useful, say thanks, click on some banners or donate, I can always use some beer money.