Home Learn Linux Linux awk command with 10 examples

Linux awk command with 10 examples

by Nitish.S
Linux AWK command

Computer systems have always been used to analyze valuable data stored in simple text files. In turn, computer systems themselves are managed through log files. What is potentially common in these two situations is that they hold a great amount of data, which often needs to be streamlined before actually reading the data; otherwise, it is just confusing.

For example, if you are reading some data arranged in a tabular form, you want some columns, you don’t want others.

This requirement was a dire one even back in the day, and consequently, the GNU Project holds an amazing tool that helps users filter and extract data for a better experience. That tool is AWK.

History

AWK is actually a programming language that is dedicated to text processing. It is used for data extraction in specific patterns. It was developed in the 1970s by the Bell Labs, by Alfred Aho, Peter Weinberger, and Brian Kernighan (their surnames’ initials gave birth to the name). The development of awk didn’t stop there. A new version is introduced in 1985, which brought new changes to the awk command, including the ability to handle multiple input streams, computed regular expressions, user-defined functions, and much more! In short, the update made it a more powerful programming language.

Awk has a different implementation. To ensure that there is a proper implementation that offers a standard approach, Paul Rubin wrote gawk in 1986. It worked well with the newer awk.

Apart from that, in 1989 System V Release, new features were added. The developers also improved on its dark corners, making it better for programmers and users to use it. The latest change happened in 1997 when awk saw network access — providing the users the ability to solve problems remotely.

The latest rewrote done in 2011 when John Haque rewrote the gawk internals.

Things you can do with AWK

By now, it should be somewhat clear what AWK is capable of. It is a general-purpose scripting language that lets you handle text processing. Advanced users can also use it for analysis and reporting.

AWK is not like other programming languages as it is data-driven, rather than a procedural programming language. That means that you can effectively use it to perform actions against text input. In short, you can use it to transform data, input them, and also send them as standard output.

In short, you can use AWK to do simple operations such as splits each input line into fields, scans a file line by line, perform an action(s) on match lines, and so on! The awk command is also useful for procedure formatted reports and transforms data files. Moreover, you also get access to programming constructs such as conditional & loops, string & arithmetic operations, and format output lines.

How awk works

In this section, we are going to learn how awk works. It is important to learn that awk has different implementations. To ensure that we are on the same page, we will discuss and use the GNU implementation — which is popularly known as gawk. In most cases, the gawk is symlinked to the awk interpreter.

To get a good understanding, we first need to understand the Records and fields.

It is well known that awk can process textual streams and data files. To process the data, the input is divided into fields and records. To ensure that awk doesn’t get overwhelmed, one record is processed at any given time until the input’s end is reached. Furthermore, the records are further divided into simple sections using a record separator using characters. Also, each record is separated using newline characters. This means that each line can be termed as a record.

You can choose to set a new record separator using the RS variable.

Next comes the filed separator. As usual, each record has fields, and they are separated using the field separator. The field separator can be whitespace, tabs, newline characters, and space. Also, each field is referenced using the $ symbol, whereas the field number starts with 1. This means that the first field can be denoted as $1 whereas the second field is denoted as $2. This way, the nth field can be denoted as $nf.

The awk program

The second aspect of awk is the awk program. If you want to work with awk, you need to write a program that lets the command execute or process the text. The awk program offers a lot of functionality using rules and user-defined functions. The rules work with action pair or one pattern, and the rules are separated using semi-colons or newline.

In case you are wondering, an awk program will look like below.

pattern { action }
pattern { action }
....

In short, the awk program works by match the records based on patterns. If the pattern is found in the record, then it will process it. If not, then the whole record is matched to make sure that something matches based on rules.

awk Command Examples

Now that we have a good understanding of the awk command and how it works, it is now time for us to check out some of the awk command examples.

If you have never used awk before, you may want to know that awk can be used with options like below:

awk options program file

The options that you can use with awk include the following:

  • – f file: It is used to specify the file that contains the awk script
  • -F fs: It is used to specify the file separator.
  • -v var=value: It is used to declare a variable.

Example 1: Read AWK Scripts

One of the most common ways of using awk is to read scripts. As a Linux user, you can create an awk script using the single quotation markers.

To do so, you need to type the following command in the terminal.

$awk '{print "Welcome to Hello, World -- AWK tutorial"}'

awik-read-scripts

In the above example, what you type, it will be returned to the screen itself. The command will continue executing until you end it by pressing CTRL + D.

Example 2: Using Multiple Commands

Another common use of awk is to use multiple commands. As a user, you may want to combine two awk commands into one to get the desired result. In this example, we will output a string and then replace the second word in the string with new input.

$echo "Hello World" | awk '{$2 = "Universe; print $0"}'

using-multiple-commands

In the above example, we first echoed, “Hello, World” to the terminal. Next, we concatenated another awk command where we replaced the second word with Universe — and then finally output the string, which is Hello Universe.

Example 3: Using Variable

Variables let you store information and access them. If you have used programming languages before, you surely know about them. In the case of awk, you use it to process text files. Using the variables, you can access certain data fields within the file just as below.

For this purpose, we created a new text file, mynewfile, where we input some random but beautiful lines.

Next, you need to run the command, as shown below.

awk '{print $1}' mynewfile

using-variable

As you can see, it outputs that particular variable that shows that field from the file. Also, you should see the error that I made.

Example 4: AWK preprocessing

With the awk command, you can add preprocessing. To do so, you need to use the BEGIN keyword.

If you are reading carefully, we created a new file above. Let’s try to use the awk preprocessing to showcase the content of the file.

The command for it is as below.

awk 'awk BEGIN {print "The content of the file:"}
> {print $0}' , mynewfile

awk-pre-processing

The above example screenshot is not correct. I used “Begin” instead of “BEGIN,” which is why you do not see the print statement execute. I leave this for you to try out and see how your result goes!

Example 5: Reading Script From File

This one is tricky. Here, you can use the awk script to read a file.

We create a new script that contains the following.

{print $1 "universe starts at " $6"}

We saved the file as a newscript.

Now, run the following command at the terminal.

$awk -F: -f newscript /etc/passwd

Fascinating, right!

Example 6: AWK Post-processing

Next, we take a look at the AWK post-processing. It works similar to pre-processing, but this time, the post-processing uses the END command.

$ awk 'BEGIN {print "The file content starts now:"}
>
> {print $0}
>
> END {print "The File ends"}' mynewfile

awk-post-processing

Example 7: User-defined variables

You can also use variables within the awk command without using a number or dollar sign.

Below is an example.

$awk '
BEGIN{
test = "Welcome to FossLinux Awesome Linux Family"
print test
}
'

awk-user-defined-variable

Example 8: Built-in Functions

The awk commands also come in handy with their in-built functions. For example, you can use mathematical functions, and as well as String functions.

$ awk 'BEGIN {x - "fossLinux"; print toupper(x)}'
$ awk 'BEGIN {x=exp(35); print x}'

awk-built-in-function

Example 9: Formatting Printing

You can also format the printf function that comes with awk. There are many modifiers you can use. For instance, you can use c to print out as a string; you can also use d for an integer value, and so on.

$ awk 'BEGIN {
x = 200 * 200
printf "The result is: %e\n", x
}'

awk-formatting-printing

Example 10: Structured Commands

You can also use structured commands such as if, else, while, or for loop. Let’s see the below example for the if command.

$ awk '{if ($1 > 20) print $2}' mynewfile

awk-structured

Conclusion

This leads us to the end of our awk command tutorial. So, did you find it useful, and are you going to use it for your work? Comment below and let us know.

You may also like

Leave a Comment

fl_logo_v3_footer

ENHANCE YOUR LINUX EXPERIENCE.



FOSS Linux is a leading resource for Linux enthusiasts and professionals alike. With a focus on providing the best Linux tutorials, open-source apps, news, and reviews written by team of expert authors. FOSS Linux is the go-to source for all things Linux.

Whether you’re a beginner or an experienced user, FOSS Linux has something for everyone.

Follow Us

Subscribe

©2016-2023 FOSS LINUX

A PART OF VIBRANT LEAF MEDIA COMPANY.

ALL RIGHTS RESERVED.

“Linux” is the registered trademark by Linus Torvalds in the U.S. and other countries.