How to use AWK command in Linux? (With Examples)

Posted in

How to use AWK command in Linux? (With Examples)
maazbinasad

Maaz Bin Asad
Last updated on September 21, 2024

    The AWK command is a powerful tool that can be used to manipulate text or search pieces of information inside a file in Linux OS. The AWK command uses a scripting language to define a regular expression that is run upon a file or a set of files to find out strings that match the regular expression and perform a specific action on those patterns of text.

    You can use AWK for large text files to churn out useful data such as logs, big-data sheets, or a stream of data collected over a long period of time. AWK is pre-installed in all the Linux/Unix systems. The AWK command in Linux derived its name from the starting letters of the names of its creators. AWK can be recognized as a text manipulation tool or a scripting language based on your requirements.

    The Syntax of AWK

    The syntax that the AWK command follows is:

    $ awk [program or regex] [files]

    The Program field defines a regular expression, such as a search pattern, along with the actions that you want the command to perform on the files that you supply in the next parameter. The AWK command can use multiple options alongside it. These are:

    OPTIONS FUNCTIONS
    -f program-file Instead of the CLI, the text for the program is read from a file.
    -F value It is used to set the separator of fields
    -v var=value It is used to set variables.

    We can use the manual of the AWK command for a better insight into all the implementation options of this command.

    $ man AWK

    AWK 1

    Actions, Records, Fields, and Variables

    Based on the program that you define along with the AWK command, the command decides what action to perform on the files. A typical format of an AWK program is:

    CONDITION {ACTION}
    CONDITION {ACTION}
    .
    .

    In the above format, the CONDITION field specifies the text pattern that you want to perform a match on, and the ACTION field defines the action you want to perform on the matched strings.

    Actions

    The actions that you define are basically the commands that can perform calculations, can be variables, user-defined as well as in-built functions.

    Records

    By default, the AWK command will consider each new line as a separate record in your text file. You can alter the default behavior by defining apt options.

    Fields

    The AWK command, by default, uses tabs and spaces to categorize or differentiate between fields in records.

    Variables

    There are a ton of in-built variables that AWK has already defined for your use. Let’s check some of these variables below.

    VARIABLE USES
    $0 This variable represents the whole of the record.
    $1, $2, etc. They hold the field variables of a record, which might be individual text values.
    Number of Records (NR) It displays the total number of records or lines that have been read till now from all the files.
    File Number of Records (FNR) Records read from presently reading files.
    Number of Fields (NF) It displays the aggregate count of fields in the record that is being read presently. The last field is denoted by $NF and the second last by $(NF-1)
    FILENAME It stores the name of the current file.
    Field Separator (FS) It defines what character has been used to separate the fields, by default, its spaces or tabs.
    Record Separator (RS) It defines the characters that have been used to separate all the records from each other in a file; by default, it is the new line character.
    Output Field Separator (OFS) It defines those characters that can be used to identify different output fields.
    Output Record Separator (ORS) Used to store the character to separate the output records.
    Output Format (OFMT) %.6g is the default format for numerical values.

    In the rest of the examples we will walk you through, we will use the following text file. AWK 2

    Printing the Contents

    You can display all the contents in a command line using the following AWK command.

    $ awk '{print}' actors.txt

    AWK 3

    Displaying the total number of Records

    To display or print the total number of lines or records in a given file, you can use the NR option along with the AWK command in the following way.

    $ awk 'END { print NR }' actors.txt

    AWK 4

    Find a Match

    You can use regular expressions to define the pattern that you want to generate a match. For example,

    $ awk '/Geller/' actors.txt

    AWK 5

    Please note that it does not print the fields but the whole records where it finds the matches. Another example is where we print all the records starting with the letter ‘ R ’.

    $ awk '/^R/' actors.txt

    CP 8

    Playing With Field Variables

    You can use field variables to output specific fields of all the records. For instance, if we want to display the first field of all records that start with the letter R, we can utilize this command.

    $ awk '/^R/ {print $1;}' actors.txt

    CP 9

    Using Pipe with AWK

    You can use the output from other commands by piping them to the AWK command. In this example, we will use the list command to list all the contents of the current directory and pipe the output with the AWK command to display the month of creation of the file, which resides in the 6th field of the record.

    $ ls -l | awk '{print $6}'

    CP 10

    Using the in-built Variables

    We can also use the in-built variables we discussed above in our article, along with the awk command to format our output. For example, here, we will output the current record number and the first field of each record separated by a dash.

    $ awk '{print NR "-" $1}' actors.txt

    AWK 11

    Merging Actions

    Using the double-ampersand (&&) symbol, we can combine actions with conditions. For example, in the following command, we will display all the records where the first field has more than 4 characters and the second field starts with the letter G.

    $ awk '$2 ~ /^G/ && length($(NF-1)) > 4 { print }' actors.txt

    AWK 12

    Here, we have used the $(NF-1) to access the first field. However, we could also have used $1. We have used the length function to find out the number of characters in a field.

    Wrapping Up!

    To conclude, in this detailed article, we discussed a powerful tool and command called AWK, which can be used to search for a pattern in a large number of text files or files with huge sizes very efficiently. We discussed the AWK command's syntax along with important terms such as actions, records, fields, and variables. We also discussed a few in-built variables that the AWK command provides.

    Next, we skimmed through a few practical examples to get hands-on with the AWK command, such as printing contents, finding a match, using field variables, and in-built field variables. Finally, we saw how to combine several conditions with actions to generate the desired filtered output. We certainly hope that through this comprehensive guide, you will understand this powerful tool and be able to use it with ease.

    People are also reading:

    Leave a Comment on this Post

    0 Comments