main.append(addParagraph("We have already seen and used several built-in variables including FS, RS, OFS and ORS. Earlier in the course we also saw NF which is the number of fields in a record."))
main.append(addParagraph("specifies that the number of fields should be 6 and it then outputs the number of fields in each record followed by the record itself but only where the number of fields in that record is 6."))
main.append(addParagraph("Notice that in this example, we are using NF in both the pattern and the action."))
main.append(addParagraph("Related to NF is NR which rep resents the number of the record and like fields, records are numbered starting from 1. For example"))
main.append(addParagraph("will do the same thing, but only where the record number is 6 so it will output the number six and the record number followed by the record."))
main.append(addParagraph("Notice that when we talk about what will happen, for example 'this command will output', we are talking about the action. That makes sense of course because the action is what the AWK program actually does."))
main.append(addParagraph("When we talk about the conditions under which an action will take place such as 'only when the record number is 6', we are referring to the pattern."))
main.append(addParagraph("An important point to remember is that NR does not meen the record number within the file although this will often be the same thing as is the case in the above examples."))
main.append(addParagraph("To illustrate he this, consider that the first line of the file dukeofyork.txt is"))
main.append(addSyntax("The grand old Duke of York"))
main.append(addParagraph("and if we output this with"))
main.append(addParagraph("and we will see that record shown as record 1. However, you might recall that we can provide more that one file as input in which case AWK will read through the records of each of the files in turn. For example"))
main.append(addParagraph("will concatenate the two files so it is outputting all of the records from both of the files. The first file is dukedyork.txt and its records are numbered from 1 to 8 and the records from the second file, names.txt, are numbered from 9 so the record"))
main.append(addSyntax("Gretchen Galloway"))
main.append(addParagraph("is now the ninth record."))
main.append(addParagraph("In this example, we can easily see which lines in the output came from dukeofyork.txt and which came from names.txt but in many case's where we are reading out input from multiple files, that might not be so obvious."))
main.append(addParagraph("Let's say we want to do the same thing but as well as the record number, we also want our output to show the name of the file a record was read from and also the record number within that file, we can do that with the command"))
main.append(addParagraph("This shows that the record is the ninth record that was read in, it was read from the file names.txt and it was the first record in that file."))
main.append(addSyntax("FILENAME = the file that is currently being processed when the record is read in."))
main.append(addSyntax("FNR = the number of the record in the file."))
main.append(addParagraph("The built-in variables available to you may depend on the version of AWK you are using so you might want to check the documentation for your version."))
main.append(addParagraph("Some built -in variables refer to fields. An example of this is $0 which refers to the record (that is, all the fields in a record). We have also seen something like $n which refers to the nth field in a record. We can use $ with any number to refer to a specific field. In addition, we can use a variable or am expression . Consider this example."))
main.append(addParagraph("For any given line, this will output the nth field in each record where n is equal to the total number of fields in that record. For example, if the record contains 6 fields, the output will be field 6 so the effect is that this will output the last field in each record."))
main.append(addParagraph("If we want to use an expression here, we will put it in parentheses. For example, we can output the penultimate field in each record with"))
main.append(addParagraph("you might be surprised by the result. For each input record we see -1 in the output."))
main.append(addParagraph("To illustrate this, recall that the last field in the first record of dukeofyork.txt is York. Although NF represents the number of fields and that is an integer, $NF refers to the field with that integer as a value, so in this case $NF refers to York and we then subtract 1 from thati value."))
main.append(addParagraph("It does have a numerical value and that is O (for any wor ) so subtracting 1 from that gives you -1 as seen in the output."))
main.append(addParagraph("It is also possible to use the value of a field to reference a field in the record. For example"))
main.append(addSyntax("awk '{print $($1)}'"))
main.append(addParagraph("Let's look at this starting with the expression so that is"))
main.append(addSyntax("($1)"))
main.append(addParagraph("This returns the value in field 1 and if that value isn't a number, it's probably going to return a 0. To keep things simple, let's assume field 1 holds a number between 1 and 2"))
main.append(addSyntax("2 one two three"))
main.append(addParagraph("so the expression returns 2 so"))
main.append(addSyntax("awk '{print $($1)}'"))
main.append(addParagraph("becomes"))
main.append(addSyntax("awk {print $2}"))
main.append(addParagraph("and so field number two is output, in this case"))
main.append(addSyntax("one"))
main.append(addParagraph("It is also possible to amend the value in a specified field by using an expression that sets the value of a field to something. For example if we have"))
main.append(addSyntax("awk '{$2 = \"TWO\", print}' duke ofyork.txt"))
main.append(addParagraph("The second action will output the record that has been read in, but in the first action, the value of the second field has been set to \"TWO\". The output will therefore look pretty similar to what we have seen earlier except the second word in each line is replaced by the word \"TWO\"."))
main.append(addParagraph("As we saw in previous examples, the changes that AWK makes are only on the file in memory, the saved file is not changed unless you redirect output to overwrite the file."))
main.append(addParagraph("We can also create new fields by assigning a value to a field that didn't previously exist. For example, let's say we set the value of field 11 to \"ELEVEN\" in the file, dukeofyork.txt"))
main.append(addParagraph("You might recall that the records in that file all have between 6 and 10 fields."))
main.append(addParagraph("We've used an exclamation mark as a field separator and we are outputting the number of fields in each record followed by the record itself. The output looks something like this."))
main.append(addParagraph("Now, each record has 11 fields because we set the value of $11 and for any record that & had less than 10 fields, empty fields have been created. Take the first line as an example, it has 6 fields before we run this program. The program sets the value of the 11 th field which in this case also has the effect of creating that 11th field but also creates empty fields for $7, $8, $9 and $10."))
main.append(addParagraph("By creating new fields in this way, we are also changing the value of $0 and NF as we saw in this example because adding fields is changing the overall record and obviously it changes the number of fields in each record."))
main.append(addParagraph("We can also assign a completely new record by changing the value of $0 as in this example"))
main.append(addParagraph("In this case, for each record in the file, we are changing the value of the record to"))
main.append(addParagraph("one two three"))
main.append(addParagraph("and then outputting the number of fields and the value of the second field and the result is that each record now has three fields and the value of $2 is"))
main.append(addParagraph("As with many programming languages, variable names are made up of letters, numbers and the underscore character. The name can start with a letter or the underscore characher, but not a number."))
main.append(addParagraph("Variables are not declared in AWK, they are just created as soon as you use one. This can be a problem because it means, for example, if you mistype a variable name, rather than give you an error AWK will create a new variable which will almost certainly mean that your code will not do what you expect it to do."))
main.append(addParagraph("Consider this example"))
main.append(addParagraph("If we provide the input"))
main.append(addSyntax("one two"))
main.append(addParagraph("we get the output"))
main.append(addSyntax("two"))
main.append(addParagraph("Here, we have two variables, hello and goodbye which held the values of $1 and $2 respectively and we wanted to output both values, but the first variable, hello, was mistyped as ehllo in the output."))
main.append(addParagraph("AWK then creates a new variable called ehllo and gives it the default initial value which it is an empty string so what we are seeing in the output is an empty string followed by the value of goodbye which is $2."))
main.append(addParagraph("Variable names are case-sensitive and all variables (this is also true for expressions) are treated either as number or strings and that is dependent on the context so AWK is using Duck typing to determine the data type."))
main.append(addSyntax("awk '{a=1;b=3; print a + b}'"))
main.append(addParagraph("Notice that this just goes ahead and outputs something, it doesn't need any input so when you press return you get a blank line followed by"))
main.append(addSyntax("4"))
main.append(addParagraph("In this example, because we used the addition operator, the variables are treated as numbers. If we omit it"))
main.append(addSyntax("awk '{a=1;b=3; print a b }'"))
main.append(addParagraph("the output is"))
main.append(addSyntax("13"))
main.append(addParagraph("This is something we have seen in other examples. If we put the variables (or field identifiers/literal values) next to each other in the output, their values are concatenated. In this case, the variables have been treated as strings, those string values being \"1\" and \"3\" and so we get the string \"13\" as our output."))
main.append(addParagraph("In AWK, a string has a numeric value of 0 so if we use a string in a context where it would be treated as a number, for example"))
main.append(addSyntax("awk ' {a=1;b=\"Bob\"; print a + b}'"))
main.append(addParagraph("will give us the output"))
main.append(addSyntax("1"))
main.append(addParagraph("AWK will treat numeric values as either integers or floats depending on context and will convert between them as necessary."))
main.append(addSyntax("awk '{a=1;b=3 ; print a / b }'"))
main.append(addParagraph("will give us the out put"))
main.append(addSyntax("O.333333"))
main.append(addParagraph("If you want a string to be treated as a number, you can add O to it."))
main.append(addSyntax("awk '{print \"one\"}'"))
main.append(addParagraph("outputs"))
main.append(addSyntax("one"))
main.append(addParagraph("but if we add 0 to it"))
main.append(addParagraph("Things can get quite complicated if you mix the these types of operations. What do you think the output would be from"))
main.append(addSyntax("awk '{a=1;b=2;c=3; print a b * c}'"))
main.append(addParagraph("Here, we have a concatenation operation which will treat its operands as strings and a multiplication operation. The key here is that the multiplication will be done first just as it would if we replaced the concatenation operator (which is just a space) with the addition operator."))
main.append(addParagraph("That will return 6 so then we would concatenate that with the value of a, which is 1, to give the output"))
main.append(addSyntax("16"))
main.append(addParagraph("Of course, we can use parentheses to force awk to perform the concatentation first"))
main.append(addSyntax("awk '{a=1;b=2,c=3; print (a b) * c}'"))
main.append(addParagraph("which will give you the string 12 and that is converted to a number for the multiplication to give you"))
main.append(addSyntax("36"))
main.append(addParagraph("Let's look at another example"))
main.append(addParagraph("So this is going to take the first field of its input and print it in double quotes, it will add 0 to that to convert the field to a number and output that as well so for the input"))
main.append(addSyntax("123"))
main.append(addParagraph("we will get the output"))
main.append(addParagraph("Let's see some more input/output pairs"))
main.append(addParagraph("AWK will try to interpret a string as a number if the context calls for it, as you can see in these examples. This raises some interesting points."))
main.append(addParagraph("Note that it recognises integers, floats, numbers in scientific format and negative numbers but you might see some changes when converting back to a number so = 4e3 becomes 4000 for example and where the string representation includes leading zeroes, these are ignored."))
main.append(addParagraph("In some cases, where the string is a mixture of numbers and letters, the letters are ignored so that's why we see the output 66 when the input is 66booboo. This doesn't work if the string starts with a letter, awk won't extract the number so \"booboo66\" returns a because this is interpreted only as a string."))
main.append(addParagraph("So with strings that include both letters and numbers, as long as the first character is a number, the letters will be silently ignored and the numeric characters will be extracted as a number."))
main.append(addParagraph("Aside from user-defined functions which are not covered in this course all awk variables have global scope."))
main.append(addHeader("WORKING WITH OPERATORS AND ARRAYS"))
main.append(addParagraph("AWK uses pretty much the same arithmetic and logical operators as languages such as C or Java. This inchides the increment and decrement operators which both have pre and post versions so it will work differently depending on whether the operater is placed before or after the variable."))
main.append(addSyntax("awk '{a=3;b = ++a; print a, b}'"))
main.append(addParagraph("will output"))
main.append(addSyntax("3 4"))
main.append(addParagraph("because the value of a is incremented before being assigned to b."))
main.append(addSyntax("'{awk a=3 ; b = a++ ; print a,b}'"))
main.append(addParagraph("will output"))
main.append(addSyntax("3 3"))
main.append(addParagraph("because this time, the value of a is assigned to before it is incremented."))
main.append(addParagraph("AWK also uses assignment operators that are similar to those found n C. A single equals sigh performs a straight forward assignment but you can use C-like shorthand for statements such as"))
main.append(addSyntax("a = a + 5;"))
main.append(addParagraph("which can be written as"))
main.append(addSyntax("a += 5;"))
main.append(addParagraph("and similarly there other versions of the assignment operator including %= and a ^= (those relate to the arithmetic operators modulo - % and exponent - ^."))
main.append(addParagraph("The comparison operators are also similes to those found in C and you can use these with strings. The comparison operator, if you use it to compare two strings will return true if the string strings are both identical. If you want to use an operator such as <= or >=, the strings will be compared depending on alphabetic order so"))
main.append(addParagraph("AWK also has three string operators, one of which is concatenate which doesn't, as we have already seen, have a symbol. It also has two regular expressionss, ~ which will return true if the comparison returns a match with the regex and !~ which will return true if it doesn't match."))
main.append(addParagraph("AWK also has arrays and these also work in a way that is similar to C so if you want to define a variable as an array, you put a subscript after it with the index value. For example, if we run the code"))
main.append(addParagraph("and give it the input"))
main.append(addSyntax("one two three"))
main.append(addParagraph("we will get the output"))
main.append(addSyntax("one two three"))
main.append(addParagraph("This is just creating three elements and assigning the values from one on the file fields from the input to those elements so a[1] holds the value of the first field and so on. We are then outputting these by referencing those array elements rather than directly referencing the fields."))
main.append(addParagraph("Note that if you use a variable to create an array, you can no longer use that variable as a scalar."))
main.append(addParagraph("Arrays in AWK can be very flexible as we will see later but it only supports one-dimensional arrays."))