website/linux/learninglinuxcommandlineOLD/common.html

<!doctype html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My Learning Website</title>
<link href="/styles/styles.css" rel="stylesheet" type="text/css">
<link href="/linux/styles/styles.css" rel="stylesheet" type="text/css">
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
      <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
      <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
</head>
<body>

<div class="banner">
	<h1 class="courselink">Learning Linux Command Line</h1>
	<h2 class="lecturer">LinkedIn Learning : Scott Simpson</h2>
	<h2 class="episodetitle">Common Command-Line Tasks and Tools</h2>
</div>

<article>
	<h2 class="sectiontitle">The Unix Philosophy</h2>
	<p>Much of the design of a Linux system is based on a set of principles, often referred to as the Unix Philosophy.  This can be summed up in a quote from Doug McIllroy:</p>
	<blockquote>
		<quote> Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.</quote>
		<cite>Doug McIllroy</cite>
	</blockquote>
	<p>More information on this can be found in the online book, <a href="http://www.catb.org/esr/writings/taoup/html/ch01s06.html">Basics
	of the Unix Philosophy</a> or in the YouTube video, <a href="https://www.youtube.com/watch?v=aY7OzGPaP5I">The Unix Philosophy</a> by Philip Bohun.</p>
	<p>The course video distils this philosophy into three maxims.</p>
	<pre class="inset">
	&bull;    Programs should do <strong>one</strong> thing.
	&bull;    They should use text interfaces (this does not refer to the interface that the program presents to the user, but to the fact that the program should take in text and output text).
	&bull;    Many modular tools are preferable to one big one.</pre>
	<p>We could take the example of a Swiss army knife that does many things, but none of them particularly well.  The Unix Philosophy can be thought of more as an array of specialized tools in the same way that a kitchen is (or should be) stocked with tools for specialized tasks such as a zester, garlic press, bread knife and so on.</p>
	<p>This means that it is important that these tools should be able to work together in that one program may do something and then pass output to a second program to perform its processing and so on.</p>
	<p>This philosophy applies when working at the command line where one program may read the text from a file which it then sends to a program that filters certain text, then to a process that removes duplicate lines and then to a program that writes the results back to a file (or more precisely, where the output is directed to a file).</p>
	<p>At the command line, modularity and
	flexibility should be seen as features, not as limitations.</p>
	<h2 class="sectiontitle">Use Pipes to Connect Commands Together</h2>
	<p>A pipe is used to take the output of one command and send it to another.  In this context, a command can be seen as a processing node and a pipe as a connection between two of these nodes.  Note that in text, a space is often placed between the command and the pipe for readability but this is not necessary.</p>
	<p>To illustrate this, let’s say we execute the following command</p>
	<pre class="inset">$echo “Hello World”</pre>
	<p>This simply echoes, to the screen, the string “Hello world”.  Now, let’s say that rather than display the string, we want to count the number of words, we can do that with</p>
	<pre class="inset">$echo “Hello World” | wc</pre>
	<p>Note that the format of the output is lines, words, characters and this command will therefore ouput</p>
	<pre class="inset">1	2	12</pre>
	<p>The character count is 12 because this includes the space character and a newline character that echo adds to the end of the string.  Theoretically, you can pipe the output of one command into any other command and this will normally do what you expect!</p>
	<h2 class="sectiontitle">View Text Files with cat, head, tail and less</h2>
	<p>The cat command is short for concatenate.  For example, let’s say we have a file called hello.txt containing the text ‘Hello’ and a file called world.txt containing the text ‘World’.  The command</p>
	<pre class="inset">$cat hello.txt world.txt</pre>
	<p>Will concatenate these two strings and display the output to the screen as</p>
	<pre class="inset">Hello World</pre>
	<p>Note that there is also a line break between the files.  In this case, we have provided cat with two files as input but experimentation shows that it also works with four files.  However, cat is most commonly used with only a single file which it displays to the screen so it can easily be used to view the contents of a text file.</p>
	<p>In addition, a useful application of cat is to pipe the output of a text file into another command for processing.</p>
	<p>A typical use of cat is to view log files which can be quite long so we can also use head and tail to view a specific number of lines at the start or end of the file respectively.</p>
	<p>If we want to view the first 10 lines of a file, we can use the command</p>
	<pre class="inset">$head -10 poems.txt</pre>
	<p>to do that.  Similarly, the command</p>
	<pre class="inset">$tail -10 poems.txt</pre>
	<p>will  display the last 10 lines.  Note that in both these cases, we can replace 10 with a different number, let’s say 5, to display 5 lines of text and you may see the number preceded by an n to indicate that we are using the option -n to specify the number of lines to be displayed.  Also, if we do not specify any number, 10 lines will be displayed by default.</p>
	<p>Bearing in mind the fact that these commands are often used to view log files, consider an example where we are viewing a security log file in order to monitor suspicious activity.  It can be very useful to be able to view additions to the log file as they happen.  We can do this with the tail file by adding the -f option (or--follow).  In this example (where we are monitoring a file called security.txt)</p>
	<pre class="inset">$tail -f security.txt</pre>
	<p>the command will display the last 10 lines of the file when executed but will continue to run and as additional lines are added to the file, they will be displayed on the screen.</p>
	<p>We can also use the less command to display the contents of a text file with some additional flexibility such as the ability to display data a page at a time.  In fact, the man command used to display data relating to a specific command actually uses the less command to do this.</p>
	<p>Working with text files like this can help to give a sense of how the order of commands can affect the output.  Consider this piece of code (using a file in the Exercise Files folder.</p>
	<pre class="inset">$cat poems.txt | cat -n | tail -n5</pre>
	<p>The output produced is shown in figure 11.</p>
	<figure>
		<img src ="images\figure11.png" alt="Figure 11 - the output from the sequence of commands, cat poems.txt | cat -n | tail -n5">
		<figcaption>Figure 11 - the output from the sequence of commands, cat poems.txt | cat -n |
	tail -n5</figcaption>
	</figure>
	<p>If we analyse the sequence of commands here, the first (cat poems.txt) outputs the contents of the file and pipes them to a second cat command which numbers the lines in the file and passes this output to the tail command which displays the last 5 lines (numbered from 51 to 55).</p>
	<p>Now, if we switch the last two commands around to give</p>
	<pre class="inset">$cat poems.txt | tail -n5 | cat -n</pre>
	<p>The output is now as shown in figure 12.</p>
	<figure>
		<img src ="images\figure12.png" alt=" Figure 12 - the output from the sequence of commands, cat poems.txt | tail -n5 | cat -n">
		<figcaption>Figure 12 - the output from the sequence of commands, cat poems.txt | tail -n5 | cat -n </figcaption>
	</figure>
	<p>The output is the same five lines, but now numbered from 1 to 5.  This is because we have taken the output from the first command and passed it to the tail command which, in this case, outputs the last 5 lines and it is these 5 lines which are sent to the second cat command for numbering.</p>
	<p>Since we have only applied the second cat command to the 5 lines output by tail, the numbering sequence starts at 1.</p>
	<h2 class="sectiontitle">Search for Text in Files and Streams with grep</h2>
	<p>In essence, the purpose of grep is to display lines of text that match a pattern (or more specifically, that contain some text that matches a pattern.  However, while a pattern can be quite complex, it can be as simple as a specific word as in the command:</p>
	<pre class="inset">$grep “the”poems.txt</pre>
	<p>This has the effect of displaying all lines from the file poems.txt containing the word “the”.  We can also add an n option here so that the lines are numbered as they are output but note that we are not piping the output to a command to number the lines so the line numbers are the line numbers from the original file.  Hence, if we execute the command:</p>
	<pre class="inset">$grep‘the’poems.txt -n</pre>
	<p>we will see the same output as with the previous command but with the lines numbered 7, 11, 12, 13, 16 and so on and these are the line from the original file.  In essence, the first time the word ‘the’ is encountered is on line 7 of the original file and this is why the first line number is 7 here.</p>
	<p>I can also use the command</p>
	<pre class="inset">$grep ‘The’poems.txt</pre>
	<p>To search for instances of ‘The’ and I will see that the results are different.  This is because grep is case-sensitive by default.  However, we can instruct grep to ignore case by using the i option so</p>
	<pre class="inset">$grep -I ‘The’poems.txt</pre>
	<p>Will find any instance of the letters t, h and e together.  I would add here that this type of search is quite literal and it will find any instance of the letters wherever they occur provided they are consecutive and in the order specified so it will find the in them and so on.  Also, because the last command is case insensitive, it does not matter what case we used to specify the pattern.</p>
	<pre class="inset">$grep -i ‘tHe’poems.txt</pre>
	<p>would produce exactly the same results.</p>
	<p>Another option we can use with grep is v which will provide the inverse of the result we would obtain if we didn’t use it.  Hence</p>
	<pre class="inset">$grep -vi ‘the’poems.txt</pre>
	<p>will display all lines of text that do <strong>not</strong> contain a match to the pattern ‘the’ and again, we are ignoring case in the results.  If we didn’t use the i option in this instance we would see lines containing the string ‘The’, but no lines containing the string ‘the’.</p>
	<p>One potentially very useful application for the v option in grep is where you are looking through logs and you want to filter out a program that produces a lot of log entries.  You could also combine two grep commands so that you omit all lines with a given pattern and the search the remaining lines for a second pattern.</p>
	<p>To give a somewhat contrived example of this (since I am working on a CentOS VM created for this course and also that I am not particularly experiences with security logs, it is difficult to perform a meaningful search of the log files), let’s say that I want to search through the security log file. </p>
	<p>In CentOS, the log files are in a folder called log which is in the var folder and the security file is called secure (I am assuming from the name that it is the security log)!  Let’s assume that I want to disregard any line in this file containing the text ‘polkitd’ and search the remaining files which contain the text ‘philip’.  I can do that with</p>
	<pre class="inset">$sudo grep -v ‘polkitd’secure | grep ‘philip’</pre>
	<p>Note that the second grep command is not operating on the file secure, it is operating on a stream and that stream is formed by the output of the first command.  If I tried the commands</p>
	<pre class="inset">$sudo grep -v ‘polkitd’secure | grep ‘philip’secure</pre>
	<p>The first point to note is that I would see a security failure since root access is required to view the secure file so I would need to add sudo before the second grep command as well.  Secondly, and this is an assumption on my part but I think a sensible one, this would filter out the lines containing ‘polkitd’ but then do a completely new search which does not filter this text out.</p>
	<p>To put it another way, we are sending the output of a grep command to another grep command, but then we are ignoring it and searching directly on the original file.  To test this theory, I will run the commands again but this time filtering out the text, ‘philip’. The command</p>
	<pre class="inset">$sudo grep -v ‘philip’secure | grep ‘philip’</pre>
	<p>should return nothing since we are searching for the string ‘philip’ only in lines that do not contain the string ‘philip’.</p>
	<p>The command</p>
	<pre class="inset">$sudo grep -v ‘philip’secure | sudo grep ‘philip’secure</pre>
	<p>should return all lines containing the text ‘philip’ since we are ignoring the stream that filters out lines with this text.</p>
	<p>I have also added &gt;&gt; log1.txt and &gt;&gt; log.txt respectively to the ends of these two commands when executing them so that the output goes into text files in my own home directory and I can view them without needing sudo access.  In theory, log1.txt should be empty and log.txt should contain all the lines including the text ‘philip’.</p>
	<p>In practice, log1.txt is empty and log.txt contains 92 lines of text so I think this confirms my theory!</p>
	<h2 class="sectiontitle">Manipulate Text with awk, sed and sort</h2>
	<p>Both awk and sed can be used to programmatically manipulate text in streams or files.</p>
	<h3>awk</h3>
	<p>In the Exercise Files folder, we have a file called simple_data.txt which we can examine with cat and this file contains lines of data which each contain tab separated data (except for the first line which contains column titles).  The contents of this file are as follows:</p>
	<pre class="inset">
	&bull;    Name		ID 		Team
		 Scott		314		Purple
		 Ananti	991		Orange
		 Jian		3127		Purple
		 Miguel	671	  	Green
		 Wes	  	1337		Orange
		 Anne		556		Green</pre>
	<p>Both awk and sed are commonly used in system administration and systems programming and LinkedIn Learning has courses on both which I will take a look at when I have more time.  These are <a href="https://www.linkedin.com/learning/awk-essential-training/welcome">AWK Essential Training</a> and <a href="https://www.linkedin.com/learning/sed-essential-training/welcome">SED Essential Training</a>.</p>
	<p>Going back to the data file, awk is great for pulling out programmatically determinative data and we can do that by writing an awk program in a separate file or by entering awk commands at the command line.</p>
	<p>Note that data inside a file like this is determined by limiters or field separators (each line in this example has three fields and what makes data such as 314 part of the second field is simply that there is a tab character before and after it – for instance, we might read the first line of data as Scott [tab] 314 [tab] Purple).</p>
	<p>Consider the code</p>
	<pre class="inset">$awk  ’{print $2}’ simple_data.txt</pre>
	<p>This simply outputs the second column to the display.  As we can see here, the awk program is enclosed in single quotes and $2 is a reference to the second field (or second column) and the print command will respond by printing data, taken from the file specified but will only extract the second column.</p>
	<p>If we want to display two columns, we can modify the print statement in the curly braces so that we have:</p>
	<pre class="inset">$awk  ’{print $2 “\t” $1}’ simple_data.txt</pre>
	<p>Note that we have used a string literal in the above command to print a tab space between the columns and we have switched the order round so that the second column is being displayed before the first column.</p>
	<p>The output of this command (which is displayed on the screen) is plaintext so we can pipe it to another command such as sort.  So, if we wanted to sort the entries numerically, we can do that with:</p>
	<pre class="inset">$awk  ’{print $2 “\t” $1}’ simple_data.txt | sort -n</pre>
	<p>Notice that the order is taken from the first column but I don’t know if that is because it is a column of numerical data or if it is because it is simply displayed first.  I will try the command:</p>
	<pre class="inset">$awk  ’{print $1 “\t” $2}’ simple_data.txt | sort -n</pre>
	<p>and check the output.  If it is sorting the first column it should either attempt to sort the names (maybe using the ascii values of the first character) or it will not sort them.</p>
	<p>If it is sorting the numerical data, we should see the output sorted by ID number.</p>
	<p>The result of this is that it has sorted the first column alphabetically.  As an experiment, I changed the first character of Jian to lower case and found that it still appeared  Anne and Miguel which shows that the sorting is not based on ASCII values (if it were, jiang would be last when sorted since lower case letters have higher ascii values than upper case letters).</p>
	<p>As we have seen, awk is particularly useful for extracting data from a file.  In comparison, sed, which stands for stream editor, is particularly useful for changing data in a file.</p>
	<p>In it’s simplest form, sed can be used simply to exchange one string for another.  For example, the command</p>
	<pre class="inset">$sed “s/Orange/Red” simple_data.txt</pre>
	<p>will replace all instances of Orange in the simple_data.txt file with Red.  The s in this command stands for substitute and it is followed by the string you want to replace and then the new string, and these are separated with a / character.</p>
	<p>We have already seen sort and that it can be used to sort data numerically and that when used with string data such as the names in the simple_data.txt file, the data is sorted alphabetically.  We can also use sort on its own so we could do something like,</p>
	<pre class="inset">$sort simple_data.txt</pre>
	<p>which is the most basic type of sort.  This sorts on the first column and sorts alphabetically without regard to case.  We can also add an -n flag to sort numerically but if we don’t specify a column (or key) to sort on, it will sort by the first column and the result will be the same.</p>
	<p>If we execute the command</p>
	<pre class="inset">$sort -k2 simple_data.txt</pre>
	<p>we are now specifying a key and the result is a sort based on the ID numbers in the file.  However, in this case the search is based simply on the first character (or more if this is necessary to determine order) so our first three ids are 1337, then 3127 and then 314.</p>
	<p>We can add the -n option here as well so</p>
	<pre class="inset">$sort -nk2 simple_data.txt</pre>
	<p>will result in the data being sorted in numerical order based on the second column.</p>
	<p>Sort can also be used to do some useful things like removing duplicate lines from a file.  If we look at the file, dupes.txt which has several lines of text which are not all unique, we can run the command</p>
	<pre class="inset">$sort -u dupes.txt</pre>
	<p>The result is the lines from the files sorted alphabetically but there are no duplicates so the sort  command outputs three lines of text compared to the 6 lines of text in the original file.</p>
	<p>In fact, sort is a very useful command and worthy of further investigation.  There are some other text manipulation commands which are probably also worth investigating</p>
	<pre class="inset">
	&bull;    rev – prints out the file in reverse
	&bull;    tac – prints out the lines in reverse order (tac is cat backwards)
	&bull;    tr – this is the translate 	command which can work with individual characters (this description really doesn’t give much of a clue as to what this does so it is probably worth investigating for that reason alone)</pre>
	<h2 class="sectiontitle">Edit Text with Vim</h2>
	<p>Note that vim (Vi Improved) is, as the name suggests, a more advanced version of vi.  However, the name vi is used nowadays for legacy reasons only.  If you invoke the editor with the vi command, you are invoking vim so this is exactly the same as invoking it with the vim command.</p>
	<p>Since I am already familiar with vim, I will simply note that there are numerous online tutorials and cheat sheets for using vi (for instance, there are 9 <a href="https://linuxhandbook.com/vim-cheat-sheet/">here</a>) and there are LinkedIn Learning tutorials specifically for <a href="https://www.linkedin.com/learning/learning-vi">vi</a>, <a href="https://www.linkedin.com/learning/learning-vim/vim-for-text-editing">vim</a> and <a href="https://www.linkedin.com/learning/learning-nano">nano</a>.</p>
	<h2 class="sectiontitle">Working with Tar and Zip Archives</h2>
	<p>A tar file (tar stands for tape archive), is a common way to package a number of files into one single file, perhaps for easier distribution.  Note that a tape file is not itself compressed.  That is, if you package files into a tar file, they are not automatically compressed but you can incorporate compression in several formats including .tgz (a tar file with compression), .gz (gzip) and .bz2 (bzip 2).</p>
	<p>To demonstrate this, we will take all of the files in the Exercise Files folder and place it inside an archive file.  To do this, we will go to the Documents folder because this is the folder that contains the Exercise Files folder (actually, on my installation it is also inside a folder called commandlinebasics so this is the folder I will navigate to before creating the archive.</p>
	<p>The command to create an archive is</p>
	<pre class="inset">$tar cvf myfiles.tar Exercise\ Files/</pre>
	<p>We can break this down as follows:</p>
	<pre class="inset">
	&bull;    tar is, of course the command to create an archive.
	&bull;    The c option indicates that an archive is to be created.
	&bull;    The v option indicates verbose mode and results in every file being listed as it is added to the archive.
	&bull;    The f option indicates that the output is to go to a file.  Without this, the output would simply go to the screen unless you piped it to another command.
	&bull;    Following the options is the name of the archive to be created.
	&bull;    We then have the names of the files or folders to be included in the archive.</pre>
	<p>As was mentioned earlier, a tar file is not automatically compressed and in fact we can view this archive with cat, for example, and we can see any plain text from the files in the archive.</p>
	<p>If we want to create a compressed archive, we need to add an a option so this gives us:</p>
	<pre class="inset">$tar caf myfiles.tar.gz Exercise\ Files</pre>
	<p>Note that the v was omitted this time since we didn’t really need to see the files listed again and the a option has been added.  In this case, the compression format is gzip and this is determined by the fact that the output file has a .gz extension at the end.  Actually, if you look in the man file for tar, you will see that the a option is auto-compression and it indicates that the archive should be automatically compressed based on the file extension.  It is also possible to explicitly indicate what compression format should be used.</p>
	<p>We could also have used the extension .bz2 and the compression format used would be bzip2.</p>
	<p>We will do that and then take a look at the directory listing (which is shown in figure 13), we see that the two compressed files are slightly different in size.</p>
	<figure>
		<img src ="images\figure13.png" alt=" Figure 13 - the directory listing showing the Exercise Files folders as well as the three archives we  created.">
		<figcaption> Figure 13 - the directory listing showing the Exercise Files folders as well as the three archives we  created.</figcaption>
	</figure>
	<p>Note that the uncompressed file is also significantly larger than the compressed archives and if we try to view the file, say with cat, we can see that the file does not contain very much in the way of legible text!</p>
	<p>So that’s how we can create an archive, now we will look at the opposite process, extracting files from an archive.  To do this we will create a folder first which we will call unpack and then move one of our archive files to that directory before extracting the contents.</p>
	<p>Now, we can cd to the unpack folder and extract the files with the command</p>
	<pre class="inset">$tar xf myfiles.tar.bz2</pre>
	<p>Notice how we have replaced the c, which allowed us to create the archive, with an x, which allows us to extract the archive.  The other option is f which denotes extraction to a file (recall that on Linux, everything, including a directory, is a file).  In this case, we created a new folder for the archive before extracting its contents in part because of the fact that we already had an Exercise Files folder in the directory we were working in.</p>
	<p>What would have happened if we had tried to extract the files into a directory  that already contains them?  Nothing seems to happen which suggests that the process simply overwrites the files that are there already.  To determine if this is the case, let’s make a change to a file in the Exercise Files folder with a command such as</p>
	<pre class="inset">$cat simple_data.txt &gt; poems.txt</pre>
	<p>which takes the contents of simple_data.txt and redirects it to the file, poems.txt, overwriting its previous contents.  Now, if we extract the files again, if the command is overwriting existing files, we should see that poems.txt is restored.  This is in fact what we see so we can determine from that the fact that the extraction process overwrites files.  Note, also, that it does not report this (at least, if we use the options used in the command we saw earlier) so it is worth bearing in mind that this process can be dangerous.</p>
	<p>This is another reason why we might want to put the archive into a separate folder before extracting the contents so it is more difficult to accidentally overwrite a file you have changed since the archive had been created.</p>
	<p>Note that we had previously navigated to the unpack folder before extracting files from the zipped archive we had placed there.  We will now go back one step, back to the commandlinebasics folder which contains our original Exercise Files folder and the other archives we have created as well as, of course, the unpack folder.</p>
	<p>We will create another folder called unpack2 and we will decompress the other compressed archive into this folder, but we will do this without navigating to it or placing the compressed archive inside it.</p>
	<p>So, the compressed archive we are working with this time is myfiles.tar.gz and we will extract the files inside it with the command:</p>
	<pre class="inset">$tar xvf myfiles.tar.gz -C unpack2</pre>
	<p>Note that the -C option which is placed after the archive name but before the target folder denotes that we are specifying the folder to extract the files to.  We can inspect the folder’s contents with the command</p>
	<pre class="inset">$ls -lah unpack2</pre>
	<p>and we can see that it does indeed contain the extracted Exercise Files folder.</p>
	<p>The tar format is very common in Linux environments, but you may also see the zip format being widely used due to the fact that it is more cross-platform friendly.  That is, it works well on Linux, Windows and MacOS.</p>
	<p>To create a zip file, we would use the zip command like this:</p>
	<pre class="inset">$zip exfiles.zip Exercise\ Files/</pre>
	<p>Here, we have the command itself followed by the name of the zip file we want to create and then the name of the files or folders (in this case, just the single folder, Exercise Files) to be added to the zip file.</p>
	<p>This works pretty much as you might expect but with a surprising twist.  We have specified that the folder should be added to the zip file, but not the contents.  The result is that if we unzip the exfiles.zip file, we will find that it contains an empty folder called Exercise Files.</p>
	<p>Assuming we want to zip the folder and it’s contents, we need to add an -r option.</p>
	<pre class="inset">$zip -r exfiles.zip Exercise\ Files/</pre>
	<p>When the files are being zipped, each file is listed (note that we did not specify verbose mode with a -v option) and we can see how much space is being saved in each file.  When complete, we can check the size of the zip file with a directory listing and we can see that it is similar in size to our compressed archive files but is slightly larger.</p>
	<p>To unzip the file, we will again move it to a folder of its own which we will call unzip and from that folder, we can then unzip the file with the command:</p>
	<pre class="inset">$unzip exfiles.zip</pre>
	<p>Again, we will see a list of files as they are extracted and when done, we can see the unzipped directory, Exercise Files, and we can get a directory listing to confirm that it does contain the expected files.</p>
	<h2 class="sectiontitle">Output Redirection</h2>
	<p>Working at the command line, most of the time any output from a directory listing, search, file display command and so is displayed on the screen.  However, it is possible to save this output into a file, for example.</p>
	<p>There are three streams we can make use of when working with text.  These are:</p>
	<pre class="inset">
	<strong>STREAM			NUMBER		USAGE</strong>
	Standard input (stdin)			0			Text input
	Standard output (stdout)		1			Normal text output
	Standard error (stderr)			2			Error text</pre>
	<p>To demonstrate this, we’ll go to the Exercise Files folder and list it’s contents with ls.  By default, this is sent to the screen but we can also redirect it to a text file with:</p>
	<pre class="inset">$ls 1&gt;filelist.txt</pre>
	<p>This redirects the output which would normally go to the screen into a file called filelist.txt and it will create the file if it does not already exist.</p>
	<p>However, it is important to remember that if the file does already exist, redirecting output to it in this way will overwrite anything that was already in the file.  Let’s say we have the file, flielist.txt, with the directory listing from Exercise Files already written to it and we want to add to that, the contents of the home directory.</p>
	<p>If we do something like</p>
	<pre class="inset">$ls ~ 1&gt; filelist.txt</pre>
	<p>this will result in filelist.txt containing only the directory listing from the home directory.  If we use the following command</p>
	<pre class="inset">$ls ~ 1&gt;&gt; filelist.txt</pre>
	<p>this will append the contents of the home directory to the existing file and the result is that the file will contain listings of both directories.</p>
	<p>It is also worth noting that if we use either &gt; or &gt;&gt; without specifying a stream (that is, if we omit the 1), it will be assumed that we mean to redirect the text to the standard output so the result will be the same.</p>
	<p>Now, if we try to list the contents of a directory that does not exist such as with the command</p>
	<pre class="inset">$ls notreal</pre>
	<p>this will generate an error message.  It is quite common to want to redirect error messages into a file for later study but if we use the standard output as in</p>
	<pre class="inset">$ls notreal 1&gt; error.txt</pre>
	<p>the result will be that we still see the error on the screen.  We can check the directory listing again and this will show that the file, error.txt has been created but is empty.  Bear in mind that if the file already contained some error text, this command would delete the existing text because there is no standard output from this command so we are essentially sending an empty string to the error.txt file and overwriting its contents.</p>
	<p>For this reason, the redirect command is quite dangerous and should be used with care.</p>
	<p>To put the error message into the error.txt file, we would use the command:</p>
	<pre class="inset">$ls notreal 2&gt; error.txt</pre>
	<p>or better, unless we definitely want to overwrite the existing file</p>
	<pre class="inset">$notreal 2&gt;&gt; error.txt</pre>
	<p>This will have the effect of adding the error text to the end of the error.txt file.</p>
	<p>It is also worth noting that we can, if required, combine two redirections so we might for example, redirect the standard output to one file and the errors to another file or we could, if required, redirect both to the same file.</p>
	<h2 class="sectiontitle">Exploring Environment Variables and PATH</h2>
	<p>There are a number of environment variables which relate to the shell and how it operates (amongst other things) and where these are specific user, they are stored in the .profile file in the user’s home folder.</p>
	<p>When new software is installed, it is usually placed in a folder already pointed to by the path, but sometimes it may be necessary for a user to add folders to the path.</p>
	<p>To add a new folder to the path, it is important to note that PATH is an environment variable that refers to the current path so if we wanted to add /path/to/mytools to the current path, we would add the line</p>
	<pre class="inset">PATH = “$PATH:/path/to/my/tools”</pre>
	<p>to the profile (if it exists, note that not all distros have this profile but by default but you can create the file if required).</p>
	<p>If you want to check the path for an existing file such as ls, you can use the which command as in</p>
	<pre class="inset">$which ls</pre>
	<p>and this shows that ls is located in the /bin folder.</p>
	<h3 class="inset">Additional Shell Scripting Courses</h3>
	<p>Further information relating to bash scripting in general can be found in the following courses:</p>
	<pre class="inset">
	&bull;    <a href="https://www.linkedin.com/learning/linux-bash-shell-and-scripts/welcome">Linux: Bash Shell and Scripts</a>
	&bull;    <a href="https://www.linkedin.com/learning/learning-bash-scripting/welcome">https://www.linkedin.com/learning/learning-bash-scripting/welcome</a></pre>
	<h2 class="sectiontitle">Challenge: Extract Information from a Text File</h2>
	<p>For this challenge, we have a file called log.tar.gz which has been archived and compressed.  This contains a text file which consists of log entries relating to attempts to log on to some system.  In some cases, these would have failed and the challenge is to investigate failures relating to attempts to log into the system by guessing the password.  We want to create a new text file containing all the user-names used in such failed attempts and we want them listed alphabetically.</p>
	<p>To begin with, I have created a folder in the Exercise Files called logs and moved the file, log.tar.gz into this new folder.</p>
	<p>The next step is to extract the file, auth.log, from the compressed archive and we can do that with</p>
	<pre class="inset">$tar xf log.tar.gz</pre>
	<p>Next, we want to examine the file in order to determine what our search term is going to be</p>
	<pre class="inset">$less auth.log</pre>
	<p>Looking at the file, we can see that where a failed log in like this occurs, the log entry includes the words ‘Invalid user’ or ‘invalid user’ and these come in groups so we can use the search term, ‘invalid user’ to filter out the lines that we want.</p>
	<p>We can sort these using the -u option to remove duplicate entries and we can write this to a file called invalid_users.  To do this, we will use a cat command to output the text and pipe this to grep in order to filter out the lines we don’t want.  The output from grep will then be piped to awk so that we can extract the username from each of the relevant lines and then we can pipe this to sort and we will redirect this to the new file so this gives us:</p>
	<pre class="inset">$cat auth.log | grep ‘invalid user’ | awk ‘{print $9} | sort -u &gt; invalid_users.txt</pre>
	<p>Note that the awk command takes advantage of the fact that in the lines we are interested in, the username is the 9<sup>th</sup> element of these lines (note that the delimeter in this case is a space).  We have also used a print command here so that the username is output for each line so that we are piping lines containing only the username to the sort command.</p>
	<p>We can now inspect the invalid_users.txt file with less and we can see that it does indeed consist of various usernames sorted alphabetically and if we go back to the original file we can confirm that these are associated with failed log in attempts.</p>
</article>

	<div class="btngroup">
		<button class="button" onclick="window.location.href='files.html';">
			Previous Chapter - Files, Folders and Permissions
		</button>
		<button class="button" onclick="window.location.href='advanced.html';">
			Next Chapter - A Peek at Some More Advanced Topics
		</button>
		<button class="button" onclick="window.location.href='learninglinuxcommandline.html'">
			Course Contents
		</button>
		<button class="button" onclick="window.location.href='/linux/linux.html'">
			Linux Page
		</button>
		<button class="button" onclick="window.location.href='/index.html'">
			Home
		</button>
	</div>
</body>
</html>