Computer Science 111, Assignment 5
Tutorial on for loops, command-line arguments, and more about characters and strings.
- Preliminaries.
If you have not done so already, create a directory named hw05 inside your homework directory. Then change your present working directory to hw05 and copy into it the example files for Assignment 5, as follows:cp ~nixon/cs111/hw05/* .
- Intro to for loops.
Loops of the following form are common:int count = someNumber; while ( count < someOtherNumber ) { body_statement_1; body_statement_2; . . . body_statement_n; count++; }Or, similarly, counting down instead of up:
int count = someNumber; while ( count > someOtherNumber ) { body_statement_1; body_statement_2; . . . body_statement_n; count--; }Observe that the above two code segments each contain at least three expressions involving the variable count:
- An initialization of the variable count, before the while loop itself.
- The while loop's condition, which tests the value of the variable count, comparing it to some value which has been determined before the loop begins.
- A statement at the end of the loop which either increments or decrements the value of count.
An example of this pattern is the Assignment 4 example program average1G.cpp.
The above pattern is a form of count-controlled repetition, i.e. a loop controlled by a variable counting one from pre-determined value to another. In order to have count-controlled repetition, it is not necessary that the count variable be incremented/decremented at the end of the loop body. It could be incremented/decremented anywhere in the loop body. But it's most common to increment/decrement it at the end.
Because the count variable (whose name isn't necessarily count) is used to control the loop, it is sometimes referred to as the loop control variable.
Because this pattern is so common, C++ has a special loop, known as a for loop, whose purpose is to make this pattern more readable. A for loop is exactly like a while loop except that the three key expressions involving the loop control variable are all gathered onto one line, so that the reader can tell at a glance what is happening with that variable.
Using a for loop, the counting-up version of our pattern can be re-written:
for ( int count = someNumber; count < someOtherNumber; count++ ) { body_statement_1; body_statement_2; . . . body_statement_n; }And the counting-down version can be re-written:
for ( int count = someNumber; count > someOtherNumber; count-- ) { body_statement_1; body_statement_2; . . . body_statement_n; }A for loop consists of the reserved word for, followed by a pair of parentheses containing three expressions separated by semicolons, followed by a dependent statement or block after the right parenthesis. The dependent statement or block is known as the body of the loop.
The three expressions inside the parentheses of a for loop heading must be: (1) a statement, (2) a condition, and (3) another statement. The various parts of the for loop are executed in the following order:
- First statement between the parentheses. (Executed only once.)
- Testing of condition. If false, then exit loop. If true, continue as follows:
- Loop body.
- Last statement between the parentheses.
- Testing of condition again. If false, then exit loop. If true, repeat steps 3, 4, and 5 until condition is false.
The first expression between the parentheses of the for loop heading is a statement which is executed only once, before the loop body is executed the first time, and before the condition is tested. Syntactically, the first expression in the parentheses is allowed to be any kind of program statement whatsover. But, customarily, this statement initializes a variable which will then be used to count how many times the loop body is executed. Such a variable is known as a count variable or a loop control variable.
The third expression in the parentheses is a statement which is executed not just once, but every time the loop body is executed, after the loop body is executed. Syntactically, this statement too may be any kind of program statement whatsoever. But it is usually used to increase the loop control variable by one, thereby counting repetitions of the loop body. A statement which increases a variable by one is known as an increment.
The second expression in the parentheses is a condition, an expression of type bool. It is tested immediately AFTER the first statement in the parentheses has been executed, but BEFORE the loop body is executed the first time. If the condition is false, the loop body is not executed at all. If the condition is true, the loop body is executed. Then the last expression in the parentheses is executed, and then the condition, the second expression in the parentheses, is tested again. If the condition is false this time, execution of the loop quits, and program execution resumes after the end of the loop body. If the condition is true, the loop body is executed again, and then the third expression in the parentheses is executed again, and then the condition is tested again. As before, if the condition is false, execution of the loop quits, and if the condition is true, the loop body is repeated again. And so on, until the condition eventually becomes false.
Syntactically, the condition may be any expression of type bool (which actually means it may be any expression with a numeric value, because numeric values can convert automatically to type bool without need for a cast). But it is customarily used to test the value of the loop control variable.
Whatever the condition is, the programmer must take care to write the program so that the condition does eventually become false. Otherwise, the loop will repeat forever. For example, the condition in the following for loop heading will never become false:
for ( int i = 0; i >= 0; i = i + 1 )A loop whose condition is always true is known as an infinite loop.
So that the for loop heading will all fit on one line, it is common to use a variable with a single-letter name (e.g. i in the example above) as the loop control variable. Usually, single-letter variable names are considered bad programming practice. We want most variables to have names which clearly indicate the variable's purpose. However, loop control variables are an exception to this rule. Many programmers do consider it OK to use single-letter variable names for loop control variables. The letters most often used for this purpose are i, j, k, m, and n.
Consider again the Assignment 4 example program average1G.cpp, which averages a sequence of integers entered by the user. Below is a for loop equivalent to the while loop in average1G.cpp:
for ( int i = 0; i < numberOfNumbers; i = i + 1 ) { int number; cin >> number; sum += number; }For a complete program containing this for loop, see average6.cpp.
In this particular example, observe that the loop control variable i is NOT used anywhere inside the body of the loop. Here, the loop control variable's sole purpose is to count iterations (repetitions) of the loop.
Though in this example it isn't, the loop control variable IS allowed to be used inside the loop body as well. But, if you do use the loop control variable inside the body of a for loop, you should be careful that its value does not get CHANGED inside the loop body.
Thus the following for loop is perfectly fine programming practice:
for ( int i = someNumber; i < someOtherNumber; i++ ) { anotherVariable += i; yetAnotherVariable *= i; }On the other hand, the following for loop would be very poor programming practice:
for ( int i = someNumber; i < someOtherNumber; i++ ) { i += anotherVariable; yetAnotherVariable *= i--; }In this latest example, it would have been much better to use a while loop insteac of a for loop. The for loop heading does NOT enhance the readability of this loop. On the contrary, the for loop heading is very misleading, making it appear at first glance that i counts from someNumber up to one less than someOtherNumber, when in fact the way that the value of i is changed during each iteration is quite different.
- Code traces with for loops.
On your forbin account, make sure that your present working directory (pwd) is your hw05 directory, inside your homework directory. (Change directories as necessary.) Then copy into it LineNumberer.class, as follows:cp ~nixon/java/LineNumberer.class .Then use it to generate a line-numbered copy of average6.cpp, as follows:java LineNumberer average6.cpp 2thereby generating the file average6.cpp.txt, a line-numbered copy of average6.cpp.
Download the file average6.cpp.txt, print it out, and compare it with these sample code traces.
Recall that average6.cpp exemplifies count-controlled repetition. However, it is not a very good use of count-controlled repetition. It requires the user to enter the number of numbers before entering the numbers to be averaged. As we saw in the Assignment 4 Tutorial on kinds of repetition, text files, input error states, and break statements, a much better version of this program would NOT require the user to enter the numer of mumbers. Thus, a much better version is average4A.cpp, which does NOT use count-controlled repetition, but instead uses a form of event-controlled repetition testing the error state of the input stream. Note that average4A.cpp uses a while loop, and is indeed most appropriately written using a while loop rather than a for loop.
The remaining examples on this page will all involve more appropriate uses of for loops.
- Characters and the increment and decrement operators.
Compile alphabetFor.cpp and run it. This program displays the alphabet in lower-case letters.Consider now what it means to add 1 to a character. For example, in the for loop heading in alphabetFor.cpp:
for ( char letter = 'a'; letter <= 'z'; letter++ )Here, the postincrement operator (++) is used to add 1 to letter. Adding 1 to a character means adding 1 to its ASCII code value, resulting in the ASCII code of the next character in the ASCII chart -- which, in the case of a letter, means the next letter in the alphabet. Fore example, adding 1 to the ASCII code of 'a' yields the ASCII code of 'b'.
Similarly, subtracting 1 from a letter character yields the ASCII code of the previous letter in the alphabet, as exemplified in the program reverseAlphabetFor,cpp.
Note that these programs are examples of count-controlled repetition, because, in each case, the loop counts (up or down) from one pre-determined value to another.
- The digit characters.
Compile digitASCII.cpp and run it. This program displays a table of the digit characters and their ASCII values, i.e. the numeric values of their ASCII codes. Note that the numeric value of a digit's ASCII code is not the same as the intended numeric value of the digit itself. For example, the ASCII value of the digit '1' is 49, not 1. Fortunately, consecutive digit characters do have consecutive ASCII codes.Look now at the source code of digitASCII.cpp. It has the following for loop:
for ( char digitChar = '0'; digitChar <= '9'; digitChar++ ) cout << " " << digitChar << " " << (int) digitChar << endl;Each iteration of the loop displays both the character digitChar and the numeric value of its ASCII code. The numeric value of its ASCII code is obtained by casting digitChar to type int.
- Accessing all the characters in a string.
Compile charsInString1.cpp and then run it. The program will prompt you to enter a string containing between 1 and 15 characters. Enter the requested string. It may contain capital letters, lower-case letters, digits, punctuation marks, but not spaces or tabs. (See what happens if you try to enter a string containing spaces or tabs.)The program will then display a table listing all the characters in the string you typed, together with their ASCII values. Following is a sample output:
Enter a string between 1 and 15 characters long:>012xyzXYZ~;#"&/ Position Character ASCII value -------- --------- ----------- 0 0 48 1 1 49 2 2 50 3 x 120 4 y 121 5 z 122 6 X 88 7 Y 89 8 Z 90 9 ~ 126 10 ; 59 11 # 35 12 " 34 13 & 38 14 / 47In the column to the left of the list of typed characters, there is a number indicating each character's position within the string. Observe that the first character is at position 0, the second character is at position 1, until, finally, if you entered 15 characters, the last character is at position 14.
The program then error-checks the number of characters in the string and quits if the string was too long.
if ( word.length() > 15 ) { cout << "You entered a string of length " << word.length() << ", should be between 1 and 15." << endl; return 1; } // ifRecall that the length function of class string returns the number of characters in the string object for which the length function is called.
If the user did enter a string as instructed, the table is displayed. The body of the table (except for headings) is generated by the following for loop:
for ( int i = 0; i < word.length(); i++ ) { // Display character's position in string word: cout << setw(5) << i; // Display the character itself: cout << " " << word[i]; // Display the character's ASCII value cout << setw(14) << (int) word[i] << endl; } // for iIn the sample output, observe that both the leftmost column and the rightmost column contain integers with varying widths, i.e. varying numbers of digits. So that these numbers will be neatly lined up in right-justified columns, we use the setw manipulator.
Each row of the table deals with a different individual character within the string object word. The expression word[i] denotes the character at position i within the string object word.
We have seen previously that the characters in a string can be accessed using square bracket notation, e.g. word[0], word[1], and word[2]. The expression between the square brackets can be any expression whose value is of an integer data type. Thus, for example, it can be an int variable, as it is in this program.
After printing out the character itself, our program then prints out the numeric value of the character's ASCII code. To obtain the numeric value of the ASCII code for character word[i], we cast word[i] to type int, as we do in the following statement:
cout << setw(14) << (int) word[i] << endl;We need the loop to iterate (repeat) once for each character in the string. To determine how many characters there are in the string, we use the length function of class string. In this example, the for loop is controlled as follows:
for ( int i = 0; i < word.length(); i++ )Just before the for loop begins its first iteration, the int variable i is initialized to 0. The for loop will iterate (repeat) for as long as i is less than word.length(). Thus, the last value of i for which an iteration will occur is one less than the length of the string, which is what we need, because the positions within the string are numbered from 0 up to one less than the length of the string.
- The getline function.
Now compile charsInString2.cpp and then run it. This program is very similar to charsInString1.cpp except that, in charsInString2.cpp, the string CAN include spaces and tabs.Look now at the source code of both programs.
In charsInString1.cpp, one of the variables is a string class object named word, which is used to hold the string typed by the user. The string is input as follows:
cin >> word;In charsInString2.cpp, a string variable named line plays exactly the same role as the variable word in charsInString1.cpp,, except that, in charsInString2.cpp, the string is input using a call to the getline function, rather than cin's extraction operator (">>"):
getline( cin, line );so that, if the user types any spaces or tabs, the string line will contain them. Otherwise, if the extraction operator were used, cin would extract only the portion of the string up to the first space or tab.
The getline function reads a line of text, including the end-of-line marker ('\n' on a Unix system) which was generated when the user pressed the Enter key at the end. However, the end-of-line marker is NOT part of the string whose contents have been set by the getline function. The getline function consumes the end-of-line marker and then deletes it from the string it has read.
Earlier, we saw a version of the getline function used with C-strings. We are now looking at another version of getline used with string class objects.
The version of getline used with C-strings was a function of class istream (the superclass of both class ifstream and the class of cin). It was called as follows (where line1 and line2 are C-strings):
cin.getline(line1, 81); infile.getline(line2, 41); // infile is an ifstream objectOn the other hand, version of getline used with string class objects is not a function of either class string or class istream, i.e. it is not called using a dot after either line or cin. Rather, getline is just an ordinary function, declared in the header file <iostream>. It takes two arguments, one of type istream and one of type string. Thus it can be called as follows, where line3 and line4 are class objects:
getline(cin, line3); getline(infile, line4);); // infile is an ifstream objectIn b>charsInString2.cpp, if the user presses the Enter key without typing any characters first, the program's response will differ depending on whether the getline function or the extraction operator is being used for string input. If the getline function is used, the program will read a line of text consisting of nothing but the end-of-line marker. The getline function then removes the end-of-line marker, resulting in an empty string, a string of length 0, containing no characters. On the extraction operator is used, the program will not respond at all to the Enter key, and will continue doing nothing until the user finally does enter a non-whitespace character. Thus, if the extraction operator is used, it is not possible to enter an empty string.
So, when the program error-checks the number of characters in the string, there are two error conditions to be checked in charsInString2.cpp, in which the string line was input via the getline function:
if ( line.length() == 0 || line.length() > 15 )An error message is printed out if either (1) the user entered an empty string, i.e. the user did not type any characters but just pressed [Enter], or (2) the entered string contained more than 15 characters.
On the other hand, in charsInString1.cpp, only one error condition needs to be checked, because the string word was input via cin, and thus an empty string is not possible:
if ( word.length() > 15 )Note that the getline function changes the contents of a string variable which has been passed to it as an argument. In this regard, it is similar to the get function of cin, which changes the value of a char variable that has been passed to it as an argument. In this regard, both these functions are different from the <cmath> library functions, which do NOT change the values thar are passed to them as arguments.
- for loops with C-strings.
As we have seen, b>for loops are usually used with count-controlled repetition. However, they can appropriately be used with some other kinds of loops for which they enhance readability. For example, one could use a b>for loop to access all the characters in a C-string text, as follows:for ( int i = 0; text[i] != '\0'; i++ ) { // do whatever with text[i], // WITHOUT changing the value of i // anywhere in the body of this loop. }This is sentinel-controlled repetition, not sentinel-controlled repetition.
However, note that all three items in the for loop heading -- the initialization, the condition, and the increment -- involve the variable i. Thus, using a for loop does enhance the readability of this loop, provided that the value of i is not changed anywhere in the loop except in the heading.
- Command-line arguments.
So far, you've learned about two kinds of input: Interactive input (using cin) and text file input. You will now learn about another kind of input: command line arguments. This topic is NOT covered in the textbook, so a complete tutorial will be provided here.Compile and run commandLineDemo1.cpp. When you run it, don't just type a.out, but, instead, type:
a.out firstname middlename lastnameusing your actual first, middle, and last names in place of "firstname," "middlename," and "lastname." The output will look like something this:
Hello, Dorothy L. Nixon. Thank you for using a.out.Then try changing the name of the file a.out to demoargv as follows:
mv a.out demoargvwhere mv means "move." (The Unix mv command is used to move files from one directory to another, and also to change the name of a file, as you have done here.) Then run the program again, typing the file's new name demoargv instead of a.out:
demoargv firstname middlename lastnameagain using your actual first, middle, and last names in place of "firstname," "middlename," and "lastname." The output will then look something like this:
Hello, Dorothy L. Nixon. Thank you for using demoargv.Observe that the program knows the name of its own executable file, as well as being able to repeat your name.
Look now at the source code of commandLineDemo1.cpp, and examine the following statements:
cout << "Hello, " << argv[1] << " " << argv[2] << " " << argv[3] << ". ";> cout << "Thank you for using " << argv[0] << "." << endl;The expressions argv[0], argv[1], argv[2], and argv[3] are all variables whose type is C-string (char[] or char*). They are a special sequence of C-string variables that are automatically given their values when the program is run. They are known as the command-line arguments, and their values are the strings you typed when you ran the program, including both the filename of the executable file itself and anything else you typed on the same line after the filename, where argv[0] is the filename, argv[1] is the first string you typed after the filename (separated from the filename by a space), argv[2] is the second string you typed after the filename, and so on, for how many command-line arguments you typed. When you type command-line arguments, they must be all on one line, separated from each other by spaces.
A numbered sequence of variables, such as the argv sequence, is known as an array. The number inside the square brackets, which distinguishes an individual variable inside the array, is known as an index. The individual numbered (indexed) variables inside an array are known as array elements. The array name itself without the square brackets, in this case argv, is also a variable; it refers to the entire array rather than to the individual elements within the array. You will learn more about arrays later in this course and in your next programming course, Computer Science 211. (For now, don't worry about how the array of command-line arguments was created, or how to use arrays other than C-strings and the array of command-line arguments.) Note that the command-line argument array is an array of C-strings, which are themselves arrays of characters. Thus, the command-line argument array is an array of arrays of characters.
Note also that, when you use command-line arguments, the main function now must have a different heading from what we used in earlier examples:
int main(int argc, char** argv)Don't worry for now about what this heading means. You'll learn its significance later. For now, simply note that the main function must have the above heading when you use command-line arguments.
Now list the files in your hw03 directory by typing:
ls -lto see the detailed (long) listing. Observe that demoargv is a huge file compared to all the others. Why is it so huge? Because it is an executable (machine code) file which contains not only a machine code translation of your own source code file, but also a lot of other relevant code from the C++ library, e.g. the definition of class ostream (of which cout is an object). To avoid exceeding your disk quote by accumulating too many machine code files in your directory, you should remove all such files when you are finished using them. Remove demoargv by typing:
rm demoargv
- Using command-line arguments to input values of of type int.
The command-line arguments can be used to input C-strings only. You cannot use them to input values of numeric data types directly. For example, the compiler will NOT allow the following:int someNumber = argv[1];The C-string representation of an int value is stored very differently from the way the int value itself is stored. For example, the C-string "12345" abd the int value 12345 are stored in different ways.An int value is stored as a single binary number, taking up 32 bits on most Unix systems, whereas a C-string consists of a sequence of individual characters, each of which is stored as an 8-bit binary numeric code. When the C-string happens to be a string representing an integer, e.g. "12345", it is still stored as a sequence of the characters representing the digits. So, to input integers via command-line arguments, we must take the C-strings and somehow convert them to values of integer data types, e.g. of type int.
Converting strings to numeric values is something you didn't have to worry about it when using interactive input (cin) or text data file input. That's because class istream (of which cin is an object) and class ifstream (objects of which manage input from a text data file) are both defined, in the C++ library, to take care of that detail for you. However, when using command-line arguments, you will have to take care of it yourself.
Converting strings to numeric data type values CANNOT be done by casting. The following will NOT convert a C-string representation of an integer to the represented integer:
int someNumber = (int) argv[1];Casting can be used to convert a simple data type value to a value of another simple data type, but CANNOT be used to convert an array or object to a simple data type value.
Instead, you must call a function to generate the int value represented by a C-string. There is a standard C++ library function which does this, atoi, declared in library header file <cstdlib>. It has the following prototype:
int atoi(char a[])For an example of a program which inputs integers using command-line arguments, compile and run average7.cpp.
Then, when you run the program, be sure to type a sequence of integers after a.out on the same line. (Otherwise, you will get a runtime error, and probably core dump, which you will have to remove.) Just enter the numbers to be averaged. It is not necessary to enter the number of numbers too, because, as we will see soon, the program can easily find out how many command-line arguments there are, and thus can easily determine the number of numbers without the user telling it that number explicitly.
Look now at the source code of average7.cpp. First, look at the main function heading:
int main(int argc, char** argv)Variables declared between the parentheses of a function heading are called parameters. A function's parameters are automatically given values as the first thing that happens when a function is called, before the body of the function is executed. In the case of the main function, the parameters argc and argv are given their values automatically as the first thing that happens when the program is run.
As we have seen, argv is the array of command-line arguments. The other parameter, argc, is the number of command-line arguments.
As we have seen, argv[0] holds the filename of the executable file. Hence the strings representing the integers to be averaged are in argv[1], argv[2], and so on, up to argv[argc - 1]. Thus the total number of command-line argments, argc, is one greater than the number of integers to be averaged. So, our program determines the number of integers to be averaged as follows:
int numberOfNumbers = argc - 1;Then the int value represented by each of the command-line arguments from argv[1] to argv[numberOfNumbers] is obtained as follows:
int number = atoi(argv[1]);where i is the loop control variable of a for loop with the following heading:
for ( int i = 1; i <= numberOfNumbers; i++ )We have seen that the atoi function give us the value of the integer represented by a C-string. For floating-point numbers, we can use the following function to give us the double value represented by a C-string:
double atof(char a[])Also, for very large integers, we can use the following function to give us the long value represented by a C-string:long atol(char a[])Alas, these functions do NOT provide a means of detecting errors. If you give them a string which does not represent a number of the appropriate kind, these functions will give you a nonsensical result.
There is a another library function which can give us the numbers represented by C-strings and which also does detect non-numeric strings. But that function (sscanf, whose prototype is in library header file <cstdio>) is more complicated to use, so we will not teach it to you now.
- Filenames as command-line arguments.
Compile average8.cpp. Then run it by typing:a.out numbers.txtthereby telling the program to take its input data from the file numbers.txt. You could also run it as follows:
a.out numbers2.txtor
a.out FileWithAVeryLongName.txtand it would take its input data from the specified file. Thus, the name of an input data file can be obtained as a command-line argument. Within the program, the file is opened as follows:
ifstream inputFile; inputFile.open(argv[1]);In the real world, it is common for filenames of a program's data files to be specified as command-line arguments. Examples include the Unix commands mv, cp, and rm. These Unix commands are all programs which take filenames as command-line arguments.
Back to: