.. _more-files: Reading strings from a file ########################### .. include:: /references.inc Well, the previous lecture implied that we might be able to read in an entire line in one shot with something like: .. literalinclude:: code/main.cpp The problem is that the input works like it did for numbers, one "chunk" at a time, not the entire line. Here is a sample data file: .. literalinclude:: code/test.data And here is the output we get: .. literalinclude:: code/sample.out Clearly, this will help in your lab project, but what if we really do want to read the entire line in. Doing this involves using a different method: .. literalinclude:: code/main1.cpp And running this gives us this output: .. literalinclude:: code/sample2.out Now we have a full line of text and processing this one is more difficult. (Guess which approach I would recommend that you use for your lab project!) Writing small test programs like this is a very good way to test code snippets and make sure your logic is working correctly, before adding something to your project. Let's take the first version of this code and do something we probably should have been doing all along. When we write a new function, we really ought to bolt that new function into a `test fixture` and feed it a bunch of data to make sure we are happy with the result. Since we are studying files, I will show how to build up a testing routine that can exercise a function we have constructed and want to add to our project. To make this more interesting, we will build the test fixture code in a separate file and have the linker hook in our test function. Testing a new function ********************** Well, to do this, we need a function to test. I will keep this moderately simple! Suppose we are processing a bunch of text, maybe something like my course note files. My files are a set of lines, each of which may have one or more `acronym` in it somewhere. The acronym will be marked off by surrounding it with vertical bars. The test line might look like this: .. code-block:: text I am currently working at |ACC| as a Professor of |CS|. My function is going to help me expand the acronyms I have marked into the set of words associated with each one. In this example, what I really want to have in my notes is this: .. code-block:: text I am currently working at Austin Community College as a Professor of Computer Science. All I have done here is expand the acronym using the text I really want. I need some test data for this exercise, so I `Googled` "computer acronyms" and came up with this set of terms and phrases that could be substituted whenever the marked up term is found.This is a sample extract from that search: .. literalinclude:: code/acronyms.data .. note:: OK, so I added a few entries that were not really in this list! The substitution function ************************* Here is the prototype for my new function: .. code-block:: c string term_expander(string term); The function takes in an acronym (with or without the vertical bars (to make it more friendly) and returns a string with the expanded words we should replace this acronym with! To make this function more useful, we will allow the acronym to be given in any case the user wants, so our term could have been any of these: * |ACC| * |acc| * |Acc| * |AcC| If the acronym is unknown, the function should return the acronym without the vertical bar markers in all capital letters. Phew, sounds hard Testing the function ==================== Here is an odd idea, but one that is a huge part of the programming landscape out there now! Let's write a test program that will exercise our new function and tell us if it is working correctly for a number of test cases. The test cases will include both correct and incorrect data, and we will define exactly what we want the function to report for each test we set up.. The test program is not supposed to solve the problem, it is supposed to exercise the test function and make sure it is doing the right thing. It is also supposed to test what the program does with bad input data. For this example, if the term we hand our function is not in the list of expansions, we will simply strip off the vertical bar characters and return the term exactly as it was given by the caller. .. note:: The model for this problem is actually the tool I use in writing my lecture notes. It has this feature, and it saves me a bunch of typing! Step 1 - reading a test file ============================ We start this exercise off by creating a program that reads a set of tests from a file. This pattern should look familiar: .. literalinclude:: code/test1.cpp And here is a sample test set, in the file ``test_set.data`` Do you see that pattern here? Each line contains a term marked up as needed, followed by the text that the function should return. For our first version of this program, the test code simply reads the file and prints it out, "baby step" style! .. literalinclude:: code/test1.out Now, we need to add code to hook in out actual test function. In actual software development, we would write a ``header file`` for our function that looks like this: .. literalinclude:: code/term_expander.h This file is named ``term_expander.h``, and the actual function will be stored in a file named ``term_expander.cpp``: That funny notation that surrounds the actual prototype is a pattern used by all C++ programmers. It keeps the compiler from accidentally trying to include the same file more than once. Include files can include other include files and sometimes you get into a loop. The ``conditional`` lines stop this from happening. You will learn more about all this in another C++ course. For now, just use the pattern. Note that the actual name stuff you type is the header file name in all capital letters, with dots replaced by underscores. (This is what programmers do, so we will also!) Here is our start on the function. It is clearly wrong, but it will allow out test program to work! .. literalinclude:: code/term_expander.cpp Of course, our function is far from complete! At this point, I need to make sure Dev-C++ has all the files added in the project. The :term:`IDE` will make sure the test program works correctly. Step 2 - Making the test work ============================= Our test program is reading our test set file, but not really processing it. We need to break each line up into two parts: the term and the set of words we want to see when the function works. The term is the first thing in the file, and ends when we see whitespace in the line. The set of words is everything from the first non-white space character to the end of the line. Here is code that breaks up each line into these two parts: .. literalinclude:: code/test3.cpp WOW! That ``split_line`` function I added here looks very complicated. Actually, it is not that bad. All we are doing is using another simple C++ library function called ``isalpha`` which returns ``true`` if the character you have it is a letter (alphabetic character). It returns ``false`` otherwise. I want to copy all of the letters and the vertical bar characters into the ``term`` variable until I find a character that is not a letter of bar. At that point, I have found the term, and I need to skip the rest of the white space until I find another letter. From that point to the end of the line, I just need to copy the test into the ``substitution`` variable. What looks strange in this code is the fact that I left off the initialization code in the ``for loops`` for the second and third loops in this routine. Why? Well, I an using the variable ``i`` to count my way across the entire line. After the first loop, ``i`` is pointing to the first white space character in the line. I do not need to change that value, so I simply do nothing as the second loop starts up. I will let it run until I find a non-while space character, then use ``break`` to stop this loop. AT that point, ``i`` is pointing to the first letter of the substitution, which runs to the end of the line. The last for loop just copies everything left in the line into the ``substitution`` variable! See, that was not so bad. .. note:: Once again, as you see and study code fragments like this, you will store this pattern away in your mind for later use in other programs. The more code you examine, the better your will become. What you are really doing is training your brain to think things through at a fine level of detail, and thinking hard to make sure this all works. OMG! I am testing my test code! The output now shows that I am breaking up my test set lines as needed: .. literalinclude:: code/test3.out Step 3 - Making the test report what it finds ============================================= Now, I am going to make this test code report success or failure!. The idea is to call the test function and get its return value. We want to compare what we get to the substitution string our test_set.data`` file says we should get. We will report failures, and just count successes for a final report: .. literalinclude:: code/test4.cpp Look closely at how this works! The term we extracted from each test set line is handed to the ``term_expander`` function, and the result we get back is stored in ``result``. We then compare that ``result`` with the string we found on the test set line, which is stored in ``substitution``. If they match, we add one to the counter indicating tests passed. If they do not match, we report the error, and add one to the tests failed counter. AT the end of the program we report the final results. Here is what I saw now: .. literalinclude:: code/test4.out Well, that is no good, all the tests failed. Test Driven Development *********************** What we just did was to build a test fixture we can use to test out ``term_expander`` function as we work on it. Our goal is to make all the tests pass. Every time we add something to the function, we hope we are moving closer to a finished product. Out tests will tell us if that is so. We can always add more tests to the ``test_set.data`` file as we think them up! This is the real way software is being developed in huge projects today. Obviously, this is beyond what you could have done earlier in the course, but you could start using this approach now that you are starting to learn what all this programming stuff is all about. Have fun. This actually makes programming a lot more satisfying. Added to my "Baby Step" approach, and you will find programming can be a lot of fun! (Even if it is hard work at the same time!)