File Examples¶
Now that we have the basic idea about how to access files, let’s use then to store and retrieve numerical data.
Converting strings to numbers¶
As you have seen, Python accepts just about anything a user types in through
the input
statement as a string of characters, then expects you to convert
that string into whatever you expect. If the user does not enter the right
thing, we will have a problem. Here is an example:
# test program to see what happens with bad input
val = input("Enter an integer number: ")
print("You entered this text:",val)
ival = int(val)
print("The integer number you entered was:",ival)
Here is my bad input:
python badinput.py
Enter an integer number: NO!
You entered this text: NO!
Traceback (most recent call last):
File "badinput.py", line 5, in <module>
ival = int(val)
ValueError: invalid literal for int() with base 10: 'NO!'
As you can see, python threw a fit (actually, it threw and exception
whose
name is ValueError
. We can keep control of this situation by doing this:
# test program to see what happens with bad input
ival = None
while ival == None:
val = input("Enter an integer number: ")
print("You entered this text:",val)
try:
ival = int(val)
print("The integer number you entered was:",ival)
except ValueError:
print("I cannot process your input: ",val,"try again!")
Do you see how this works? We are using a magical value (None
) to indicate
that we do not have a valid input yet. In other languages doing this is pretty
hard! We then set up a loop that will keep asking the user to enter an integer
until they do it right.
Note
Who was it that said insanity is doing the same thing over and over and expecting different results? (Einstein!)
This is better code, and it does handle cases where the user just was sloppy in typing in the number.
Here is the output:
python3 badinput2.py
Enter an integer number: no!
You entered this text: no!
I cannot process your input: no! try again!
Enter an integer number: 10
You entered this text: 10
The integer number you entered was: 10
Now, we have a way check for good input from the user, let’s write this data out to a file:
Data files¶
We are not going to get into the details of setting up really nice data files, just show a simple example to get you started. Since we are doing things with just text files, this will be pretty easy.
Creating a data file¶
As a start, let’s write a short program that generates a series of simple numbers. Nothing fancy needed, just a simple loop:
def generate_data(filename):
fout = open(filename,"w")
for num in range(0,15):
fout.write(str(num) + '\n')
fout.close()
def main():
print("Generating a set of data")
datafile = 'mydata.dat'
generate_data(datafile)
main()
Look closely at how we printed the data. We need to convert the number into a
string, then add (+) a newline character to that string. The fout.write
function takes care of writing the number to the file, one number per line. You
can verify that it worked by opening up the mydata.dat
file using gVim
.
Reading a data file¶
Now that we have a data file to process, let’s write a short program to add the numbers up. It would be silly to write this program so it only works with exactly the number of numbers in the data file (we know that number, we set it up!), so we will write the code in a general way:
def sum_data(filename):
fin = open(filename,"r")
sum = 0
for line in fin:
num = int(line)
sum += num
fin.close()
return sum
def main():
print("Summing a set of data")
datafile = 'mydata.dat'
val = sum_data(datafile)
print("The total of all the numbers in the file is",val)
main()
Here is the output:
python basicsummer.py
Summing a set of data
The total of all the numbers in the file is 105
Appending data to a file¶
This next example is a bit silly. Let’s add some more numbers to the data file, then re-sum the new result:
def add_data(filename):
fout = open(filename,"a")
for num in range(15,25):
fout.write(str(num) + '\n')
fout.close()
def main():
print("Adding to a set of data")
datafile = 'mydata.dat'
add_data(datafile)
main()
Run this example, then rerun the summer and we get this:
python dataadder.py
Adding to a set of data
python basicsummer.py
Summing a set of data
The total of all the numbers in the file is 300
Looks like it worked. The append
option (a
) opens the file up, then
positions the file so we can write immediately after the last line in the
current file. If you run the dataadder.py
file multiple times, you will add
more and more lines to it. This is useful in programs that create a log of
actions they have taken during a run. Every time you run the program, you open
up the log file in append mode, and add new entries into the lo as the program
runs. At the end of the run, closing the file locks it all in and you can see
what happened.
Working with records¶
In many programs, data are generated that are not simple numbers, but some combination of numbers and text. We need to be able to generate data files from this kind of data, and read those data back in later. The problem is that sometimes we do not know exactly how many chunks of data to read until after we read something in and check it. For this example, and for your lab, we will work on another graphics program, but this time, we will generate the picture using data in a file.
Here is the idea. I want to process a file that looks like this:
circle
blue
40
40
25
rectangle
red
50
20
100
60
triangle
yellow
60
10
100
10
80
20
circle
orange
60
60
50
Now, how in the world am I going to read that in? Well it is easier than you might think at first.
First, there are three records
in this data set. The first element in each
is the name of the object we want to draw, next comes the fill color for the
object. Finally, there are a series of numbers that define the rest of the data
needed for each object:
circle - x,y,radius
Rectangle - x1,y1,x2,y2
Triangle - x1,y1,x2,y2,x3,y3
Once we know what kind of object we are supposed to draw, the rest is easy.
Setting up the circle code¶
We know how to read the basic stuff in the file, but we need to know what each line contains. In some cases it is a string, in others a number. We better do this right!
def drawCircle(fileobj):
color = fileobj.readline().strip()
x = int(fileobj.readline())
y = int(fileobj.readline())
radius = int(fileobj.readline())
print("Circle with color:",color,"at:",x,",",y,"with radius:",radius)
def draw_picture(filename):
fin = open(filename,'r')
for line in fin:
line = line.strip()
print("|%s|" % line)
if line == 'circle':
drawCircle(fin)
def main():
filename = "picture.dat"
draw_picture(filename)
main()
Here is a section of the output from this file. Notice that I printed out each data line using a special string to place a vertical bar around the text I read in. This is to help me see exactly what sequence of characters I have. When I text for “circle”, I better not have a newline on the end of the string, or extra white space anywhere!
python drawing1.py
|circle|
Circle with color: blue at: 40 , 40 with radius: 25
|rectangle|
|red|
|50|
|20|
|100|
|60|
You should be able to write the routines to process a rectangle and triangle from this example.
The key to all this is to identify the data item that tells us exactly what other data to expect. I set up the first chunk of data as a string, but a number would do just as well. Once you have that, you test that key data item and select the correct code to process it:
import sys
def drawCircle(fileobj):
color = fileobj.readline().strip()
x = int(fileobj.readline())
y = int(fileobj.readline())
radius = int(fileobj.readline())
print("Circle with color:",color,"at:",x,",",y,"with radius:",radius)
def drawRectangle(fileobj):
print("I saw a rectangle there")
def drawTriangle(fileobj):
print("I saw a triangle there")
def draw_picture(filename):
fin = open(filename,'r')
for line in fin:
line = line.strip()
print("|%s|" % line)
if line == 'circle':
drawCircle(fin)
elif line == 'rectangle':
drawRectangle(fin)
elif line == 'triangle':
drawTriangle(fin)
else:
print("unknown object - aborting!")
sys.exit()
def main():
filename = "picture.dat"
draw_picture(filename)
main()
This is what I got with this code:
python3 drawing2.py
|circle|
Circle with color: blue at: 40 , 40 with radius: 25
|rectangle|
I saw a rectangle there
|red|
unknown object - aborting!
Why did I get this. It looked like I was processing all possible objects in the program. Wait, I failed to read in all the data needed to fully define the additional objects, so the code ran into the “red” line (for the rectangle) and thought it was supposed to be another object name. As a result, it choked!