What is Data anyway?¶
Programs are all about manipulating data of all kinds. As we start learning C++, we need to examine how it handles data. C++ is not like the Python world most of you came from, so pat attention!
What is data?¶
We need to start off by making one thing clear: data is anything you want to manipulate: words, numbers, colors, phases of the moon, or anything at all. However, we must get those things into the computer, so ultimately, data is just zeros and ones. How does that work?
Encoding things¶
Suppose we want to manipulate colors for a game program. How are we going to set things up so our program can do this?
As humans, we tend to try to figure out how our world works. That is how we
become scientists, or engineers (that is how I did it anyway). Long ago,
scientists figured out that colors are just different frequencies of basically
something called electromagnetic energy
(same thing as radio, by the way -
only much higher frequency). Now, we do not care about all that science here,
but it occurred to someone that we could record a number somewhere (maybe the
frequency) and equate that number to a specific color. When we line up a list
of numbers next to a list of colors, we are encoding colors as numbers.
We can go back and forth between the two lists as we want. This is the basic
idea of representing colors in a computer. Once we know the encoding,
we can take a number and figure out the color, or take a color and figure out
the number.
How will we set up the numbers?¶
Based on our previous discussions about how memory in our computer is set up, we know that we are going to record the number in memory, and we can choose any number of bits to store that number. The computer likes to group bits in eight-bit chunks, so we usually choose multiples of eight-bits when we decide how big a piece of memory we want to use to record the number.
So, we can choose some number of bits in memory (or some number of bytes), place a binary number in those bits and call it a number. The number of bits controls how many different colors we can deal with in our program. Eight bits gives us 256 colors, 16 bits will give us 65536 colors and so on.
Rules for manipulating numbers¶
Here comes an important concept! When we manipulate our numbers, we need to set
up the rules that control what happens. You already know all about one set of
rules - you learned then in elementary school and use them every day. We even
gave those rules an ugly name - math
(ugh!)
The rules are simple: if you take one or more numbers and perform some
operation on then, you get some other number as a result. When we add
two
numbers, we certainly hope the result is the sum of those two numbers - since
that is what the rules say we should get. Fortunately, the designers of the
computer set the machine up so those rules work as expected.
But suppose we add two numbers that we are using to encode colors, what should the answer be?
Good question! The answer depends on what we humans think it should be, and (know what) adding two colors may not make any sense! But, we might think of adding like mixing two colors together (like cans of paint) and then we have an idea what the result should be. Hmmm!
Fortunately, the scientists who studied colors figured out how combining colors
works at a deeper level, and we finally came up with an encoding scheme that
makes more sense - and even let us build machines that can display those
colors. (We call those machines monitors
).
Recording our data - er - numbers¶
Once we have figured out our encoding scheme, and have defined our
rules, all we need to do is save away the number we want into some place in
memory. We will probably want more than one color. We might need millions of
these containers to represent the colors on each dot on the monitor
screen.
(Boy, is that a lot of bits?) We apply the rules we set up to the colors to
change them as we want using our computer program
that is where we actually
make the rules work!
Manipulating the bits¶
Programs are all about manipulating a pile of bits! What makes things interesting is what we see when we watch the program go! BOOM! Another space alien bytes the dust (bad pun, there!)
When we actually record a number representing a color in memory, we can
manuipulate
that number to turn it into a different color.
Note
It can really cause you to stop and think when you realize that most everything we ask computers to do is just fooling around with a pile of bits!
Another example¶
We will not go too deep into this one, but let’s consider how we can get a computer to manipulate text.
Once again, we need to break text down into fundamental chunks and think how we will encode those chunks as bits in the computer. Fortunately, this is pretty easy (at least for us English-speaking folks).
We have an alphabet of 26 letters we use to form words, sentences, paragraphs and so forth. We all know that. So, if we come up with an encoding for each letter, we can teach a computer to play with text. Cool.
But, thinking about this a bit more, we realize that we have other things to consider as well. For example, we distinguish between upper and lower case letters, we have punctuation symbols, and (oh yeah) those digit things we use to write down numbers!
Note
Yikes! Is a number on a piece of paper the same thing as a number in the computer? NO!
The encoding of a number (meaning a quantity of things we can apply math rules
to) is simple - we express the number in binary and record the result. But,
the display
of a number on a piece of paper is just the series of symbols
we want to print on paper. Each digit has its own encoding and the number on
paper is just a sequence of those digit encodings - as many as it takes to
display the number the way we want to see it. Clear?
A Fundamental problem with computers¶
Suppose you want to build a program that manipulates both colors and text, how do you know that a place in memory holds a color or a piece of text? The answer is you cannot - they are both sitting in memory as a sequence of zeros and ones. They look the same. It is up to our program to remember which is which and apply the right rules for each kind of data.
Note
The formal term we will use in identifying what set of rules to apply is
data type
. We will explore this next.
As a final reminder, we still need rules for dealing with text. What does “A” x “<” mean - absolutely nothing, so using math on letters or symbols is a meaningless activity. We need our programs to be smarter!