Moving Data

Read time: 59 minutes (14947 words)

We want to build a simple simulator for a simple machine. Specifically, we will model a machine that uses the harvard architecture This machine will run exactly as Von Neumann described it, but we will separate data from code in the memory. Internally, we will set up the classic four step “dance” to get this machine to run some code.

Ready to see how that gets done? Well, since we are going to use C++ for this experiment, we need to review a bit.

C++ Classes

A class is a blueprint to be used to manufacture one or more objects at run-time. Theoretically, classes describe real gadgets from our human world, but they may merely describe some concept instead. Each object is a gadget we can use at runtime, to do something useful.

The basic idea of a class is simple. We describe all of the atributes possessed by our object. These are the object’s internal variables. We describe all of the actions an object created from this blueprint can perform. Actions are captured in methods, which contain code that can do the real work.

The most interesting new idea in all of this is simple. Objects manipulate their own internal attributes (variables) using their internal `methods (functions). Other objects can ask an object to do this work, but they cannot (unless specifically allowed to) reach inside another object and mess around. This layer of protection eliminates a lot of common mistakes in programming.

Note

C++ provides ways to violate these ideas, and let other objects reach into an object and modify things. That kind of programming is risky, and should not be done until you are sure things are working correctly.

This “object orientation” is an interesting idea. In traditional programming, you set up variables to track something. Then you use use code, usually wrapped up in a function, to mess around with those variables. We call that messing around processing.

You do have some control over what code can access a variable, based on where in your code you declare that variable. This is handy, but not enough. In traditional programming, it was far too easy to make mistakes, and not know where the errors were located.

Object Wrappers

Objects work in a different way. Basically, an object is a set of variables and functions surrounded by a protective wrapper.

Note

Remember, we use different terminology when we talk about these object oriented things. Variables become attributes and functions become methods.

Attributes

By placing these attributes inside of that protective wrapper, we control access to each attribute. You can individually decide to make an attribute visible to other parts of your program, or lock it up and let no code outside of the wrapper play with that attribute. Inside the wrapper you are free to use attributes as you wish.

Methods

You can do the same thing with methods inside the wrapper. C++ is really good at protecting names in your code from being used in ways you do not like. We control access to any name: attributes, or methods! No more worrying about what code might have messed with something, and messed your data up by accident (or by evil design!)

A common code design tactic is to create public methods that will let other parts of your program modify a private attribute. Those special methods can make sure such modifications do not harm to your object. We can provide other methods to return the value stored in some attribute as well. This extra layer of code protects the attribute (and the rest of your class) from misuse. We call those methods that change an attribute mutator methods, and those methods that simple report the current value an ccessor methods.’

attiny85sim

We will by building our simple simulator in several stages. You already got started on this project in the previous lecture.

For the next part of this exercise, I am going to show you a few example C++ classes and a main function (application) file that can exercise those classes. This will serve as a C++ review. We will discuss what is wrong with each example as we build toward a structure we can really work with.

The goal of this project is to construct enough of a simulator that we can actually run a simple program on it. The target program will be a variation of the code we demolished in the C Program Tear-Down lecture, only we will be using something close to AVR assembly language, not Pentium assembly!

Note

Your next several lab projects will require that you follow these examples, and get each one running in your project repository. Details are in the lectures and assignments for this week.

Ready? Lets get to coding!

Step02: Moving Data

Let’s start off by examining a simple C++ statement:

1
2
3
4
int X = 5;
int Y;

Y = X;

Note

The statement we are interested in studying is that last line, the two declarations set up the environment in which that statement will execute.

What will happen when the assignment statement runs?

We should all know what will happen when the processor hits line 4:

  • a copy of the current value stored in the container named X will be retrieved
  • That value will be stored in the container named X

Simple! But how was that done inside of the machine?

Modeling the Data Transfer

Here is an application file that sets up two “containers”, each one modeling a chunk of memory where a data item (integer in this example) can be stored. At this point, we do not see the definition of the class defining these containers, but we can see that they must support a few basic methods:

src/main.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include "Component.h"

int main( int argc, char *argv[] ) {
    std::cout << "attiny85sim (v0.2.0)" << std::endl;
    std::cout << "running ..." << std::endl;

    // create required components
    Component X(5), Y(0);

    // make the data move
    int data = Y.write(X.read());

    std::cout << "Component X returned " << X.read() << std::endl;
    std::cout << "Component Y stored " << data << std::endl;

    std::cout << "done!" << std::endl;
}

We have not seen the Component class definition either, but we can figure out a few things from how the objects seem to be working.

It appears that our Component class sets up objects that can be initialized using the constructor. Those objects support read and write methods. If you look closely, both methods return a value, and we are displaying what “moved” as part of the application output. A little thinking shows that we are modeling the data transfer we are interested in studying.

Here is the header file:

include/Component.h
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#pragma once

class Component {
    public:
        Component(int val);     // constructor
        int read(void);    
        int write(int);      // to outside
    private:
        int data;
};


Notice that we are placing the header file in a special directory named include. That will make it easy to find the header files later.

Note

In your previous C++ class, you should have learned about header guards. These typically use the C++ preprocessor #ifndef directives to prevent the compiler from seeing the same declarations twice. Modern compilers have moved away from this pattern, and provide a new directive. That #pragma once line accomplishes the same thing with less fuss.

Now we are ready to see the actual class implementation:

lib/Component.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include <iostream>
#include "Component.h"

Component::Component(int val) {
    data = val;
}

int Component::write(int val) {
    data = val;
    return data;
}

int Component::read(void) {
    return data;
}

All of our component parts will end up in the lib directory. These “components” may be useful in other projects, and we will build a C++ library from them in a later step.

Modifying the Makefile

Since we are placing parts of our program in different directories, we need to teach make where to look for files. Here is the new Makefile we need for this step:

Makefile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Makefile for version2
TARGET = version2

# files
CSRCS	:= 	$(wildcard src/*.cpp) $(wildcard lib/*.cpp)
COBJS	:=	$(CSRCS:.cpp=.o)
INCLUDE	:= include

# tools
CXX		:= g++
RM		:= rm -f

.PHONY: all
all:	$(TARGET)

$(TARGET):	$(COBJS)
	$(CXX) -o $@ $^

%.o:	%.cpp
	$(CXX) -c -I$(INCLUDE) -o $@ $<

.PHONY: run
run:	$(TARGET)
	./$(TARGET)

clean:
	$(RM) $(TARGET) $(COBJS)

Here, we are using several cool features of Make.

The first one is that line defining then name TARGET. This will be the name of the final executable we want to build.

Warning

This Makefile runs on my MacBook. If you use a WIndows PC, you need to add the .exe extension to the TARGET name.

Look at next line, where we define another name, CSRCS. In that line we are asking Make to look into the directories where we have placed source code files, and build a list of all files with names ending in .cpp. The result is a name for a pattern we can use in this Makefile. That list is a space-separated list of names, exactly like the list you would supply to the compiler on the command line.

The next line is nice as well. Here we are asking Make to take that string of file names it found on the last line, and substitute the substring .cpp with another string, .o in this case. It will store this new list using the name COBJS. We are asking Make to create a list of “object files” that matches the list of “source files”. We will use that list later.

When you want to refer to a pattern (by the name you created), in your Makefile, you surround the name with parentheses, and stick a dollar sign in front of it. Make will substitute the pattern is set up for that name reference at that spot in the command it generates.

Note

The line where we defined COBJS shows other ways you can use names as well. Make can transform lists in a number of ways. We will see that in action in later examples of Makefiles we set up. Remember, Make just manipulates text strings and issues commands for you. It has not idea why it is doing this! That is up to you! Use “make -n” to check your work before running make!

This Makefile is much more intelligent that the simple one we showed earlier.

Makefile Target Lines

Formally, a line specifying a “target” has two parts:

  • The target name itself (followed by a colon)
  • A space separated list of “dependencies” needed to run the target commands

Basically what a target rules says is this: If you want to “build” this target, make sure these “dependencies are present in the current directory. If they are, run the following commands.

This version compiles each source file as a separate step, creating an associated object file (with a .o extension). The target rule that does this work starts on line 23. This rule is a template for real commands make will issue, when it decides it needs to. In this example, the % is a place-holder for a real name. The target rule tells Make that if it needs to build a file named “something.o” it can do so if it finds a file named “something.cpp. The actual commands use ugly Make notation. Here are the common ones we will use.

  • $< - replace this with the first (or only) chunk of the dependency list
  • $@ - replace this with the name of the target
  • $^ - replace this with the entire dependency list

I know this notation looks odd, but it will save tons of typing!

The Compiling Process

This version of our Makefile is typical of what you will see in most C/C++ projects. Basically, it does this:

  • Compiles all source files (ending in .cpp here) into object files (ending in .o)
  • Links all object files (with the C++ runtime library) to build an executable file

The neat things about this setup is that now Make is smart enough to be able to only compile files you change, not everything. It will relink the object files to rebuild your program. Much more efficient!

How Make pulls this off is tied to time and date stamps on all files involved. You should read up on Make to learn more! Make can manage a lot of the work you do as a developer, not just compile code for you. (I use Make to manage processing my lecture notes, and push them to multiple web servers.)

Finding Header Files

Normally, when you ask the compiler (actually the preprocessor) to “include” one of your header files, you provide the path to the file between double quotes. By adding the -I option to the compiler command, we can tell the preprocessor to search in the named directory for any header files we try to include. That makes our include lines much cleaner.

Reusing Makefiles

If you look at this Makefile closely, it has nothing in it that locks it down to this particular project. Well, maybe that TARGET line needs tweaking. In fact, we can use this Makefile in other C++ projects. We will be doing that, but working it into a more general form as our project gets a bit more involved.

Compiling the Project

Let’s try to compile this example:

$ make clean
rm -f version2 src/main.o lib/Component.o

Note

I am only doing this so my lecture note build system will actually compile the example code. Normally, you only do this when needed. Like before running the Git commands to push your changes to the “remote” server (GitHub).

$ make
g++ -c -Iinclude -o src/main.o src/main.cpp
g++ -c -Iinclude -o lib/Component.o lib/Component.cpp
g++ -o version2 src/main.o lib/Component.o

And run it:

$ ./tinysim
attiny85sim (v0.2.0)
running ...
Component X returned 5
Component Y stored 5
done!

Reviewing Step02

So, how good is this model for our simulator?

Not good at all!

The methods we used were public, meaning any code can potentially access them. Each component supports both read and write methods to access the private variable inside, which effectively turns the private data into “global” variables, with little protection from harm. Clearly, this is not good.

Another thing to note is that the two components are being directed to do specific actions by the main function, which is acting like a controller here. Although this does not seem like a bad thing, this is not how we want our controller to behave. The controller will not tell the component what to do, it will tell the component to do whatever it was trained to do. It is up to the component to make that happen. Viewed this way, the controller should not really direct each action, it should just yelling “action”. All the components act on their own, just like you did in the “dance”. We are not there yet, but we have a start

Note

You should commit this version of the code now. Tag it as version v0.2.0

Step03: Naming Components

When our simulator is running, we are going to want to generate some kind of output, so we can see what is happening inside of the machine. Although we have not figured out what to look at yet, it will be handy for us to be able to name each component. Let’s see what we can do to add that feature:

Obviously, we need to add an attribute to the Component class, and this will be a string:

include/Component.h
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#pragma once
#include <string>

class Component {
    public:
        Component(std::string n, int val);
        int read(void);
        int write(int val);
        std::string get_name();
    private:
        std::string name;
        int data;
};


Notice we have added a public method to retrieve the component name. Since no code should be able to modify the component name, we added a parameter to the constructor to initialize it.

Here is the new implementation code:

lib/Component.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include "Component.h"

Component::Component( std::string n, int val ) {
    data = val;
    name = n;
}

int Component::write(int val) {
    data = val;
    return data;
}

int Component::read(void) {
    return data;
}

std::string Component::get_name( void ) {
    return name;
}

The changes to our main code should be pretty obvious.

src/main.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include "Component.h"

int main( int argc, char *argv[] ) {
    std::cout << "attiny85sim v(0.3.0)" << std::endl;
    std::cout << "running ..." << std::endl;

    // create required components
    Component X( "X", 5 ), Y( "Y", 0 );

    // make the data move
    int data = Y.write(X.read());

    std::cout << X.get_name() << 
        " returned " << X.read() << std:: endl;
    std::cout << Y.get_name() <<
        " stored " << data << std::endl;

    std::cout << "done!" << std::endl;
}

Let’s see this Makefile in action:

$ make clean
rm -f cosc2325 src/main.o lib/Component.o

I do this to make sure I am compiling a fresh version!

$ make
g++ -c -Iinclude -o src/main.o src/main.cpp
g++ -c -Iinclude -o lib/Component.o lib/Component.cpp
g++ -o cosc2325 src/main.o lib/Component.o

And run it:

$ make run
./cosc2325
attiny85sim v(0.3.0)
running ...
X returned 5
Y stored 5
done!

Reviewing Step03

Well, our controller logic is still referring to the components by name, but the names you see on the output come from the components themselves. We could freely change those names in the controller without harming the operation of this version.

Let’s make that change, and streamline the output a bit for our next step.

Note

Commit this version. Tag it as v0.3.0

Step04: Generic Component Names

The controller is not in the business of tracking the names of the components, but we will need to build specific kinds of components eventually, to model Von Neumann’s basic components. For now, let’s just go back the naming them something like c1 and c2 in main, and let’s make the output a little less wordy!

This change is only in main.cpp`:

src/main.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include "Component.h"

int main( int argc, char *argv[] ) {
    std::cout << "attiny85sim v(0.4.0)" << std::endl;
    std::cout << "running ..." << std::endl;

    // create required components
    Component c1( "X", 5 ), c2( "Y", 0 );

    // make the data move
    int data = c2.write(c1.read());

    std::cout << c1.get_name() << "->(" << data <<
        ")->" << c2.get_name() << std::endl;

    std::cout << "done!" << std::endl;
}

Let’s try to compile and run this example:

$ make clean
rm -f cosc2325 src/main.o lib/Component.o
$ make run
g++ -c -Iinclude -o src/main.o src/main.cpp
g++ -c -Iinclude -o lib/Component.o lib/Component.cpp
g++ -o cosc2325 src/main.o lib/Component.o
./cosc2325
attiny85sim v(0.4.0)
running ...
X->(5)->Y
done!

Since my Makefile says that we must have the executable file in the project directory before we can run it, Make takes care of doing the compile steps as part of us trying to run the code! COol!

Reviewing Step04

The output looks much cleaner. In fact, it looks like something processor designers use all the time: Register Transfer Language. We will discuss what that means later. For now, you should be able to see what moved where!

The notation is showing our two components exchanging a specific value over some kind of communications path. Hey, that sounds like a wire! (Well, actually, several wires. More on that later.

Note

Commit this version. Tag it as v0.4.0

At this point in the development, we can see how data is actually moved between two points in our simulated system. We have developed enough code to feel confident that we can move to the next level in our design.

Note

I am going to close this part of the development and call this version lab2. I will grade the project through tag v0.4.0. Since we are going to add new functionality, we will bump the major version number to one for our next step. Stay tuned, this will be interesting!