A Better Linked List Class

Read time: 27 minutes (6755 words)

Note

This lecture covers advanced concepts in C++, not really seen in an introductory C++ class like ours, but it they worth taking a quick look at.

Let’s build the simple Linked List class again, and this time set up some new tricks that C++ provides. This list will be a little different. We will assign an ID number to each node, in addition to the real value we want to store in the list. The reason for this will become apparent later.

Here is a diagram of what we have been building so far:

This setup is not exactly memory efficient. Eash integer takes four bytes, and each pointer takes 8 bytes. So managing 1000 integers will take 12,000 bytes, not the 4000 bytes really needed for the data. Arrays, on the other hand take only as much storage as the data, so we could store all those integers in just 4000 bytes. Maybe we can do better!

Let’s work through our list design strategy.

The Node Class

We need a simple node to hold our data items (and the ID).

Node.h
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#pragma once

class Node {
 public:
    // constructors
    Node();
    Node(int val);

    // accessors
    int getID(void);
    int getItem(void);
    Node * getNext(void);

    // mutators
    void setItem(int val);
    void setNext(Node * ptr);
 private:
    int item;
    int ID;
    Node * next;

    // class variable
    static int nextID;
};

Not much new here, except for the ID attribute.

The ID is just a number that needs to be unique to each node we create. The obvious place to set up this number is in the constructor. However, we want the system to manage the unique ID for us. We can do that if we set up a class variable, like this:

See that static thing at the bottom of this class? That keyword tells the compiler to build exactly one of these attributes. All instances of the class can access this attribute, but when they do, they will all access the same variable.

We will teach our constructor to handle this static variable and use it to assign a unique ID to each object that gets created!

Since we know we should be testing code as we write it, here is a test file that will exercise this Node class:

tests/test_node.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#include <catch.hpp>
#include "Node.h"

TEST_CASE( "test Node constructor" ) {
    Node * tmp = new Node();
    REQUIRE(tmp != nullptr);
    REQUIRE(tmp->getItem() == 0);
    REQUIRE(tmp->getNext() == nullptr);
}

TEST_CASE( "test Node ID generation" ) {
    Node * tmp = new Node();
    int id1 = tmp->getID();
    tmp = new Node();
    int id2 = tmp->getID();
    REQUIRE(id2 == id1 + 1);
}

The first test in this code just makes sure that when we allocate a new Node, we get a valid address for it. Not much of a test, but it is start.

The second test is supposed to make sure we get unique IDs for each Node object we create. This test is harder than you might suspect.

We will eventually have lots of tests that create Node objects, and we cannot be sure in what order those tests will run. For that reason, it will not be possible to create a node in this test and know exactly what ID it will have. What we can do it create two nodes and make sure the ID for each increases by one.

Here is our first implementation of the Node class:

lib/Node.h
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// Copyright 2018 Roie R. Black
#include "Node.h"

// initialize class variable here
int Node::nextID = 0;

// constructors
Node::Node() {
    ID = ++nextID;
    next = nullptr;
}

Node::Node(int val) {
    ID = ++nextID;
    item = val;
    next = nullptr;
}

// accessors

int Node::getID(void) {
    return ID;
}

int Node::getItem(void) {
    return item;
}

Node * Node::getNext(void) {
    return next;
}

// mutators
void Node::setItem(int val) {
    item = val;
}

void Node::setNext(Node * p) {
    next = p;
}

Do you see how we manage our ID here?

Note

The tests run at this point all passed, so I will not show you that output.

The LinkedList Class

Now that we have a `Node we can use, lets build our first LinkedList setup:

include/LinkedList.h
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Copyright 2018 Roie R. Black
#pragma once
#include "Node.h"

class LinkedList {
 public:
    // constructor
    LinkedList();

    // destructor
    ~LinkedList();

    // accessors
    void print(void);
    int getItemByID(int id);
    int getSize(void);
    int getHeadID();

    // mutators
    void insert(int val);

 private:
    Node * head;
    Node * tail;
    int size;
};

There are a few things to note here.

First, we are managing pointers to the front and the end of the list. Tracking these places in the list will prove handy later, when we use this class.

We also have provided new accessor methods:

  • getItemByID - searches for an item using the node ID.
  • getHeadID - this returns the ID of the node currently at the front of the list. This is use din testing.

For this version of our class, the insert method will always add a new item at the front of the list.

Here is a test set for this list:

tests/test_linked_list.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <catch.hpp>
#include "LinkedList.h"

TEST_CASE( "test LinkedList constructor" ) {
    LinkedList ll;
    REQUIRE(ll.getSize() == 0);
}

TEST_CASE( "test LinkedList insert" ) {
    LinkedList ll;
    ll.insert(1);
    REQUIRE(ll.getSize() == 1);
    int id1 = ll.getHeadID();
    ll.insert(2);
    REQUIRE(ll.getSize() == 2);
    int id2 = ll.getHeadID();
    REQUIRE(id2 == id1 + 1);
    REQUIRE(ll.getItemByID(id1) == 1);
    REQUIRE(ll.getItemByID(id2) == 2);
}

These tests should be easy to follow, after looking at the tests for the Node class previously.

Here is the implementation of the LinkedList class:

lib/LinkedList.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include "LinkedList.h"

LinkedList::LinkedList() {
    head = nullptr;
    tail = nullptr;
    size = 0;
}

void LinkedList::insert(int val) {
    Node *tmp = new Node(val);

    if (head == nullptr) { // list is empty
        head = tmp;
        tail = tmp;
    } else { // list is non-empty
        tmp->setNext(head);
        head = tmp;
    }
    size++;
}

LinkedList::~LinkedList() {
}
int LinkedList::getItemByID(int ID) {
    Node * tmp = head;
    while (tmp != nullptr) {
        if (ID == tmp->getID())
            return tmp->getItem();
        tmp = tmp->getNext();
    }
    return -1;
}

int LinkedList::getHeadID(void) {
    if (head == nullptr) return -1;
    return head->getID();
}

int LinkedList::getSize(void) {
        return size;
}

Not much is really new here.

Time to change that!

Reviewing the Node Class

The only reason for building a Node class is to set up a container that will hold the data we need in the list. All of that code is really something the builder of the LinkedList class should own. No other part of the program should ness around with Nodes.

This is especially true is the container we are building (the list) is to be a general purpose one, with features designed to make it useful in our project.

We have already seen that we can build our I35 simulator using an array of cars, and we could use a linked list of cars as well. Does the user of our code even need to know how we are managing a pile of cars at all?

The answer is a bit NO. If we want to start off with arrays, that is a design decision, and has nothing to do with the rest of the code.

Let’s set up a new container that can manage a bunch of cars!

Cars Class

We need a home for a car, and a way to access an individual car. We want to be able to add cars to the container, and fetch then so we can access their attributes when needed. How the cars are managed is not relevant to our bigger project.

Here is a new Car class:

include/Car.h
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#pragma once

class Car {
 public:
    int getX(void);
    void setID(int id);
    void setSpeed(int x);
    void setXpos(int s);

 private:
    int ID;
    int xPos;
    int speed;
};

As a developer charged with building this part of the project, your first job is to fully understand what you are supposed to build.

The class header file tells you what your class needs to do, but how you do that is up to you.

Now, your first idea may well be to set up an array to manage these cars. You also might decide to create a Car class to manage the data you need to store for each car. That is fine, and may do the job nicely.

We have a problem here, though.

How many cars are we supposed to manage.

I know! 500!

Tilt!

As soon as you pick some random number, you have made a serious error. The ONLY time it makes sense to pick a number like this is after you and the users of your code agree on that number! You are not in charge of a decision like this!

The Sky is the Limit

Cars will be added (and deleted) as the program runs! You really have no idea how many cars to handle.

Can an array still do the job?

Well, if we think a bit, maybe we can deal with this.

Why does an array need to be in just one big chunk of memory? What if we created a bunch of small chunks, and linked those chunks together? The result would be more memory efficient, but we have a complication to deal with. Simple indexes no longer work.

Fortunately, C++ can help here.

The Subscript Operator

Remember those brackets we use for subscripting an array? It turns out those are just “operators” line addition and subtraction. C++ lets you set up code that will be fired off when you add brackets after an object’s name!

class MyInts {
 private:
    int m_list[10];

 public:
    int& operator[] (const int index);
};

int& MyInts::operator[] (const int index) {
    return m_list[index];
}

Notice that the new operator returns a reference. That is important, since we might want to assign a value into the array using this kind of code:

MyInts data;
data[2] = 42;

C++ requires that the thing on the left-hand side of an assignment statement is actually referring to a chunk of read memory. (In C++, they call the left-hand side an lvalue). That means it has a real address, The object obviously has one, but we are using a subscript after that. So data[2] actually runs a method in our class, and that function must return an address that can be used to place the value in memory!

Once we know we can do things like this, we can add subscripts to any class, and set up the code to locate the required piece of data, no matter where we decide to hide it. The user of our class never needs to know anything about the trickery we are setting up inside.

Vectors

C++ actually has such a storage container in the Standard Template Library. It is called a vector. We will look at that later.