Sorting Things¶

In programming, it is often necessary to sort things before we display them, or even process them. How do we do this?

Well, there are quite sophisticated ways to sort a list of things, some of which will make your head hurt trying to understand how they get the job done. But we have enough background to go over a simple way to sort items in an array, and try that out in a simple piece of code.

Card Deck Example¶

Let’s pretend you have a deck of cards with only the numbers on them, none of that “suit” nonsense. You dropped the deck and want to sort it back into proper “order”. How would you (the human you, not the programmer you) get that job done?

Well, you could fan the cards out, looking for the lowest card, place it on a table face down top, then repeat the process until the deck was sorted. Hey, we might be able to do that in code. What we need is two things: a way to find the “smallest” thing in the list, and a second place to store the numbers.

Finding the smallest item in a list¶

Let’s set up a simple list:

const int NUM_ITEMS = 7;
int test_nums[] = {5,3,7,1,3,9,8};

I am using a feature of C++ that lets you set up an array with a list of values easily. This gets hard if the list is big, but this will do for our example.

We search the list by setting up a loop that will examine each item in the array. Here is a chunk of code that will do this.

 1 2 3 4 5 int min = test_nums; for( int i=0; i< NUM_ITEMS; i++ ) { if( test_nums[i] < min ) min = test_nums[i]; } cout << "The smallest number was " << min << endl;

In this chunk of code we start off by thinking about the process. If we want to find the smallest item in the array, what to we start with, probably the first item. If the list only has one item, that will be the smallest for sure (also the largest, but that is another problem). However if the next item is smaller than this first one, we will toss the first one, and replace it with the new item. We keep doing that looking at all the items in the array until we are done. Since we looked at them all, we will find the smallest.

Notice that the loop starts at zero, but we have already assumed that the first item (at index zero) was the smallest. What will happen if we look at it again? Well, since the test checks if the new item is “less than” our current minimum value, this test will fail, so we will not replace our minimum with itself. So, this will work, but it would be more “efficient” if we skipped looking at the first item again, and just started our loop at index one.

Setting up a new array for the sort¶

To model our idea for sorting our array, we need a second array to store the sorted items. Here is code to do that:

int sort_nums[NUM_ITEMS];

for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums;
for( int i=0; i<NUM_ITEMS; i++) {
if( test_nums[i] < min ) min = test_nums[i];
}
sort_nums[pass] = min;
}

for(int i=0;i<NUM_ITEMS;i++) {
cout << sort_nums[i] << " ";
}
cout << endl;

This is a lot of code, but the idea is simple. We set up an empty array to receive our minimum value. We need to do this a bunch of times to sort the entire array, so we wrap up our logic in another loop that spins as many times as we have items in our array. Every time we spin through the loop, we find the smallest item, and place it in the new array. Finally, we print it out:

1 1 1 1 1 1 1

Whoa! That is not right. What happened? Well, think about what this code does. It searches the list for the smallest item and puts it into the new array, but it does not remove that item from the array, so we keep finding the same minimum item. Could we fix this problem? Sure! We could replace that minimum value item with some insanely big value so we never find it again. There is a problem with this idea, we do not know where that minimum value was stored, just what value it had

Shoot!

We could tweak our code a bit and track where the current minimum value is stored. Let’s try that!

for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums;
int loc = 0;
for( int i=0; i<NUM_ITEMS; i++) {
if( test_nums[i] < min ) {
min = test_nums[i];
loc = i;
}
}
test_nums[loc] = 999;
sort_nums[pass] = min;
}

This keeps track of the place where our current minimum is stored on each pass, and updates that place each time we find a new minimum. Finally is replaces that minimum value in the original data array with our “insane” big number. (Is this going to be big enough for this to work all the time? Probably not!)

Here it the result of this code, and it looks like it worked, even for the double value we stuck into that data set:

1 3 3 5 7 8 9

We can stop now, we are done. Right? Wrong!

Can we do better¶

This is not a good solution at all, we needed a completely separate array to do this job. What if we came up with a better scheme that did not need this.

Can we sort the array “in place”, by just rearranging the data? Well that sounds like an idea. But first we have to think about how we can swap two values in a computer.

Swapping Data Values¶

Here is a simple piece of code. What will happen?

int x = 5;
int y = 7;

x = y;
y = x;

I hope by now you can figure out that this will not work at all. We will lose our value in x by writing over it in the first assignment statement, then we will copy that same value back into y and end up wit two copies of what was in y. That does not swap at all.

We need another variable to do this, and this new variable will temporarily keep track of what we are about to lose in that first assignment. Here is the correct solution. Commit this to memory!

int x = 5;
int y = 7;
int temp;

temp = x;
x = y;
y = temp;

Here, we kept a copy of what was in x before we zapped it with its new value, then we put what was in x originally into y. This time we have a good swap!

Sorting In Place¶

Now that we know how to swap two data items, can we modify our scheme to use this idea to make our data move around until it is sorted? Let’s try and see.

Think about how that set of nested loops is working. The outer loop is trying to find the smallest value to put into the array slot whose index is the pass value. Inside that loop, we look at the entire array to see where the smallest item is found. If we find that item, and swap it into the right spot, would this work! Let’s try it and see:

int  test_nums1[] = {5,3,7,1,3,9,8};
int temp;
for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums1;
int loc = 0;
for( int i=0; i<NUM_ITEMS; i++) {
if( test_nums1[i] < min ) {
min = test_nums1[i];
loc = i;
}
}
temp = test_nums1[pass];
test_nums1[pass] = min;
test_nums1[loc] = temp;
}

Note

I changed the name of the array because I built this lecture all in one file, and in that last example, I zapped the original test array. So, I constructed a second one and had to give it a new name. I will show the entire program at the end of this lecture.

3 7 5 3 9 8 1

Well, that did not work. Why not?

The problem is simple to see if you think about it. We swapped the smallest item into position and moved what was there into that item’s old spot. But we keep on looking for our smallest item starting at the beginning of the array, and that means we look at items we know are in the right spot.

On every pass through the outer loop, we are not looking for the smallest item in the entire list, just in the list that is left to search. This is how we did things manually back when we were sorting cards manually. (Remember, we took the smallest item out of our deck and put it on the table!)

So that inner loop need to look only at the items left to be sorted. And we make a simple change to make this happen. Start the inner loop at “pass”, not at zero, and see what happens:

for( int i=pass; i<NUM_ITEMS; i++) {

That is all I changed. Now see how it works:

9 1 3 3 5 7 8

So close, but so far away!

Well durn! This looks almost right, but it failed. Here we go again, debugging our ideas!

On that last pass through the outer loop, we started off setting min to the first item in the list. That is just not what we should be doing. If we “removed” a card from our original card deck, we only looked at the cards remaining in the deck. That means we need to look at item “pass” for our first minimum value, not zero! One more change:

int min = test_nums1[pass];
int loc = pass;

Here is our output now:

1 3 3 5 7 8 9

Houston, we have lift off! (Sorry, I am an aerospace engineer, and I did watch a lot of space launches back when! I even shard an office for several years with a Shuttle Astronaut! See American Heroes)

We finally have a sorting scheme that seems useful. I bet your head is hurting about now, so we will let it cool off a bit until next time.

Just in case you want to run this entire experiment, here is the full program I set up for this example thinking process!

#include <iostream>
using namespace std;

int main( int argc, char *argv[] ) {
const int NUM_ITEMS = 7;
int test_nums[] = {5,3,7,1,3,9,8};

int min = test_nums;
for( int i=0; i< NUM_ITEMS; i++ ) {
if( test_nums[i] < min ) min = test_nums[i];
}
cout << "The smallest number was " << min << endl;

int sort_nums[NUM_ITEMS];

for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums;
for( int i=0; i<NUM_ITEMS; i++) {
if( test_nums[i] < min ) min = test_nums[i];
}
sort_nums[pass] = min;
}

for(int i=0;i<NUM_ITEMS;i++) {
cout << sort_nums[i] << " ";
}
cout << endl;

for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums;
int loc = 0;
for( int i=0; i<NUM_ITEMS; i++) {
if( test_nums[i] < min ) {
min = test_nums[i];
loc = i;
}
}
test_nums[loc] = 999;
sort_nums[pass] = min;
}

for(int i=0;i<NUM_ITEMS;i++) {
cout << sort_nums[i] << " ";
}
cout << endl;

int  test_nums1[] = {5,3,7,1,3,9,8};
int temp;
for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums1;
int loc = 0;
for( int i=0; i<NUM_ITEMS; i++) {
if( test_nums1[i] < min ) {
min = test_nums1[i];
loc = i;
}
}
temp = test_nums1[pass];
test_nums1[pass] = min;
test_nums1[loc] = temp;
}

for(int i=0;i<NUM_ITEMS;i++) {
cout << test_nums1[i] << " ";
}
cout << endl;

for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums1;
int loc = 0;
for( int i=pass; i<NUM_ITEMS; i++) {
if( test_nums1[i] < min ) {
min = test_nums1[i];
loc = i;
}
}
temp = test_nums1[pass];
test_nums1[pass] = min;
test_nums1[loc] = temp;
}

for(int i=0;i<NUM_ITEMS;i++) {
cout << test_nums1[i] << " ";
}
cout << endl;

for(int pass=0; pass< NUM_ITEMS; pass++) {
int min = test_nums1[pass];
int loc = pass;
for( int i=pass; i<NUM_ITEMS; i++) {
if( test_nums1[i] < min ) {
min = test_nums1[i];
loc = i;
}
}
temp = test_nums1[pass];
test_nums1[pass] = min;
test_nums1[loc] = temp;
}

for(int i=0;i<NUM_ITEMS;i++) {
cout << test_nums1[i] << " ";
}
}