Functions That Call Themselves

This idea is a bit shocking to beginners.

Suppose you iare writing the code for a function, and it occurs to you that using the same funciton you are writing to solve the problem you are trying to solve would help. How would that work?

Let’s consider the classic first example of this.

Factorial

You should have learned about this at some point in your math classes:

5! = 5 * 4 * 4 * 2 * 1

n! = n * (n-1) * (n-2) ... * 1

But wait!

n! = n * (n-1)!

This looks interesting. However, there is a catch to this formula. It does not work for any value of n (only positive numbers), and we need a way to stop it.

That way is found by noting that 1! is simply 1. That is called a base case. The factorial of any other (positive) value of n can be figured out using that simple “recursive” formula.

But what does it look like in code?

ex1.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def fact(n):
    if n == 1: return 1
    return n * fact(n-1)

def main():
    nums = [1,2,3,4,5,6]
    for n in nums:
        print("%d! = " % n,fact(n))

main()

That is about as simple as it gets! We have produced a short chunk of code that should work. It is hard to mess up a piece of code this short. But does it work?

Let’s run it and see:

$ python ex1.py
1! =  1
2! =  2
3! =  6
4! =  24
5! =  120
6! =  720

Here is another classic recursive problem:

The formula for a Fibonacci Number for some parameter “n” is defined as:

  • F(0) = 0
  • F(1) = 1
  • F(n) = F(n-1) + F(n-2)

In this case, there are two `base cases, one for 1 and one for 0. Here is the code:

ex2.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12

def fibo(n):
    if n == 0: return 0
    if n == 1: return 1
    return fibo(n-1) + fibo(n-2)

def main():
    nums = [0,1,2,3,4,5]
    for n in range(21):
        print("F(%d) = " % n, fibo(n))

main()

here is this code in action:

$ python ex2.py
F(0) =  0
F(1) =  1
F(2) =  1
F(3) =  2
F(4) =  3
F(5) =  5
F(6) =  8
F(7) =  13
F(8) =  21
F(9) =  34
F(10) =  55
F(11) =  89
F(12) =  144
F(13) =  233
F(14) =  377
F(15) =  610
F(16) =  987
F(17) =  1597
F(18) =  2584
F(19) =  4181
F(20) =  6765

Recursion in detail

Is there anything wrong with this?

Let’s modify that last program, and see what is really happening:

ex2.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11

def fibo(n):
    print(" f(",n,") called")
    if n == 0: return 0
    if n == 1: return 1
    return fibo(n-1) + fibo(n-2)

def main():
    print(fibo(5))

main()

Running this gives this output:

$ python ex3.py
 f( 5 ) called
 f( 4 ) called
 f( 3 ) called
 f( 2 ) called
 f( 1 ) called
 f( 0 ) called
 f( 1 ) called
 f( 2 ) called
 f( 1 ) called
 f( 0 ) called
 f( 3 ) called
 f( 2 ) called
 f( 1 ) called
 f( 0 ) called
 f( 1 ) called
5

Now we can see the issue. We are actually calling the function over and over, actually solving a problem we may have already worked out. But, we did not keep that old solution around, so we need to figure it out again. This wastes processing time.

For a small number like five, this is no big deal, but what about 10,000. How many wasted calls do we make then.

Can we do better?

To answer that, we need to think of a way to save our old answers!

Since we will want to check for old answers based on a value of n, it looks like a Python dictionary will do the job. Here is a modified program that should be faster:

ex4.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17

cache = {
    0:0,
    1:1
}

def fibo(n):
    print(" f(",n,") called")
    if n in cache: return cache[n]
    newvalue = fibo(n-1) + fibo(n-2)
    cache[n] = newvalue
    return newvalue

def main():
    print(fibo(20))

main()

Running this gives this output:

$ python ex4.py
 f( 20 ) called
 f( 19 ) called
 f( 18 ) called
 f( 17 ) called
 f( 16 ) called
 f( 15 ) called
 f( 14 ) called
 f( 13 ) called
 f( 12 ) called
 f( 11 ) called
 f( 10 ) called
 f( 9 ) called
 f( 8 ) called
 f( 7 ) called
 f( 6 ) called
 f( 5 ) called
 f( 4 ) called
 f( 3 ) called
 f( 2 ) called
 f( 1 ) called
 f( 0 ) called
 f( 1 ) called
 f( 2 ) called
 f( 3 ) called
 f( 4 ) called
 f( 5 ) called
 f( 6 ) called
 f( 7 ) called
 f( 8 ) called
 f( 9 ) called
 f( 10 ) called
 f( 11 ) called
 f( 12 ) called
 f( 13 ) called
 f( 14 ) called
 f( 15 ) called
 f( 16 ) called
 f( 17 ) called
 f( 18 ) called
6765

It looks like this still works, but have we actually improved anything. There is a simple way to see. Let’s time the code:

Using a standard Python package called timeit we can create a stopwatch:

ex5.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
from timeit import timeit

cache = {
    0:0,
    1:1
}

def fibo(n):
    print(" f(",n,") called")
    if n in cache: return cache[n]
    newvalue = fibo(n-1) + fibo(n-2)
    cache[n] = newvalue
    return newvalue

def main():
    t1 = timeit()
    print(fibo(20))
    t2 = timeit()
    print(t2-t1)

main()

Now, we should see some time data:

$ python ex5.py
 f( 20 ) called
 f( 19 ) called
 f( 18 ) called
 f( 17 ) called
 f( 16 ) called
 f( 15 ) called
 f( 14 ) called
 f( 13 ) called
 f( 12 ) called
 f( 11 ) called
 f( 10 ) called
 f( 9 ) called
 f( 8 ) called
 f( 7 ) called
 f( 6 ) called
 f( 5 ) called
 f( 4 ) called
 f( 3 ) called
 f( 2 ) called
 f( 1 ) called
 f( 0 ) called
 f( 1 ) called
 f( 2 ) called
 f( 3 ) called
 f( 4 ) called
 f( 5 ) called
 f( 6 ) called
 f( 7 ) called
 f( 8 ) called
 f( 9 ) called
 f( 10 ) called
 f( 11 ) called
 f( 12 ) called
 f( 13 ) called
 f( 14 ) called
 f( 15 ) called
 f( 16 ) called
 f( 17 ) called
 f( 18 ) called
6765
8.001300739124417e-05

To see our improvement, we need to add timing code to the simple recursive version:

ex6.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from timeit import timeit

def fibo(n):
    if n == 0: return 0
    if n == 1: return 1
    return fibo(n-1) + fibo(n-2)

def main():
    t1 = timeit()
    print(fibo(20))
    t2 = timeit()
    print(t2-t1)

main()

Running this gives this output:

$ python ex6.py
6765
-0.0011871590104419738

That is a pretty significant improvement.

Using Recursion on Lists

If you think about it, a linked list is a recursive thing as well. We start off with an empty list, then add a node. If we are using a simple node object, with a number and a pointer to the next node in the list, that “next node” pointer is NULL, indicating that it is the last one in the list. If we add one more node to the end, we have yet another linked list, this one consisting of a head object, and what can be considered as another linked list (with one item in it). By making use of this fact, we can write routines that access our list using recursion.

As a refresher, here are the original routines we used to build a simple linked list of names:

So Which is better?

For every recursive routine you write, there is an equivalent non-recursive method that will do the same thing. In general, the non-recursive routine will be harder to write, and may be easier to write wrongly. So, most of the time, it is probably better to start off writing recursive routines if you can, then rewrite them if you need more speed, and need to use less memory to get your job done.