Decoding AVR Instructions
#########################

..  include /header.inc
..  vim:filetype=rst spell:

The AVR Instruction summary details how individual instructions are encoded,
and from that information and the ATtiny85 datasheet, I put together a file
containing all of the instructions available in this processor, and the
encoding of each. A bit of Python analysis turned out a summary of the
Instruction set, ordered by the bit patterns we will see in the final hex file.

..	include::	opcodes.inc

..  warning::

    The data file used to produce this table is still being proof-read. You
    will be able to play with it late rin this lecture.


Some things may strike you as odd. For instance, in that first part of the
table, the ``ADD`` and ``LSL`` instructions have the same basic encoding. When
you add a register to itself, you basically shift all the bits to the left, and
the register fields, where we normally decode ``Rd``, and ``Rr`` for a two
operand instruction, are the same for ``LSL``. That means these two are the same
instructions, but the chip designers decided to give you a more common
instruction for use in assembly coding.

Decode Output Signals
*********************

From the table above, we can figure out what signals the decode unit needs to produce:

    +--+-------+-------------+
    |Rd|5-bits |0<=d<=31     |   
    +--+-------+-------------+
    |Rr|5-bits |0<=r<=31     |
    +--+-------+-------------+
    |K |6-bits |0<=K<=63     |
    +--+-------+-------------+
    |A |6-bits |0<=A,=63     |
    +--+-------+-------------+
    |s |3-bits |0<=s<=63     |
    +--+-------+-------------+
    |k |16-bits|0<=k<=655365 |
    +--+-------+-------------+
    |k |7-bits |-64<=k<=63   |
    +--+-------+-------------+
    |q |6-bits |0<=q<=63     |
    +--+-------+-------------+
    |b |3-bits |0<=b<=7      |
    +--+-------+-------------+

The legal values allowed on each signal are shown, deduced from the AVR instruction documentation.

Basically, the job of decoding instructions amounts to a tedious exercise in
checking patterns. The most common way to do this involves the C++
``switch`` statement. We can use the ``bitset`` data type to play around with individual bits.

..  literalinclude::    code3/decode.cpp
    :linenos:
    :caption: decode.cpp

This code (which is incomplete) shows basically how to extract the bits that
make up a register number. The encodings from the Le't see what this does:

 .. command-output::    g++ -o demo decode.cpp
    :cwd: code3

..  note::

    The code shown here can convert a string on binary digits to a bitstring,
    and from a ``bitstring`` to an ``unsigned long``. You should already know
    how to convert from an integer to a bitset. 

Let's see this code in action:

.. command-output::    ./demo
    :cwd: code3

The logic is figuring out the register numbers, bit I will leave it up to you to verify that things are corect.

Endians
*******

Not Indians, "Endians"!

Computer systems are funny beasts. We want ot store data types of any size in
our memory, but often that memory is just a bunch of bytes. How do we pack
16-bits into 8-bit contianers?

Simple, we use two of those bytes to hold the 16-bits. But as soon as we decide
to do that, we have a question to answer. WHich byte comes first?

Little Endian
=============

The most common scheme is :little endian". In this scheme, the low byte it
places in the lower address, and the upper byte is placed one byte above that
one. This scheme also handles bigger data types. just keep placing successive
bytes on top of the lower bytes until you are done.

The Pentium (and the AVR) are "little endian" systems.

Big Endian
==========

As you might suspect, "big endian" uses the opposite scheme. High bytes in the
data type end up at low addresses. The only system I am familiar with that uses
this scheme is built by Sun Microsystems, now part of Oracle. I have not seen
one of their systems in years.

What does this mean for our simulator?

I put together a short chunk of code and assembled it to see what the compiler produced:

..  code-block:: text

    2c: 82 0f   add r24, r18
    2e:

This shows that instruction memeory is really byte addressed in this machine,
something we do not need to worry about aslong as we get the right bits to
decode.

Converting this to binary, and splitting the bits up into nibbles, we see this:

..  code-block:: text

    2c: 1000 0010 0000 1111 add r24, r18

Now, according to the instruction encoding table for this instruction, we
should see this:

..  code-block:: text

    add -> 0000 11rd dddd rrrr

It looks like our "little endian" machine has swaped the bytes around.

..  code-block:: text

    1000 0010 0000 1111
    dddd rrrr 0000 11rd

Which gives us these registers:

..  code-block:: text

    Rd -> 11000 -> R24
    rr -. 10010 -> R18

Which is just what we want to find! Looks like the encoding table matches the
bits in the code!

AVR Instruction Memory
**********************

The AVR instruction memory is actually a big array of bytes, but that memory is
designed to deliver 16-bits n one operation. We will model this memory as a
16-bit data array, just to make things simpler.

When you fetch data from that memory, we need to check the order of the bytes,
to make sure our instruction codes are in the right order.

That means we need careful testing to make sure that when we load a program from
the bytes in a hex file, we get the right results when we docode things in our
simulator.

Similar decoding code will handle the instructions we are going to include in
our system.

Using Python to Explore Code
****************************

I used Python to take apart the instruction set, and to extract real code from
th elisting file generated by the ``avr-gcc`` compiler. 

Here is part of the code I used:

    * :download:`decoder.py`

    * :download:`ATTiny85.json`

The ``json`` data file is something I extracted from the AVR documentation files I
showed earlier. I actually am setting up this system to generate an example of
every instruction in the chip in an assembly language file. When that file is
processed, I will be able to ccheck the encoding tables for every instruction.
See Python is durned handy!