Reading Intel Hex Files

Reference:Intel HEX Record Structure

Read time: 11 minutes (2983 words)

For simplicity, we will set up the simulator to read instruction data from the standard hex file created by the avr-gcc tools. In order to read these files, we need to peek at the file format.

Note

Read the reference for details beyond what we need to our simulator project. This file format has been around for a long time, and is used in many different applications.

Line Structure

The file is composed of series of lines of data. Each line contains the following data items:

Start Code

Each data line starts with one ASCII character, the “:” character, indicating the start of a data line.

Byte Count

Next, two hex digits indicate the number of data bytes in the line.

Address

The next four hex characters indicate the starting address where this line of data will be stored. This value is an offset added to the base address of the data file. The initial base address is assumed to be zero, so this address will translate into the actual memory address where this line of data should be loaded.

Record Types

The next two hex digits indicate the type of record on this line. There re several codes available, but for our purposes, “00”, meaning “Normal data”, is the code we will see for data lines we need to load. A code of “01” indicates the last line in the file (see below).

Data Bytes

Next we will see a stream of hex digits representing the actual data bytes. The number of bytes is defined by the byte count fils above.

Checksum

The final byte (two hex characters) is a checksum calculated by summing all of the data bytes in the record (excluding the checksum byte). The two’s complement of this number is written as the checksum byte. To check that a record is valid, simple add the checksum calculated to the checksum in the record. The low byte of the resulting value should be zero.

End File Indicator

The last line in the file has a byte count of zero, and an address field of zero as well. The Record Type code is “01”. This record contains nothing to be loaded.

Reading the File

The file itself is a simple text file, so reading it is pretty simple.

Here is a sample file that will read a hex file and break out each line into the parts defined above. This file can form the basis for a loader program we will need to initialize the instruction memory for the simulator.

hex-loader.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <iostream>
#include <fstream>
#include <string>

void parse_line(std::string n) {
    int len = n.size();
    if(n[0] != ':') {
        std::cout << "bad data line" << std::endl;
        return;
    }
    std::string bytes = n.substr(1,2);
    std::cout << "Byte count: " << bytes << std::endl;

    std::string offset = n.substr(3,4);
    std::cout << "Offset: " << offset << std::endl;

    std::string record = n.substr(7,2);
    std::cout << "Record Type: " << record << std::endl;

    std::string data = n.substr(9,len-11);
    std::cout << "Record data: " << data << std::endl;

    std::string check = n.substr(len-2,2);
    std::cout << "Checksum: " << check << std::endl;
}

int main(int argc, char *argv[]) {
    std::string line;

    std::string fname = argv[1];     // we should verify this
    std::ifstream fin;
    fin.open(fname, std::ios::in);
    if(fin.is_open()) {
        fin >> line;
        while(!fin.eof()) {
            std::cout << line << std::endl;
            parse_line(line);
            fin >>line;
        }
    } else {
        std::cout << "error reading file" << std::endl;
    }
}

To test this code, I used the blink example project, modified for the atiny85 processor, and generated a hex file.

Note

The only change needed in the file involved repacing all call instructions with rcall instructions, since the attiny85 does not support simple call.

Here is the Makefile I used to generate the hex file and a listing file for reference:

Makefile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# source files
TARGET	:= $(shell basename $(PWD))
CSRCS	:= $(wildcard *.c)
COBJS	:= $(CSRCS:.c=.o)
SSRCS	:= $(wildcard *.S)
SOBJS	:= $(SSRCS:.S=.o)
OBJS	:= $(COBJS) $(SOBJS)

LST	:= $(TARGET).lst

# define the processor here
MCU		:= atmega328p
FREQ	:= 16000000L

# define the USB port on your system (this works on Linux)
PORT	:= /dev/ttyACM0
PGMR	:= arduino

# tools
GCC		:= avr-gcc
OBJDUMP	:= avr-objdump
OBJCOPY	:= avr-objcopy
DUDE	:= avrdude

UFLAGS	:=  -v -D -p$(MCU) -c$(PGMR)
UFLAGS		+= -P$(PORT)
UFLAGS		+= -b115200

CFLAGS	:=  -c -Os -mmcu=$(MCU)
CFLAGS		+= -DF_CPU=$(FREQ)

LFLAGS	:= -mmcu=$(MCU)
LFLAGS	:= -nostartfiles

.PHONY all:
all:	$(TARGET).hex $(LST)

# implicit build rules
%.hex:	%.elf
	$(OBJCOPY) -O ihex -R .eeprom $< $@

%.elf:	$(OBJS)
	$(GCC) $(LFLAGS) -o $@ $^ 

%.o:	%.c
	$(GCC) -c $(CFLAGS) -o $@ $^

%.o:	%.S
	$(GCC) -c $(CFLAGS) -o $@ $<

%.lst:	%.elf
	$(OBJDUMP) -C -d $< > $@

# utility targets
.PHONY:	load
load:
	$(DUDE) $(DUDECONF) $(UFLAGS) -Uflash:w:$(TARGET).hex:i

.PHONY:	clean
clean:
	$(RM) *.hex *.lst *.o *.elf

Notice the -nostartfiles addition to the LFLAGS lines. This option tells the linker to get rid of the initial jump table in the hex file that we will not need. If you are using interrupts, comment out this line.

For this demonstration (and for our simulator project) we will not need interrupts.

Here is the hex file I produced:

:10000000CFE5D2E0DEBFCDBF03D00CD010D0FDCF06
:1000100011241FBE80E88093460010924600BD9ACE
:10002000C598089588B390E2892788BB089508E2AF
:100030001FEF2FEF2A95F1F71A95D9F70A95C1F717
:02004000089521
:00000001FF

And, here is the output from this example code:

$ make clean
rm -f hex-reader *.o
$ make
g++ -o hex-reader main.cpp
$ make run
./hex-reader blink.hex
:10000000CFE5D2E0DEBFCDBF03D00CD010D0FDCF06
Byte count: 10
Offset: 0000
Record Type: 00
Record data: CFE5D2E0DEBFCDBF03D00CD010D0FDCF
Checksum: 06
:1000100011241FBE80E88093460010924600BD9ACE
Byte count: 10
Offset: 0010
Record Type: 00
Record data: 11241FBE80E88093460010924600BD9A
Checksum: CE
:10002000C598089588B390E2892788BB089508E2AF
Byte count: 10
Offset: 0020
Record Type: 00
Record data: C598089588B390E2892788BB089508E2
Checksum: AF
:100030001FEF2FEF2A95F1F71A95D9F70A95C1F717
Byte count: 10
Offset: 0030
Record Type: 00
Record data: 1FEF2FEF2A95F1F71A95D9F70A95C1F7
Checksum: 17
:02004000089521
Byte count: 02
Offset: 0040
Record Type: 00
Record data: 0895
Checksum: 21
:00000001FF
Byte count: 00
Offset: 0000
Record Type: 01
Record data: 
Checksum: FF

Completing this code so it can load our memory module is left as an exercise.