Intel HEX
Intel hexadecimal object file format, Intel hex format or Intellec Hex is a file format that conveys binary information in ASCII text form.[6] It is commonly used for programming microcontrollers, EPROMs, and other types of programmable logic devices. In a typical application, a compiler or assembler converts a program's source code (such as in C or assembly language) to machine code and outputs it into a HEX file. Common file extensions used for the resulting files are .HEX[1] or .H86.[2][3] The HEX file is then read by a programmer to write the machine code into a PROM or is transferred to the target system for loading and execution.[7][8]
Filename extension |
---|
History
The Intel hex format was originally designed for Intel's Intellec Microcomputer Development Systems (MDS) in 1973 in order to load and execute programs from paper tape in order to replace the "paper-intensive" BNPF/BPNF format.[9] Also, it served the purpose of easing the data transmission from customers to Intel for ROM production.[10] The format was used to program (E)PROMs via paper tapes (in Intellec Hex Paper Tape Format) or to control punched card-controlled EPROM programmers (through the Intellec Hex Computer Punched Card Format).[10]
Since 1975, it was also utilized by the MCS Series II floppy-disk based ISIS-II systems, using the file extension HEX.[9]
Format
Intel HEX consists of lines of ASCII text that are separated by line feed or carriage return characters or both. Each text line contains hexadecimal characters that encode multiple binary numbers. The binary numbers may represent data, memory addresses, or other values, depending on their position in the line and the type and length of the line. Each text line is called a record.
Record structure
A record (line of text) consists of six fields (parts) that appear in order from left to right:[7]
- Start code, one character, an ASCII colon ':'.
- Byte count, two hex digits (one hex digit pair), indicating the number of bytes (hex digit pairs) in the data field. The maximum byte count is 255 (0xFF). 16 (0x10) and 32 (0x20) are commonly used byte counts.
- Address, four hex digits, representing the 16-bit beginning memory address offset of the data. The physical address of the data is computed by adding this offset to a previously established base address, thus allowing memory addressing beyond the 64 kilobyte limit of 16-bit addresses. The base address, which defaults to zero, can be changed by various types of records. Base addresses and address offsets are always expressed as big endian values.
- Record type (see record types below), two hex digits, 00 to 05, defining the meaning of the data field.
- Data, a sequence of n bytes of data, represented by 2n hex digits. Some records omit this field (n equals zero). The meaning and interpretation of data bytes depends on the application.
- Checksum, two hex digits, a computed value that can be used to verify the record has no errors.
Color legend
As a visual aid, the fields of Intel HEX records are colored throughout this article as follows:
Start code Byte count Address Record type Data Checksum
Checksum calculation
A record's checksum byte is the two's complement of the least significant byte (LSB) of the sum of all decoded byte values in the record preceding the checksum. It is computed by summing the decoded byte values and extracting the LSB of the sum (i.e., the data checksum), and then calculating the two's complement of the LSB (e.g., by inverting its bits and adding one).
For example, in the case of the record :0300300002337A1E, the sum of the decoded byte values is 03 + 00 + 30 + 00 + 02 + 33 + 7A = E2
, which has LSB value E2
. The two's complement of E2
is 1E, which is the checksum byte appearing at the end of the record.
The validity of a record can be checked by computing its checksum and verifying that the computed checksum equals the checksum appearing in the record; an error is indicated if the checksums differ. Since the record's checksum byte is the two's complement — and therefore the additive inverse — of the data checksum, this process can be reduced to summing all decoded byte values, including the record's checksum, and verifying that the LSB of the sum is zero. When applied to the preceding example, this method produces the following result: 03 + 00 + 30 + 00 + 02 + 33 + 7A + 1E = 100
, which has LSB value 00
.
Text line terminators
Intel HEX records are separated by one or more ASCII line termination characters so that each record appears alone on a text line. This enhances legibility by visually delimiting the records and it also provides padding between records that can be used to improve machine parsing efficiency.
Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems. For example, Linux programs use a single LF (line feed, hex value 0A
) character to terminate lines, whereas Windows programs use a CR (carriage return, hex value 0D
) followed by a LF.
Record types
Intel HEX has six standard record types:[7]
Hex code | Record type | Description | Example |
---|---|---|---|
00 | Data | Contains data and a 16-bit starting address for the data. The byte count specifies number of data bytes in the record. The example shown to the right has 0B (eleven) data bytes (61, 64, 64, 72, 65, 73, 73, 20, 67, 61, 70) located at consecutive addresses beginning at address 0010. | :0B0010006164647265737320676170A7 |
01 | End Of File | Must occur exactly once per file in the last line of the file. The data field is empty (thus byte count is 00) and the address field is typically 0000. | :00000001FF |
02 | Extended Segment Address | The data field contains a 16-bit segment base address (thus byte count is always 02) compatible with 80x86 real mode addressing. The address field (typically 0000) is ignored. The segment address from the most recent 02 record is multiplied by 16 and added to each subsequent data record address to form the physical starting address for the data. This allows addressing up to one megabyte of address space. | :020000021200EA |
03 | Start Segment Address | For 80x86 processors, specifies the initial content of the CS:IP registers (i.e., the starting execution address). The address field is 0000, the byte count is always 04, the first two data bytes are the CS value, the latter two are the IP value. | :0400000300003800C1 |
04 | Extended Linear Address | Allows for 32 bit addressing (up to 4GiB). The record's address field is ignored (typically 0000) and its byte count is always 02. The two data bytes (big endian) specify the upper 16 bits of the 32 bit absolute address for all subsequent type 00 records; these upper address bits apply until the next 04 record. The absolute address for a type 00 record is formed by combining the upper 16 address bits of the most recent 04 record with the low 16 address bits of the 00 record. If a type 00 record is not preceded by any type 04 records then its upper 16 address bits default to 0000. | :02000004FFFFFC |
05 | Start Linear Address | The address field is 0000 (not used) and the byte count is always 04. The four data bytes represent a 32-bit address value (big-endian). In the case of 80386 and higher CPUs, this address is loaded into the EIP register. | :04000005000000CD2A |
Named formats
The original 4-bit/8-bit Intellec Hex Paper Tape Format and Intellec Hex Computer Punched Card Format supported only record types 00 and 01.[10]
The Extended Intellec Hex Format additionally supports record type 02.
Special names are sometimes used to denote the formats of HEX files that employ specific subsets of record types. For example:
- I8HEX files use only record types 00 and 01 (16-bit addresses)
- I16HEX files use only record types 00 through 03 (20-bit addresses)[6]
- I32HEX files use only record types 00, 01, 04, and 05 (32-bit addresses)
File example
This example shows a file that has four data records followed by an end-of-file record:
:10010000214601360121470136007EFE09D2190140 :100110002146017E17C20001FF5F16002148011928 :10012000194E79234623965778239EDA3F01B2CAA7 :100130003F0156702B5E712B722B732146013421C7 :00000001FF
Start code Byte count Address Record type Data Checksum
Variants
Besides Intel's own extension, several third-parties have also defined variants and extensions of the Intel hex format, including Digital Research (as in the so called "Digital Research hex format"),[3] Zilog, Texas Instruments, Microchip, and c't. These can have information on program entry points and register contents, a swapped byte order in the data fields, and other differences.
The Digital Research hex format for 8086 processors supports segment information by adding record types to distinguish between code, data, stack, and extra segments.[2][3]
Most assemblers for CP/M-80 (and also XASM09 for the Motorola 6809) don't use record type 01h to indicate the end of a file, but use a zero-length data type 00h entry instead. This eases the concatenation of multiple hex files.[11][12][1]
Texas Instruments defines a variant where addresses are based on the bit-width of a processor's registers, not bytes.
Microchip defines variants INTHX8S[13] (INHX8L,[1] INHX8H[1]), INHX8M,[13][1][14] INHX16[13] (INHX16M[1]) and INHX32[15] for their PIC microcontrollers.
Alfred Arnold's cross-macro-assembler AS,[1] Werner Hennig-Roleff's 8051-emulator SIM51, and Matthias R. Paul's cross-converter BINTEL are also known to define extensions to the Intel hex format.
See also
- Binary-to-text encoding, a survey and comparison of encoding algorithms
- MOS Technology file format
- Motorola S-record hex format
- Tektronix hex format
References
- Arnold, Alfred (2020) [1996, 1989]. "6.3. P2HEX". Macro Assembler AS - User's Manual. V1.42. Translated by Arnold, Alfred; Hilse, Stefan; Kanthak, Stephan; Sellke, Oliver; De Tomasi, Vittorio. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
[…] For the PIC microcontrollers, the switch -m <0..3> allows to generate the three different variants of the Intel Hex format. Format 0 is INHX8M which contains all bytes in a Lo-Hi-Order. Addresses become double as large because the PICs have a word-oriented address space that increments addresses only by one per word. […] With Format 1 (INHX16M), bytes are stored in their natural order. This is the format Microchip uses for its own programming devices. Format 2 (INHX8L) resp. 3 (INHX8H) split words into their lower resp. upper bytes. […] Unfortunately, one finds different statements about the last line of an Intel-Hex file in literature. Therefore, P2HEX knows three different variants that may be selected […] :00000001FF […] :00000001 […] :0000000000 […] By default, variant 0 is used which seems to be the most common one. […] If the target file name does not have an extension, an extension of HEX is supposed. […]
- "3.1. Intel 8086 Hex File Format". CP/M-86 Operating System - System Guide (PDF) (2nd printing, 1st ed.). Pacific Grove, California, USA: Digital Research. June 1981. pp. 15–16. Archived (PDF) from the original on 2020-02-28. Retrieved 2020-02-28. (17 pages)
- "Appendix C. ASM-86 Hexadecimal Output Format". CP/M-86 - Operating System - Programmer's Guide (PDF) (3 ed.). Pacific Grove, California, USA: Digital Research. January 1983 [1981]. pp. 97–100. Archived (PDF) from the original on 2020-02-27. Retrieved 2020-02-27.
[…] The Intel format is identical to the format defined by Intel for the 8086. The Digital Research format is nearly identical to the Intel format, but adds segment information to hexadecimal records. Output of either format can be input to GENCMD, but the Digital Research format automatically provides segment identification. A segment is the smallest unit of a program that can be relocated. […] It is in the definition of record types 00 and 02 that Digital Research's hexadecimal format differs from Intel's. Intel defines one value each for the data record type and the segment address type. Digital Research identifies each record with the segment that contains it. […] 00H for data belonging to all 8086 segments […] 81H for data belonging to the CODE segment […] 82H for data belonging to the DATA segment […] 83H for data belonging to the STACK segment […] 84H for data belonging to the EXTRA segment […] 02H for all segment address records […] 85H for a CODE absolute segment address […] 86H for a DATA segment address […] 87H for a STACK segment address […] 88H for a EXTRA segment address […]
(1+viii+122+2 pages) - "The Interactive Disassembler - Hexadecimal fileformats". Hex-Rays. 2006. Archived from the original on 2020-03-01. Retrieved 2020-03-01.
- "AR#476 PROMGen - Description of PROM/EEPROM file formats: MCS, EXO, HEX, and others". Xilinx. 2010-03-08. Intel MCS-86 Hexadecimal Object - File Format Code 88. Archived from the original on 2020-03-03. Retrieved 2020-03-03.
- "Appendix D. MCS-86 Absolute Object File Formats: Hexadecimal Object File Form". 8086 Family Utilities - User's Guide for 8080/8085-Based Development Systems (PDF). Revision E (A620/5821 6K DD ed.). Santa Clara, California, USA: Intel Corporation. May 1982 [1980, 1978]. pp. D-8–D-13. Order Number: 9800639-04. Archived (PDF) from the original on 2020-02-29. Retrieved 2020-02-29.
- Hexadecimal Object File Format Specification. Revision A. Intel. 1998 [1988-01-06]. Retrieved 2019-07-23.
- "General: Intel Hex File Format". ARM Germany GmbH. Archived from the original on 2020-02-27. Retrieved 2017-09-06.
- Feichtinger, Herwig (1987). "1.8.5. Lochstreifen-Datenformate: Das Intel-Hex-Format" [1.8.5. Paper tape data formats]. Arbeitsbuch Mikrocomputer [Microcomputer work book] (in German) (2 ed.). Munich, Germany: Franzis-Verlag GmbH. pp. 240–243 [243]. ISBN 3-7723-8022-0. (NB. The book also describes a BNPF, a Motorola S and a MOS 6502 hex format.)
- "Chapter 6. Microcomputer System Component Data Sheet - EPROMs and ROM: I. PROM and ROM Programming Instructions - B1. Intellec Hex Paper Tape Format / C1. Intellec Hex Computer Punched Card Format". MCS-80 User's Manual (With Introduction to MCS-85). Intel Corporation. October 1977 [1975]. pp. 6-75–6-78. 98-153D. Retrieved 2020-02-27. (NB. This manual also describes a "BPNF Paper Tape Format", a "Non-Intellec Hex Paper Tape Format" and a "PN Computer Punched Card Format".)
- Zschocke, Jörg (November 1987). "Nicht nur Entwicklungshilfe - Down-Loading für Einplatinencomputer am Beispiel des EPAC-09: Intel-Hex-Format". c't - magazin für computertechnik (in German). Vol. 1987 no. 11. Verlag Heinz Heise GmbH & Co. KG. pp. 198, 200, 202–203, [200]. ISSN 0724-8679.
[…] Den Vorspann beschließt ein Byte, dessen Wert den Typ des Blockes angibt: 0 = Datenblock, 1 = Endblock. Auf diese Unterscheidung kann jedoch verzichtet werden, wenn sich ein Endblock auch durch eine Blocklänge gleich Null eindeutig kennzeichnen läßt. (So verfahren die meisten Assembler unter CP/M, auch der XASM09; das Typbyte ist dann immer Null). […]
(NB. XASM09 is a Motorola 6809 assembler.) - Prior, James E. (1989-02-24). "Re: Intel hex (*.HEX) format questions". Newsgroup: comp.os.cpm. Retrieved 2020-02-27.
- "PIC16C5X Programming Specification 5.0 - PIC16C5X Hex Data Formats: 5.1. 8-Bit Split Intellec Hex Format (INHX8S) / 5.2. 8-Bit Merged Intellec Hex Format (INHX8M) / 5.3. 16-Bit Hex Format / 5.4. 8-Bit Word Format / 5.5. 16-Bit Word Format". Microchip Databook (1994 ed.). Microchip Technology Inc. April 1994. pp. 3-10–3-11, 9-10, 9-15, 9-17, 9-21, 9-23, 9-27. DS00018G. Retrieved 2020-02-28.
[…] Assemblers for the PIC16C5X can produce PIC16C5X object files in various formats. A PIC16C5X programmer must be able to accept and send data in at least one of following formats. The 8-bit merged (INHX8M) format is preferred. […] format […] INHX8S […] produces two 8-bit Hex files. One file will contain the address / data pairs for the high order 8-bits and the other file will contain the low order 8-bits. File extensions for the object code will be '.obl' and '.obh' for low and high order files […] format […] INHX8M […] produces one 8-bit Hex file with a low byte / high byte combination. Since each address can only contain 8 bits in this format, all addresses will be doubled. File extensions for the object code will be '.obj' […] format […] INHX16 […] produces one 16-bit Hex file. File extension for the object code will be '.obj'. […]
- Beard, Brian (2016) [2010]. "Microchip INHX8M HEX-record Format". Lucid Technologies. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
- Beard, Brian (2016) [2013]. "Microchip INHX32 HEX-record Format". Lucid Technologies. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
Further reading
- "2.8. Microprocessor Formats 2.8.1. Input Requirements: Intel Intellec 8/MDS Format. Select Code 83". Operator Guide To Serial I/O Capabilities of Data I/O Programmers - Translation Format Package (PDF). Revision C. Data I/O Corporation. October 1980. p. 2-10. 055-1901. Archived (PDF) from the original on 2020-03-01. Retrieved 2020-03-01.
- Translation File Formats. Data I/O Corporation. 1987-09-03. Archived from the original on 2020-03-01. Retrieved 2020-03-01. (56 pages)
- "How Do I Interpret Motorola S & Intel HEX Formatted Data? Intel Hex-32, Code 99". Home > Hardware > … > In-circuit Test Systems > Automated Test Equipment [Discontinued] > Details. Keysight Technologies. Archived from the original on 2020-03-01. Retrieved 2020-03-01.
- Bergmans, San (2019-06-02) [2001]. "Intel HEX Format". SB-Projects. Archived from the original on 2020-03-01. Retrieved 2020-03-01.
- Beard, Brian (2016) [2007]. "Intel HEX-record Format". Lucid Technologies. Archived from the original on 2020-02-28. Retrieved 2020-02-28.
External links
- binex - a converter between Intel HEX and binary for Windows.
- SRecord, a converter between Intel HEX and binary for Linux (usage), C++ source code.
- kk_ihex, open source C library for reading and writing Intel HEX
- libgis, open source C library that converts Intel HEX, Motorola S-Record, Atmel Generic files.
- bincopy is a Python package for manipulating Intel HEX files.