The "Go" tools
     The GoAsm manual

understand....

registers

by Jeremy Gordon -

This file is intended for those interested in 32 bit assembler programming, in particular for Windows.

The "registers" are small areas of memory contained within the processor itself. The processor is designed to manipulate the data in the registers and transfer data into and out of them at great speed. As processors have become more powerful so have the number and size of the registers they contain, but all processors from the 386 upwards have eight "general" 32 bit registers for general programming use. These have been given names by Intel based on their traditional uses, shortened for all purposes to EAX, EBX, ECX, EDX, EDI, ESI, EBP, and ESP.

There is also an EIP register which holds the current instruction being executed at any one time. You cannot read from or write to the EIP register so it is not available for general programming use. However its value will be visible to you when you are single-step debugging.

EBP (base pointer) and ESP (stack pointer) have certain restrictions in use which I shall come to later.

Each of the registers can be accessed using 32 bits. That is to say if you use a 32 bit assembler instruction, then all 32 bits of the register will be changed, read or written to (depending on the instruction). Each of the registers can also be accessed using only the first 16 bits if this is allowed by the instruction concerned. This is achieved by using the register names AX, BX, CX, DX, DI, SI, BP and SP.
In addition to this the registers AX, BX, CX and DX have their first 16 bits divided to make two separate 8 bit registers, available for access using AL, AH, BL, BH, CL, CH and DL, DH. The "L" stands for low and the "H" stands for high.

The following illustrations show this:-

The first 16 bits of the EAX register (that is bits 0 to 15) can be accessed as the AX register. The first 8 bits of the EAX register (that is bits 0 to 7) can be accessed as the AL register. The second 8 bits of the EAX register (that is bits 8 to 15) can be accessed as the AH register. EBX, ECX, and EDX work in the same way.

The first 16 bits of the EDI register can be accessed as the DI register. ESI, EBP and ESP work in the same way, but note there is no further sub division of 16 bit register in the case of these particular registers.

To make their programs run as fast as possible assembler programmers use the registers as much as they can. For example, if a routine needs to access a particular 32 bit number frequently, the routine will run much faster if the number is kept in a register rather than in memory. Keeping numbers on the stack (using the PUSH and POP instructions) is also very quick but not as quick as using the registers.
In 32 bit programming there is no advantage in using the 16 bit registers instead of the 32 bit ones. The main reason they have been kept is to ensure downwards compatibility with 16-bit code. However there are some instructions which only use the 16 bit registers.
In 32 bit programming if you use the 16 bit registers you will usually add one more byte to the instruction size and your programs will probably run more slowly because the processor is designed to expect 32 bit transactions.
The 8 bit registers are useful in 32 bit assembler programming if they can be used to hold data which otherwise would have to be held in memory. For example you might use the two registers AL and AH if the numbers are small enough. Remember however that you could not then use EAX. This is because AL and AH are part of EAX.
There are also eight 80 bit registers which are used for floating point and for executing the 64 bit MMX instructions.
Later processors also have eight 128 bit XMM registers which can make full use of the SSE and SSE2 instructions.

Traditional use of registers

Some instructions use particular registers to perform certain tasks; some instructions are faster if certain registers are used; in early processors not all the registers could do everything as they do now. These three facts, together with established traditional use of registers by assembler programmers over the years, have established expectations about how the registers should be used. If you stick to these it will help with the readability of your code.
  • Use EAX to pass data to a procedure and to return data from a procedure to the calling code. The Windows APIs themselves use EAX to return a value to the caller. AL, AX and EAX should also be used as far as possible to receive data from memory and loading data to memory since they may work slightly quicker than other registers. For example use MOV AL,[ESI] in preference to MOV DL,[ESI]. Also if you need to use ADD, AND, ADC, CMP, MOV, OR, SUB, TEST, XCHG, XOR with an immediate value (ie. a number like MOV AL,23h) use AL, AX or EAX if you can since the instruction uses fewer opcodes than if you use another register.
  • Use the EDX register as a backup for EAX if that is already in use.
  • Use the ECX register as a counter. JECXZ is a special instruction that tells you if ECX is zero and the LOOP, SCAS, MOVS series of instructions all use ECX as a counter.
  • Use EBX to hold data generally or to address memory for example MOV EAX,[EBX] or MOV [EBX],EDX.
  • Use ESI where you need to read from memory, eg. MOV EAX,[ESI] and EDI where you need to write to memory eg. MOV [EDI],EAX. This is consistent with the LODSD, STOSD and MOVSD instructions.
  • Use any of the registers as base or index registers in complex memory instructions eg. MOV EAX,[MemPtr+ESI*4+ECX]
  • Never use ESP for anything other than a pointer to the stack, unless you have a routine which has no stack activity at all. Then you might save the value of ESP in memory and restore it before returning to the caller of the routine.
  • Traditionally EBP is used to address local data on the stack in callback routines. EBP and its 16-bit component BP can be used as a general register in Windows programming, but you will have to be very careful if you are using stack frames (FRAME in GoAsm) or local data (LOCAL in GoAsm). This is because stack frame parameters and local data are addressed using EBP plus or minus value. After EBP is changed the parameters and local data cannot be accessed until EBP is restored to its previous value. Note that in 16-bit code BP used to address the stack segment unless used with a segment override, but in 32 bit windows EBP can be used to address any part of the 4GB "flat" memory area.
  • CS, DS and SS are still used by Windows even in 32 bit code, so you should not use these. And these days you can't use ES,FS or GS either. In Windows 98 and upwards this causes an exception.
  • If you need to use the ordinary registers to hold 64 bit information use EDX:EAX, where EDX holds the most significant bits. This accords with the 64 bit shift instructions SHLD and SHRD and also with CDQ.


Copyright © Jeremy Gordon 2002-2003
Back to top