EXE Jump Tables

dedndave · May 29, 2009, 05:51:54 PM

do any of you experienced programmers have a simple method of eliminating the need for jump tables at the end of an EXE ?
i am guessing they are inserted by the assembler ?

jj2007 · May 29, 2009, 06:22:44 PM

Good question indeed. They don't look very efficient. Maybe it's a noob question: how does the assembler "know" the memory location of GetStdHandle, i.e. 7C812F3A? Is it guaranteed to be the same in all versions of Windows??

Code Select

0040106A              .  6A F5                       push -0B  ; StdHandle = STD_OUTPUT_HANDLE
0040106C              .  E8 A7000000                 call <jmp.&kernel32.GetStdHandle>
...
00401118               $ FF25 74114000              jmp near dword ptr [<&kernel32.GetStdHandle>]
...
GetStdHandle            8BFF                         mov edi, edi  ; HANDLE kernel32.GetStdHandle(StdHandle)
7C812F3B                55                           push ebp
7C812F3C                8BEC                         mov ebp, esp

PBrennick · May 29, 2009, 06:31:06 PM

GetStdHandle, for example, has its laddress stored in a lookup table in kernel.dll so even though there are differing versions of that DLL, the address, which can vary, will always be found.

Paul

BogdanOntanu · May 29, 2009, 06:31:52 PM

Quote from: dedndave on May 29, 2009, 05:51:54 PM
do any of you experienced programmers have a simple method of eliminating the need for jump tables at the end of an EXE ?

Simple? Well ...NO. Possible? Yes.

See here for a possible solution: http://www.masm32.com/board/index.php?topic=6519.0;topicseen

The MS libs contain two kinds of API "glue" code: one that will generate a jump table and one that will call "directly".

The jump table is more efficient for many reasons.

AFAIK there are other older threads about this issue also.

Quote
i am guessing they are inserted by the assembler ?

No. Usually this jump table is generated / inserted by the linker when it links extern / API's from LIBs .

However my Sol_Asm assembler does generate the jump tables :D but this is an exception to confirm the above rule.

dedndave · May 29, 2009, 06:38:33 PM

well - i read many of the other related threads - i did not see anything specific about eliminating the tables
much ado about why they are there, however
i seem to recall something someplace about eliminating them
and, yes, i can see how a program may load faster with the tables
once it has loaded, however, i would have to think it would be faster without them
as always, the best solution is probably a hybrid, where functions that are speed-critical are referenced without tables
and funtions that are used several times, but are not speed-critical use the tables (similar to procs vs macros argument)

EDIT
@JJ - i don't think it matters if they are the same or not
perhaps running under one OS, they are one value and under a different OS, a different value
from what i gather, they are externals that are unresolved until run-time

surely, if you reference a function 100 different places in the code, the table may be a good way to go
that way, the OS only has to set the value one time when the program is loaded

BogdanOntanu · May 29, 2009, 06:48:50 PM

Quote from: jj2007 on May 29, 2009, 06:22:44 PM
Good question indeed.

In fact a relatively irrelevant question unless you are writing a compiler.

Quote
They don't look very efficient.

It depends on the "angle" of view.

- Each "direct" call is still "indirect" in fact from the CPU's point of view.
- Each "direct" call is 1 byte longer than the jump table and this adds up when you use a lot of API in your code.
- Each "direct" call needs relocations and more solving by the linker and thus makes assembly/ linking slower and DLL's bigger
- indirect call's allow you an extra central "hooking" location that can be useful with portable applications and other OS'es (GOT/PLT like/ ready)

Quote
Maybe it's a noob question: how does the assembler "know" the memory location of GetStdHandle, i.e. 7C812F3A? Is it guaranteed to be the same in all versions of Windows??

It does not. IF you generate an OBJ (most common) then the assembler leaves this task to the linker. The linker uses the information in the LIB's to glue together an jump table or "direct" indirect calls. Both methods make reference to the IAT structure in the PE specification and further fixing is deferred to the OS loader.

The OS loader KNOWS where the API address is in the current OS version/ layout. The OS loader loads the PE executable and patches the IAT with the correct address and since the jump table or the "direct" calls are both referencing the IAT this "magically" and finally solves the problem.

DLL's need further relocations solving but everything else is the same.

Quote
Code Select Expand
0040106A . 6A F5 push -0B ; StdHandle = STD_OUTPUT_HANDLE 0040106C . E8 A7000000 call <jmp.&kernel32.GetStdHandle> ...

This "jmp.&Kernel32.getStdhandle" is in fact "jmp [iat.dll_01.function_01] UNTIL the OS loader fixes the corect address and Olly is kind enough to show you friendly names. However Olly does this after the OS loader has performed his job.

Check with a hex editor / disassembler to see that in the "cold" executable the values are different.

dedndave · May 29, 2009, 06:59:23 PM

i knew i saw it someplace - it was in Hutch's scrap-book of source code

the method was devised by EliCZ - and it does not look all that simple - lol
i may have a go, just as a learning experience

the first download on the page....
http://movsd.com/source.htm

BogdanOntanu · May 29, 2009, 07:05:22 PM

Quote from: dedndave on May 29, 2009, 06:38:33 PM
well - i read many of the other related threads - i did not see anything specific about eliminating the tables

The info is there somewhere... I do not recall the more detailed threads exactly but this kind of subject pops in and out periodically in the advanced sections.

Basically IF you import your API functions with specific names like this: __imp__ExitProcess@4 THEN the linker selects an glue code that does not generate an jump table (if the LIB does provide such glue code). The exact details might be slightly diferent but this is the idea.

Quote
and, yes, i can see how a program may load faster with the tables

There are less places to fix at linking (only the jump table) and less relocations at DLL loading time and the jump table is slightly smaller than direct call with a lot of API calls (common case).

Quote
once it has loaded, however, i would have to think it would be faster without them

Oh well... debatable but anyway when you call an API "speed" is no longer of the essence.

Quote
as always, the best solution is probably a hybrid, where functions that are speed-critical are referenced without tables
and funtions that are used several times, but are not speed-critical use the tables (similar to procs vs macros argument)

The jump table is generated ONLY for the API and not for your own functions. None of the API call should be considered speed critical.

Quote
surely, if you reference a function 100 different places in the code, the table may be a good way to go
that way, the OS only has to set the value one time when the program is loaded

The OS loader only fixes one value anyway... and this value is the function address in IAT PE's table (not the jump table).

For DLL's indeed there are more relocations to fix (only if DLL is relocated). One relocation fix is needed for each direct call in code.

jj2007 · May 29, 2009, 07:08:07 PM

Very nicely explained, thanxalot, Bogdan :U

Vortex · May 29, 2009, 07:18:15 PM

The trick to generate direct calls is based on the declaration of external symbols :

Code Select

EXTERNDEF _imp__ExitProcess@4:PTR pr1
ExitProcess EQU <_imp__ExitProcess@4>

EXTERNDEF _imp__GetCommandLineA@0:PTR pr0
GetCommandLine EQU <_imp__GetCommandLineA@0>

EXTERNDEF _imp__GetModuleHandleA@4:PTR pr1
GetModuleHandle EQU <_imp__GetModuleHandleA@4>

EXTERNDEF _imp__CreateWindowExA@48:PTR pr12
CreateWindowEx EQU <_imp__CreateWindowExA@48>

pr0, pr1, pr2, pr3 etc. are defined in windows.inc

The creation of the EXTERNDEFs above is automated with Scan.exe

[attachment deleted by admin]

dedndave · May 29, 2009, 07:23:40 PM

ahhhh - cool
let me play with that one, too, Vortex - thank you

jj2007 · May 29, 2009, 09:05:48 PM

Cool indeed, Vortex - thanks. So that is how the crt_ imp stuff was created.

If I understand correctly, placing the call in the jmp table makes sense for calls that are used more than a few times. So is there a good reason to place GetCommandLine and ExitProcess there?

dedndave · May 29, 2009, 09:22:13 PM

i think i will use a selective approach
there are very few instances where i want to reduce overhead as much as possible
these are primarily timing and thread synchronization related functions
i understand that system calls are inherently long-winded, but there is no reason i can't try to get the most out of it
other than that, the tables are probably a much better deal
i am even open to using both methods for a function if i think it is the best approach
for example, if i have a function in one critical spot - i want no table branch
i use the same function several other places - use the table for those

now, if i can just get access to the function-name strings for an error routine, i will be a happy camper - lol
i think i can figure that one out for myself

yes - thank you Bogdan and Vortex both
that is exactly what i was looking for Vortex - very simple
i like the growing window thingy too - lol

mitchi · May 29, 2009, 09:30:02 PM

Very interesting read Bogdan. I've learned a few things here! :bg
What exactly are the relocations you talk about?

Vortex :

Nice tool, nice explanation. So it's really all about the symbols declared in the obj :)

dedndave · May 30, 2009, 02:48:13 AM

Jochen -

QuoteIf I understand correctly, placing the call in the jmp table makes sense for calls that are used more than a few times. So is there a good reason to place GetCommandLine and ExitProcess there?

the answer is here, i suspect...

QuoteThe creation of the EXTERNDEFs above is automated with Scan.exe

it is fairly simple to create them manually - or let the Scan.exe make the IMP file and remove the unwanted ones

btw - it seems imperative to use PoLink

News:

EXE Jump Tables