Re: Macros

From: John Carson (donaldquixote_at_datafast.net.au)
Date: 02/03/04


Date: Tue, 3 Feb 2004 12:10:26 +1100


"Bonj" <a@b.com> wrote in message
news:u%23PYYMd6DHA.2404@TK2MSFTNGP11.phx.gbl
> Consider the two following possible implementations of a program:
>
> one, written in C++. With classes (let's say, 2, of which there is
> probably only about 5 instances instantiated at any one time). Lots
> of local variables and stack usage. Class isn't padded with dummy
> variables, although byte alignment is set to 16 bytes (the maximum).
> The class has mainly 4 byte variables although one is a 16 byte one
> (a RECT).
>
> the second, written in plain old C. It is the same program as the
> first, and in a lot of places is pretty much similar code. But
> instead of having a class (as C can't), it has a struct, all
> variables being 4 bytes (there is no RECT in the struct). The struct
> is padded manually to 128 bytes. Instead of class functions, there
> are macros which operate on the struct. Instead of having local
> variables, it has globals - so there is no usage of the stack so to
> speak for each function - it just does a job. Every function that
> returns a void (about 80% of them) is then converted into a macro,
> so that the only functions that the program actually needs to
> generate function overhead for are the system functions, such as
> WinMain and the WndProc, and a couple of other main controlling
> functions - the rest is totally inline.
>
> Understandably, the first will obviously be a lot neater code. But
> I've decided to consider investigating the possibility of the second
> - to try to gain as near to machine code performance as possible
> (even though I can't begin to understand machine code). So far it
> seems alright, and not too ugly (to me, anyway).
> ASIDE from the benefits of having cleaner code - just speaking in
> terms of performance, how much better is the second approach likely
> to be? In the region of 0.000001% ? or maybe 2% - 10% ? Or maybe
> about 0.1% - 1%?
>
> Those are the optons.
>
> Comments?

I wouldn't assume that the second will be faster; it could be slower.
Inlining functions does not always improve performance because it increases
code size and thus increases the risk of cache misses and the like. In C++,
you can use the inline keyword as a (superior) alternative to macros. The
compiler will then inline at its discretion. Global variables are also more
susceptible to cache misses than stack variables.

-- 
John Carson
1. To reply to email address, remove donald
2. Don't reply to email address (post here instead)


Relevant Pages

  • Re: Macros
    ... > of local variables and stack usage. ... > (even though I can't begin to understand machine code). ... code size and thus increases the risk of cache misses and the like. ... you can use the inline keyword as a alternative to macros. ...
    (microsoft.public.vc.language)
  • [PATCH 6/7] remove all remaining _syscallX macros
    ... * mainuse the stack at all after fork(). ... * calls - which means inline code for fork too, ... "Conditional" syscalls ...
    (Linux-Kernel)
  • [PATCH] x86: style fascism for xen assemblies
    ... * a view to being able to inline as much as possible. ... push %eax ... * This is run where a normal iret would be run, with the same stack setup: ... In order to deliver the nested exception properly, ...
    (Linux-Kernel)
  • [PATCH 6/6] remove remaining errno and __KERNEL_SYSCALLS__ references
    ... * we need this inline - forking from kernel space will result ... * mainuse the stack at all after fork(). ... * calls - which means inline code for fork too, ...
    (Linux-Kernel)
  • Re: Macros
    ... >> stack?) ... > they do not depend on the sizes or number of local variables (just their ... The CPU is more efficient when it uses the closest L1 cache ... > You can reproduce this scheme for main memory and pagefile: ...
    (microsoft.public.vc.language)