Re: Macros

From: Bonj (a_at_b.com)
Date: 02/05/04


Date: Thu, 5 Feb 2004 23:05:45 -0000

inline
>
> > When using stack variables, isn't there some overhead involved in
> > creating the variables in the first place (pushing/popping on/off the
> > stack?)
> Yes, there is some overhead, but it is almost always negligible : What you
> must understand is that you do not push and pop local variables per se,
you
> push and pop *space* for them. The locals are then created and used "in
> place" in the available space that has been reseved. When using locals in
a
> function, you get (on x86's) :
>
> - 3 assembler instructions on function prolog :
> push EBP
> move EBP ESP
> sub ESP <size of local variables>
> At least in debug mode,the two first instructions (setting an "empty"
stack
> frame) are always generated by VC (7.1) even if there are no variables. I
> suspect this is for debugging and stack walking reasons.
>
> - and one instruction on function epilog:
> pop EBP
> Conversely, this is always generated in debug mode, even if there are no
> locals.
>
> Now, if you analyse these standard prolog and epilog, you will notice that
> they do not depend on the sizes or number of local variables (just their
> total size). It means that if you have only one local variabe in your
clever
> C/macroish code (and you probaly have at least a local "i" index in a
"for"
> loop, haven't you?), you will get ZERO benefit.
>
> > Is this overhead far less than that of cache misses?
> "cache misses" are a complete beast, the expression itself being not very
> clear.
> Basically, you get several type of memory in your sytem, from "closest" to
> "farest" of your CPU:
> - L1 processor cache
> - L2 processor cache
> - RAM (main memory)
> - hard disk (paging file).
>
> The first is the smallest and fastest memory, the last it the biggest and
> slowest. The CPU is more efficient when it uses the closest L1 cache
> (because it does not need to wait for the memory to give it the value -
> today's processors are much faster that memories).It can use it's L1 cache
> if both the program code and data is in it. Because of the principle of
> "locality of space and time", if you use memory at address X for one
> instruction (either for code or data) in your program, you will likely :
> - reuse address X shortly after.
> - use addresses near X shortly after.
>
> Here is the basic idea of cache : When the OS/hardware sees that your
> program uses ("hits") address X, it fetches a whole "page" containing the
> address X (a contigous block of memory - the proper name is "line" when
> speaking about processor internal cache) in the L1 processor cache. This
> way, if, shortly after, your program hits address X or addresses near X,
the
> data would in the L1 cache and would be accessed very fast.
> Of course, the L1 cache is small, so the fetched-in line is not very big :
> that's why there is a second level of cache (L2) that is bigger but slower
:
> When you hit address X that is not in the L2 cache (nor in the L1 cache),
a
> big page is fetched in L2, and a smaller, sub-part of it is fetched in L1.
>
> You can reproduce this scheme for main memory and pagefile : each time you
> want to access data on some slow, big, memory, you prefetch a "page" of
> addresses near to the requesteed data in a smaller, faster memory. With
the
> locality principle, it is likely that your program will next hit some
memory
> that has been prefetched in the fastest memory.
>
> PS : This is an over-simplified, probably false in some details, view of
the
> problem, but you get the basic idea...

yes, I see. Thanks.

>
> PPS : If you want some quantitative answers, reading from main memory is
> likely to take several ten's of CPU cycles. Fetching data from hard disk
> takes several thousand's of CPU cycles. Compare this with the four (at
most)
> instruction for local variables space reservation on the stack.

OK. Excellent.

>
> > Does having NO global variables at all in a project reduce cache
> > misses and make it 'easy for the compiler'?
> It reduces cache misses, and it also allows the compiler to do some
> optimisations that are otherwise impossible (see tom_usenet's answer on
the
> other thread)

I will

>
> > Is this a good idea? Or
> > is it OK to have some, but not a lot? If so, what is the limit before
> > it starts to affect performance?
> The problem with a lot of globals is NOT performance, it is readability
and
> maintenability.

see below

>
> >
> > And don't take offence but PLEASE stop going on at me to write
> > 'clean' code. If I wanted clean code, I'd use VB or MFC.
>
> Gee!!! I am sorry to have to tell you so, but I bet that the majority of
> contributors to this newsgroup consider VB (up to version 6 at least) and
> MFC as quite bad, "unclean" code!

yes, MFC is quite ugly now I think about it. It's good for some things
though such as ActiveX controls.

>
> > But I DON'T.
> > I've got the current stage of the development of the game completely
> > in my head, right down to every single important variable and what
> > its value should be at any particular time.
> Well, great, ok! Your intellectual capacities must be quite astonishing
then
> if your are able to think about the (tens? hundreds? thousands?) variables
> in your program at once! Now, are you sure that you will have this perfect
> "view of mind" of your program in 6 months?

That isn't your problem to worry about, or comment on for that matter. It's
mine. And it's not what I asked. In fact I did clearly say "ASIDE from the
benefits of having cleaner code - just speaking in terms of performance, how
much better is the second approach likely to be?"
so... you decided to tell me about it anyway? Why?

>
> > FWIW, FYI, I don't even
> > use comments because in all probability and hope, I'm the only one
> > that's ever going to see the source code and I find comments ugly,
> > get in the way and they don't tell me anything I don't already know
> > when reading them.
> Wrong : they don't tell you anything you don't already know when you are
> *writing* them. And even when writing them, putting your though on paper -
> well on screen - can be quite usefuil sometimes.
>
> > I only use comments to 'comment out' some code if
> > I'm not sure if I want to delete it yet. I don't want to know how to
> > write clean, maintainable code - like I say, I already know how to do
> > that -
> Until now, you haven't proved that in your posts.
>
> > If anything I don't WANT people to
> > understand it as it's a game and I don't want to flatter myself but
> > if it ever gets successful I don't want people writing cracks.
> Writing "cracks" means modifiying binaries. What the hell do comments have
> to do with it? Do you think that the binaries generated by the compiler
are
> different (less "readable") if you don't put comments in your source code?
> Or do you wan't to distribute your source code with you program but make
it
> unreadable enough so that no one can understand it?

Well, I don't know. I was thinking if it got leaked.

>
> > I
> > understand that most people who ask questions here are likely to be
> > doing team development, probably using MFC, and probably working for
> > a company. But I'm just a nerd in his bedroom who learns how to do
> > things
> If by "nerd" you mean a self-teaching user that do programming for fun and
> intellectual interest,

yes I do.

then I encourage you gladly to continue and to learn
> some good practices by reading whatever help experienced programmers can

What makes you think I'm not as experienced a programmer as you? I'm only
after help on the constructs, syntax and workings of the C++ language, not
how to design my algorithms.

> give you here. If you mean anyhing else by "nerd", I fear this newsgroup
is
> not the right place for you.
>
> Arnaud
> MVP - VC
>
>



Relevant Pages

  • Re: when to use "new"
    ... Maybe you should understand what the stack is and what the heap is. ... the memory for local variables are ...
    (comp.lang.cpp)
  • Re: Unions in Assembly Language
    ... > uses normal stack calling convention. ... stack is memory, it can break optimal cache usage, because if some memory ... potential of asm, I am convienced one must use asm daily, for years to see ...
    (alt.lang.asm)
  • Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCPportsfrom the host TCP port space.
    ... The cache miss is going to cost you half the memory bandwidth of a full copy. ... What do you need from the kernel for RDMA support beyond HW drivers? ... Eventually, it would be useful to be able to track the VM space to implement a registration cache instead of using ugly hacks in user-space to hijack malloc, but this is completely independent from the net stack. ...
    (Linux-Kernel)
  • Re: Macros
    ... Now you bring up this issue of 'cache misses', ... must understand is that you do not push and pop local variables per se, ... Basically, you get several type of memory in your sytem, from "closest" to ... L1 processor cache ...
    (microsoft.public.vc.language)
  • Re: memory reading and writing
    ... typically there's a small sticker or writing on the memory ... You're not accounting for the "cache memory"... ... memory is on-chip and near register-like in access speeds BUT, ... the "stack" is actually just the SP ...
    (alt.lang.asm)