Re: Is the calculation time same? for 1234 x 1234 and 12345 x 12345.

From: Joseph M. Newcomer (newcomer_at_flounder.com)
Date: 07/28/04


Date: Wed, 28 Jul 2004 14:44:40 -0400

You can assume that multiply takes place in constant time for any pair of 32-bit numbers.
One of the advantages of massive amounts of chip space and tens of millions of transistors
is that you can build huge combinatoric networks that can do a 32x32 multiply in a single
clock cycle. Which is what has been done. Worrying about this sort of issue is a waste of
time. The amount of CPU cycles expended just in posting the question and reading this
reply is probably several orders of magnitude more than the number of CPU cycles your
program will expend in multiply instructions in running the program, unless this is in the
inner loop of a gigabit data computation loop (then it may be break-even). Actually, a
Pentium 4 could do two floating-point multiplies and an integer multiply in a single clock
cycle, so the cost of doing a multiply could quite possibly be zero apparent clock cycles.

OTOH, if you are doing some sort of image or DSP algorithm, data layout (cache hit issues)
could cost you an order of magnitude performance hit if done wrong. Optimization in modern
machines is rarely at the instruction level, never at the sub-instruction level (which is
where you are asking), and always at the data architecture and algorithm level.

To quote an earlier post I made:

If the data you want is in the L1 cache it costs you 0 CPU cycles to access it
If it is in the L2 cache, it costs 1 [Xeon] or 2 [Celeron] CPU clock cycles
If it is not in the L2 cache but in memory, it costs 20 CPU clock cycles
If the page is paged out, it costs you 20,000,000 CPU clock cycles.

Actually, this data is somewhat old, and the L2 miss might be higher, and the page fault
cost similarly higher. Key here is that an L2 cache miss costs you an order of magnitude
and a page fault costs you six orders of magnitude over an L2 cache miss. If you are
worried about making your program run fast, this is where to put your effort. Not worrying
about the time it takes to do a single instruction.
                                        joe

On Wed, 28 Jul 2004 08:54:02 -0700, "Frank E Rogers" <syang@pelco.com> wrote:

>Both numbers are stored as 32bits integer.
>My question is, if the data type is the same, would the actual number
>influcence the computing time?
>If I have two intigers, one only use 12bits, another one uses 16bits, would
>the latter operates faster?
>Thanks.
>

Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm



Relevant Pages

  • Re: Need _gcvt() for Unicode
    ... wasting a lot of their own time solving non-problems. ... The optimizations you are trying to ... if you get a hit in the L1 cache it costs you zero clock cycles. ...
    (microsoft.public.vc.mfc)
  • Newbie question on formatting a field in table design
    ... distribute utility costs to 10 departments ... Access help menu multiplies the number entered by 100 and adds the "%" ... Since we are using 10 departments for each branch office to allocate ... I have to assure that each branch office has 100% of its costs ...
    (comp.databases.ms-access)