Re: Is there a maximum contiguous memory allocation?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message news:io33j5tengqp4pfn23i05nsd8pljh9a85t@xxxxxxxxxx
See below...
On Tue, 22 Dec 2009 07:39:14 -0600, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:


"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message news:qjb0j55qf7r19bdd2dcm1dciotd7g79jq3@xxxxxxxxxx
Followup: with my trivial MFC app, the largest
allocation
I could get with VirtualAlloc or
malloc was 1194*1024*1024 bytes. 1195 failed.

Yes but then are you testing this on Win64 with 8 GB +
RAM?
I need to know that Win64 with 32 GB of RAM could provide
me
with a 30 GB std::vector that maps to that much physical
memory.
****
No, I'm testing it in the context of Win32. If you have
Win64, then try the experiment
yourself!

I am trying to determine the benefits of this $800 upgrade
before I do it.


Note that if I had my Win64 system up, with 4GB of RAM, I
could STILL allocate 30GB of
vector. Most of it would be paged out most of the time,
but could ALLOCATE it! I could
allocate it if I had 2GB of physical memory! I see no
reason I would need 8GB+ of RAM to
do a simple allocation. In fact, if the allocation failed
I would seriously start
worrying about Windows. Note, of course, that I also
need, ir order to allocate 30GB of
VM, probably on the order of 60GB of paging file, which
would consume pretty much my
entire 68GB C: drive (which won't happen because I have
Office and VS installed on it, so

You can upgrade to a 500 GB WD SATA for $50

I don't have enough space left for that massive
pagefile.sys), so I'd have to install a
second hard drive just to hold the paging file. So I have
to meet the minimum system
requirements. But with a 60GB paging file I can allocate
30GB of vector independent of
the amount of physical memory I have installed.

PLEASE STOP USING THE PHRASE "PHYSICAL MEMORY". IT IS
NEVER GOING TO HAPPEN! Your
continued use of this phrase in consistently erroneous
ways is not helping make your case.
It keeps screaming "I have no idea what I'm talking about,
but answer my question anyway".

So then what does virtual memory map to? The VM pages are
swapped in and out of something, right?


If you allocate a 30GB std::vector, you get 30GB of
virtual memory. The chances that this
will all be resident in a 32GB physical memory
configuraiton is vanishingly small, let's
just call it "no chance whatsoever" to simplify the
discussion. If you want 30GB to be
largely resident, you must
- make sure the user has the privilges to modify the
working set quota
- set the working set quota to be large enough to
encompass your
code, your data, AND your 30GB vector

Note that this merely IMPROVES the likelihood that your
vector will be not paged out when
you go to access part of it. In NO WAY does it make any
guarantees! And "working set" is
a "request", not a "non-negotiable demand". The system is
completely free to ignore
anyone's working set request if it needs pages for any
purpose it deems suitable.

If you want a 100% guarantee that the vector will be
resident, and not get paged out, you
need to
- make sure the user has the privileges to use
VirtualLock, and to be able
to lock at least 30GB down
- use VirtualLock to lock the memory [probably will fail
on a system with
32GB, so plan on 64GB as your minimum config]

Yes, but, I see no good reason why this last step should be
necessary in my case. I also see no reason why a 32 GB
system could not easily keep an entire 30 GB memory
allocation resident in RAM. Would Windows arbitrarily
allocate more than 2GB of memory to itself to do useless
things? Why can't windows keep itself in the extra 2 GB? If
it does this then a 30 GB resident allocation should be a
given.


NOW you have locked down 30GB of VIRTUAL memory. It won't
page out. But on a machine of
< 64GB, it is unlikely this can happen. It may not even
be possible because there might
be no way to allow a user to set a working set large
enough, or do a VirtualLock large
enough; I don't know what the administrative limits are.
But these are the minimum steps
you need to take.

Note that both of these require administrative controls be
exercised. Note that virtually
none of your customers would have a clue as to how to set
any of these parameters, so you
will have to give them a script to follow (and I have
never worked with these parameters,
and I have no idea how to access them). The script itself
will require someone with
administrative privileges set these parameters.

Note that a system that can survive with only 2GB to run
the kernel and the application
you are working with is extremely unlikely. So if you want
30GB, you had better have a LOT
of spare memory around! So don't even think of trying to
do this on a system < 64GB.

Win XP 32 has run fine for years with far less than 1.0 GB,
Does Windows 7 insist on wasting much more than this to do
useless things? Can we turn those useless things off?


You will never, ever, under any circumstances imaginable
be able to allocate physical
memory. You can only allocate virtual memory.
joe

It seems to me that is merely a mathematical mapping with
one corresponding to the other equivalent to each other
except for speed. When two things are equivalent, then they
can be readily substituted, one for the other. As it
actually turns out, I would allocate physical memory, that
is what malloc() does. The OS re-interprets this to mean the
allocation of virtual memory on some, not all systems.

****


For a real app, one with lots of DLLs, threading, etc.
the
value will typically be
smaller. Perhaps much smaller.

Your Mileage May Vary.
joe


On Mon, 21 Dec 2009 12:35:49 -0500, Joseph M. Newcomer
<newcomer@xxxxxxxxxxxx> wrote:

See below...
On Mon, 21 Dec 2009 07:17:18 -0600, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:


"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message
news:0dssi5thoj8m18oc3nintm5bdqdfhc4op7@xxxxxxxxxx
See below...
On Sat, 19 Dec 2009 09:15:38 -0600, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:


"Bo Persson" <bop@xxxxxx> wrote in message
news:7p47djFhkhU1@xxxxxxxxxxxxxxxxxxxxx
Peter Olcott wrote:
My application needs to create a std::vector >
5GB,
is
that
possible in x64?

Yes, if you have enough virtual memory.

The limitation is not in the amount of RAM, but in
the
address space.


Bo Persson



I have read the Microsoft has placed and artificial
2GB
limit on the size of an array. I also read that this
same
limit applies to 64 bit .NET applications, maximum
object
size of 2GB.

My application requires a single contiguous block of
physical memory, is this possible?

****
I'd missed this, and I'm only coming back to it based
on
another reply.

Unless you are writing a device driver for a piece of
hardware designed by an amateur
designer, the chances that you will require
contiguous
physical memory is zero.

My DFA recognizer needs contiguous physical memory,
****
Nonsense. Complete and utter nonsense, beyond any
shadow
of a doubt. Why do you keep
talking about "contiguous physical memory"?
(a) it doesn't matter if the physical memory is or is
not
contiguous
(b) from an application, you cannot control physical
memory
(c) even if you could control physical memory, you can't
allocate large blocks of it
(d) what part of "virtual memory" are you failing to
understand?
****
or disk
swap time would make this process infeasibly slow.
****
In the trade, we call this "life is hard". Meaning,
there's nothing you can do about it.
You are making an impossible request, which
(a) has no meaning
(b) makes no sense because it is impossible to achieve
(c) requires something that makes no sense if a virtual
memory world
(d) is impossible to achieve even for a kernel
programmer
working with physical memory
(MmAllocateContiguousMemory)
(e) even if it was possible, it would not change
anything,
since you can't address more
than 2GB
(f) that 2GB has to include space for all your
application, other structures you use,
DLLs, all the storage they use, and the OS interface, so
you are reduced to something less
than 2GB
(g) since those various pieces I just alluded to can
fragment memory, in practice you
cannot get arbitrarily large contiguous blocks any time
you fell like it; there is a
practical limit to the maximum block size, which varies
from moment to moment in your
program; the longer your program runs, the smaller this
size becomes.
****
There would be a possible disk read for every pixel on
the screen.
Current whole screen response time <= 100 ms.
****
This is called "need to redesign the algorithm".
Typically, in VM systems, you have to
consider things that repack FSM models to maximize
locality of reference. This is a
problem that has been known and understood since at
least
1961, and was well-understood
when I started using virtual storage in 1968 (that's 41
years ago). In 1969, we were
spending hours analyzing our algorithms and repacking
data
to minimize paging; in fact, we
were even using features of our linker to pack code
adjacent. In 1971, I wrote a
diagnostic program that measured code page transitions
during execution of an application
so we could understand how to pack our code to minimize
page faults by studying its actual
behavior. The first LISP machines (in the 1980s) did
not
store lists as lists but as
contiguous arrays to minimize page faults (the extra
cost
and complexity of handling
complex array/list structures including automatic
repacking of lists into arrays more than
paid off in terms of performance gains achieved by
avoiding page faults). Once we got
machines with caches, we started redesigning algorithms
to
maximize cache hits ever for
pages that were resident. Cache hit performance can
improve your program performance by a
factor of 10; paging optimization can improve your
program
performance by a factor of
100,000 to 1,000,000. Or more.

Note that you can use raw VirtualAlloc to improve your
chances of getting storage (malloc
already guarantees fragmentation most of the time). But
you are still going to hit limits
far smaller than 2GB. I just tried an experiment; I ran
a
program that tried to allocate
storage. If it succeeded, I would exit the program and
try again.

Using either VirtualAlloc or malloc, the largest size I
could allocate was 1100MB; 1200MB
failed. I did not try values between these two. And
that
was in a trivial MFC program
that did no other allocation, had no user DLLs loaded
(just what MFC loads). amd allocated
essentially immediately upon startup. Your Mileage May
Vary, but it shows that hopes of
getting larger allocations are very unlikely. Note that
it took about 6 seconds to do the
allocation.

An assumption of uniform time to access large data
arrays
is not a valid assumption and
has NEVER been a valid assumption in virtual memory
systems. If you created an algorithm
whose success depends on a performance that is in
practice
impossible to achieve, then you
need to rethink your design. It can be as simple as
repacking your FSM so adjacent states
are packed adjacent. Or it simply may be that it is
impossible to achieve the performance
you thought was possible.

Note that the issues of working set and VM do not go
away
in Win64. Paging does not go
away. Physical memory still has no meaning.
****

If there was some disk equivalent technology that was
comparable in speed to RAM, then this limitation would
not
have the same degree of impact. Conventional disk seek
time
would kill my performance. The alternatives that I have
examined are solid state drives and various types (and
redesigns) of RAID arrays.
****
Yes, those help. Sometimes the only solution is faster
technology. Raw hardware can
solve problems. So can algorithm redesign. Those of us
who grew up on machines with slow
swapping files and small address spaces learned these
lessons. The current generation
thinks that memory is uniform, and it always comes as a
surprise when they discover it is
not.
****

It would be really great if this problem did not exist
because I would then be able to process Chinese glyphs
efficiently. The current process is estimated to
require
about 2.0 TB RAM. I am working on redesigning the
process
to
eliminate this restriction.
****
Sounds like Win64 to me (8TB limit). But note that you
will still be limited by how many
pages are available in physical memory, and that's not
going to change a lot in the
foreseeable future because of memory costs. Memory
costs
are not only the cost of the
chips, but the cost of the space made available on the
motherboards to plug the memory
into (sockets cost money; printed circuit board space
costs money, and there are physical
limits to how many sockets you can place on a
motherboard). For example, a 2GB chip costs
about $80. So a 2TB RAM system requires 1000 chips and
would cost $80,000. But note that
this means you would need 1000 sockets on your
motherboard! Not going to happen. So you
are going to be paging. Take that as a given. It is
not
negotiable, it is not avoidable,
it is going to be part of what you live with and you
cannot change that fact. So your
algorithms have to change to accoutn for that.
joe
****

(Professional hardware designers as a matter of
course
specify what are called "infinite
scatter-gather DMA controllers", which although
"infinite"
is a bit of a misnomer (you are
usually limited to 4GB of descriptors), each
descriptor
specifies a 32-bit address and a
32-bit length, allowing a single DMA transfer to
transfer
as many discontiguous blocks of
data as are needed to complete the I/O, in a single
operation).

You may require contiguous *virtual* memory, which is
a
different question, and when you
start looking at objects the size you are describing,
either you have to assume that you
will be working with discontiguous memory or you have
to
go to a 64-bit native platform.
There are no other solutions.
joe
****
Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm


.



Relevant Pages

  • Re: Is there a maximum contiguous memory allocation?
    ... Most of it would be paged out most of the time, but could ALLOCATE it! ... allocate it if I had 2GB of physical memory! ... the amount of physical memory I have installed. ... Note that you can use raw VirtualAlloc to improve your ...
    (microsoft.public.vc.mfc)
  • Re: Is there a maximum contiguous memory allocation?
    ... if you have enough virtual memory. ... physical memory is zero. ... Note that you can use raw VirtualAlloc to improve your chances of getting storage (malloc ... that did no other allocation, had no user DLLs loaded. ...
    (microsoft.public.vc.mfc)
  • Re: Is there a maximum contiguous memory allocation?
    ... if you have enough virtual memory. ... physical memory is zero. ... contiguous arrays to minimize page faults (the extra cost ... Note that you can use raw VirtualAlloc to improve your ...
    (microsoft.public.vc.mfc)
  • Re: VirtualAlloc()
    ... large/global memory usage instead of "new". ... allocate physical storage. ... uses `VirtualAlloc'. ...
    (microsoft.public.vc.language)
  • Re: Application becomes slow in windows server 2003
    ... Memory allocation is not your bottleneck. ... I asked the difference between heapalloc and virtualalloc because of my ... Since calloc,malloc call heapalloc so i thought to use it directly. ... assumption was it will improve performance(time to allocate memory). ...
    (microsoft.public.vc.mfc)