Re: 32-bit programs on Windows x64
- From: "Peter Olcott" <NoSpam@xxxxxxxxxxxxx>
- Date: Mon, 7 Jul 2008 06:40:56 -0500
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message news:11r274dksj2ao6ksfcs0tvqaedq81gv0jv@xxxxxxxxxx
I don't come up with any level of abstraction in which aA DFA lexical analyzer does not compare its input string to
single-threaded sequential
machine has "parallelism"
joe
the set of valid strings one-at-a-time, it compares its
input string with the entire set of valid strings
all-at-once, simultaneously, and at the-same-time, thus
concurrently.
On Sun, 6 Jul 2008 19:08:08 -0500, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:
Joseph M. Newcomer [MVP]
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message news:tul274tvva6eful7vfvupff52lgig717qf@xxxxxxxxxx
See below...
On Sun, 6 Jul 2008 12:49:16 -0500, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:
****
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message
news:2fu174pnif6e48hff9ss2p86ht06uug60p@xxxxxxxxxx
See below...
On Sun, 6 Jul 2008 09:50:32 -0500, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:
****
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in
message
news:46ev645h0n75e8rgc04am76tbfefd592c3@xxxxxxxxxx
See below...
On Sat, 5 Jul 2008 08:35:33 -0500, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:
****
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote
in
message
news:f6nt64937d35g2tii20qh5sskmvqccp1pp@xxxxxxxxxx
See below...I think that if enough RAM is available you are
On Fri, 4 Jul 2008 11:36:34 -0500, "Peter Olcott"
<NoSpam@xxxxxxxxxxxxx> wrote:
****
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote
in
message
news:ntvq64djji32o6smjpgqj7soo397b09ilk@xxxxxxxxxx
There is no way to discover this. There are so
many
processes running that if you really
need 4GB of physical memory available to hold
your
entire
process, you will need at least
an 8GB machine, AND EVEN THEN THERE IS NO
GUARANTEE
THAT
PAGING WILL NOT OCCUR. It is
essentially not under your control. You can try
::VirtualLock, but it doesn't guarantee
that you can lock things down (and you need
special
admin
configuration parameters to
allow it to work at all, and you have to set
your
working
set quota).
Any belief that Windows can sustain real-time
behavior
is
based on various delusional
systems. It is not real-time and any illusion
that
it
is
real-time is just that, an
illusion.
A swapping operation takes 10-50ms depending on
a
variety
of parameters. But it is
impossible to process 4.2GB of data in 1/10 of a
second
using ANY algorithm, even if there
is nothing going on. Therefore the only thing
that
matters is the actual working set.
My system requires 4 GB to store its DFA pixel
pattern
recognizer. To match any particular pixel pattern
only
requires traversing a tiny subset of this 4 GB of
data.
The
4 GB of data is required so that millions of
different
pixel
patterns can be concurrently matched.
First, note that you are never going to get 4GB,
and
in
fact never going to get 2GB, on a
standard Win32 installation. If you are lucky,
you
*might* get 1.5GB on a good day. No
promises.
If you run a Win32 app on a Win64 system, you will
not
get
4GB. You might get, under
ideal conditions, about 3.5, but that might be
optimistic.
I don't know how the basic
allocator works near that level of stress.
supposed
to
get the whole 4.0GB.
YOU ARE STILL NOT GETTING IT!!!! IF THERE IS 512K ON
THE
MACHINE, YOU STILL GET THE 4GB!!!
THE AMOUNT OF RAM ON THE MACHINE IS COMPLETELY,
TOTALLY
UNRELATED TO THE AVAILABLE ADDRESS
SPACE!!!! HOW MANY TIMES DO I HAVE TO SAY THIS????
DO
NOT CONSIDER PHYSICAL RAM AS
IMPOSING ANY LIMITATIONS ON VIRTUAL ADDRESS SPACE!!!
All
physical RAM does is give you
more phyiscal address space. This can impact
performance,
but it has absolutely NOTHING
to do with how much address space is available to
you
in
a
process!!! Why do you insist
on thinking about physical memory as being an
operational
parameter in any of this?
What DOES matter here is that you have to load, in
your
4GB of address space, all the
32-bit DLLs that you are using. These will be
scattered
around throughout your address
space, cluttering it up, fragmenting it, and
generally
interfering with your ability to
get data space. Then there has to be room for your
program code, constants, and static
data, and that gets deducted. The only difference
between
Win32 and Win64 is the system
DLLS. In Win32, you also have KERNEL32.DLL,
USER32.DLL,
GDI32.DLL, NTDLL.DLL and
ADVAPI32.DLL), these interfaces are not present when
you
are running 32-bit apps in Win64
(I haven't verified the docs are not lying about
this).
Surprise, surprise, your
executable and all these DLLs TAKE UP ADDRESS SPACE
which
makes it unavailable to you for
data. So, on a good day, you might be able to use
1.5GB
of the 2GB (Win32), 2.5GB of the
3GB (server boot), or 3.5GB of the 4GB (Win32 on
Win64).
If you're lucky. I may be
optimistic in these estimates.
joe
****
3.5 GB may be enough, otherwise I can always go with a
Win64
on Win64.
****
Concurrently matched sounds suspicious. How many
threads
running on how many processors
is the first question about concurrency. Every
thread
>
total processors slows a
compute-bound computation down.
http://en.wikipedia.org/wiki/Deterministic_finite_state_machine
A DFA inherently does concurrent matching on all of
its
patterns, as opposed to and contrast with making one
comparison for each pattern to be matched.
No. A DFA/DFSM does *sequential* matching of its
input
stream to the states; I saw
nothing in that article that implies any
concurrency.
If
you have several DFSMs, and have
created the union of them (a technique I learned
over
40
years ago, and used to assign to
beginning programming students), you first get a
NDFSM,
but you can always reduce an NDFSM
to a DFSM, so you end up with a single DFSM that
does
sequential matching of input tokens
to states.
****
When a DFA is used as a lexical analyzer, it does not
need
to compare its input string against every possible
valid
token string, one-at-a-time, the input string is
simultaneously compared to all valid token strings in
parallel. It does this with no actual comparisons at
all,
its all a matter of indexes into the state transition
matrix.
Because of this the extra time required when
additional
strings are added to the set of possible valid token
strings
is zero. It takes the same amount of time (using a
DFA)
to
compare one string against one string, as it does to
compare
one string against 10,000 strings, because the strings
are
compared in parallel rather than sequentially.
It compares the string, sequentially, one token at a
time,
to the input string. The fact
that the implementation is a state transition table
does
not make it concurrent.
Note that state transition tables are usually sparse
matrices, and as such, are subject to
some really interesting optimizations, such as "comb
vector" layout, which substantially
reduces the space required while maintaining the
cosnstant-time lookup.
joe
There are two different levels of abstraction here,
physical
implementation, and functional end-result.
Even though the process is physically implemented as a
sequential set of steps, the functional end-result of
this
sequential set of steps is that one string is compared
against many strings at-the-same-time, simultaneously,
concurrently, and in parallel.
I find the use of "parellel" in single-threaded code to
be
a very suspicious use of the
word. If there is no concurrency, there is no
parallelism.
****
****
This is proven by the fact that adding strings to be
compared against does not add any time at all to the
matching process.
No, that's just an FSM. No parallelism is involved
****
****
In other words you are right at one level of
abstraction,
and I am right at another different level of
abstraction.
I don't see any concurrency at all. It's just an FSM.
joe
You are still stuck in a single level of abstraction.
****
Joseph M. Newcomer [MVP]
Joseph M. Newcomer [MVP]
****
So it then boils down to how many patterns/sec you
can
match. Note that the L2 cache is
probably going to cost you an order of magnitude
performance already.
joe
Why should it cost that much? That seems
unreasonably
high.
800 mhz RAM should (at least in theory) be no more
than
300%
slower than a 2400 mhz processor, and the purpose of
cache
is to reduce this penalty, not increase it.
It may seem high, but in fact if you end up
thrashing
the
L2 cache, you will lose 10x-20x
performance. This has been documented in a variety
of
experiments over the years. An L2
cache miss will cost you 10x-20x over an L2 cache
hit.
The lower bound is if all the data
in the cache is unmodified; the higher one is if
there
is
a dirty cache line and it has to
be written back first.
joe
****
Joseph M. Newcomer [MVP]
*****
Joseph M. Newcomer [MVP]
http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=7046848.PN.&OS=PN/7046848&RS=PN/7046848
Perhaps you really have an infeasible project.
joe
On Thu, 3 Jul 2008 14:03:01 -0700 (PDT),
PeteOlcott
<PeteOlcott@xxxxxxxxx> wrote:
Joseph M. Newcomer [MVP]What part of "virtual memory is not physical
memory"
are
you missing. I've said this
eight or ten or a dozen times, VIRTUAL MEMORY
IS
NOT
PHYSICAL MEMORY.
Not only does my process require 4.0 GB of RAM,
it
is
also
time
critical. If there is any swapping to and from
VM,
then
the current
project becomes infeasible. I must know how much
actual
RAM is
available to my process, even though this actual
RAM
will
be mapped to
VM. If three is any swapping to and from disk,
the
current
project
becomes infeasible, my maximum response time is
1/10
second.
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.
- Follow-Ups:
- Re: 32-bit programs on Windows x64
- From: David Lowndes
- Re: 32-bit programs on Windows x64
- References:
- Re: 32-bit programs on Windows x64
- From: PeteOlcott
- Re: 32-bit programs on Windows x64
- From: Joseph M . Newcomer
- Re: 32-bit programs on Windows x64
- From: Peter Olcott
- Re: 32-bit programs on Windows x64
- From: Joseph M . Newcomer
- Re: 32-bit programs on Windows x64
- From: Peter Olcott
- Re: 32-bit programs on Windows x64
- From: Joseph M . Newcomer
- Re: 32-bit programs on Windows x64
- From: Peter Olcott
- Re: 32-bit programs on Windows x64
- From: Joseph M . Newcomer
- Re: 32-bit programs on Windows x64
- From: Peter Olcott
- Re: 32-bit programs on Windows x64
- From: Joseph M . Newcomer
- Re: 32-bit programs on Windows x64
- From: Peter Olcott
- Re: 32-bit programs on Windows x64
- From: Joseph M . Newcomer
- Re: 32-bit programs on Windows x64
- Prev by Date: Re: Date time picker control cannot set time from CTime mapping
- Next by Date: Re: Date time picker control cannot set time from CTime mapping
- Previous by thread: Re: 32-bit programs on Windows x64
- Next by thread: Re: 32-bit programs on Windows x64
- Index(es):