Re: pushing the envelope with sockets



Thanks for the reply (comments below):

"Helge Jensen" wrote:



Dan Ritchie wrote:

I'm using asynchronous I/O, and in the receive handler I issue another
BeginReceiveFrom immediately in order to have an I/O ready to receive as
quickly as possible. This works very well, but I find that my CPU usage will

Async IO can potentially avoid spending threads in massively parallel
applications, but why would it reduce CPU-load?

It wouldn't, but CPU is not the ONLY issue. With UDP, if an I/O has not
been issued when a packet is received the packet is dropped. Using async I/O
and issuing subsequent reads immediately in the handler is the only way I was
able to keep up with the amount of data coming in (~30 Mbps in ~1Kb packets).


suddenly (and apparently randomly) increase dramatically (ie, from < 30% to
about 60% on average). I've added performance counters, and identified that

During IO, or changing from blocking to non-blocking IO?

This is overall CPU usage. All I/O is asynchronous (non-blocking). The
point here is that without any apparent change in either the number of UDP
packets received or the number of reads being completed, CPU usage suddenly
jumps after holding steady for some time.


30 vs. 60% of how long? comparing relative CPU-usage doesn't make much
sense if it's not known what it's relative to.

30% or 60% of available CPU (ie, out of 100% for a given time slice in
perfmon). CPU usage holds steady at under 30% on average for, say about 30
seconds, then jumps to 60% on average and stays that way. Again, the point
is there is nothing that changes in the load to explain the additional usage.


Try timing:

- receiving the way you do now
- receiving into a large buffer, but less often -- while still not
exceeding the OS-buffer-size, something like:

byte[] large_buf = ...;
while ( true ) {
Receive(large_buf, ...);
Thread.Sleep(TimeSpan.FromSeconds(0.1));
}

Which spends more cpu? (it's not the same as which completes first!)

the majority of the increase in CPU time is spent in the BeginReceiveFrom
call. This is surprising to me for a couple of reasons. First, this is a
non-blocking call, so I would expect it to return quickly whether data is
received or not. I also don't see any corresponding increase/decrease in

BeginReceiveFrom may complete synchronously, or it may need to queue in
an IO-completion-port for the next receive. The latter will probably
spend more or less CPU (although I haven't measured it).

Either way the call itself should not block. They always complete
asynchronously (ie, the handler is invoked), although it is possible for the
handler to be invoked on the calling thread.


Have you tried reading synchronously with a very large buffer, just to
see how that measures up to your async-IO?

Yes. Async I/O does better. It's the only way I can keep up with the send
rate.


If you re-issue BeginReceive immidiatly you must be allocating a new
receive-buffer for each receive. Perhaps a less CPU-intensive approach
would be to only have one buffer and process that before re-issuing receive?

See above, I must re-issue I/Os quickly in order to avoid UDP packet loss.
And again, the CPU usage is not consistent throughout the run (in fact it
changes suddenly after remaining steady) despite the fact that the I/O load
is consistent. Through measurement, I've determined that the vast majority
of the additional CPU time is spent inside BeginReceiveFrom, not in
allocating buffers, etc. And if you're wondering, the amount of time it
spends in garbage collection remains consistent as well. Before the jump in
CPU usage, the amount of time spent in BeginReceiveFrom is a small percentage
of the total CPU usage. When the total CPU usage jumps, the amount of time
spent in BeginReceiveFrom becomes a significant portion (about half) of the
total CPU usage.


either the number packets received per second or the number of reads I
complete per second (which tracks very closely to UDP packets/second), so I
don't believe the number of BeginReceiveFrom calls I'm making is changing
commensurately.

You may be seeing switch-overhead. The faster you BeginReceive, the
fewer packages the OS will have queued up for you and the more calls to
Begin/End-Receive will be executed.

There's no queueing here beyond issuing the next I/O. That's because the
next I/O isn't issued until the previous one completes (the handler issues
the next I/O).


So my question is, is there anything in the Socket implementation that might
be causing this unexpected increase in CPU time? I've looked at
BeginReceiveFrom with .Net Reflector, and I'm wondering if perhaps the call
to ThreadPool.RegisterWaitForSingleObject might be stalling. I'm issuing on
the order of about 3500 I/Os per second, but I don't see my ThreadPool
availability being impinged. Any ideas? Thanks.

I would try reading in large chunks instead of small ones if low
cpu-usage is the goal, at least just to see if there's a difference.

It's another matter if low latency or efficient usage of the
number-of-threads resource is required than if you are trying to
minimize CPU-usage.

--
Helge

.



Relevant Pages

  • Re: pushing the envelope with sockets
    ... BeginReceiveFrom immediately in order to have an I/O ready to receive as quickly as possible. ... but I find that my CPU usage will ... receiving into a large buffer, but less often -- while still not exceeding the OS-buffer-size, something like: ... BeginReceiveFrom may complete synchronously, or it may need to queue in an IO-completion-port for the next receive. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: List of running processes in C++
    ...  The CPU usage is usually negligible. ... Which basically means that oracle does the writing for you, ... process's I/O bandwidth is controlled by oracle's I/O bandwidth. ...
    (comp.unix.programmer)
  • Re: svchost.exe leads to heavy hard disk activity.
    ... sysinternals) to monitor the I/O accesses and found most of I/O operations ... > as to which process is accessing the disk. ... >> Actually, in most of time, the CPU usage is low. ... >>> Troubleshooting Windows XP ...
    (microsoft.public.windowsxp.general)
  • Re: [SLE] 9.1 is sloooooowwww
    ... >Want to add one more thing: the IO scheduler. ... showed very high I/O wait. ... See if there is anything like excessive memory or CPU usage in top, ... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND ...
    (SuSE)
  • Re: A very basic question
    ... Assuming all things equal in an ideal world, the cpu usage by this app ... then it would seem it's more a question of I/O ... For IBM-MAIN subscribe / signoff / archive access instructions, ... send email to listserv@xxxxxxxxxxx with the message: GET IBM-MAIN INFO ...
    (bit.listserv.ibm-main)