Re: how to minimize interrupt latency using interrupt affinity in



Could it delay it by 1.5ms though? All the interrupt does is store the
processor context on the stack and call the interrupt vector - no matter how
many threads are running. Even if the system idle thread is running the
result should be the same as when there are many high priority threads
scheduled to run...

Philip

"Scott Noone" wrote:

You're correct, your interrupt will interrupt any thread executing on the
processor. However, the overhead of scheduling and dispatching on the
processor could delay your interrupt from being delivered.


-scott

--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com


"pgruebele" <pgruebele@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:68AE9531-67DD-47F4-8E0A-0B41B02DB98C@xxxxxxxxxxxxxxxx
This is Vista with the latest bits.

It should not be necessary to change thread/process affinity since
interrupts pre-empt all threads, including realtime priority threads. It
is
my understanding that no matter what threads are running on the system,
interrupts are serviced at a level that is higher than any scheduler
managed
threads...

Philip

"Scott Noone" wrote:

As long as threads can still be scheduled on the processor then there can
be
activity on that processor. Are you also changing the affinity of all
threads so that nothing is scheduled on that proc?

Out of curiosity, which O/S is this?



--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com


"pgruebele" <pgruebele@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:5F8A1548-16E3-4824-A247-D35716477489@xxxxxxxxxxxxxxxx
Thanks

The thing is that:

1. doing disk activity such as searching for files greatly increases
this
interrupt jitter but this would not cause increases in SMIs, right? So
SMIs
don't appear to be the culprit. Other things like tcp traffic seem to
have a
similar effect as disk activity.

2. CLI/IF is processor/core specific right? If all ISRs except for
mine
run
on core 0, then they can only disable interrupts for that core... So
other
drivers can't really be causing this (I have verified 100% that only my
driver raises interrupt on core 2).

3. the jitter can reach up to >1.6ms which seems like an awfuly long
time.
Since it happens mainly with system load (not just CPU load since
interrupts
have priority over all threads...), this means that device driver
ISR/DPC
acticity must somehow be causing this indirectly. The question is why?

My application is actually soft-realtime so it can cope with these
timing
errors OK. The problem is that when the system gets loaded, this
interrupt
jitter becomes so large and frequent that it causes me to have to
re-measure
too much data. I don't expect hard realtime performance. I just want
to
understand why things are behaving as badly as they are given the
configuration I created...

Thanks

Philip

"Scott Noone" wrote:

"I can therefore say with some certainty that the problem must lie
with
either the kernel or interrupt hardware "

Or with some other piece of software. Don't forget that there is the
CLI
instruction that will disable maskable interrupts on the processor.
This
is
used in various parts of the kernel and in some exported APIs (the
ExInterlockedXxx package comes to mind). I'd find it slightly unusual
for
a
driver to disable interrupts on the processor with any kind of regular
frequency, though it wouldn't surprise me.

Also, the xperf output shown doesn't seem to take into account system
management interrupts (e.g. clock). These would delay your ISR from
running
also.

This doesn't really help get you a solution, of course. But I guess
the
moral is that Windows is not real time and even though you've
affinitized
all your interrupts to a particular processor you still don't own that
processor. Other things can (and will) thwart your attempts to get
consistent results.

-scott


--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com


"pgruebele" <pgruebele@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:5712D646-43D3-49FB-8E75-3B140FABCD29@xxxxxxxxxxxxxxxx
I have some further data which shows a concrete example of
unexplainable
incidents of excessive interrupt jitter.

Running Xperf gives me the exact sequence of all system interrupts.
With
it
and excel, I am able to look at the time delta between each of my
interrupts
(nirlpk driver, the only one interrupting on cpu core 2). I ran a
lot
of
disk intensive file searches and managed to make my interrupt get
skipped
altogether. The table below shows this. nirlpk is my driver and it
is
the
ONLY one generating interrupts on core 2. Most of the nirlpk
interrupts
(not
shown in table) are almost exactly 1.25ms apart as they should be
(this
is
the period at which the hardware generates interrupts). However,
the 2
nirlpk ISR calls in this part of the xperf table are 2.86ms apart.
Note
that
there is no ISR activity between 28409.61188 and 28411.48736, so
there
is
absolutely no reason why my interrupt should have been called so
late.

DRIVER core ISREnterTime
ISRExitTime
_________________________________________________________
nirlpk.sys 2 28408.6198 28408.63592
USBPORT.SYS 0 28408.66316 28408.66808
HDAudBus.sys 0 28408.68044 28408.69992
USBPORT.SYS 0 28409.1872 28409.1902
HDAudBus.sys 0 28409.19404 28409.2018
ubohci.sys 0 28409.21444 28409.21864
ubohci.sys 0 28409.22256 28409.22848
ubohci.sys 0 28409.2426 28409.24764
USBPORT.SYS 0 28409.32928 28409.33412
HDAudBus.sys 0 28409.33704 28409.34732
ubohci.sys 0 28409.59304 28409.59644
ubohci.sys 0 28409.5994 28409.6024
ubohci.sys 0 28409.605 28409.61188
nirlpk.sys 2 28411.48736 28411.50836
ubohci.sys 0 28411.48976 28411.50056

How is this explained? These skipped or delayed ISR calls happen
much
more
frequently with system activity. Yet the table below shows that
there
was
no
ISR activity before my delayed ISR call at 28411.48736. So, even if
my
ISR
were being called on core 0 like the other ISRs, there would be no
reasonable
explanation of why my ISR is called so late...

Why is the kernel taking so long to call my isr? Once again, the
ISR
does
not do any serious processing. All it does is acknowledge the
interrupt
and
check for interrupt overrun (which it got in the example above). I
can
therefore say with some certainty that the problem must lie with
either
the
kernel or interrupt hardware (modern Q6600 680Sli ACPI machine).

Regards

Philip Gruebele

"pgruebele" wrote:

Hi Eliyas.

I just want to confirm some assumptions in relation to this
interrupts
latency issue I am having:

1. each processor/core handles interrupts indepedently of the
others.

2. if my device is the only device generating interrupts on core2
(other
devices are on core 0), then no matter what my device's IRQL is set
to,
its
interrupt should always be serviced immediately since no other
interrupts
of
higher IRQL should ever be running on core2.

3. if a device driver ISR or DPC temporarily disables interrupts or
performs
other similar actions, it will only disable these for the core that
it
is
running on (core0), and should have no effect on my ISR running on
core2.

4. after my device generates an interrupt but before the kernel
actually
calls my ISR, does the kernel try to acquire any spin locks which
could
explain this latency?

Thanks in advance

Philip Gruebele




"Eliyas Yakub [MSFT]" wrote:

I will summarize the discussion that I witnessed between two
engineers.

-----------

We are not able to clearly explain the reason for this jitter.
If
your
device is the only interrupting device on processor 3, it should
have
very
low interrupt latency. Setting affinity to 0x3 allows the
hardware
to
select between core 0 and core 1 at the time of each interrupt.
The
hardware is supposed to choose the "lowest priority" processor
but
the
algorithms it uses differ from machine to machine. In many
cases,
core
0 is
the tie-breaker and lots of interrupts end up on core 0.

The best answer we can come up with for why this is happening is
that
there
is some other kernel activity going on - kernel mode workers that
are
disabling interrupts, significant IPI traffic, etc. Maybe from
some
HD
software? Xperf traces are a good place for you to start. If
you
are
running on x86 the latency impact of this kind of activity is
probably
.