Re: How to check if a spin lock has been freed or released?



Stephan Wolf [MVP] <stewo68@xxxxxxxxxxx> wrote:
> Bill Paul wrote:
>> Unfortunately, people _do_ make assumptions. Developers assume that
>> they're writing Windows drivers, not NDIS drivers.
>
> Hmm, well, it would seem it is not the poeple but *you* who is making
> assumptions. Both on NDIS as well as on if or how other people make
> assumptions.

No, if anything I've been forced to make fewer assumptions that most. :)
I think you assumed that I'm a Windows driver developer. I'm not.
Remember, I said that I wrote the FreeBSD NDIS emulator (or "NDIS
wrapper" if you will). You can see the code at:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/compat/ndis/
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/if_ndis
http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.sbin/ndiscvt

It's totally different code from the Linux one, but performs the same
function. I carefully read the NDIS documentation and tried to make
my implementation do exactly what it said, and if I'd been aiming strictly
for source compatibility, that would have been the end of it. But like I
said earlier, the goal was binary compatibility with Windows 2K/XP/2K3,
and that caused a lot more headaches.

FreeBSD has a spinlock API that looks something like this:

struct mtx myMtx;

mtx_init(&myMtx, ...);
mtx_lock_spin(&myMtx);
mtx_unlock_spin(&myMtx);
mtx_destroy(&myMtx);

"So Bill," I hear you say, "why didn't you just map NdisAllocateSpinlock()
to mtx_init(), NdisFreeSpinLock() to mtx_destroy(), and so on? We designed
the NDIS API to accomodate you, right?"

Well, no. I can't do that. The NDIS API assumes that the underlying OS
supports a spinlock model like the one in Windows. This conflicts with
FreeBSD both in terms of semantics and API. Sementically, FreeBSD doesn't
have the notion of IRQLs. And in terms of API, FreeBSD's struct mtx
is a little bit larger than struct NDIS_SPIN_LOCK.

The structure size difference means you can't just pretend an NDIS_SPIN_LOCK
is a FreeBSD mtx structure with clever casting.

"But Bill," I hear you ask again, "NDIS_SPIN_LOCK is large enough to
contain a pointer. Why not just embed a pointer to a struct mtx inside
it, and have NdisAllocateSpinLock() just allocate a struct mtx off the
heap? It's a little gross, but the driver won't know the difference,
right?"

Again, no, I can't do that. Why? Because like I've been trying to say,
hardware manufacturers have released broken drivers for Windows 2K/XP/2K3
where the breakage happens to have gone unnoticed solely because the
NDIS implementation on those OSes let them get away with it. You can
create an NDIS driver for Windows that will appear to work correctly
even though it has errors like calling NdisFreeSpinLock() at the
wrong moment, or forgetting to call it at all. If I had implemented
NDIS spinlocks in terms of FreeBSD spinlocks, both of those errors could
easily crash FreeBSD even while having no effect on Windows. Calling
NdisFreeSpinLock() would destroy the FreeBSD mutex and release its
memory, so referencing it again could corrupt memory or cause a page
fault. Forgetting to call NdisFreeSpinLock() at all would leak memory.

Another good example: NdisAllocateBufferPool() and NdisFreeBufferPool().
In Windows 2K/XP/2K3, an NDIS_BUFFER is really an MDL. This being the
case, NdisAllocateBufferPool() and NdisFreeBufferPool() don't really
need to do very much, since NdisAllocateBuffer() and NdisFreeBuffer()
end up doing IoAllocateMdl() and IoFreeMdl(), so no special pool
allocation or setup is really needed. The NDIS documentation warns
you that calling NdisFreeBuffer() _after_ the buffer pool has been
released with NdisFreeBufferPool() will cause a memory leak. However,
this did not stop RealTek from releasing a driver for their RTL8180
wirless chipset that did exactly this in its MiniportHalt() method.
This caused no problems on Windows since NdisFreeBufferPool() is a no-op,
but it broke on my NDIS implementation since my NdisFreeBufferPool()
was not a no-op and actually did release the pool.

Luckily, in this case, I had contacts at RealTek and I reported the
problem to them and they fixed it. RealTek had run their code with
the driver verifier, but it never flagged this as a problem. Eventually,
I rewrote my code so that I emulated IoAllocateMdl() and IoFreeMdl(),
which allowed me to duplicate the Windows NDIS semantics and avoid
the problem entirely.

There are other gotchas too. A big one is that in Windows 2K/XP/2K3,
you're allowed to allocate from the heap with ExAllocatePoolWithTag()
while at DISPATCH_LEVEL (provided of course you allocate from the
NonPagedPool). In FreeBSD, this would translate to being able to
call malloc() while holding a spinlock. Except you can't do that.

FreeBSD has both spinlocks and sleep locks. You are strongly
discouraged from using spinlocks unless you really really know what
you're doing, and encouraged to use sleep locks instead. In the FreeBSD
kernel, malloc() acquires lots of sleep locks, even in the case where
you call it with M_NOWAIT. The code in kern_malloc.c acquires sleep
locks. The UMA memory manager acquires sleep locks. The machine
dependent pmap code acquires sleep locks. You can't swing a dead cat
in FreeBSD without acquiring a sleep lock. (The fact that FreeBSD
uses interrupt threads is one reason why you can get away with it
being designed this way.)

Obviously, I can't invoke a routine that will acquire a sleep lock
if I'm already holding a spinlock (i.e. at 'DISPATCH_LEVEL'). However
there are NDIS drivers that will do NdisAllocateMemoryWithTag() after
doing an NdisAcquireSpinLock(). This was yet another reason why I
couldn't just map the NDIS spinlock API to the FreeBSD spinlock API.

It took me quite a while to come up with a spinlock implementation that
a) worked, b) preserved all the necessary Windows semantics and API
design, and c) played well with FreeBSD. It was a hassle, but it was
worth it.

Now, I could have just said "the hell with it" and implemented the
NDIS API just the way the documentation describes it, and broken drivers
be damned. But then I'd have to answer to those FreeBSD users with
their sad puppy dog eyes who want to use their new NIC that only
has Windows drivers. I can explain to them how the vendor supplied
NDIS driver violates the NDIS spec until I'm blue in the face, but it
won't matter to them. "But... but... it works with Windows," they will
whine. Naturally, they _assume_ this means everything's ok.

And sadly, they aren't the only ones who think like this.

> One thing a chief developer I worked for taught me was "programming
> errors are *always* the result of false assumptions." Think of it a
> little. It has turned out to be true for me during the last 15 years.

Unfortunately, this has not stopped many developers (not to mention
managers and marketers) from making the grandest assumption of all,
namely: "If it works with Windows, it must be ok."

The inference here is that it's perfectly acceptable to disregard published
specs in favor of Windows compatibility. I'm not saying that people
necessarily set out to ignore the specs, but they don't put what I
would say is enough effort into verifying compliance with them, at
least not once they see that their code works with Windows. And Microsoft
certainly doesn't go out of its way to discourage this behavior. (I
mean, why should they?)

The greatest example of this is ACPI. There are countless motherboards
out there with AML code which will not even compile cleanly with the
Intel "iasl" ASL compiler, which is part of Intel's reference ACPI
implementation that supposedly tracks the spec very closely. But that
doesn't matter to them, because even though the AML may be thourougly
broken according to the spec and might not work with other ACPI
implementations, it does work with Microsoft's implementation, ergo
it must be ok.

"But surely since Windows has such a huge market share, it only makes
sense that they should devote their resources to Windows compliance,
right?"

Well, if you're going to make that argument, you're basically saying
that we should just forget about standards bodies and let Microsoft
dictate how things should work. This tosses the whole notion of 'open
standards' out the window, pardon the pun. I mean, what's the point
of having specifications anyway if you're not going to give them
due consideration.

> Still I would not make any assumptions on how NDIS functions are
> actually implemented. AFAIK, there even exists a NDIS Wrapper for x86
> Linux, which enables Windows NDIS miniport drivers (binary file) to run
> under x86 Linux. You assumptions on spinlocks probably fail for this
> scenario.

I know exactly how the Linux ndiswrapper works. In fact, it now implements
spinlocks much in the same way I do. Remember, the goal is to provide
binary compatibility for Windows 2K/XP/2K3 drivers. (And not for
Win95/98/ME, just in case that's not clear. I mean, if they do work that's
great, but I'm only supporting NDIS 5.0 and later.) I'm not trying to
write my own drivers: I'm trying to make other peoples' driver work, and
unfortunately that means I do have to know the exact implementation
details, because as much as I'm sure Microsoft would like it to be
otherwise, the drivers do depend on them. The people who wrote the
drivers _assumed_ their code was ok because "it worked with Windows."
And now I have to clean up after these assumptions.

> More, it was common to have binary compatible NDIS miniport drivers for
> all of W95/W98/WME/NT4/W2K until MS added some incompatible features
> (see also http://www.wd-3.com/031504/NDISCompile.htm).

Yes, I know about this too. (I've read Walter Oney's book on WDM.) If
a driver is binary compatible with W95/W98/WME/NT4/W2K, then it should
also work with my NDIS emulator. Building a cross-platform driver like
this would imply someone using the same code base for all platforms, so
bugs like NdisFreeSpinLock() misuse would show up immediately during
testing on the 95/98/ME platforms (assuming they, like CE, have an
NdisFreeSpinLock() that's also not a no-op). It also implies that the
developer has a greater familiarity and experience with Windows than
most, and is therefore less likely to make such silly coding mistakes
in the first place. Unfortunately, few vendors are liable to produce
such cross-platform binaries anymore, and as long as the 2K/XP/2K3
NDIS implementation works the way it does, I think it's fairly safe
to assume that I'll continue to see broken drivers being released. :)

-Bill

--
=============================================================================
-Bill Paul (510) 749-2329 | Senior Engineer, Master of Unix-Fu
wpaul@wind<(/>nospam**!!river.com | Wind River Systems
=============================================================================
<adamw> you're just BEGGING to face the moose
=============================================================================
.



Relevant Pages

  • Re: How to check if a spin lock has been freed or released?
    ... > they're writing Windows drivers, ... Both on NDIS as well as on if or how other people make ...
    (microsoft.public.development.device.drivers)
  • Re: Canon printer and TurboPrint
    ... associate or friend to try a non Windows solution is printing. ... Correct is that FreeBSD is extremely printer-friendly ... need any drivers in best case, which is a PS capable printer. ...
    (freebsd-questions)
  • Re: my brother is making me learn FreeBSD...
    ... hardware created for open source operating systems by ... such community doesn't exist for windows and you are ... 100% reliant on the vendor to supply working drivers. ... AFAIK FreeBSD can not act as a domain ...
    (freebsd-questions)
  • Re: BSD Questions.
    ... > Accounting Software ... > FreeBSD then thats a point against it. ... > have drivers for windows, ... For sure Windows has drivers such as the WNT ...
    (freebsd-questions)
  • Re: ndis with USB wifi dongle - no joy
    ... No ndis interface is created, and dmesg shows the following messages: ... Windows NDIS device drivers work because the co-called Project Evil is ...
    (freebsd-questions)