Re: Explain this about threads

Tech-Archive recommends: Fix windows errors by optimizing your registry



Jon Slaughter wrote:
So blocking is something done on the thread side? I dont' really get it. If I've created a thread how can I "block" while I wait for a resource to come availiable?

You call a method that will block on the resource. Which method you need to call depends on the resource.

For what it's worth, you seem to already have the basic understanding of what has to happen. That is, you've already deduced the requirements for being able to have something that allows a thread to block. So it seems the main thing lacking is a practical experience or knowledge of the concrete mechanisms involved. So I'll try to focus more on that.

As a simple example, consider Console.ReadLine(). This is a blocking call. The thread that calls it will block until the user has entered a full line of data. In reality, behind the scenes there's some work to deal with checking for the end-of-line, but the basic idea is that as long as the user isn't entering any data, that thread isn't doing any work. It's not runnable. It's blocked.

Your essentially saying its better to block than

while(resource_unavailable)
Thread.Sleep(10)

or somethign like that... but then how can one actually block? Is it special API routines?

Well, Sleep() is also a blocking method. In your example above, your thread will be unrunnable for (roughly) 10 milliseconds. This is your thread blocking.

But, no...I wouldn't say that blocking happens via "special API routines". It is true that blocking is a consequence of calling specific methods, but the number and variety of those methods is so great, I'm not sure the term "special" really applies. Blocking is almost always more a natural consequence of a specific design than it is something you do specifically to block.

If _all_ you really want to do specifically is block, then I'd say that Sleep() would be the function that does that. So in that respect, sure...that's the "special API routine" that blocks. But there's many many other ways for a thread to block.

It seems that blocking would have to occur by the code, such as a driver, that actually can tell when the resource is ready. So there really couldn't be self blocking? (seems like it could never wake up)

See above. But yet, generally speaking when a thread calls a method that blocks, it has to be using some sort of resource that the OS knows about so that the OS can wake the thread back up.

In some cases, this resource is some kind of synchronization object, like a WaitHandle. This would be something one thread would use to communicate with another, where one thread waits on the object, and another thread sets the object when it's done whatever the first thread was waiting for.

Another example would be the Monitor class, or even just the lock() statement. Again, these are synchronization mechanisms, but this time rather than having one thread explicitly waiting and another explicitly signaling, they are ways of having both threads indicate to the OS "hey, I want this resource" and then allowing the OS to do the hard work of ensuring that only one thread is runnable at a time while they are using that resource.

Another broad class of blocking methods are i/o methods, such as Console.ReadLine() I mentioned above. Pretty much any form of i/o you can do involves at some level a call to a low-level OS function that uses some internal signaling method to allow the OS to make your thread unrunnable until the i/o completes.

IMHO, this is probably the most applicable example in your own case. You seem to be doing i/o, and so this is a natural scenario in which to take advantage of blocking behavior.

It sounds if its like a lock that the scheduler uses to know when to put back a thread into a queue? The thread that controls the resource is one that controls the lock? In this way, say, if I'm some driver with resource X and thread Q requests X but X is not ready I can set a lock on X w.r.t to Q and the schedular will block Q.

Yes, to some extent that's a fair description of how it works. The exact mechanism varies, but it always comes down to some means of the thread registering itself (in practically all cases, this "registering" is implicit in whatever blocking technique being used) with the OS so that the OS itself knows under what condition the thread will become runnable again. Then the OS manages all aspects, including moving the thread back to the runnable state when appropriate so that the scheduler can allow the thread to start executing again.

Basically you are using the term blocking in a self-referential way like "I can either block myself or not"... but I don't understand how that could work.

Well, you are correct when you observe that if a thread is blocked, it cannot unblock itself. A blocked thread _always_ depends on some other thread to unblock it. Sometimes that's as simple as the time period given in the Sleep() call expiring, sometimes it's something else, like some i/o completing, or another thread releasing a shared resource.

Saying that a thread can "block itself" is only self-referential in the way that any reflexive verb is. That is, I suppose it technically is in fact self-referential, but I don't see that as a problem.

[...]
How could thread B block itself and know when to "unblock" since it has no idea when the resource will be available?

It can't. It relies on cooperation between the OS and whatever has the resource in order to signal a state that will allow the OS to know that the thread can be unblocked.

The only thing that knows this is the code that sits on top of the resource and it would seem that it would have to do the blocking and unblocking.

That's right. Because of the way the OS is designed, the holder of the resource doesn't need to know who else wants it though. All it needs to do is signal to the OS that it's done with the resource, and the OS will take care of the rest.

Keeping in mind that synchronizing a shared resource isn't the _only_ way to block, it's just one way to block. But in any other way, there is a similar mechanism involve that allows the OS to know all of the details necessary in order to unblock the blocked thread when appropriate.

Now if there is a way to set a block on a specific resource like

In Thread B:

Ask for resource,
Block until resource available.

But internally it would seem that the blocking realling occurs by the controller of the resource and the scheduler actually deals with that?

Define "controller of the resource". In some respects, the OS is the "controller". That is, ultimately it is the OS managing who gets the resource at any given time. On the other hand, you could say that whatever thread currently has acquired the resource is the "controller". That is, until that thread releases the resource, the resource is controlled by that thread and unavailable to any other thread.

This means that as long as thread B is blocked, thread A and thread C can share the CPU without having thread B using any CPU time. Assuming you just have the three threads, then basically thread A and thread C get their usual share of the CPU, plus they both get to use the time thread B otherwise would have used.

Yeah, I understand this. I don't understand how B can actually do any blocking because the only way it could do this is to poll the resource until its ready or hook some interrupt. The first case then isn't blocking and the second isn't available to normal windows applications.

Ah, but it is. The second case, that is. Normal Windows applications don't have direct access to the interrupts, no. But they do have access to methods that allow the OS itself to use the interrupts, which implicitly provides a mechanism for the application itself to use the interrupts.

This is, in fact, how a lot of the various i/o methods work.

The way I see it, Thread B would have to ask to be blocked until resource is available(well, essentially this would be implicit if its a synchronous call).

Yes, that's exactly what happens. By making the synchronous call, the thread is implicitly telling the OS "block my thread until the resource is available".

Like I said, it seems to me you already know how it works. You just don't realize it. :) You've deduced what must happen; the only missing part is that you don't seem to realize that these mechanisms do in fact already exist in the OS.

So the flow might go like

In Thread B,

Ask for resource,
Please Block me until resource is ready

So far, so good.

----

Thread A,
Thread B asked to be blocked until resource is ready,
Tell kernel to block B,
Wait until resource is ready(either through polling or interrupt)
Tell kernel to unblock B
return

Not quite right here. Thread A doesn't tell the OS to block B. The OS already knows, by the semantics of whatever synchronization mechanism is at work, that B needs to be blocked.

The only involvement thread A might have is in managing the resource that the OS already knows thread B needs to be blocked on. This might mean that thread A is holding the resource at the moment thread B requests it. Or thread A might be a device driver thread managing i/o, and by asking for some specific i/o on that device, thread B implicitly tells the OS to block itself until thread A has completed whatever i/o task is required to provide the result thread B wants.

Thread A might have some implicit relationship to thread B, but it's the OS that manages which threads block and which ones get to run.

[...]
The bottom line here is that polling is bad. Really bad. It takes the one thread that actually has no work to do, and causes it to use the most CPU time out of any thread running on the system. Polling is almost always counter-productive. It's almost never the right way to solve a problem.


Right, I realize that because for my application I am doing something like polling.

I was wondering if somehow I coudl get around it by blocking but I am unfamiliar with this.

I have seen your comments in other threads. I have yet to see anything in your comments that suggest that you really need polling.

I can't rule it out, but because of the way Windows works, polling is usually not going to solve a problem in the way you might hope it would.

For practically any application, using some form of blocking i/o is the appropriate solution. If you have an application that has such time-critical needs that some sort of polling mechanism might be required, it's almost always the case that that application just will never work properly on Windows, because of its lack of real-time features.

In fact though I don't think I can get around polling the way I am doing it because I'm working directly with hardware. Basically just using a proxy to do the hardware. Not only do I have to poll but I also have to introduce spin waits to slow down the process. This is bad but I think I have no choice.

Without knowing the full details of your project, I can't really offer much specific advice. While I have taken note of some of what you've written in the other threads, I admit that I haven't been following the conversation closely. If you've posted all of the gory details, I didn't happen to catch that.

If you have no managed access to the i/o from the hardware, and no integrated unmanaged access to the i/o (that is, via one of the higher-level i/o API in Windows), then I suppose it's possible polling is your only option. However, even there what you should do is use a call to Sleep(), with a 1 millisecond timeout, any time you poll and don't have any work to do, to ensure that your thread immediately yields its timeslice.

If you do have managed access to the i/o, then it will be much better to take advantage of the blocking behavior of the synchronous API, or in many cases even better would be to use the asynchronous API (typically, this would involve using the methods with the words "Begin..." and "End..." at the start of the name).

I just don't understand how one can do orderly communications in windows. In dos, say, if you wanted to communicate at some desired rate you could hook the timer interrupt and it will be called every click of the timer... You could configure the timer for several frequencies. So if I wanted to communicate with the parallel port in a timely fashion I could do this quite easily by hooking the timer.

Well, yes and no. If you've done DOS programming, then you know that you can get problems when multiple pieces of code all try to hook the same timer. Either the hooks get properly chained, in which case you have the potential for performance issues, or they don't, in which case you simply wind up with some code not getting timer notifications.

So it's not a panacea. :) But yes, DOS provides a low-level way to do this sort of thing.

On the other hand, DOS isn't really a multi-tasking OS. Yes, there are DOS programs that use various techniques to implement a form of multi-tasking, but these are always fragile and rely on cooperation between the various pieces of code all trying to run at once.

Windows, on the other hand, is a true multi-tasking OS. Code running on Windows does not get to explicitly decide if and when it will run, but Windows does provide other mechanisms to deal with efficiently allowing multiple pieces of code all to share the same CPU resources.

As is the case in all consumer-level multi-tasking OS's though, Windows doesn't provide any sort of real-time management. So the price of being able to use this higher-level, more efficient multi-tasking API is that control over exactly when your thread will execute is lost.

This doesn't prevent orderly communications. But it does prevent you from knowing _exactly when_ communications will take place, yes.

Fortunately, for practically all i/o that a Windows application might be asked to do, the OS switches between threads quickly enough that the end-user never will notice any difference.

I could also use interrupts to get input when a resource is availible(such as data on the parallel port).

But both of these methods are impossible to do in user mode code.... I'm trying to read about kernel mode drivers and see if this is possible to do in a driver(I know I can hook interrupts but not sure about the timming so I still might have to introduce spin waits).

A kernel mode driver does have access to the same or similar mechanisms you're familiar with from DOS, including interrupts. And in fact, if for some reason you need this low-level access to the hardware, writing a driver is often the best solution, especially if you can use interrupts (even in a driver, polling has similar problems).

But, it's important to keep in mind that even if you put that sort of logic into a driver, there's no way for that driver to interact with a user-mode piece of code in a way that takes the thread-scheduling issues out of the picture. The driver itself can be less-affected (though it's subject to the same thread-scheduling rules, so... :) ), but in the end if you're writing this code on Windows, presumably the data is eventually presented to the user, or written to a file, or whatever, and at that point it still has to go through user-mode code and will suffer the same timing idiosyncrasies that user-mode code always has.

But if I could block for a very precise amount of time in a user mode program I coudl simulate the timer interrupt. Essentially having some Thread.Sleep but for high resolution. I know this is impossible on the PC though but it would be nice.

Sure, it would be nice. You can in fact use other timer mechanisms. For example, the multimedia timers in WinMM provide higher-resolution timing. I don't know if there are similar high-resolution timers in ..NET, but there might be.

But even with higher-precision timers, your thread will be subject to the thread-scheduling rules. There will always be limits to just how much precision you get under Windows.

(Actually what would be nice is to have a small seperate cpu that is specifically designed for timed communications so one could load some code there and it will always run at some specified rate and is independent of the main cpu and os)

Well, IMHO that's known as "DMA". :)

Ok, So I guess I was right in that blocking ultimately is part of the OS. I still don't know what the blocking techniques are though but maybe they are the synchronous calls?

Synchronous i/o calls are a form of blocking, yes.

I thin kthe problem is though that in some cases spinning is the only way to do something. For example, I have a kernel mode driver that I use to communicate with the port. It emulates in and out. so if I want to send a sequence of bits to the port I might do

out a;
out b;
out c;
etc..

but then this runs as fast as possible.

I guess part of the question is why do you need this to run "as fast as possible"? Or is it more a matter if you _don't_ want it to run "as fast as possible"? Is there something about the i/o where you need to control the exact timing of the use of the port?

Typically, there is some buffering available for i/o ports. The buffers aren't huge, but they are large enough for the driver managing that i/o port. Then the driver itself has larger buffers that it uses to manage data on a timing frequency appropriate to user-mode thread scheduling. Of course, typically there's also some sort of handshaking on the i/o port so that the driver can signal that the buffers are full (forcing the appropriate end of the i/o connecton to stop sending, whether that's the hardware or the user-mode code).

But it's true that not all i/o devices support this sort of handshaking (in particular, in the hardware-to-computer direction...software should always be able to stop sending when a buffer is full, though no doubt there are implementations out there that don't), and in those cases it's actually possible to lose data if it's not read fast enough.

But I'm not clear on how that applies here.

Its actually not that fast as it takes about 7us, atleast on my computer, to send just one out. So as you can see, excluding task switch interruptions this code runs about 150khz or so. Its not all that slow actually but would be nice to have to run fast as possible(there is an upper limit on the speed of the port but I forgot what it is... I think its around 100khz or so but depends on the chip used). But lets suppose its to fast and I want to slow it down?

If what is too fast? Like I said, most i/o devices deal with this inherently. They use buffering and handshaking to ensure that data is not transmitted too quickly in either direction. If data is being sent by the hardware so fast that the software can't keep up, i/o is simply stopped until the software catches up. Likewise the other direction.

For what it's worth, it's hard for me to imagine hardware so fast that the software can't keep up. The memory controller is one of the fastest i/o devices in a computer, if not THE fastest, and software still winds up waiting on memory on a regular basis. So _on average_, all code can execute plenty fast enough to keep up with any i/o device attached to a computer.

You seem to be asking about the other direction; software being too fast for the hardware. But that begs the question, why is the hardware driver not taking care of this already. I seem to recall this was a regular parallel port; am I misremembering? A standard parallel port driver should handle all of the buffering you need for it to work correctly with whatever parallel device is attached to the hardware.

Only way is to introduce delays. Since the only way to delay for us's is to introduce a spinwait I have no choice. Now the good thing is, is that 90% of the time I only have to send a small number of bits(< 100) and the delay is probably pretty short(< 100us).

I guess I'm still not clear on why the timing is so critical. Perhaps this is a consequence of me not paying close enough attention to the other threads.

This means that that maximally I would have about 10ms delay from start to finish in sending one command. If its just interrupted once then that might be ok. (actually its more like 2.5ms on average I think)

I don't recall the exact length of a timeslice on Windows (and if I remember correctly, it can vary according to the specific version and configuration of Windows you're using). However, I think that typically 10ms is less than the timeslice.

So, assuming your code starts sending immediately at the beginning of its timeslice, it should be able to send all of the data within a single timeslice. One way to manage this would be to have the thread blocked until you are ready to send, and then unblock it. For example, use a WaitHandle, where the thread uses the WaitHandle.WaitOne() method. Some other thread would set the WaitHandler, and the thread waiting on the WaitHandle would begin sending immediately after returning from WaitOne().

Assuming you've made sure the handle is not set before calling WaitOne(), this will ensure that the sending starts right at the beginning of a timeslice, and as long as 10ms is less than a timeslice (which it generally should be), you can finish all of the work within a timeslice, ensuring that your thread is not interrupted while sending the data.

Now, as far as inserting delays into the data sending code, that's outside the scope of this thread IMHO. You asked about how threads block, and that's what I've been trying to explain.

But none of the blocking mechanisms allow you to block a thread and then start executing it again with the sort of resolution you want. Even if you raise the priority of your thread, just making it runnable won't cause it to start executing again until there's a free CPU unit. This won't happen until any threads currently executing either yield their timeslice, or use up their timeslice entirely (and so are preempted by the OS).

In any case I've switch to the idea that it might be best to learn about kernel mode programming so I'm reading up on that.

If you are dealing with some sort of custom hardware that has very specific timing needs with regards to its i/o, then yes...it's possible you'll need to write a driver, and it's possible that driver might need to be a kernel-mode driver. I don't have enough details to answer that question.

But even drivers are subject to the thread-scheduling rules. There are mechanisms provided to drivers to help manage timing issues, but at the end of the day, Windows is simply not a real-time OS and so getting precise timing of code execution is not always possible.

You can insert delays into your code to try to ensure a specific _minimum_ delay between operations, but you have much less control over the maximum delay, and using the built-in thread-scheduling mechanisms, including those that block threads, isn't likely to be a good way to implement the minimum delay aspect.

Pete
.



Relevant Pages

  • Re: Explain this about threads
    ... You call a method that will block on the resource. ... Sleepis also a blocking method. ... Again, these are synchronization mechanisms, but this time ... So if I write a kernel mode driver then I have to ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Explain this about threads
    ... resource, it ... So, in the quote, they appear to be explaining that polling a resource is ... So blocking is something done on the thread side? ... share the CPU without having thread B using any CPU time. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Explain this about threads
    ... except maybe it blocks but then something else has to unblock. ... a resource that you block yourself? ... But the basic blocking behavior itself, being based on the individual character reads and not the newline per se, _is_ in fact managed by the OS, and not your thread). ... (Internally to scheduler: Block Program, ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Splash Screens , how could something so basic still be hard?
    ... processing triggered by socket I/O. ... more time is spent using the CPU than the i/o resource. ... A single thread can be sufficient to handle hundreds ... having multiple threads using different i/o resources ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: spin lock
    ... the resource and prevent one task from hogging the shared resource. ... When you say that tasks always PAUSE during I/O, is this the POSIX pausefunction? ... multitasker runs standalone. ... with the rule that whenever a task initiates I/O it relinquishes the CPU and remains inactive until the I/O is complete. ...
    (comp.lang.forth)