Wolfram|Alpha: Systematic knowledge, immediately computable.

Sunday, March 21, 2010

Hyperthreading, Win 7 Scheduling and Core Parking, A Simple Tale

I was asked by a pretty savvy engineer friend if I could explain Hyyperthreading to a ten year old. I said "sure, if I can't explain something in a way a bright ten year old can understand it, I probably don't understand it well enough myself." I was faced with similar problems in my engineering days, having to present to the start up money guys that were not technicians.

So, here's my go at it.

Think of it this way. You have a bank, with four tellers (cores). Many customers (threads) waiting to get serviced (CPU time on a core). These tellers are pretty bright, they can each multi-task (hyperthreading) between two customers (threads on each logical core). They can each have two customers (threads) at their window, but can only work with one customer at a time (there is after all only one physical core per two logical cores of the CPU), but if that customer has to fill out a form (the thread stalls, or sleeps, or has used its fair share of CPU time), the teller can work on the next customer (thread) in their line (the two hyperthreads per core). They simply make a note of what they are doing with the first customer (the thread CPU states), put it aside (each core in the CPU has two areas per core to track state), and start working on the second customer (thread). The tellers (cores) can work very efficiently with these two customers (threads), and can do so for as long as the bank manager (Operating System) lets them, because they basically have everything they need to know about those two customers (threads) right at their fingertips (the two areas of each core that hold this information.)

Above all of this is the bank manager (the OS). The manager decides which two customers (threads) are at each teller (core) at any given time, and can swap one of those customers (threads) with another waiting customer (thread) in the bigger line of waiting customers (the whole thread pool for the OS).

Now it so happens that this swapping slows the servicing of the customers (threads), so the manager (OS) avoids this at all costs. In addition, the manager (OS) knows that the more work (threads) he can keep on as few tellers (cores), the better. In fact, if the manager (OS) can, he'll put a teller (core, real or logical) on break (CPU parking), not having to pay them during this time (energy savings for the CPU.) Even more interesting, the manager (OS, actually OS and CPU features) knows that if he can push as much work on to a few tellers (cores), these tellers drink a big cup of coffee and work even faster than normal (turbo-boost). The manager (OS/CPU features) knows that there's not enough coffee to go around to all the tellers (cores), and that all of the tellers (cores) cant all be working at the faster rate, so the manager (OS/CPU features) tries to keep as few tellers (cores) active as possible, so long as he thinks it won't affect the overall servicing of the customers (thread pool).

Playing with things like the manager's decisions (thread scheduling) by overriding him (playing with affinity) can force all of the tellers (cores) to do work, even when the manager (OS/CPU features) would dictate this is not the most efficient way to do things. It will likely have no effect on the rate of servicing of the whole customer collection (thread pool), and may in fact cause all of the tellers (cores) to work at normal speed (no turbo-boost), slowing things down in reality.

The manager (OS) knows best, that's why you hired him. Barring him being drunk (a bug in the OS or CPU scheduling logic), he'll usually make better decisions than you.

3 comments: