As multi-core architectures begin to emerge in every area of computing, operating system scheduling that takes the peculiarities of such architectures into account will become mandatory. Due to architectural differences to traditional multi-processors, such as shared caches, memory controllers and smaller cache sizes available per computational unit, it does not suffice to simply schedule tasks on multi-core processors in the same way as on SMP systems. Furthermore, current research motivates architectural changes in CPU design, such as multicore processors with asymmetric core performance and so called many-core architectures that integrate up to 100 cores in one package.
Such architectures will exhibit a fundamentally different behaviour with regard to shared resource utilization and performance of nonparallelizable code compared to current CPUs. It will be the responsibility of the operating system to spare the programmer as much platform-specific knowledge as possible and optimize overall performance by employing intelligent and configurable scheduling mechanisms.
What’s so different about multi-core scheduling?
One could assume that the scheduling process on such multi-core processors wouldn’t differ much from conventional scheduling – intuitively the run-queue would just have to be replaced by n run-queues, where n is the number of cores and processes would simply be scheduled to the currently shortest run-queue (with some additional process-priority treatment, maybe). While that might seem reasonable, there are some properties of current multi-core architectures that speak strongly against such a naïve approach. First, in many multi core architectures, each core manages its own level 1 cache (Figure 1). By just naïvely rescheduling interrupted processes to a shorter queue which belongs to another core (task migration), parts of the processes cache working set may become unnecessarily lost and the overall performance may slow down. This effect becomes even worse if the underlying architecture is not a multi-core but a NUMA system where memory access can become very costly if the process is scheduled on the “wrong” node.