From: Ingo Molnar <mingo@elte.hu> To: Ryan Cumming <bodnar42@phalynx.dhs.org> Subject: Re: [patch] sched_[set|get]_affinity() syscall, 2.4.15-pre9 Date: Fri, 23 Nov 2001 12:36:27 +0100 (CET) Cc: Robert Love <rml@tech9.net>, <linux-kernel@vger.kernel.org>, <linux-smp@vger.kernel.org> On Thu, 22 Nov 2001, Ryan Cumming wrote: > [...] a /proc interface would allow me to change the CPU affinity of > processes that aren't {get, set}_affinity aware (i.e., all Linux > applications written up to this point). [...] had you read my patch then you'd perhaps have noticed how easy it is actually. I've attached a simple utility called 'chaff' (change affinity) that allows to change the affinity of unaware processes: mars:~> ./chaff 714 0xf0 pid 714's old affinity: 000000ff. pid 714's new affinity: 000000f0. > And one final thing... what sort of benifit does CPU affinity have if > we have the scheduler take in account CPU migration costs correctly? > [...] while you are right that the scheduler can and should guess lots of things, but it cannot guess some things. Eg. it has no idea whether a particular process' workload is related to any IRQ source or not. And if we bind IRQ sources for performance reasons, then the scheduler has no chance finding the right CPU for the process. (I have attempted to implement such a generic mechanizm a few months ago but quickly realized that nothing like that will ever be accepted in the mainline kernel - there is simply no way establish any reliable link between IRQ load and process activities.) So i implemented the smp_affinity and ->cpus_allowed mechanizms to allow specific applications (who know the kind of load they generate) to bind to specific CPUs, and to bind IRQs to CPUs. Obviously we still want the scheduler to make good decisions - but linking IRQ load and scheduling activity is too expensive. (i have a scheduler improvement patch that does do some of this work at wakeup time, and which patch benefits Apache, but this is still not enough to get the 'best' affinity.) Ingo /* * Simple loop testing the CPU-affinity syscall. */ #include <time.h> #include <stdio.h> #include <stdlib.h> #include <linux/unistd.h> #define __NR_sched_set_affinity 226 _syscall3 (int, sched_set_affinity, pid_t, pid, unsigned int, mask_len, unsigned long *, mask) #define __NR_sched_get_affinity 227 _syscall3 (int, sched_get_affinity, pid_t, pid, unsigned int *, mask_len, unsigned long *, mask) int main (int argc, char **argv) { int pid, ret; unsigned int mask_len; unsigned long mask, new_mask; if (argc != 3) { printf("usage: chaff <pid> <hex_mask>\n"); exit(-1); } pid = atol(argv[1]); sscanf(argv[2], "%lx", &new_mask); printf("pid: %d. new_mask: (%s) %08lx.\n", pid, argv[2], new_mask); ret = sched_get_affinity(pid, &mask_len, &mask); if (ret) { printf("could not get pid %d's affinity.\n", pid); return -1; } printf("pid %d's old affinity: %08lx.", pid, mask); ret = sched_set_affinity(pid, sizeof(new_mask), &new_mask); if (ret) { printf("could not set pid %d's affinity.\n", pid); return -1; } ret = sched_get_affinity(pid, &mask_len, &mask); if (ret) { printf("sched_get_affinity returned %d, exiting.\n", ret); return -1; } printf("pid %d's new affinity: %08lx.", pid, mask); return 0; }