[LWN Logo]
[LWN.net]
From:	 Ingo Molnar <mingo@elte.hu>
To:	 Ryan Cumming <bodnar42@phalynx.dhs.org>
Subject: Re: [patch] sched_[set|get]_affinity() syscall, 2.4.15-pre9
Date:	 Fri, 23 Nov 2001 12:36:27 +0100 (CET)
Cc:	 Robert Love <rml@tech9.net>, <linux-kernel@vger.kernel.org>,
	 <linux-smp@vger.kernel.org>


On Thu, 22 Nov 2001, Ryan Cumming wrote:

> [...] a /proc interface would allow me to change the CPU affinity of
> processes that aren't {get, set}_affinity aware (i.e., all Linux
> applications written up to this point). [...]

had you read my patch then you'd perhaps have noticed how easy it is
actually. I've attached a simple utility called 'chaff' (change affinity)
that allows to change the affinity of unaware processes:

 mars:~> ./chaff 714 0xf0
 pid 714's old affinity: 000000ff.
 pid 714's new affinity: 000000f0.

>  And one final thing... what sort of benifit does CPU affinity have if
> we have the scheduler take in account CPU migration costs correctly?
> [...]

while you are right that the scheduler can and should guess lots of
things, but it cannot guess some things. Eg. it has no idea whether a
particular process' workload is related to any IRQ source or not. And if
we bind IRQ sources for performance reasons, then the scheduler has no
chance finding the right CPU for the process. (I have attempted to
implement such a generic mechanizm a few months ago but quickly realized
that nothing like that will ever be accepted in the mainline kernel -
there is simply no way establish any reliable link between IRQ load and
process activities.)

So i implemented the smp_affinity and ->cpus_allowed mechanizms to allow
specific applications (who know the kind of load they generate) to bind to
specific CPUs, and to bind IRQs to CPUs. Obviously we still want the
scheduler to make good decisions - but linking IRQ load and scheduling
activity is too expensive. (i have a scheduler improvement patch that does
do some of this work at wakeup time, and which patch benefits Apache, but
this is still not enough to get the 'best' affinity.)

	Ingo


/*
 * Simple loop testing the CPU-affinity syscall.
 */
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <linux/unistd.h>

#define __NR_sched_set_affinity 226
_syscall3 (int, sched_set_affinity, pid_t, pid, unsigned int, mask_len, unsigned long *, mask)

#define __NR_sched_get_affinity 227
_syscall3 (int, sched_get_affinity, pid_t, pid, unsigned int *, mask_len, unsigned long *, mask)

int main (int argc, char **argv)
{
	int pid, ret;
	unsigned int mask_len;
	unsigned long mask, new_mask;

	if (argc != 3) {
		printf("usage: chaff <pid> <hex_mask>\n");
		exit(-1);
	}
	pid = atol(argv[1]);
	sscanf(argv[2], "%lx", &new_mask);

printf("pid: %d. new_mask: (%s) %08lx.\n", pid, argv[2], new_mask);

	ret = sched_get_affinity(pid, &mask_len, &mask);
	if (ret) {
		printf("could not get pid %d's affinity.\n", pid);
		return -1;
	}
	printf("pid %d's old affinity: %08lx.", pid, mask);

	ret = sched_set_affinity(pid, sizeof(new_mask), &new_mask);
	if (ret) {
		printf("could not set pid %d's affinity.\n", pid);
		return -1;
	}
	ret = sched_get_affinity(pid, &mask_len, &mask);
	if (ret) {
		printf("sched_get_affinity returned %d, exiting.\n", ret);
		return -1;
	}
	printf("pid %d's new affinity: %08lx.", pid, mask);
	return 0;
}