Age | Commit message (Collapse) | Author |
|
been fully initialized. Otherwise the scheduler and other things can
accidentally stumble into semi-initialized processes and strange things
can happen. This also requires us to do systrace attachment a bit later.
Debugging help from fgs@
|
|
|
|
|
|
First we check for running out of processes (nprocs variable) before we
continue with the fork, then we do various calls that might sleep (and
allow other forks to start and pass that check), then we increase that
variable. This could allow processes to be created past the limit.
Second is that we don't decrease the the process count for this uid
if the stack allocation fails. So a user could run out of processes
he's allowed to run without actually having them.
miod@ ok
|
|
okay deraadt@
|
|
|
|
Don't assume that the stack is on the top of user address space.
And don't assume that the stack grows down.
|
|
|
|
|
|
|
|
|
|
|
|
(Look ma, I might have broken the tree)
|
|
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.
This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.
Idea and uvm parts from NetBSD.
|
|
- Use malloc/free instead of MALLOC/FREE for variable sized allocations.
- Move the memory inheritance code to sys/mman.h and rename from VM_* to MAP_*
- various cleanups and simplifications.
|
|
deraadt@ ok.
|
|
|
|
|
|
|
|
|
|
on NetBSD's code, as well as some faked Posix RT extensions by me. This makes
at least simple linuxthreads tests work.
|
|
|
|
|
|
okay art@, millert@
|
|
|
|
|
|
Add a new flag to fork1 - FORK_VMNOSTACK that shares all of the vmspace
except the stack and use it for rfork.
|
|
show them with -k. Do not try to show RSS based values for them as they
mess up column alignment. vmstat -f now shows kernel threads separately
from rforks too.
|
|
The function and the argument never change.
|
|
|
|
|
|
|
|
|
|
|
|
argument. Let sys_rfork build the arguments to fork1() and do the
sanity checks itself.
|
|
to, at the bottom or the top, depending on your architecture's stack growth
direction. This is in preparation for Linux' clone(2) emulation.
port maintainers, please check that I did the work right.
|
|
commit messages:
Scheduler bug fixes and reorganization
* fix the ancient nice(1) bug, where nice +20 processes incorrectly
steal 10 - 20% of the CPU, (or even more depending on load average)
* provide a new schedclock() mechanism at a new clock at schedhz, so high
platform hz values don't cause nice +0 processes to look like they are
niced
* change the algorithm slightly, and reorganize the code a lot
* fix percent-CPU calculation bugs, and eliminate some no-op code
=== nice bug === Correctly divide the scheduler queues between niced and
compute-bound processes. The current nice weight of two (sort of, see
`algorithm change' below) neatly divides the USRPRI queues in half; this
should have been used to clip p_estcpu, instead of UCHAR_MAX. Besides
being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op,
and it was done after decay_cpu() which can only _reduce_ the value. It
has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can
scheduler-penalize themselves onto the same queue as nice +20 processes.
(Or even a higher one.)
=== New schedclock() mechanism === Some platforms should be cutting down
stathz before hitting the scheduler, since the scheduler algorithm only
works right in the vicinity of 64 Hz. Rather than prescale hz, then scale
back and forth by 4 every time p_estcpu is touched (each occurance an
abstraction violation), use p_estcpu without scaling and require schedhz
to be generated directly at the right frequency. Use a default stathz (well,
actually, profhz) / 4, so nothing changes unless a platform defines schedhz
and a new clock.
[ To do: Define these for alpha, where hz==1024, and nice was totally broke.]
=== Algorithm change === The nice value used to be added to the
exponentially-decayed scheduler history value p_estcpu, in _addition_ to
be incorporated directly (with greater weight) into the priority calculation.
At first glance, it appears to be a pointless increase of 1/8 the nice
effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that
because it will ramp up linearly but be decayed only exponentially, thus
converging to an additional .75 nice for a loadaverage of one. I killed
this: it makes the behavior hard to control, almost impossible to analyze,
and the effect (~~nothing at for the first second, then somewhat increased
niceness after three seconds or more, depending on load average) pointless.
=== Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation.
Collect scheduler functionality. Try to put each abstraction in just one
place.
|
|
|
|
|
|
|
|
way than that the parent wait call will never get the status of this child,
says Rob
|
|
argument)
|
|
|
|
This gives root a bigger chance to fix any problem that caused the limit
to be reached.
|
|
That was a memory leak.
|
|
Now we can return ENOMEM instead of doing a panic when we run out of memory.
|
|
|
|
|
|
|
|
|