Age | Commit message (Collapse) | Author |
|
|
|
'go for it' deraadt@
|
|
from ray lai;
|
|
is in the midst of exiting. This solves a race condition that causes freed
memory to be left referenced in the master kernel timeout worklist, leading to
a uvm_fault (observed on an i386 MP system). tedu@, deraadt@, miod@ ok
|
|
Update ticks in timeout_hardclock_update to avoid errors in hardclock (this
is the third time we mess up here). ticks is only used for timeouts anyway.
At the same protect updating ticks with timeout_mutex and be slightly
more paranoid in timeout_hardclock_update.
ok tdeval@ miod@
|
|
|
|
|
|
sends SIGVTALRM and SIGPROF to the process if they had. There is a big
problem with calling psignal from hardclock on MULTIPROCESSOR machines
though. It means we need to protect all signal state in the process
with a lock because hardclock doesn't obtain KERNEL_LOCK. Trying to
track down all the tentacles of this quickly becomes very messy. What
saves us at the moment is that SCHED_LOCK (which is used to protect
parts of the signal state, but not all) happens to be recursive and
forgives small and big errors. That's about to change.
So instead of trying to hunt down all the locking problems here, just
make hardclock not send signals. Instead hardclock schedules a timeout
that will send the signal later. There are many reasons why this works
just as good as the previous code, all explained in a comment written
in big, friendly letters in kern_clock.
miod@ ok noone else dared to ok this, but noone screamed in agony either.
|
|
cpus calling hardclock and the statclock emulation. Move some ifdef
__HAVE_TIMECOUNTER code.
|
|
code is all conditionalized on __HAVE_TIMECOUNTER, and not
enabled on any platforms.
adjtime(2) support exists, courtesy of nordin@, sysctl(2) support
and a concept of quality for each time source attached exists.
High quality time sources exists for PIIX4 ACPI timer as well as
some AMD power management chips. This will have to be redone
once we actually add ACPI support (at that time we need to use
the ACPI interfaces to get at these clocks).
ok art@ ken@ miod@ jmc@ and many more
|
|
encapsulating all such access into wall-defined functions
that makes sure locking is done as needed.
It also cleans up some uses of wall time vs. uptime some
places, but there is sure to be more of these needed as
well, particularily in MD code. Also, many current calls
to microtime() should probably be changed to getmicrotime(),
or to the {,get}microuptime() versions.
ok art@ deraadt@ aaron@ matthieu@ beck@ sturm@ millert@ others
"Oh, that is not your problem!" from miod@
|
|
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.
ok art@ niklas@ nordin@
|
|
|
|
Introduce the cpu_info structure, p_cpu field in struct proc and global
scheduling context and various changed code to deal with this. At the
moment no architecture uses this stuff yet, but it will allow us slow and
controlled migration to the new APIs.
All new code is ifdef:ed out.
ok deraadt@ niklas@
|
|
rescinded 22 July 1999. Proofed by myself and Theo.
|
|
This is at least necessary for the sparc microtime() function, and was
only working before by goat luck. The recent commons removal triggered it.
__atribute__ syntax borrowed from NetBSD.
|
|
|
|
|
|
declarations (extern int foo), and compensate in the appropriate locations.
|
|
info. Since we only use it to profile processes in user mode and there
is no way to get back user mode without going past the AST that will
write out the profiling info in a context where copyout works.
Sitting in my tree for ages.
Reviewed and with some suggestions from nordin@
|
|
infrastructure. ok art@ and miod@
|
|
type characteristics.
internal_types.h will contain only settings invisible from standard C, e.g.,
in the __* or _[A-Z]* namespace, and be reused by files like limits.h.
This allows us to shorten machine/limits.h greatly, as all the common defines
are now in sys/limits.h, plus a small stub in internal_types.h.
Tested on all arches as far as I know.
Approved after discussion with art, millert, deraadt, and others.
|
|
|
|
|
|
|
|
|
|
|
|
(Look ma, I might have broken the tree)
|
|
From NetBSD.
|
|
The older code actually ensured that no timeout would be too early, but
it violated the principle of least surprise by making it seem (when you
looked at the time variable) that every timeout was one tick late.
Also periodic timeouts (that readd themselves in the timeout function),
will now happen with the frequency you expect.
|
|
|
|
|
|
From FreeBSD: eventually, we should replace hzto() with something
like tvtohz() as well.
|
|
|
|
makes it the callers responsibility to allocate resources for the
timeouts.
This is a KISS implementation and does _not_ solve the problems of slow
handling of a large number of pending timeouts (this will be solved in
future work) (although hardclock is now guarateed to take constant time
for handling of timeouts).
Old timeout() and untimeout() are implemented as wrappers around the new
API and kept for compatibility. They will be removed as soon as all
subsystems are converted to use the new API.
|
|
commit messages:
Scheduler bug fixes and reorganization
* fix the ancient nice(1) bug, where nice +20 processes incorrectly
steal 10 - 20% of the CPU, (or even more depending on load average)
* provide a new schedclock() mechanism at a new clock at schedhz, so high
platform hz values don't cause nice +0 processes to look like they are
niced
* change the algorithm slightly, and reorganize the code a lot
* fix percent-CPU calculation bugs, and eliminate some no-op code
=== nice bug === Correctly divide the scheduler queues between niced and
compute-bound processes. The current nice weight of two (sort of, see
`algorithm change' below) neatly divides the USRPRI queues in half; this
should have been used to clip p_estcpu, instead of UCHAR_MAX. Besides
being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op,
and it was done after decay_cpu() which can only _reduce_ the value. It
has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can
scheduler-penalize themselves onto the same queue as nice +20 processes.
(Or even a higher one.)
=== New schedclock() mechanism === Some platforms should be cutting down
stathz before hitting the scheduler, since the scheduler algorithm only
works right in the vicinity of 64 Hz. Rather than prescale hz, then scale
back and forth by 4 every time p_estcpu is touched (each occurance an
abstraction violation), use p_estcpu without scaling and require schedhz
to be generated directly at the right frequency. Use a default stathz (well,
actually, profhz) / 4, so nothing changes unless a platform defines schedhz
and a new clock.
[ To do: Define these for alpha, where hz==1024, and nice was totally broke.]
=== Algorithm change === The nice value used to be added to the
exponentially-decayed scheduler history value p_estcpu, in _addition_ to
be incorporated directly (with greater weight) into the priority calculation.
At first glance, it appears to be a pointless increase of 1/8 the nice
effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that
because it will ramp up linearly but be decayed only exponentially, thus
converging to an additional .75 nice for a loadaverage of one. I killed
this: it makes the behavior hard to control, almost impossible to analyze,
and the effect (~~nothing at for the first second, then somewhat increased
niceness after three seconds or more, depending on load average) pointless.
=== Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation.
Collect scheduler functionality. Try to put each abstraction in just one
place.
|
|
|
|
|
|
|
|
|
|
|
|
Dennis Ferguson (NetBSD PR #2788)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|