Age | Commit message (Collapse) | Author |
|
- move softintr_establish(), which allocates memory, into dtalloc()
- Use M_DEVBUF consistently for all memory chunks in the softc.
- Do not check for NULL before calling free(9).
Reviewed by Christian Ludwig, ok miod@
|
|
Also talk about thread rather than proc which might be confusing.
|
|
Get rid of the per-ringbuffer mutex. Use a variable to prevent against
recursion. Allow to process more events in the same timeframe.
From Christian Ludwig.
|
|
It is currently not safe to call wakeup(9) in interrupt handlers at a priority
higher than IPL_SCHED. As long as dt(4) relies on generic kernel primitives
we have to play tricks to be able to inspect more parts of the kernel. In this
case defer the wakeup(9) to a custom soft-interrupt. This will be good enough
as long as we don't add tracepoints to the soft-interrupt machinery. A more
complex & viable solution would be to not rely on the kernel generic IPC to
avoid recursion.
From visa@ and Christian Ludwig, ok claudio@
|
|
KERN_SECURELVL locked until existing `securelevel' checks became moved
out of kernel lock.
Make sysctl_securelevel_int() mp-safe by using atomic_load_int(9) to
unlocked read-only access for `securelevel'.
Unlock KERN_ALLOWDT. `allowdt' is the atomically accessed integer used
only once in dtopen().
ok mpi
|
|
From Christian Ludwig.
|
|
From Christian Ludwig with some tweaks.
|
|
All events are currently exported to userland in order to support complex
filters. If this becomes a bottleneck it should be possible to translate
(some) user-land filters to in-kernel fitlers.
Prodded by a diff from Christian Ludwig to also trace the tracing program.
ok claudio@
|
|
When initializing the profiling probes, check if we sucessfully
allocated the probe, before registering it. This avoids a NULL
pointer dereference when probe allocation has failed.
from Christian Ludwig
|
|
For the interval and profile providers, schedule the first clock
interrupt to occur dp_nsecs nanoseconds after the start of recording.
This makes the interval between the start of recording and the first
event consistent across runs.
With input from claudio@. Simplified by claudio@.
Thread: https://marc.info/?l=openbsd-tech&m=170879058205043&w=2
ok mpi@ claudio@
|
|
Clock interrupt staggering makes profiling more expensive on average.
Remove it.
Thread: https://marc.info/?l=openbsd-tech&m=170751016121770&w=2
ok mpi@
|
|
To improve the utility of dt(4)'s interval and profile probes we need
to move the probe entry points from the fixed-frequency hardclock() to
a dedicated clock interrupt callback so that the probes can fire at
arbitrary frequencies.
- Remove entry points for interval/profile probes from hardclock().
- Merge dt_prov_profile_enter(), dt_prov_interval_enter(), and
dt_prov_profile_fire() into one function, dt_clock(). This is
the now-unified callback for interval/profile probes. dt_clock()
will consume multiple events during a single execution if it is
delayed, but on platforms with high quality interrupt clocks this
should be rare.
- Each struct dt_pcb gets its own clockintr handle, dp_clockintr.
- In struct dt_pcb, replace dp_maxtick/dp_nticks with dp_nsecs,
the PCB's sampling period. Aynchronous probes must initialize
dp_nsecs to a non-zero value during dtpv_alloc().
- In struct dt_pcb, replace dp_cpuid with dp_cpu so that
dt_ioctl_record_start() knows where to bind the PCB's
dp_clockintr.
- dt_ioctl_record_start() binds, staggers, and starts all
interval/profile PCBs on the given dt_softc. Each dp_clockintr
is given a reference to its enclosing PCB so that dt_clock()
doesn't need to search for it. The staggering sort-of simulates
the current behavior under hardclock().
- dt_ioctl_record_stop() unbinds all interval/profile PCBs. The
CL_BARRIER ensures that dp_clockintr's PCB reference is not in
use by dt_clock() so that the PCB may be safely freed upon
return from dt_ioctl_record_stop(). Blocking while holding
dt_lock is not ideal, but in practice blocking in this spot is
rare and dt_clock() completes quickly on all but the oldest
hardware. An extremely unlucky thread could block for every
interval/profile PCB on the softc, but this is implausible.
DT_FA_PROFILE values are up-to-date for amd64, i386, and macppc.
Somebody with the right hardware needs to check-and-maybe-fix the
values on octeon, powerpc64, and sparc64.
Joint effort with mpi@.
Thread: https://marc.info/?l=openbsd-tech&m=170629371821879&w=2
ok mpi@
|
|
Syskaller has hit the assertion "dtlookup(unit) == NULL" by opening
dt(4) device in two parallel threads. Convert kassert into if
condition. Move check that device is not used after sleep points
in malloc. The list dtdev_list is protected by kernel lock which
is released during sleep.
Reported-by: syzbot+6d66c21f796c817948f0@syzkaller.appspotmail.com
OK miod@
|
|
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately
it has a race and panics sometimes with pool_do_get: syncache free
list modified. Add a reference counter for timeout and list of syn
cache entries. Currently list refcout is not strictly necessary
due to exclusive netlock, but will be needed when we continue
unlocking.
Checking timeout_initialized() is not MP friendly, better do proper
initialization during object allocation. Refcount in btrace helps
to find leaks.
bug reported and fix tested by Peter J. Philipp
OK claudio@
|
|
- Add two new tracpoints sched:fork & sched:steal
- Include selected CPU number in sched:wakeup
- Add sched:unsleep corresponding to sched:sleep which matches add/removal
of threads on the sleep queue
ok claudio@
|
|
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@
|
|
Replace hand-rolled reference counting with refcnt_init(9) and hook it up
with a new dt(4) probe.
OK mvs
Feedback OK bluhm
|
|
ok mpi@
|
|
|
|
longer identify function boundaries and as such no kprobes were found anymore.
Adjust the parser accordingly.
ok mpi@
|
|
Replace hand-rolled reference counting with refcnt_init(9) and hook it up
with a new dt(4) probe.
OK bluhm mvs
|
|
Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.
The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.
OK kettenis@
|
|
ok bluhm@
|
|
This adds stacktrace_save_utrace() to extract and save the userland stack
which is stubbed out on most archs. alpha and riscv64 do not even implement
dt(4) and stacktrace_save_at() so the stubs are excluded there.
Additionally add a new ioctl DTIOCGETAUXBASE which allows btrace to
fetch the AUX_BASE vallue from the AUX vector of a process.
OK mpi@ (some time ago) discussed with kettenis@
|
|
ok deraadt@ miod@ krw@
|
|
|
|
They are already tracked as strings in the kernel. Export them to
userland using one ioctl(2) for all arguments of each probe.
OK mpi@
|
|
Tested on a G5 and G4 macppc.
OK miod@
|
|
API functions. Fixes flamegraphs on archs I could test.
OK bluhm@ miod@
|
|
Forgot to put it in the list of static tracepoints when I committed
the tracepoint at g2k22. Woops.
|
|
Inserts a new static dt(4) tracepoint in vmm(4) to report details
on in/out instructions (direction, port, and data).
ok mlarkin@
|
|
There was a crash due to use after free of the ifa although it is
ref counted. As ifa_refcnt was a simple integer increment, there
may be a path where multiple CPUs access it concurrently. So change
to struct refcnt which is MP safe and provides dt(4) leak debugging.
Link level address for IPsec enc(4) and various MPLS interfaces is
special. There ifa is part of struct sc. Use refcount anyway and
add a panic to detect use after free.
bug report stsp@; OK mvs@
|
|
tracepoint for each type of refcnt we have. As a start, add inpcb
and tdb refcnt. When the counter changes, btrace may print the
actual object, the current counter, the change value and optionally
the stack trace.
discussed with visa@; OK mpi@
|
|
OK mpi@
|
|
OK mpi@
|
|
proper strings, adapt dt's exported string in the same way.
Old/new files/tools will not work the same way.
That this interface needs to also change was pointed out by jsg
|
|
One can use them on non-VMM architectures, but they obviously won't hit:
# arch -s ; btrace -l | grep vmm
sparc64
tracepoint:vmm:guest_enter
tracepoint:vmm:guest_exit
Move them under __amd64__ to avoid confusion and safe a few bytes.
OK dv
|
|
OK mpi@
|
|
feedback and ok tb@ jmc@ ok ratchov@
|
|
is enabled by default, this line does not provide much information.
requested by kettenis@ deraadt@; OK mpi@
|
|
the user anyway and close(2) may crash after setuid(2).
Reported-by: syzbot+90e094f33d329fb2c3ab@syzkaller.appspotmail.com
OK deraadt@
|
|
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.
With that, 'struct emul' is unused: delete it and all its references
ok millert@
|
|
goes on in SMR.
OK mpi@
|
|
|
|
ok mpi@
|
|
|
|
KERNEL_LOCK() held
|
|
KERNEL_LOCK() held
discussed with and OK mpi@
|
|
this allows us to dynamically trace function boundaries with btrace by patching
prologues and epilogues with a breakpoint upon which the handler records the data,
sends it back to userland for btrace to consume.
currently it's hidden behind DDBPROF, and there is still a lot to cleanup and
improve, but basic scripts that observe return codes from a probed function
work.
from Tom Rollet, with various changes by me
feedback and ok mpi@
|
|
|