src - OpenBSD base system

Age	Commit message (Collapse)	Author
2024-10-23	remove duplicate MCX_CAP_DEVICE_DRAIN_SIGERR define	Jonathan Gray

2024-10-04	As with other multiqueue drivers, print the number of queues we set up	Jonathan Matthew
	along with the interrupt and ethernet address details. ok dlg@
2024-05-24	remove unneeded includes; ok miod@	Jonathan Gray

2024-04-12	fix non-auto setting of extended media type bits	Jonathan Gray
	found by smatch warning about uninitialised var use ok jmatthew@
2024-04-11	Match on ConnectX-6 virtual functions too, since they don't seem to be	Jonathan Matthew
	any different to earlier revisions. from Brad
2024-04-11	Add support for media types from the extended ethernet capabilities fields.	Jonathan Matthew
	If none of the regular ethernet capabilities are present, check the extended capabilities. Since we only report that the link is active if there's a detected media type, this isn't just a cosmetic change. Joerg Streckfuss reported that a gigabit SFP didn't work in a ConnectX-6 Lx, and tested that this change makes it work. ok dlg@
2023-11-10	Make ifq and ifiq interface MP safe.	Alexander Bluhm
	Rename ifq_set_maxlen() to ifq_init_maxlen(). This function neither uses WRITE_ONCE() nor a mutex and is called before the ifq mutex is initialized. The new name expresses that it should be used only during interface attach when there is no concurrency. Protect ifq_len(), ifq_empty(), ifiq_len(), and ifiq_empty() with READ_ONCE(). They can be used without lock as they only read a single integer. OK dlg@
2023-09-18	Add 100GB LR4 Ethernet capability and map it to IFM_100G_LR4.	Jonathan Matthew
	This isn't listed in the public PRM but it can be found in the Linux driver. from Olivier Croquin
2023-09-07	match on Mellanox ConnectX-6 Lx	Jonathan Gray
	from and tested by Olivier Croquin ok dlg@
2023-08-15	Replace a bunch of (1 << 31) with (1U << 31)	Miod Vallat

2023-06-06	don't need mcx_uptime() now that we have nsecuptime()	David Gwynne
	ok jmatthew@
2022-11-22	Allocate additional command queue slots and use command completion events	Jonathan Matthew
	to run commands where we can sleep while waiting. Rather than actually using it as a queue, just allocate the slots to particular uses. The first slot is used for polled commands (anything run while cold), then there's one for general ioctls, one for kstat reads, and one for link operations. Since we can sleep while waiting now, we need to serialize access to the command slots. This is done with rwlocks for the ioctl and kstat slots, and link slot is only used from a single instance task. This also means we don't need to hold the kernel lock while doing kstat reads. Using interrupt based command completion drops the time taken to read all the kstats off mcx interfaces from tens of milliseconds to almost nothing, which is a pretty big win when you're reading them every few seconds on busy firewalls. ok dlg@
2022-06-26	Break out of the switch statement rather than returning early on ioctl	Jonathan Matthew
	errors, ensuring the IPL is correctly restored. from Christian Ludwig
2022-03-11	Constify struct cfattach.	Martin Pieuchot

2022-01-09	spelling	Jonathan Gray
	feedback and ok tb@ jmc@ ok ratchov@
2021-07-23	pci_intr_msix_count() is the function that drivers using multiple MSI-X	Jonathan Matthew
	vectors use to decide whether to use MSI-X, so make it return 0 if MSI is not enabled for the device. fixes problems with ix(4) on older amd64 hardware and current riscv64 ok kettenis@ dlg@
2021-06-02	When processing a received packet, only sync the amount of bytes	Patrick Wildt
	mcx(4) told us has arrived. The DMA map's mapsize on RX packets is the length of the allocated buffer. For mcx(4), this can be more than around 9000 bytes, as each buffer will be at least as big as the maximum supported MTU. There's no need to sync the whole buffer, if it's only a small packet. ok dlg@ jmatthew@
2021-02-25	we don't have to cast to caddr_t when calling m_copydata anymore.	David Gwynne
	the first cut of this diff was made with coccinelle using this spatch: @rule@ type caddr_t; expression m, off, len, cp; @@ -m_copydata(m, off, len, (caddr_t)cp) +m_copydata(m, off, len, cp) i had fix it's opinionated idea of formatting by hand though, so i'm not sure it was worth it. ok deraadt@ bluhm@
2021-02-15	move the rearming of the cq after the refill of the rq.	David Gwynne
	this is the only real diff we have left outstanding on a box that experienced rx lockups. since adding this change it's been happy for the last 4 weeks and counting so far. ok jmatthew@
2021-01-27	do better accounting of how many msix interrupts we want to use.	David Gwynne
	ok jmatthew@
2021-01-25	raise the max number of queues/interrupts to 16, up from 1.	David Gwynne
	jmatthew@ has tried this before, but hrvoje popovski experienced breakage so it wasn't enabled. we've tightened the code up since then so it's time to try again. this diff has been tested by hrvoje popovski and myself ok jmatthew@
2021-01-25	don't lose the M_FLOWID flag if the ipv4 cksum is ok.	David Gwynne
	found while poking around with hrvoje popovski yes jmatthew@
2021-01-25	use an intrmap when establishing interrupts for queues.	David Gwynne
	mcx is still hardcoded/limited to 1 queue for now, but this lets different mcx devices use different cpus for handling packets. looks good jmatthew@
2021-01-20	Check management capabilities before trying to attach temperature sensors,	Jonathan Matthew
	avoiding an unhelpful error message if the card's firmware doesn't expose the sensor registers. tested by chris@, who saw the unhelpful error message ok dlg@
2021-01-04	the tx doorbell is next to the rx doorbell, not on top of it.	David Gwynne

2021-01-04	use bus_dmamap_sync around updates to the doorbells.	David Gwynne
	ok jmatthew@
2020-12-27	have mcx_process_txeof return the number of slots it processed.	David Gwynne
	it used a pointer in an argument to communicate that back to the caller, while being a void functon. this seems more natural and brings it in line with how the rx completion function returns free slots to its caller too.
2020-12-27	do a bus space barrier after arming the eq.	David Gwynne
	ok jmatthew@
2020-12-27	disable timestamping a little bit harder to avoid divide by 0.	David Gwynne
	hrvoje popovski reports the current code faults on some boxes. i'm working on it, but the code isn't being used right now.
2020-12-27	shuffle filling the rx ring so the sw prod is updated before the hw.	David Gwynne
	ok jmatthew@
2020-12-26	reuse the calculated vector as the argument to pci_intr_map_msix.	David Gwynne
	doing the maths again feels error prone.
2020-12-26	add bus_dmamap_sync ops around the eq.	David Gwynne
	ok jmatthew@
2020-12-26	add some bus_dmamap_syncs around the rq.	David Gwynne
	ok jmatthew@
2020-12-26	sprinkle some bus_dmamap_syncs around the cq handling.	David Gwynne
	ok jmatthew@
2020-12-26	sprinkle some bus_dmamap_syncs around the sq.	David Gwynne
	ok jmatthew@
2020-12-26	better manage the lifetime of the dmamem used for various rings.	David Gwynne
	ok jmatthew@
2020-12-25	expose the mcx timer as a timecounter.	David Gwynne
	this is mostly to help me better understand where i accumulate error when trying to sync the chip to the kernel clocks. ie, if im using mcx as the kernel clock source and my attempts to sync to it still produce errors, then my code is very wrong instead of slightly wrong. it's also fun and a tiny amount of code.
2020-12-17	rework the maths used to set mbuf timestamps.	David Gwynne
	there's a comment that explains how it works now, but the result is that i get much tighter and more consistent synchronisation between the kernel clock and the values derived from the mcx timestamps now. however, i only just worked out that there is still an unresolved problem where the kernel clock changes how fast it ticks. this happens when ntpd uses adjtime(2) or adjfreq(2) to try and make the kernel tick at the same rate as the rest of the universe (well, the small bit of it that it can observe). these adjustments to the kernel clock immediately skew the timestamps that mcx calculates, but then it also throws off the mcx calibration calculations that run every 30 seconds. the offsets calculated in the next calibration period are often (very) negative. eg, when things are synced up nicely and you do a read of the mcx timer and immediately follow it with a nanouptime(9) call, on this box it calculates that the time in between those two events is about 2600ns. in the calibration period after ntpd did a very small adjtime call, it now thinks the time between those two events is -700000ns. this is a pretty obvious problem in hindsight. i can't think of a simple solution to it at the moment though so i'm going to leave mcx timestamping disabled for now.
2020-12-15	fill in more of mcx_cap_device so i can get to the device frequencies.	David Gwynne
	fun fact, my Connect-x 4 Lx boards seem to run at 156MHz. less fun fact, mcx_calibrate() seems to work that out pretty well anyway, but the maths is still a bit too wonky to make it usable for mbuf timestamps.
2020-12-15	go to splhigh around the kernel clock and hardware timer reads.	David Gwynne
	the idea is to avoid some other work, like a hardware interrupt, running in between the reads of the kernel and chip clocks and therefore skewing the interval calculations. this tightens up a lot of the slop seen when using the cqe timestamp for an mbuf timestamp, but there's still something not quite right.
2020-12-15	turn hardware rx mbuf timestamping off by default.	David Gwynne
	we see too many ntp replies (appear to) arrive before the request for them was sent, which ntpd handles by disabling the peer for an hour. this was a lot easier to narrow down after fixing up bpf and timestamps, cos it let me see this: 16:50:36.051696 802.1Q vid 871 pri 3 192.0.2.55.47079 > 162.159.200.123.123: v4 client strat 0 poll 0 prec 0 [tos 0x10] 16:50:36.047201 802.1Q vid 871 pri 3 162.159.200.123.123 > 192.0.2.55.47079: v4 server strat 3 poll 0 prec -25 (DF) im going to borrow the link0 flag for a bit to allow turning timestamping on.
2020-12-12	Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.	jan
	OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
2020-11-06	Match on ConnectX-6 (non-Dx) cards too.	Jonathan Matthew
	tested by Nilson Lopes
2020-11-06	Bail out early if the port type is not Ethernet, rather than failing later	Jonathan Matthew
	on in the attach process with a useless error message. tested on a ConnectX-6 card in infiniband mode by Nilson Lopes ok dlg@
2020-10-28	Add missing bus_space_barrier() in mcx_cmdq_post() - without this, cmdq	Jonathan Matthew
	operations during attach fail on some amd64 systems using the TSC delay function, seemingly as there aren't enough memory operations happening to get the doorbell write out to the device otherwise. The lapic delay function didn't expose this problem. suggested by kettenis@ ok dlg@
2020-08-21	Add kstats reporting the software and hardware producer and consumer	Jonathan Matthew
	counters for send, receive, completion and event queues, as well as the queue states. There are still some bugs in queue handling that we're trying to track down and these should help. No change in object size without kstat enabled. ok dlg@
2020-07-23	Increase the event queue size. When polling for admin command completion,	Jonathan Matthew
	we don't wait for the event to be posted to the queue, we just look at the command itself, which means we can build up a backlog of events to be posted. Newer firmware for ConnectX-4 seems to get upset if the backlog grows beyond some fraction of the event queue size, causing an interrupt storm. This was reported by patrick@ and Hrvoje Popovski (at least) while testing support for multiple tx/rx queues. With the new event queue size, we can safely create 8 queues.
2020-07-17	Virtual functions are effectively identical to full physical functions,	Jonathan Matthew
	so we can attach to them too. ok dlg@
2020-07-17	Consistently use the port type and speed register (PTYS) to determine if	Jonathan Matthew
	the link is up, rather than the operational status (PAOS). ok dlg@
2020-07-16	Pass the interrupt handler cookie instead of the pointer to it	Patrick Wildt
	to intr_barrier(9). Fixes mysterious panics seen while working on intr_barrier(9) for arm64. ok jmatthew@