summaryrefslogtreecommitdiff
path: root/usr.sbin/vmd/virtio.c
AgeCommit message (Collapse)Author
2021-08-29Mask viornd descriptor value to prevent out of bound reads.Dave Voutila
viornd did not mask the descriptor value in the avialable ring allowing guest values to read past the end of the descriptor table. While here, change fatal to fatalx because errno is not set. Reported by Ilja van Sprundel ok mlarkin@
2021-08-29mask next descriptor value and fix chunk_size calculationDave Voutila
Guest can cause out of bounds read with a malformed descriptor. In same loop, also fix a chunk size calculation. Reported by Ilja van Sprundel. ok mlarkin@
2021-08-29check for null vioblk infoDave Voutila
If {c,m}alloc fail, info could be NULL and result in NULL deref. Reported by Ilja van Sprundel. ok mlarkin@
2021-08-29correct device status write sizeDave Voutila
Reported by Ilja van Sprundel. ok mlarkin@
2021-08-29remove old descriptor dump functionDave Voutila
Used originally to aid dev. Unneeded. ok mlarkin@
2021-07-16vmd(8): simplify vcpu logic, removing uart & vionet readsdv
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread. This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events. No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.) OK mlarkin@
2021-06-21vmd(8): support variable length vionet rx descriptor chainsdv
The original implementation of the virtio network device assumed a driver would only provide a 2-descriptor chain for receiving packets. The virtio spec allows for variable length chains and drivers, in practice, construct them when they use a sufficiently large MTU. This change lets the device use variable length chains provided by the driver, thus allowing for drivers to set an MTU up to the underlying host-side tap(4)'s limit of TUNMRU (16384). Size limitations are now enforced on both tx and rx-side dropping anything violating the underlying tap(4) min and max limits. More work is needed to increase the read(2) buffer in use by vmd to prevent packet truncation. OK mlarkin@
2021-06-17vmd(8): handle VIRTIO_BLK_T_GET_ID, check descriptor r/w flagsdv
Linux guests like to issue VIRTIO_BLK_T_GET_ID commands in attempts to read the device serial number. It's not part of the virtio spec, but has been part of QEMU and Bhyve for multiple years. It will be landing in the next version of virtio (1.2), so this stubs out handling for the request type. The added benefit is it helps squelch log noise from Linux guests. For now, no serial number is set and the request status is set to VIRTIO_BLK_S_UNSUPP to tell the driver we don't support it. While here, swap the response to VIRTIO_BLK_T_FLUSH{,_OUT} to be also returning VIRTIO_BLK_S_UNSUPP. It's not negotiated nor implemented. Lastly, add checks for validating the vioblk device is only reading/writing descriptors with approrpriate read/write-only flags per the virtio spec. With input from claudio@, OK mlarkin@
2021-06-16cleanup vmd(8) includes and header filesdv
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code. No functional change. "go for it" mlarkin@
2021-06-11vmd(8): deduplicate vioblk command logicdv
No need for each case in the switch block to have the same logic for updating the used ring and writing the state back to the guest. Move it outside the switch. No functional change. ok mlarkin@
2021-05-18vmd(8): guest virtio drivers can cause stack & buffer overflowsdv
A vmd guest can craft invalid virtio descriptor lengths resulting in reading and writing beyond stack-allocated buffer lengths providing an escape vector to the host. Instead of allowing the guest to dictate read/write lengths, this commit has vmd just use compile-time lengths based on the source or destination object sizes. For instances where vmd's virtio implementation can't use this method, such as reading packets from the vionet device, cap each read with a pre-computed max chunk size. Reported by Maxime Villard. Tested with help from Mischa Peters, OK mlarkin@
2021-04-22vmd(8): guard against bad virtio driversdv
Add protections against guests with bad virtio-{blk,net,scsi} drivers, specifically avoiding invalid descriptor chains and invalid vionet packet sizes. This helps prevent possible lockup of the host vm process due to a spinning device event loop thread. Also fix an unneeded cast in the vioblk handling in case of invalid buffer lengths. OK mlarkin@
2021-04-21Fix packet size checks and remove bad casts.dv
Because dhcpsz was an uninitialized ssize_t, it was possible that a garbage "packet" would be queued on the receiving end of the virtio network device. Change the type to size_t and add proper checks based on it being greater than zero. Remove the cast of ssize_t to uint64_t that also caused garbage sizes when dhcpsz was unintialized and set at runtime to something < 0.
2021-03-29Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcpdv
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them. This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal. OK mlarkin@
2021-03-26inspect all the packets to see if they are dhcp, not just the first oneTheo de Raadt
in a ring bundle. ok florian
2019-12-11vmd: proper concurrency control when pausing a vmpd
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state. Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause. ok mlarkin@
2019-11-30Revert previous - the stability was not as improved as we had thought andMike Larkin
we ended up accidentally breaking vmctl. This will need more thought. ok ori@
2019-11-29Fix at least one cause of VMs spinning at 100% host CPUMike Larkin
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this. We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency. with help from and ok ori@
2019-09-24vmd(8): fix memory leak in virtio network TX path.Mike Larkin
ok reyk, mpi, benno, tb
2019-09-24vmd(8): virtio.c whitespace removalMike Larkin
2019-01-22vmd: reorder PCI device assignment to fix Linux network interface numberingMike Larkin
On some recent Linux guests, the virtio network interface is named based on its PCI slot assignment, eg "enp0s3". Prior to this change, vmd assigned disks first, meaning if you used a disk image to install Linux and then removed it after install, the network interface name would change from "enp0s3" to "enp0s2" (for example). This broke any autoconfiguration script config files written during the install and generally led to users just being confused about what was going on. This change reorders the vmd PCI device assignment to put network interfaces before disks, as disk devices don't seem to have the same naming issue. This means the slot for network interfaces won't change. IMPORTANT NOTE - if you have existing Linux guest VMs, you'll need to manually fixup your config files (once). ok ajacoutot, phessler, ccardenas, deraadt@
2019-01-10unbreak vmd buildStefan Fritsch
include new virtio_pcireg.h header
2018-12-06Make it possible to define the bootdevice in vmd. This information is usedClaudio Jeker
currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
2018-11-26Move the {qcow2,raw} create functions from vmctl into vmd/vio{qcow2,raw}.cReyk Floeter
This way they are in the appropriate place and code can be shared with vmd. Ok ori@ mlarkin@ ccardenas@
2018-10-19Add support to create and convert disk images from existing imagesReyk Floeter
The -i option to vmctl create (eg. vmctl create output.qcow2 -i input.img) lets you create a new image from an input file and convert it if it is a different format. This allows to convert qcow2 images from raw images, raw from qcow2, or even qcow2 from qcow2 and raw from raw to re-optimize the disk. This re-uses Ori's vioqcow2.c from vmd by reaching into it and compiling it in. The API has been adjust to be used from both vmctl and vmd accordingly. OK mlarkin@
2018-10-08Add support for qcow2 base images (external snapshots).Reyk Floeter
This works is from Ori Bernstein, committing on his behalf: Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image. A limitation of this format is that modifying the base image will corrupt the derived image. This change also adds support for creating disk derived disk images to vmctl. To use it: vmctl create derived.qcow2 -s 16G -b base.qcow2 From Ori Bernstein OK mlarkin@ reyk@
2018-10-03Add check to ensure vioscsi pointer if validccardenas
implicit ok from pd@ since he came up with the same diff
2018-09-28Support vmd-internal's vmboot with qcow2 disk images.Reyk Floeter
OK mlarkin@
2018-09-19Various clean up items for disks.ccardenas
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing Thanks to Ori Bernstein. Ok miko@
2018-09-13vmd: set irq and vm_id in virtio dev structs on restorepd
This unbreaks vmctl receive. ok ccardenas@
2018-09-11Be consistent in logging messages.ccardenas
Change "fmt" to "format". Ok kn@
2018-09-11Fail fast when we are unable to determine disk format.ccardenas
While here, minor cleanup on logging.
2018-09-09Add initial qcow2 image support.ccardenas
Users are able to declare disk images as 'raw' or 'qcow2' using either vmctl and vm.conf. The default disk image format is 'raw' if not specified. Examples of using disk format: vmctl start bsd -Lc -r cd64.iso -d qcow2:current.qc2 or vmctl start bsd -Lc -r cd64.iso -d raw:current.raw is equivalent to vmctl start bsd -Lc -r cd64.iso -d current.raw in vm.conf vm "current" { disable memory 2G disk "/home/user/vmm/current.qc2" format "qcow2" interface { switch "external" } } or vm "current" { disable memory 2G disk "/home/user/vmm/current.raw" format "raw" interface { switch "external" } } is equivlanet to vm "current" { disable memory 2G disk "/home/user/vmm/current.raw" interface { switch "external" } } Tested by many. Big Thanks to Ori Bernstein.
2018-08-25Rework disks to have pluggable backends.ccardenas
This is prep work for adding qcow2 image support. From Ori Bernstein. Many thanks! Tested by many. OK ccardenas@
2018-07-09vmd(8): deassert interrupt pins in the PIC at the right times. Helps fixMike Larkin
situations where vmd gets stuck at 100% cpu usage because the guest VM is constantly trying to ack interrupts that already occurred. tested by phessler on a VM that used to exhibit the issue. ok phessler
2018-07-09vmd(8): stash device IRQ in the device structMike Larkin
ok kettenis
2018-06-19knfReyk Floeter
2018-04-30vmd(8): unbreak i386Mike Larkin
2018-04-26vmd(8): use #defines for queue indices and cleanup some codeMike Larkin
ok phessler
2018-04-26vmd(8): fix broken networking on newer linux guest kernelsMike Larkin
Linux kernels after about 4.11.x or so exhibited problems with vmd(8)'s virtio implementation. This commit fixes two bugs - an descriptor index problem for the receive queue and a problem where the packet data was being copied into the secondary descriptor buffer (should now be the first descriptor's buffer, since that has enough size now for a non-jumbo frame). Verified on ubuntu 17.10 (linux 4.13.x) and regression tested on a variety of older linux guests and non-linux guests. ok ccardenas, phessler
2018-04-26spelling error in log messageccardenas
2018-02-01vmd: fix vioscsi dump and restorepd
This unbreaks send / receive. Also tested send / receive for vms with cdrom by booting install62.iso on a vm with a small empty disk, send to file, receive into a new vm and running an install of bsd* and base. ok ccardenas@
2018-01-03Add initial CD-ROM support to VMD via vioscsi.ccardenas
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM. ok mlarkin@, jca@
2017-09-17vmd: send/recv pci config space instead of recreating pci devices on receivepd
ok mlarkin@
2017-09-08vmd: handle queue index wraparound in viornd. Without this, openbsd guestsMike Larkin
used to crash after roughly 68 hours uptime. ok deraadt
2017-08-20vmd: return VIRTIO_BLK_S_UNSUPP on any unknown vioblk command, not justMike Larkin
VIRTIO_BLK_T_GET_ID. suggested by sf@
2017-08-10whitespaceMike Larkin
2017-08-05vmd: report queue size of 0 when invalid queues are requested by the guestMike Larkin
Diff supplied by Nick Owens, who was kind enough to also point out the virtio spec section numbers that defined this behaviour.
2017-05-30split vioblk read/write functions into start and finish as prep forTed Unangst
async io operations. ok mlarkin
2017-05-30increase vmd(8) virtio queue size from 64 to 128. Also fix an oldMike Larkin
copypaste bug that didn't hurt us as long as all the queue sizes were the same, which was the case up to now. suggested by sf@, ok krw@