Age | Commit message (Collapse) | Author |
|
viornd did not mask the descriptor value in the avialable ring
allowing guest values to read past the end of the descriptor table.
While here, change fatal to fatalx because errno is not set.
Reported by Ilja van Sprundel
ok mlarkin@
|
|
Guest can cause out of bounds read with a malformed descriptor. In same
loop, also fix a chunk size calculation.
Reported by Ilja van Sprundel.
ok mlarkin@
|
|
If {c,m}alloc fail, info could be NULL and result in NULL deref.
Reported by Ilja van Sprundel.
ok mlarkin@
|
|
Reported by Ilja van Sprundel.
ok mlarkin@
|
|
Used originally to aid dev. Unneeded.
ok mlarkin@
|
|
Remove legacy state handling on the ns8250 and virtio network devices
originally put in place before using libevent for async device
events. The vcpu thread doesn't need to process device data as it is
handled by the libevent thread.
This has the benefit of simplifying some of the message passing
between threads introduced to the ns8250 uart since both the vcpu
and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@,
weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
|
The original implementation of the virtio network device assumed a
driver would only provide a 2-descriptor chain for receiving packets.
The virtio spec allows for variable length chains and drivers, in
practice, construct them when they use a sufficiently large MTU.
This change lets the device use variable length chains provided by
the driver, thus allowing for drivers to set an MTU up to the
underlying host-side tap(4)'s limit of TUNMRU (16384).
Size limitations are now enforced on both tx and rx-side dropping
anything violating the underlying tap(4) min and max limits.
More work is needed to increase the read(2) buffer in use by vmd
to prevent packet truncation.
OK mlarkin@
|
|
Linux guests like to issue VIRTIO_BLK_T_GET_ID commands in attempts
to read the device serial number. It's not part of the virtio spec,
but has been part of QEMU and Bhyve for multiple years. It will be
landing in the next version of virtio (1.2), so this stubs out
handling for the request type. The added benefit is it helps squelch
log noise from Linux guests.
For now, no serial number is set and the request status is set to
VIRTIO_BLK_S_UNSUPP to tell the driver we don't support it.
While here, swap the response to VIRTIO_BLK_T_FLUSH{,_OUT} to be
also returning VIRTIO_BLK_S_UNSUPP. It's not negotiated nor
implemented. Lastly, add checks for validating the vioblk device
is only reading/writing descriptors with approrpriate read/write-only
flags per the virtio spec.
With input from claudio@, OK mlarkin@
|
|
Lots of organic growth other the years lead to unnecessary includes
(proc.h everywhere) and odd dependencies between header files. This
cleans things up a bit to help with upcoming cleanup around dhcp
code.
No functional change.
"go for it" mlarkin@
|
|
No need for each case in the switch block to have the same logic
for updating the used ring and writing the state back to the guest.
Move it outside the switch. No functional change.
ok mlarkin@
|
|
A vmd guest can craft invalid virtio descriptor lengths resulting
in reading and writing beyond stack-allocated buffer lengths providing
an escape vector to the host.
Instead of allowing the guest to dictate read/write lengths, this
commit has vmd just use compile-time lengths based on the source
or destination object sizes. For instances where vmd's virtio
implementation can't use this method, such as reading packets from
the vionet device, cap each read with a pre-computed max chunk size.
Reported by Maxime Villard.
Tested with help from Mischa Peters, OK mlarkin@
|
|
Add protections against guests with bad virtio-{blk,net,scsi}
drivers, specifically avoiding invalid descriptor chains and
invalid vionet packet sizes. This helps prevent possible lockup
of the host vm process due to a spinning device event loop thread.
Also fix an unneeded cast in the vioblk handling in case of invalid
buffer lengths.
OK mlarkin@
|
|
Because dhcpsz was an uninitialized ssize_t, it was possible that a
garbage "packet" would be queued on the receiving end of the virtio
network device.
Change the type to size_t and add proper checks based on it being
greater than zero. Remove the cast of ssize_t to uint64_t that also
caused garbage sizes when dhcpsz was unintialized and set at runtime
to something < 0.
|
|
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
|
in a ring bundle.
ok florian
|
|
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and
pause. We now define a paused and unpaused condition so that a call to
pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back
when unpausing. This is because some callbacks call pthread_mutex_lock and if
the vm is paused, it would block also causing the libevent thread to block.
This would mean that we would not be able to process any IMSGs received from vmm
(parent process) including a message to unpause.
ok mlarkin@
|
|
we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
|
After debugging with ori@, it looks like an event ends up on the wrong
libevent queue, and we end continually de-queueing and re-queueing the
event continually. While it's unclear exactly why this happened, a clue
on libevent's github issues page for the same problem pointed us to using
a different event base for the device events. This seems to have unstuck
ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting
new libevent bases for those later. But those events are pretty
frequency.
with help from and ok ori@
|
|
ok reyk, mpi, benno, tb
|
|
|
|
On some recent Linux guests, the virtio network interface is named based
on its PCI slot assignment, eg "enp0s3".
Prior to this change, vmd assigned disks first, meaning if you used a disk
image to install Linux and then removed it after install, the network
interface name would change from "enp0s3" to "enp0s2" (for example). This
broke any autoconfiguration script config files written during the install
and generally led to users just being confused about what was going on.
This change reorders the vmd PCI device assignment to put network
interfaces before disks, as disk devices don't seem to have the same
naming issue. This means the slot for network interfaces won't change.
IMPORTANT NOTE - if you have existing Linux guest VMs, you'll need to
manually fixup your config files (once).
ok ajacoutot, phessler, ccardenas, deraadt@
|
|
include new virtio_pcireg.h header
|
|
currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the
internal dhcp server will pass "auto_install" as boot file to the client and
the boot loader passes the MAC of the first interface to the kernel to indicate
PXE booting. Adding boot order support to SeaBIOS is not yet implemented.
Ok ccardenas@
|
|
This way they are in the appropriate place and code can be shared with vmd.
Ok ori@ mlarkin@ ccardenas@
|
|
The -i option to vmctl create (eg. vmctl create output.qcow2 -i input.img)
lets you create a new image from an input file and convert it if it is a
different format. This allows to convert qcow2 images from raw images,
raw from qcow2, or even qcow2 from qcow2 and raw from raw to re-optimize
the disk.
This re-uses Ori's vioqcow2.c from vmd by reaching into it and
compiling it in. The API has been adjust to be used from both vmctl
and vmd accordingly.
OK mlarkin@
|
|
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.
A limitation of this format is that modifying the base image will
corrupt the derived image.
This change also adds support for creating disk derived disk images to
vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein
OK mlarkin@ reyk@
|
|
implicit ok from pd@ since he came up with the same diff
|
|
OK mlarkin@
|
|
- qcow2: general cleanup
- vioraw: check malloc
- virtio: add function to sync disks
- vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
|
This unbreaks vmctl receive.
ok ccardenas@
|
|
Change "fmt" to "format".
Ok kn@
|
|
While here, minor cleanup on logging.
|
|
Users are able to declare disk images as 'raw' or 'qcow2' using either
vmctl and vm.conf. The default disk image format is 'raw' if not specified.
Examples of using disk format:
vmctl start bsd -Lc -r cd64.iso -d qcow2:current.qc2
or
vmctl start bsd -Lc -r cd64.iso -d raw:current.raw
is equivalent to
vmctl start bsd -Lc -r cd64.iso -d current.raw
in vm.conf
vm "current" {
disable
memory 2G
disk "/home/user/vmm/current.qc2" format "qcow2"
interface { switch "external" }
}
or
vm "current" {
disable
memory 2G
disk "/home/user/vmm/current.raw" format "raw"
interface { switch "external" }
}
is equivlanet to
vm "current" {
disable
memory 2G
disk "/home/user/vmm/current.raw"
interface { switch "external" }
}
Tested by many.
Big Thanks to Ori Bernstein.
|
|
This is prep work for adding qcow2 image support.
From Ori Bernstein. Many thanks!
Tested by many.
OK ccardenas@
|
|
situations where vmd gets stuck at 100% cpu usage because the guest VM
is constantly trying to ack interrupts that already occurred.
tested by phessler on a VM that used to exhibit the issue.
ok phessler
|
|
ok kettenis
|
|
|
|
|
|
ok phessler
|
|
Linux kernels after about 4.11.x or so exhibited problems with vmd(8)'s
virtio implementation. This commit fixes two bugs - an descriptor index
problem for the receive queue and a problem where the packet data was
being copied into the secondary descriptor buffer (should now be the
first descriptor's buffer, since that has enough size now for a non-jumbo
frame). Verified on ubuntu 17.10 (linux 4.13.x) and regression tested
on a variety of older linux guests and non-linux guests.
ok ccardenas, phessler
|
|
|
|
This unbreaks send / receive. Also tested send / receive for vms with cdrom
by booting install62.iso on a vm with a small empty disk, send to file,
receive into a new vm and running an install of bsd* and base.
ok ccardenas@
|
|
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
|
ok mlarkin@
|
|
used to crash after roughly 68 hours uptime.
ok deraadt
|
|
VIRTIO_BLK_T_GET_ID.
suggested by sf@
|
|
|
|
Diff supplied by Nick Owens, who was kind enough to also point out the
virtio spec section numbers that defined this behaviour.
|
|
async io operations. ok mlarkin
|
|
copypaste bug that didn't hurt us as long as all the queue sizes were
the same, which was the case up to now.
suggested by sf@, ok krw@
|