Fix also a similar typo in a code comment. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20201117193448.393472-1-sw@weilnetz.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
		
			
				
	
	
		
			307 lines
		
	
	
		
			9.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			307 lines
		
	
	
		
			9.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
QEMU virtio-fs shared file system daemon
 | 
						|
========================================
 | 
						|
 | 
						|
Synopsis
 | 
						|
--------
 | 
						|
 | 
						|
**virtiofsd** [*OPTIONS*]
 | 
						|
 | 
						|
Description
 | 
						|
-----------
 | 
						|
 | 
						|
Share a host directory tree with a guest through a virtio-fs device.  This
 | 
						|
program is a vhost-user backend that implements the virtio-fs device.  Each
 | 
						|
virtio-fs device instance requires its own virtiofsd process.
 | 
						|
 | 
						|
This program is designed to work with QEMU's ``--device vhost-user-fs-pci``
 | 
						|
but should work with any virtual machine monitor (VMM) that supports
 | 
						|
vhost-user.  See the Examples section below.
 | 
						|
 | 
						|
This program must be run as the root user.  The program drops privileges where
 | 
						|
possible during startup although it must be able to create and access files
 | 
						|
with any uid/gid:
 | 
						|
 | 
						|
* The ability to invoke syscalls is limited using seccomp(2).
 | 
						|
* Linux capabilities(7) are dropped.
 | 
						|
 | 
						|
In "namespace" sandbox mode the program switches into a new file system
 | 
						|
namespace and invokes pivot_root(2) to make the shared directory tree its root.
 | 
						|
A new pid and net namespace is also created to isolate the process.
 | 
						|
 | 
						|
In "chroot" sandbox mode the program invokes chroot(2) to make the shared
 | 
						|
directory tree its root. This mode is intended for container environments where
 | 
						|
the container runtime has already set up the namespaces and the program does
 | 
						|
not have permission to create namespaces itself.
 | 
						|
 | 
						|
Both sandbox modes prevent "file system escapes" due to symlinks and other file
 | 
						|
system objects that might lead to files outside the shared directory.
 | 
						|
 | 
						|
Options
 | 
						|
-------
 | 
						|
 | 
						|
.. program:: virtiofsd
 | 
						|
 | 
						|
.. option:: -h, --help
 | 
						|
 | 
						|
  Print help.
 | 
						|
 | 
						|
.. option:: -V, --version
 | 
						|
 | 
						|
  Print version.
 | 
						|
 | 
						|
.. option:: -d
 | 
						|
 | 
						|
  Enable debug output.
 | 
						|
 | 
						|
.. option:: --syslog
 | 
						|
 | 
						|
  Print log messages to syslog instead of stderr.
 | 
						|
 | 
						|
.. option:: -o OPTION
 | 
						|
 | 
						|
  * debug -
 | 
						|
    Enable debug output.
 | 
						|
 | 
						|
  * flock|no_flock -
 | 
						|
    Enable/disable flock.  The default is ``no_flock``.
 | 
						|
 | 
						|
  * modcaps=CAPLIST
 | 
						|
    Modify the list of capabilities allowed; CAPLIST is a colon separated
 | 
						|
    list of capabilities, each preceded by either + or -, e.g.
 | 
						|
    ''+sys_admin:-chown''.
 | 
						|
 | 
						|
  * log_level=LEVEL -
 | 
						|
    Print only log messages matching LEVEL or more severe.  LEVEL is one of
 | 
						|
    ``err``, ``warn``, ``info``, or ``debug``.  The default is ``info``.
 | 
						|
 | 
						|
  * posix_lock|no_posix_lock -
 | 
						|
    Enable/disable remote POSIX locks.  The default is ``no_posix_lock``.
 | 
						|
 | 
						|
  * readdirplus|no_readdirplus -
 | 
						|
    Enable/disable readdirplus.  The default is ``readdirplus``.
 | 
						|
 | 
						|
  * sandbox=namespace|chroot -
 | 
						|
    Sandbox mode:
 | 
						|
    - namespace: Create mount, pid, and net namespaces and pivot_root(2) into
 | 
						|
    the shared directory.
 | 
						|
    - chroot: chroot(2) into shared directory (use in containers).
 | 
						|
    The default is "namespace".
 | 
						|
 | 
						|
  * source=PATH -
 | 
						|
    Share host directory tree located at PATH.  This option is required.
 | 
						|
 | 
						|
  * timeout=TIMEOUT -
 | 
						|
    I/O timeout in seconds.  The default depends on cache= option.
 | 
						|
 | 
						|
  * writeback|no_writeback -
 | 
						|
    Enable/disable writeback cache. The cache allows the FUSE client to buffer
 | 
						|
    and merge write requests.  The default is ``no_writeback``.
 | 
						|
 | 
						|
  * xattr|no_xattr -
 | 
						|
    Enable/disable extended attributes (xattr) on files and directories.  The
 | 
						|
    default is ``no_xattr``.
 | 
						|
 | 
						|
.. option:: --socket-path=PATH
 | 
						|
 | 
						|
  Listen on vhost-user UNIX domain socket at PATH.
 | 
						|
 | 
						|
.. option:: --socket-group=GROUP
 | 
						|
 | 
						|
  Set the vhost-user UNIX domain socket gid to GROUP.
 | 
						|
 | 
						|
.. option:: --fd=FDNUM
 | 
						|
 | 
						|
  Accept connections from vhost-user UNIX domain socket file descriptor FDNUM.
 | 
						|
  The file descriptor must already be listening for connections.
 | 
						|
 | 
						|
.. option:: --thread-pool-size=NUM
 | 
						|
 | 
						|
  Restrict the number of worker threads per request queue to NUM.  The default
 | 
						|
  is 64.
 | 
						|
 | 
						|
.. option:: --cache=none|auto|always
 | 
						|
 | 
						|
  Select the desired trade-off between coherency and performance.  ``none``
 | 
						|
  forbids the FUSE client from caching to achieve best coherency at the cost of
 | 
						|
  performance.  ``auto`` acts similar to NFS with a 1 second metadata cache
 | 
						|
  timeout.  ``always`` sets a long cache lifetime at the expense of coherency.
 | 
						|
  The default is ``auto``.
 | 
						|
 | 
						|
xattr-mapping
 | 
						|
-------------
 | 
						|
 | 
						|
By default the name of xattr's used by the client are passed through to the server
 | 
						|
file system.  This can be a problem where either those xattr names are used
 | 
						|
by something on the server (e.g. selinux client/server confusion) or if the
 | 
						|
virtiofsd is running in a container with restricted privileges where it cannot
 | 
						|
access some attributes.
 | 
						|
 | 
						|
A mapping of xattr names can be made using -o xattrmap=mapping where the ``mapping``
 | 
						|
string consists of a series of rules.
 | 
						|
 | 
						|
The first matching rule terminates the mapping.
 | 
						|
The set of rules must include a terminating rule to match any remaining attributes
 | 
						|
at the end.
 | 
						|
 | 
						|
Each rule consists of a number of fields separated with a separator that is the
 | 
						|
first non-white space character in the rule.  This separator must then be used
 | 
						|
for the whole rule.
 | 
						|
White space may be added before and after each rule.
 | 
						|
 | 
						|
Using ':' as the separator a rule is of the form:
 | 
						|
 | 
						|
``:type:scope:key:prepend:``
 | 
						|
 | 
						|
**scope** is:
 | 
						|
 | 
						|
- 'client' - match 'key' against a xattr name from the client for
 | 
						|
             setxattr/getxattr/removexattr
 | 
						|
- 'server' - match 'prepend' against a xattr name from the server
 | 
						|
             for listxattr
 | 
						|
- 'all' - can be used to make a single rule where both the server
 | 
						|
          and client matches are triggered.
 | 
						|
 | 
						|
**type** is one of:
 | 
						|
 | 
						|
- 'prefix' - is designed to prepend and strip a prefix;  the modified
 | 
						|
  attributes then being passed on to the client/server.
 | 
						|
 | 
						|
- 'ok' - Causes the rule set to be terminated when a match is found
 | 
						|
  while allowing matching xattr's through unchanged.
 | 
						|
  It is intended both as a way of explicitly terminating
 | 
						|
  the list of rules, and to allow some xattr's to skip following rules.
 | 
						|
 | 
						|
- 'bad' - If a client tries to use a name matching 'key' it's
 | 
						|
  denied using EPERM; when the server passes an attribute
 | 
						|
  name matching 'prepend' it's hidden.  In many ways it's use is very like
 | 
						|
  'ok' as either an explicit terminator or for special handling of certain
 | 
						|
  patterns.
 | 
						|
 | 
						|
**key** is a string tested as a prefix on an attribute name originating
 | 
						|
on the client.  It maybe empty in which case a 'client' rule
 | 
						|
will always match on client names.
 | 
						|
 | 
						|
**prepend** is a string tested as a prefix on an attribute name originating
 | 
						|
on the server, and used as a new prefix.  It may be empty
 | 
						|
in which case a 'server' rule will always match on all names from
 | 
						|
the server.
 | 
						|
 | 
						|
e.g.:
 | 
						|
 | 
						|
  ``:prefix:client:trusted.:user.virtiofs.:``
 | 
						|
 | 
						|
  will match 'trusted.' attributes in client calls and prefix them before
 | 
						|
  passing them to the server.
 | 
						|
 | 
						|
  ``:prefix:server::user.virtiofs.:``
 | 
						|
 | 
						|
  will strip 'user.virtiofs.' from all server replies.
 | 
						|
 | 
						|
  ``:prefix:all:trusted.:user.virtiofs.:``
 | 
						|
 | 
						|
  combines the previous two cases into a single rule.
 | 
						|
 | 
						|
  ``:ok:client:user.::``
 | 
						|
 | 
						|
  will allow get/set xattr for 'user.' xattr's and ignore
 | 
						|
  following rules.
 | 
						|
 | 
						|
  ``:ok:server::security.:``
 | 
						|
 | 
						|
  will pass 'securty.' xattr's in listxattr from the server
 | 
						|
  and ignore following rules.
 | 
						|
 | 
						|
  ``:ok:all:::``
 | 
						|
 | 
						|
  will terminate the rule search passing any remaining attributes
 | 
						|
  in both directions.
 | 
						|
 | 
						|
  ``:bad:server::security.:``
 | 
						|
 | 
						|
  would hide 'security.' xattr's in listxattr from the server.
 | 
						|
 | 
						|
A simpler 'map' type provides a shorter syntax for the common case:
 | 
						|
 | 
						|
``:map:key:prepend:``
 | 
						|
 | 
						|
The 'map' type adds a number of separate rules to add **prepend** as a prefix
 | 
						|
to the matched **key** (or all attributes if **key** is empty).
 | 
						|
There may be at most one 'map' rule and it must be the last rule in the set.
 | 
						|
 | 
						|
xattr-mapping Examples
 | 
						|
----------------------
 | 
						|
 | 
						|
1) Prefix all attributes with 'user.virtiofs.'
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
-o xattrmap=":prefix:all::user.virtiofs.::bad:all:::"
 | 
						|
 | 
						|
 | 
						|
This uses two rules, using : as the field separator;
 | 
						|
the first rule prefixes and strips 'user.virtiofs.',
 | 
						|
the second rule hides any non-prefixed attributes that
 | 
						|
the host set.
 | 
						|
 | 
						|
This is equivalent to the 'map' rule:
 | 
						|
 | 
						|
::
 | 
						|
-o xattrmap=":map::user.virtiofs.:"
 | 
						|
 | 
						|
2) Prefix 'trusted.' attributes, allow others through
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
   "/prefix/all/trusted./user.virtiofs./
 | 
						|
    /bad/server//trusted./
 | 
						|
    /bad/client/user.virtiofs.//
 | 
						|
    /ok/all///"
 | 
						|
 | 
						|
 | 
						|
Here there are four rules, using / as the field
 | 
						|
separator, and also demonstrating that new lines can
 | 
						|
be included between rules.
 | 
						|
The first rule is the prefixing of 'trusted.' and
 | 
						|
stripping of 'user.virtiofs.'.
 | 
						|
The second rule hides unprefixed 'trusted.' attributes
 | 
						|
on the host.
 | 
						|
The third rule stops a guest from explicitly setting
 | 
						|
the 'user.virtiofs.' path directly.
 | 
						|
Finally, the fourth rule lets all remaining attributes
 | 
						|
through.
 | 
						|
 | 
						|
This is equivalent to the 'map' rule:
 | 
						|
 | 
						|
::
 | 
						|
-o xattrmap="/map/trusted./user.virtiofs./"
 | 
						|
 | 
						|
3) Hide 'security.' attributes, and allow everything else
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
    "/bad/all/security./security./
 | 
						|
     /ok/all///'
 | 
						|
 | 
						|
The first rule combines what could be separate client and server
 | 
						|
rules into a single 'all' rule, matching 'security.' in either
 | 
						|
client arguments or lists returned from the host.  This stops
 | 
						|
the client seeing any 'security.' attributes on the server and
 | 
						|
stops it setting any.
 | 
						|
 | 
						|
Examples
 | 
						|
--------
 | 
						|
 | 
						|
Export ``/var/lib/fs/vm001/`` on vhost-user UNIX domain socket
 | 
						|
``/var/run/vm001-vhost-fs.sock``:
 | 
						|
 | 
						|
::
 | 
						|
 | 
						|
  host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001
 | 
						|
  host# qemu-system-x86_64 \
 | 
						|
      -chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
 | 
						|
      -device vhost-user-fs-pci,chardev=char0,tag=myfs \
 | 
						|
      -object memory-backend-memfd,id=mem,size=4G,share=on \
 | 
						|
      -numa node,memdev=mem \
 | 
						|
      ...
 | 
						|
  guest# mount -t virtiofs myfs /mnt
 |