=====================================
Wine synchronization primitive driver
=====================================

This page documents the user-space API for the winesync driver.

winesync is a support driver for emulation of NT synchronization
primitives by the Wine project or other NT emulators. It exists
because implementation in user-space, using existing tools, cannot
simultaneously satisfy performance, correctness, and security
constraints. It is implemented entirely in software, and does not
drive any hardware device.

This interface is meant as a compatibility tool only, and should not
be used for general synchronization. Instead use generic, versatile
interfaces such as futex(2) and poll(2).

Synchronization primitives
==========================

The winesync driver exposes three types of synchronization primitives:
semaphores, mutexes, and events.

A semaphore holds a single volatile 32-bit counter, and a static
32-bit integer denoting the maximum value. It is considered signaled
when the counter is nonzero. The counter is decremented by one when a
wait is satisfied. Both the initial and maximum count are established
when the semaphore is created.

A mutex holds a volatile 32-bit recursion count, and a volatile 32-bit
identifier denoting its owner. A mutex is considered signaled when its
owner is zero (indicating that it is not owned). The recursion count
is incremented when a wait is satisfied, and ownership is set to the
given identifier.

A mutex also holds an internal flag denoting whether its previous
owner has died; such a mutex is said to be inconsistent. Owner death
is not tracked automatically based on thread death, but rather must be
communicated using ``WINESYNC_IOC_KILL_OWNER``. An inconsistent mutex
is inherently considered unowned.

Except for the "unowned" semantics of zero, the actual value of the
owner identifier is not interpreted by the winesync driver at all. The
intended use is to store a thread identifier; however, the winesync
driver does not actually validate that a calling thread provides
consistent or unique identifiers.

An event holds a volatile boolean state denoting whether it is
signaled or not. There are two types of events, auto-reset and
manual-reset. An auto-reset event is designaled when a wait is
satisfied; a manual-reset event is not. The event type is specified
when the event is created.

Unless specified otherwise, all operations on an object are atomic and
totally ordered with respect to other operations on the same object.

Objects are represented by unsigned 32-bit integers.

Char device
===========

The winesync driver creates a single char device /dev/winesync. Each
file description opened on the device represents a unique namespace.
That is, objects created on one open file description are shared
across all its individual descriptors, but are not shared with other
open() calls on the same device. The same file description may be
shared across multiple processes.

ioctl reference
===============

All operations on the device are done through ioctls. There are three
structures used in ioctl calls::

   struct winesync_sem_args {
	__u32 sem;
	__u32 count;
	__u32 max;
   };

   struct winesync_mutex_args {
	__u32 mutex;
	__u32 owner;
	__u32 count;
   };

   struct winesync_event_args {
	__u32 event;
	__u32 signaled;
	__u32 manual;
   };

   struct winesync_wait_args {
	__u64 timeout;
	__u64 objs;
	__u32 count;
	__u32 owner;
	__u32 index;
	__u32 pad;
   };

Depending on the ioctl, members of the structure may be used as input,
output, or not at all. All ioctls return 0 on success.

The ioctls are as follows:

.. c:macro:: WINESYNC_IOC_CREATE_SEM

  Create a semaphore object. Takes a pointer to struct
  :c:type:`winesync_sem_args`, which is used as follows:

  .. list-table::

     * - ``sem``
       - On output, contains the identifier of the created semaphore.
     * - ``count``
       - Initial count of the semaphore.
     * - ``max``
       - Maximum count of the semaphore.

  Fails with ``EINVAL`` if ``count`` is greater than ``max``.

.. c:macro:: WINESYNC_IOC_CREATE_MUTEX

  Create a mutex object. Takes a pointer to struct
  :c:type:`winesync_mutex_args`, which is used as follows:

  .. list-table::

     * - ``mutex``
       - On output, contains the identifier of the created mutex.
     * - ``count``
       - Initial recursion count of the mutex.
     * - ``owner``
       - Initial owner of the mutex.

  If ``owner`` is nonzero and ``count`` is zero, or if ``owner`` is
  zero and ``count`` is nonzero, the function fails with ``EINVAL``.

.. c:macro:: WINESYNC_IOC_CREATE_EVENT

  Create an event object. Takes a pointer to struct
  :c:type:`winesync_event_args`, which is used as follows:

  .. list-table::

     * - ``event``
       - On output, contains the identifier of the created event.
     * - ``signaled``
       - If nonzero, the event is initially signaled, otherwise
         nonsignaled.
     * - ``manual``
       - If nonzero, the event is a manual-reset event, otherwise
         auto-reset.

.. c:macro:: WINESYNC_IOC_DELETE

  Delete an object of any type. Takes an input-only pointer to a
  32-bit integer denoting the object to delete.

  Wait ioctls currently in progress are not interrupted, and behave as
  if the object remains valid.

.. c:macro:: WINESYNC_IOC_PUT_SEM

  Post to a semaphore object. Takes a pointer to struct
  :c:type:`winesync_sem_args`, which is used as follows:

  .. list-table::

     * - ``sem``
       - Semaphore object to post to.
     * - ``count``
       - Count to add to the semaphore. On output, contains the
         previous count of the semaphore.
     * - ``max``
       - Not used.

  If adding ``count`` to the semaphore's current count would raise the
  latter past the semaphore's maximum count, the ioctl fails with
  ``EOVERFLOW`` and the semaphore is not affected. If raising the
  semaphore's count causes it to become signaled, eligible threads
  waiting on this semaphore will be woken and the semaphore's count
  decremented appropriately.

.. c:macro:: WINESYNC_IOC_PUT_MUTEX

  Release a mutex object. Takes a pointer to struct
  :c:type:`winesync_mutex_args`, which is used as follows:

  .. list-table::

     * - ``mutex``
       - Mutex object to release.
     * - ``owner``
       - Mutex owner identifier.
     * - ``count``
       - On output, contains the previous recursion count.

  If ``owner`` is zero, the ioctl fails with ``EINVAL``. If ``owner``
  is not the current owner of the mutex, the ioctl fails with
  ``EPERM``.

  The mutex's count will be decremented by one. If decrementing the
  mutex's count causes it to become zero, the mutex is marked as
  unowned and signaled, and eligible threads waiting on it will be
  woken as appropriate.

.. c:macro:: WINESYNC_IOC_SET_EVENT

  Signal an event object. Takes a pointer to struct
  :c:type:`winesync_event_args`, which is used as follows:

  .. list-table::

     * - ``event``
       - Event object to set.
     * - ``signaled``
       - On output, contains the previous state of the event.
     * - ``manual``
       - Unused.

  Eligible threads will be woken, and auto-reset events will be
  designaled appropriately.

.. c:macro:: WINESYNC_IOC_RESET_EVENT

  Designal an event object. Takes a pointer to struct
  :c:type:`winesync_event_args`, which is used as follows:

  .. list-table::

     * - ``event``
       - Event object to reset.
     * - ``signaled``
       - On output, contains the previous state of the event.
     * - ``manual``
       - Unused.

.. c:macro:: WINESYNC_IOC_PULSE_EVENT

  Wake threads waiting on an event object without leaving it in a
  signaled state. Takes a pointer to struct
  :c:type:`winesync_event_args`, which is used as follows:

  .. list-table::

     * - ``event``
       - Event object to pulse.
     * - ``signaled``
       - On output, contains the previous state of the event.
     * - ``manual``
       - Unused.

  A pulse operation can be thought of as a set followed by a reset,
  performed as a single atomic operation. If two threads are waiting
  on an auto-reset event which is pulsed, only one will be woken. If
  two threads are waiting a manual-reset event which is pulsed, both
  will be woken. However, in both cases, the event will be unsignaled
  afterwards, and a simultaneous read operation will always report the
  event as unsignaled.

.. c:macro:: WINESYNC_IOC_READ_SEM

  Read the current state of a semaphore object. Takes a pointer to
  struct :c:type:`winesync_sem_args`, which is used as follows:

  .. list-table::

     * - ``sem``
       - Semaphore object to read.
     * - ``count``
       - On output, contains the current count of the semaphore.
     * - ``max``
       - On output, contains the maximum count of the semaphore.

.. c:macro:: WINESYNC_IOC_READ_MUTEX

  Read the current state of a mutex object. Takes a pointer to struct
  :c:type:`winesync_mutex_args`, which is used as follows:

  .. list-table::

     * - ``mutex``
       - Mutex object to read.
     * - ``owner``
       - On output, contains the current owner of the mutex, or zero
         if the mutex is not currently owned.
     * - ``count``
       - On output, contains the current recursion count of the mutex.

  If the mutex is marked as inconsistent, the function fails with
  ``EOWNERDEAD``. In this case, ``count`` and ``owner`` are set to
  zero.

.. c:macro:: WINESYNC_IOC_READ_EVENT

  Read the current state of an event object. Takes a pointer to struct
  :c:type:`winesync_event_args`, which is used as follows:

  .. list-table::

     * - ``event``
       - Event object.
     * - ``signaled``
       - On output, contains the current state of the event.
     * - ``manual``
       - On output, contains 1 if the event is a manual-reset event,
         and 0 otherwise.

.. c:macro:: WINESYNC_IOC_KILL_OWNER

  Mark any mutexes owned by the given owner as unowned and
  inconsistent. Takes an input-only pointer to a 32-bit integer
  denoting the owner. If the owner is zero, the ioctl fails with
  ``EINVAL``.

  For each mutex currently owned by the given owner, eligible threads
  waiting on said mutex will be woken as appropriate (and such waits
  will fail with ``EOWNERDEAD``, as described below).

  The operation as a whole is not atomic; however, the modification of
  each mutex is atomic and totally ordered with respect to other
  operations on the same mutex.

.. c:macro:: WINESYNC_IOC_WAIT_ANY

  Poll on any of a list of objects, atomically acquiring at most one.
  Takes a pointer to struct :c:type:`winesync_wait_args`, which is
  used as follows:

  .. list-table::

     * - ``timeout``
       - Optional pointer to a 64-bit struct :c:type:`timespec`
         (specified as an integer so that the structure has the same
         size regardless of architecture). The timeout is specified in
         absolute format, as measured against the MONOTONIC clock. If
         the timeout is equal to or earlier than the current time, the
         function returns immediately without sleeping. If ``timeout``
         is zero, i.e. NULL, the function will sleep until an object
         is signaled, and will not fail with ``ETIMEDOUT``.
     * - ``objs``
       - Pointer to an array of ``count`` 32-bit object identifiers
         (specified as an integer so that the structure has the same
         size regardless of architecture). If any identifier is
         invalid, the function fails with ``EINVAL``.
     * - ``count``
       - Number of object identifiers specified in the ``objs`` array.
     * - ``owner``
       - Mutex owner identifier. If any object in ``objs`` is a mutex,
         the ioctl will attempt to acquire that mutex on behalf of
         ``owner``. If ``owner`` is zero, the ioctl fails with
         ``EINVAL``.
     * - ``index``
       - On success, contains the index (into ``objs``) of the object
         which was signaled. If ``alert`` was signaled instead,
         this contains ``count``.
     * - ``alert``
       - Optional event object identifier. If nonzero, this specifies
         an "alert" event object which, if signaled, will terminate
         the wait. If nonzero, the identifier must point to a valid
         event.

  This function attempts to acquire one of the given objects. If
  unable to do so, it sleeps until an object becomes signaled,
  subsequently acquiring it, or the timeout expires. In the latter
  case the ioctl fails with ``ETIMEDOUT``. The function only acquires
  one object, even if multiple objects are signaled.

  A semaphore is considered to be signaled if its count is nonzero,
  and is acquired by decrementing its count by one. A mutex is
  considered to be signaled if it is unowned or if its owner matches
  the ``owner`` argument, and is acquired by incrementing its
  recursion count by one and setting its owner to the ``owner``
  argument. An auto-reset event is acquired by designaling it; a
  manual-reset event is not affected by acquisition.

  Acquisition is atomic and totally ordered with respect to other
  operations on the same object. If two wait operations (with
  different ``owner`` identifiers) are queued on the same mutex, only
  one is signaled. If two wait operations are queued on the same
  semaphore, and a value of one is posted to it, only one is signaled.
  The order in which threads are signaled is not specified.

  If an inconsistent mutex is acquired, the ioctl fails with
  ``EOWNERDEAD``. Although this is a failure return, the function may
  otherwise be considered successful. The mutex is marked as owned by
  the given owner (with a recursion count of 1) and as no longer
  inconsistent, and ``index`` is still set to the index of the mutex.

  The ``alert`` argument is an "extra" event which can terminate the
  wait, independently of all other objects. If members of ``objs`` and
  ``alert`` are both simultaneously signaled, a member of ``objs``
  will always be given priority and acquired first. Aside from this,
  for "any" waits, there is no difference between passing an event as
  this parameter, and passing it as an additional object at the end of
  the ``objs`` array. For "all" waits, there is an additional
  difference, as described below.

  It is valid to pass the same object more than once, including by
  passing the same event in the ``objs`` array and in ``alert``. If a
  wakeup occurs due to that object being signaled, ``index`` is set to
  the lowest index corresponding to that object.

  The function may fail with ``EINTR`` if a signal is received.

.. c:macro:: WINESYNC_IOC_WAIT_ALL

  Poll on a list of objects, atomically acquiring all of them. Takes a
  pointer to struct :c:type:`winesync_wait_args`, which is used
  identically to ``WINESYNC_IOC_WAIT_ANY``, except that ``index`` is
  always filled with zero on success if not woken via alert.

  This function attempts to simultaneously acquire all of the given
  objects. If unable to do so, it sleeps until all objects become
  simultaneously signaled, subsequently acquiring them, or the timeout
  expires. In the latter case the ioctl fails with ``ETIMEDOUT`` and
  no objects are modified.

  Objects may become signaled and subsequently designaled (through
  acquisition by other threads) while this thread is sleeping. Only
  once all objects are simultaneously signaled does the ioctl acquire
  them and return. The entire acquisition is atomic and totally
  ordered with respect to other operations on any of the given
  objects.

  If an inconsistent mutex is acquired, the ioctl fails with
  ``EOWNERDEAD``. Similarly to ``WINESYNC_IOC_WAIT_ANY``, all objects
  are nevertheless marked as acquired. Note that if multiple mutex
  objects are specified, there is no way to know which were marked as
  inconsistent.

  As with "any" waits, the ``alert`` argument is an "extra" event
  which can terminate the wait. Critically, however, an "all" wait
  will succeed if all members in ``objs`` are signaled, *or* if
  ``alert`` is signaled. In the latter case ``index`` will be set to
  ``count``. As with "any" waits, if both conditions are filled, the
  former takes priority, and objects in ``objs`` will be acquired.

  Unlike ``WINESYNC_IOC_WAIT_ANY``, it is not valid to pass the same
  object more than once, nor is it valid to pass the same object in
  ``objs`` and in ``alert`` If this is attempted, the function fails
  with ``EINVAL``.
