Under heavy concurrent rename + change-notification load a volume can
deadlock permanently: all renames (exclusive) and opens (shared) on the
volume block, freezing the mount.
FspFileSystemNotifyBegin (FspVolumeNotifyLock) acquires the per-volume
FileRenameResource shared via an owner pointer (&VolumeNotifyCount) and
holds it for the whole Begin/End session. If a rename queues as an
exclusive waiter mid-session, the asynchronous FspVolumeNotifyWork then
re-acquires the same resource shared with ExAcquireResourceSharedLite.
Due to ERESOURCE writer-priority that shared acquire blocks behind the
queued exclusive waiter (the worker thread is not the owner -- the owner
is the &VolumeNotifyCount pointer). But that work item is the one that
must process FspFileSystemNotifyEnd to drop VolumeNotifyCount to 0 and
release the session, so it can never run: the session lock is never
released and the rename waits forever, while VolumeNotifyCount runs away
as Begin keeps incrementing it.
Acquire the rename resource in FspVolumeNotifyWork with
ExAcquireSharedStarveExclusive instead. The enclosing Begin/End session
already holds the resource shared and already defers renames until End,
so granting this redundant shared acquire ahead of the queued exclusive
waiter preserves name-stability semantics while breaking the deadlock. A
real exclusive holder still blocks the starve-exclusive acquire, so
correctness is unchanged.
- Check that the operation succeeded prior to copying to the output buffer.
- Avoid information leaks by only copying what is necessary to the output
buffer (suggestion by Tay Kiat Loong).
The WinFsp "transact" protocol is used by user mode file systems to interface
with the FSD. This protocol works via the DeviceIoControl API and uses the
FSP_IOCTL_TRANSACT control code. The FSP_IOCTL_TRANSACT code is marked as
METHOD_BUFFERED.
When the DeviceIoControl call is forwarded as an IRP, the METHOD_BUFFERED flag
instructs the kernel to copy user mode buffers to kernel mode buffers (and
vice-versa). However when the DeviceIoControl call is forwarded via the FastIO
mechanism the METHOD_BUFFERED flag is ignored. This means that when WinFsp
added support for DeviceIoControl FastIO, the FSD started accessing user mode
buffers directly.
This means that a malicious file system could attempt exploits like changing
or freeing a buffer while the FSD is reading it. Tay Kiat Loong developed a
POC exploit which demonstrated this vulnerability.
This commit fixes the problem by patching FspFastIoDeviceControl to add the
missing METHOD_BUFFERED handling.
Some third party filters send us security descriptors in absolute rather
than self-relative format. Handle this case by converting them to self-
relative format ourselves.