Under heavy concurrent rename + change-notification load a volume can
deadlock permanently: all renames (exclusive) and opens (shared) on the
volume block, freezing the mount.
FspFileSystemNotifyBegin (FspVolumeNotifyLock) acquires the per-volume
FileRenameResource shared via an owner pointer (&VolumeNotifyCount) and
holds it for the whole Begin/End session. If a rename queues as an
exclusive waiter mid-session, the asynchronous FspVolumeNotifyWork then
re-acquires the same resource shared with ExAcquireResourceSharedLite.
Due to ERESOURCE writer-priority that shared acquire blocks behind the
queued exclusive waiter (the worker thread is not the owner -- the owner
is the &VolumeNotifyCount pointer). But that work item is the one that
must process FspFileSystemNotifyEnd to drop VolumeNotifyCount to 0 and
release the session, so it can never run: the session lock is never
released and the rename waits forever, while VolumeNotifyCount runs away
as Begin keeps incrementing it.
Acquire the rename resource in FspVolumeNotifyWork with
ExAcquireSharedStarveExclusive instead. The enclosing Begin/End session
already holds the resource shared and already defers renames until End,
so granting this redundant shared acquire ahead of the queued exclusive
waiter preserves name-stability semantics while breaking the deadlock. A
real exclusive holder still blocks the starve-exclusive acquire, so
correctness is unchanged.
If we have an fsvrt device, mount it via opening the volume.
This ensures that the fsvrt is mounted by the correct fsvol
device early on and remedies a rare case where NTFS crashes
the system when it attempts to mount our fsvrt.
Reference an fsvol device at CREATE time and dereference at CLOSE time,
to ensure that fsvol remains around for DeviceIoControl operations done
after CLEANUP.
The original WinFsp protocol for shutting down a file system was to issue
an FSP_FSCTL_STOP control code to the fsctl device. This would set the IOQ
to the "stopped" state and would also cancel all active IRP's. Cancelation
of IRP's would sometimes free buffers that may have still been in use by
the user mode file system threads; hence access violation.
To fix this problem a new control code FSP_FSCTL_STOP0 is introduced. The
new file system shutdown protocol is backwards compatible with the original
one and works as follows:
- First the file system process issues an FSP_FSCTL_STOP0 control code which
sets the IOQ to the "stopped" state but does NOT cancel IRP's.
- Then the file system process waits for its dispatcher threads to complete
(see FspFileSystemStopDispatcher).
- Finally the file system process issues an FSP_FSCTL_STOP control code
which stops the (already stopped) IOQ and cancels all IRP's.
- This commit introduces the fsmup device, which is a major change in how
network file systems are handled. Previously every network file system's
fsvol device was directly registered with the MUP. Now there is a single
fsmup device that is registered with the MUP; network file systems' fsvol
devices register with fsmup instead. The fsmup device maintains a prefix
table which it uses to demultiplex and forward requests to the appropriate
fsvol device.
- This device change was necessatitated to fix issue #87.