Added InterlockedDecrement in the error path when GetPoolBuffer fails for EncryptedIoRequest to ensure accurate tracking of pending IO requests and prevent potential resource leaks.
Major changes:
- Added pooled + elastic work item model with retry/backoff (MAX_WI_RETRIES). removed semaphore usage.
- Introduced two completion threads to reduce contention and latency under heavy IO.
- Added BytesCompleted (per IRP) and ActualBytes (per fragment) for correct short read/write accounting. total read/write stats now reflect real transferred bytes instead of requested length.
- Moved decryption of read fragments into IO thread. completion threads now only finalize IRPs (reduces race window and simplifies flow).
- Deferred final IRP completion via FinalizeOriginalIrp to avoid inline IoCompleteRequest re-entrancy. added safe OOM inline fallback.
- Implemented work item pool drain & orderly shutdown (ActiveWorkItems + NoActiveWorkItemsEvent) with robust stop protocol.
- Replaced semaphore-based work item acquisition with spin lock + free list + event (WorkItemAvailableEvent). added exponential backoff for transient exhaustion.
- Added elastic (on-demand) work item allocation with pool vs dynamic origin tracking (FromPool).
- Added FreeCompletionWorkItemPool() for symmetric cleanup; ensured all threads are explicitly awakened during stop.
- Added second completion thread replacing single CompletionThread.
- Hardened UpdateBuffer: fixed parameter name typo, added bounds/overflow checks using IntSafe (ULongLongAdd), validated Count, guarded sector end computation.
- Fixed GPT/system region write protection logic to pass correct length instead of end offset.
- Ensured ASSERTs use fragment‑relative bounds (cast + length) and avoided mixed 64/32 comparisons.
- Added MAX_WI_RETRIES constant. added WiRetryCount field in EncryptedIoRequest.
- Ensured RemoveLock is released only after all queue/accounting updates (OnItemCompleted).
- Reset/read-ahead logic preserved. read-ahead trigger now based on actual completion & zero pending fragment count.
- General refactoring, clearer separation of concerns (TryAcquireCompletionWorkItem / FinalizeOriginalIrp / HandleCompleteOriginalIrp).
Safety / correctness improvements:
- Accurate short read handling (STATUS_END_OF_FILE with true byte count).
- Eliminated risk of double free or premature RemoveLock release on completion paths.
- Prevented potential overflow in sector end arithmetic.
- Reduced contention and potential deadlock scenarios present with previous semaphore wait path.
- Made the maximum work items count configurable to allow flexibility based on system needs.
- Increased the default value of max work items count to 1024 to better handle high-throughput scenarios.
- Queue write IRPs in system worker thread to avoid potential deadlocks in write scenarios.
Reduce the critical section protected by spinlock to only cover the list manipulation operation. Move the ActiveWorkItems counter decrement outside the spinlock using InterlockedDecrement, and separate event signaling from the locked section.
This change minimizes time spent at raised IRQL (DISPATCH_LEVEL) and reduces potential for lock contention.
There was a deadlock issue in the driver caused by the CompletionThreadProc function in EncryptedIoQueue.c:
https://sourceforge.net/p/veracrypt/discussion/general/thread/f6e7f623d0/?page=20&limit=25#8362
The driver uses a single thread (CompletionThreadProc) to process IRP completions. When IoCompleteRequest is called within this thread, it can result in new IRPs being generated (e.g., for pagefile operations) that are intercepted by the driver and queued back into the CompletionThreadQueue. Since CompletionThreadProc is the only thread processing this queue and is waiting on IoCompleteRequest, these new IRPs are not handled, leading to a system freeze.
To resolve this issue, the following changes have been made:
Deferred IRP Completion Using Pre-allocated Work Items:
- Introduced a pool of pre-allocated work items (COMPLETE_IRP_WORK_ITEM) to handle IRP completions without causing additional resource allocations that could trigger new IRPs.
- The CompletionThreadProc now queues IRP completions to these work items, which are processed in a different context using IoQueueWorkItem, preventing re-entrant IRPs from blocking the completion thread.
Thread-Safe Work Item Pool Management:
- Implemented a thread-safe mechanism using a semaphore (WorkItemSemaphore), spin lock (WorkItemLock), and a free list (FreeWorkItemsList) to manage the pool of work items.
- Threads acquire and release work items safely, and if all work items are busy, threads wait until one becomes available.
Reference Counting and Improved Stop Handling:
- Added an ActiveWorkItems counter to track the number of active work items.
- Modified EncryptedIoQueueStop to wait for all active work items to complete before proceeding with cleanup, ensuring a clean shutdown.
These changes address the deadlock issue by preventing CompletionThreadProc from being blocked by re-entrant IRPs generated during IoCompleteRequest. By deferring IRP completion to a different context using pre-allocated work items and managing resources properly, we avoid the deadlock and ensure that all IRPs are processed correctly.