mirror of
				https://github.com/winfsp/winfsp.git
				synced 2025-10-31 12:08:41 -05:00 
			
		
		
		
	
		
			
				
	
	
		
			239 lines
		
	
	
		
			8.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			239 lines
		
	
	
		
			8.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| {
 | |
|  "cells": [
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "# Performance Testing Analysis\n",
 | |
|     "\n",
 | |
|     "This notebook describes the methodology for analyzing WinFsp performance."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Data Collection\n",
 | |
|     "\n",
 | |
|     "Performance data is collected by running the script `run-all-perf-tests.bat`. This script runs a variety of performance tests against the NTFS, MEMFS and NTPTFS file systems. The tests are run a number of times (default: 3) and the results are saved in CSV files with names `ntfs-N.csv`, `memfs-N.csv` and `ntptfs-N.csv` (where `N` represents the results of test run `N`)."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Data Loading\n",
 | |
|     "\n",
 | |
|     "Data is loaded from all CSV files into a single pandas `DataFrame`. The resulting `DataFrame` has columns `test`, `iter`, `ntfs`, `memfs`, `ntptfs`. With multiple test runs there will be multiple time values for a `test`, `iter`, file system triple; in this case the smallest time value is entered into the `DataFrame`. The assumption is that even in a seemingly idle system there is some activity that affects the results; the smallest value is the preferred one to use because it reflects the time when there is less or no other system activity.\n",
 | |
|     "\n",
 | |
|     "The resulting `DataFrame` will contain data similar to the following:\n",
 | |
|     "\n",
 | |
|     "| test              | iter  |  ntfs  | memfs  | ntptfs |\n",
 | |
|     "|:------------------|------:|-------:|-------:|-------:|\n",
 | |
|     "| file_create_test  | 1000  |  0.20  |  0.06  |  0.28  |\n",
 | |
|     "| file_open_test    | 1000  |  0.09  |  0.05  |  0.22  |\n",
 | |
|     "| ...               |  ...  |   ...  |   ...  |   ...  |"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": null,
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "import glob, os\n",
 | |
|     "import matplotlib.pyplot as plt\n",
 | |
|     "import numpy as np\n",
 | |
|     "import pandas as pd\n",
 | |
|     "\n",
 | |
|     "nameord = [\"ntfs\", \"memfs\", \"ntptfs\"]\n",
 | |
|     "\n",
 | |
|     "datamap = {}\n",
 | |
|     "for f in sorted(glob.iglob(\"*.csv\")):\n",
 | |
|     "    datamap.setdefault(f.rsplit(\"-\", maxsplit=1)[0], []).append(f)\n",
 | |
|     "\n",
 | |
|     "df = None\n",
 | |
|     "for n in nameord:\n",
 | |
|     "    ndf = None\n",
 | |
|     "    for f in datamap[n]:\n",
 | |
|     "        df0 = pd.read_csv(f, header=None, names=[\"test\", \"iter\", n])\n",
 | |
|     "        if ndf is None:\n",
 | |
|     "            ndf = df0\n",
 | |
|     "        else:\n",
 | |
|     "            ndf = ndf.combine(df0, np.minimum)\n",
 | |
|     "    if df is None:\n",
 | |
|     "        df = ndf\n",
 | |
|     "    else:\n",
 | |
|     "        df = df.merge(ndf, how=\"left\")\n",
 | |
|     "#df"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Data Analysis\n",
 | |
|     "\n",
 | |
|     "For each test a plot is drawn that shows how each file system performs in the particular test. This allows for easy comparisons between file systems for a particular test."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": null,
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "markermap = { \"ntfs\": \"$\\mathtt{N}$\", \"memfs\": \"$\\mathtt{M}$\", \"ntptfs\": \"$\\mathtt{P}$\"}\n",
 | |
|     "for t, tdf in df.groupby(\"test\", sort=False):\n",
 | |
|     "    plt.figure(figsize=(10,8), dpi=100, facecolor=\"white\")\n",
 | |
|     "    plt.title(t)\n",
 | |
|     "    xlabel = \"iter\"\n",
 | |
|     "    if t.startswith(\"file_\"):\n",
 | |
|     "        xlabel = \"files\"\n",
 | |
|     "    for n in nameord:\n",
 | |
|     "        tdf.plot(ax=plt.gca(), x=\"iter\", xlabel=xlabel, y=n, ylabel=\"time\", marker=markermap[n], ms=8)\n",
 | |
|     "    plt.legend(nameord)\n",
 | |
|     "    plt.savefig(t + \".png\")\n",
 | |
|     "    #plt.show()\n",
 | |
|     "    plt.close()"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     "\n",
 | |
|     ""
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "### File tests\n",
 | |
|     "\n",
 | |
|     "File tests are tests that are performed against the hierarchical path namespace of a file system. Such tests include `file_create_test`, `file_open_test`, etc. Measured times for these tests are normalized against the `ntfs` time (so that the `ntfs` time value becomes 1) and a single aggregate plot is produced.\n",
 | |
|     "\n",
 | |
|     "This allows for easy comparison between file systems across all file tests."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": null,
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "fileord = [\"create\", \"open\", \"iter.open\", \"overwrite\", \"list\", \"list_single\", \"delete\"]\n",
 | |
|     "fdf = pd.concat([df[df.iter == 5000], df[df.iter == 50]])\n",
 | |
|     "fdf.test = fdf.test.map(lambda x: x.replace(\"file_\", \"\").replace(\"_test\", \"\"))\n",
 | |
|     "fdf = fdf.set_index(\"test\").loc[fileord]\n",
 | |
|     "fdf.memfs /= fdf.ntfs; fdf.ntptfs /= fdf.ntfs; fdf.ntfs = 1\n",
 | |
|     "plt.figure(figsize=(10,8), dpi=100, facecolor=\"white\")\n",
 | |
|     "plt.suptitle(\"File Tests\", fontweight=\"light\", fontsize=20, y=0.95)\n",
 | |
|     "plt.title(\"(Shorter bars are better)\")\n",
 | |
|     "fdf.plot.barh(ax=plt.gca(), y=nameord).invert_yaxis()\n",
 | |
|     "plt.gca().set(ylabel=None)\n",
 | |
|     "for container in plt.gca().containers:\n",
 | |
|     "    plt.gca().bar_label(container, fmt=\"%0.2f\", padding=4.0, fontsize=\"xx-small\")\n",
 | |
|     "plt.savefig(\"file_tests.png\")\n",
 | |
|     "#plt.show()\n",
 | |
|     "plt.close()"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     ""
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "### Read/write tests\n",
 | |
|     "\n",
 | |
|     "Read/write tests are file I/O tests. Such tests include `rdwr_cc_write_page_test`, `rdwr_cc_read_page_test`, etc. As before measured times for these tests are normalized against the `ntfs` time (so that the `ntfs` time value becomes 1) and a single aggregate plot is produced.\n",
 | |
|     "\n",
 | |
|     "This allows for easy comparison between file systems across all read/write tests."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": null,
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "rdwrord = [\"cc_read_page\", \"cc_write_page\", \"nc_read_page\", \"nc_write_page\", \"mmap_read\", \"mmap_write\"]\n",
 | |
|     "sdf = df[df.iter == 500].copy()\n",
 | |
|     "sdf.test = sdf.test.map(lambda x: x.replace(\"rdwr_\", \"\").replace(\"_test\", \"\"))\n",
 | |
|     "sdf = sdf.set_index(\"test\").loc[rdwrord]\n",
 | |
|     "sdf.memfs /= sdf.ntfs; sdf.ntptfs /= sdf.ntfs; sdf.ntfs = 1\n",
 | |
|     "plt.figure(figsize=(10,8), dpi=100, facecolor=\"white\")\n",
 | |
|     "plt.suptitle(\"Read/Write Tests\", fontweight=\"light\", fontsize=20, y=0.95)\n",
 | |
|     "plt.title(\"(Shorter bars are better)\")\n",
 | |
|     "sdf.plot.barh(ax=plt.gca(), y=nameord).invert_yaxis()\n",
 | |
|     "plt.gca().set(ylabel=None)\n",
 | |
|     "for container in plt.gca().containers:\n",
 | |
|     "    plt.gca().bar_label(container, fmt=\"%0.2f\", padding=4.0, fontsize=\"xx-small\")\n",
 | |
|     "plt.savefig(\"rdwr_tests.png\")\n",
 | |
|     "#plt.show()\n",
 | |
|     "plt.close()"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "markdown",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     ""
 | |
|    ]
 | |
|   }
 | |
|  ],
 | |
|  "metadata": {
 | |
|   "interpreter": {
 | |
|    "hash": "78f203ba605732dcd419e55e4a2fc56c1449fc8b262db510a48272adb5557637"
 | |
|   },
 | |
|   "kernelspec": {
 | |
|    "display_name": "Python 3.9.7 64-bit ('base': conda)",
 | |
|    "name": "python3"
 | |
|   },
 | |
|   "language_info": {
 | |
|    "codemirror_mode": {
 | |
|     "name": "ipython",
 | |
|     "version": 3
 | |
|    },
 | |
|    "file_extension": ".py",
 | |
|    "mimetype": "text/x-python",
 | |
|    "name": "python",
 | |
|    "nbconvert_exporter": "python",
 | |
|    "pygments_lexer": "ipython3",
 | |
|    "version": "3.8.12"
 | |
|   },
 | |
|   "orig_nbformat": 4
 | |
|  },
 | |
|  "nbformat": 4,
 | |
|  "nbformat_minor": 2
 | |
| }
 |