Skip to content
Snippets Groups Projects
  • Tomasz Grabiec's avatar
    9b72ad47
    bsd: Initialize physmem variable · 9b72ad47
    Tomasz Grabiec authored
    
    This was the cause of poor ZFS performance in misc-fs-stress test.
    
    Before:
    
     Wrote 168.129 MB in 10.12 s = 16.610 Mb/s
     Wrote 194.688 MB in 10.00 s = 19.469 Mb/s
     Wrote 183.004 MB in 10.06 s = 18.186 Mb/s
     Wrote 167.754 MB in 10.28 s = 16.315 Mb/s
    
    After:
    
     Wrote 636.227 MB in 10.00 s = 63.623 Mb/s
     Wrote 666.979 MB in 10.00 s = 66.696 Mb/s
     Wrote 613.512 MB in 10.00 s = 61.350 Mb/s
     Wrote 573.502 MB in 10.00 s = 57.346 Mb/s
     Wrote 668.607 MB in 10.00 s = 66.857 Mb/s
     Wrote 630.920 MB in 10.00 s = 63.087 Mb/s
    
    It turned out that the limiting factor was the ARC cache. A check
    inside arc_tempreserve_space() was forcing txg to be synced too often
    (once every 400ms). The arc_c variable was only 16M (arc_c_min) which
    allowed to write only 8M per transaction. It turns out that arc_c
    depends on kmem_size() which is based on physmem which was never
    initialized.
    
    I would hold with commiting this yet because of several reasons,
    which I want to put under your consideration.
    
    While this improves write throughput it makes the boot time after make
    much longer, on my disk the boot time is increased from 1.5s to 10s.
    This is because zfs verifies the last 3 txgs upon mount. This patch
    increases txg size, which results in more data to check in the next
    boot. I'm working on solving this right now.
    
    Something worth noting is that while larger transactions sync less
    often incresing throughput they also sync longer increasing worst case
    latency. In my test the pauses get as high as 3 seconds with 1G of
    guest memory.
    
    Signed-off-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
    Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
    9b72ad47
    History
    bsd: Initialize physmem variable
    Tomasz Grabiec authored
    
    This was the cause of poor ZFS performance in misc-fs-stress test.
    
    Before:
    
     Wrote 168.129 MB in 10.12 s = 16.610 Mb/s
     Wrote 194.688 MB in 10.00 s = 19.469 Mb/s
     Wrote 183.004 MB in 10.06 s = 18.186 Mb/s
     Wrote 167.754 MB in 10.28 s = 16.315 Mb/s
    
    After:
    
     Wrote 636.227 MB in 10.00 s = 63.623 Mb/s
     Wrote 666.979 MB in 10.00 s = 66.696 Mb/s
     Wrote 613.512 MB in 10.00 s = 61.350 Mb/s
     Wrote 573.502 MB in 10.00 s = 57.346 Mb/s
     Wrote 668.607 MB in 10.00 s = 66.857 Mb/s
     Wrote 630.920 MB in 10.00 s = 63.087 Mb/s
    
    It turned out that the limiting factor was the ARC cache. A check
    inside arc_tempreserve_space() was forcing txg to be synced too often
    (once every 400ms). The arc_c variable was only 16M (arc_c_min) which
    allowed to write only 8M per transaction. It turns out that arc_c
    depends on kmem_size() which is based on physmem which was never
    initialized.
    
    I would hold with commiting this yet because of several reasons,
    which I want to put under your consideration.
    
    While this improves write throughput it makes the boot time after make
    much longer, on my disk the boot time is increased from 1.5s to 10s.
    This is because zfs verifies the last 3 txgs upon mount. This patch
    increases txg size, which results in more data to check in the next
    boot. I'm working on solving this right now.
    
    Something worth noting is that while larger transactions sync less
    often incresing throughput they also sync longer increasing worst case
    latency. In my test the pauses get as high as 3 seconds with 1G of
    guest memory.
    
    Signed-off-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
    Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>