Skip to content
Snippets Groups Projects
  1. Aug 11, 2019
  2. Jul 30, 2019
  3. Jul 25, 2019
  4. Jul 16, 2019
  5. Jun 11, 2019
  6. Jun 07, 2019
  7. Jun 06, 2019
  8. Mar 06, 2019
  9. Nov 17, 2018
  10. Jul 13, 2018
  11. Jun 19, 2018
  12. Apr 27, 2018
    • kohensu's avatar
      Improve performance of Buf::get_*() (#195) · 51e435b7
      kohensu authored
      The new implementation tries to get the data directly from bytes() (this is
      possible most of the time) and if there is not enough data in bytes() use the
      previous code: copy the needed bytes in a temporary buffer before returning
      the data
      
      Here the bench results:
                                     Before                After           x-faster
      get_f32::cursor             64 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.2
      get_f32::tbuf_1             77 ns/iter (+/- 1)    34 ns/iter (+/- 0)    2.3
      get_f32::tbuf_1_costly      87 ns/iter (+/- 0)    62 ns/iter (+/- 0)    1.4
      get_f32::tbuf_2            151 ns/iter (+/- 18)  160 ns/iter (+/- 1)    0.9
      get_f32::tbuf_2_costly     180 ns/iter (+/- 2)   187 ns/iter (+/- 2)    1.0
      
      get_f64::cursor             67 ns/iter (+/- 0)    21 ns/iter (+/- 0)    3.2
      get_f64::tbuf_1             80 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.3
      get_f64::tbuf_1_costly      82 ns/iter (+/- 3)    60 ns/iter (+/- 0)    1.4
      get_f64::tbuf_2            154 ns/iter (+/- 1)   164 ns/iter (+/- 0)    0.9
      get_f64::tbuf_2_costly     170 ns/iter (+/- 2)   187 ns/iter (+/- 1)    0.9
      
      get_u16::cursor             66 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.3
      get_u16::tbuf_1             77 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.2
      get_u16::tbuf_1_costly      85 ns/iter (+/- 2)    62 ns/iter (+/- 0)    1.4
      get_u16::tbuf_2            147 ns/iter (+/- 0)   154 ns/iter (+/- 0)    1.0
      get_u16::tbuf_2_costly     160 ns/iter (+/- 1)   177 ns/iter (+/- 0)    0.9
      
      get_u32::cursor             64 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.2
      get_u32::tbuf_1             77 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.2
      get_u32::tbuf_1_costly      91 ns/iter (+/- 2)    63 ns/iter (+/- 0)    1.4
      get_u32::tbuf_2            151 ns/iter (+/- 40)  157 ns/iter (+/- 0)    1.0
      get_u32::tbuf_2_costly     162 ns/iter (+/- 0)   180 ns/iter (+/- 0)    0.9
      
      get_u64::cursor             67 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.4
      get_u64::tbuf_1             78 ns/iter (+/- 0)    35 ns/iter (+/- 1)    2.2
      get_u64::tbuf_1_costly      87 ns/iter (+/- 1)    59 ns/iter (+/- 1)    1.5
      get_u64::tbuf_2            154 ns/iter (+/- 0)   160 ns/iter (+/- 0)    1.0
      get_u64::tbuf_2_costly     168 ns/iter (+/- 0)   184 ns/iter (+/- 0)    0.9
      
      get_u8::cursor              64 ns/iter (+/- 0)    19 ns/iter (+/- 0)    3.4
      get_u8::tbuf_1              77 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.2
      get_u8::tbuf_1_costly       68 ns/iter (+/- 0)    51 ns/iter (+/- 0)    1.3
      get_u8::tbuf_2              85 ns/iter (+/- 0)    43 ns/iter (+/- 0)    2.0
      get_u8::tbuf_2_costly       75 ns/iter (+/- 0)    61 ns/iter (+/- 0)    1.2
      get_u8::option              77 ns/iter (+/- 0)    59 ns/iter (+/- 0)    1.3
      
      Improvement on the basic std::Cursor implementation are clearly visible.
      
      Other implementations are specific to the bench tests and just map a static
      slice. Different variant are:
       - tbuf_1: only one call of 'bytes()' is needed.
       - tbuf_2: two calls of 'bytes()' is needed to read more than one byte.
       - _costly version are implemented with #[inline(never)] on 'bytes()',
         'remaining()' and 'advance()'.
      
      The cases that are slower (slightly) correspond to implementations that are not
      really realistic: more than one byte is never possible in one time
      Unverified
      51e435b7
    • kohensu's avatar
      Improve performance of Buf::get_*() (#195) · e4447220
      kohensu authored
      The new implementation tries to get the data directly from bytes() (this is
      possible most of the time) and if there is not enough data in bytes() use the
      previous code: copy the needed bytes in a temporary buffer before returning
      the data
      
      Here the bench results:
                                     Before                After           x-faster
      get_f32::cursor             64 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.2
      get_f32::tbuf_1             77 ns/iter (+/- 1)    34 ns/iter (+/- 0)    2.3
      get_f32::tbuf_1_costly      87 ns/iter (+/- 0)    62 ns/iter (+/- 0)    1.4
      get_f32::tbuf_2            151 ns/iter (+/- 18)  160 ns/iter (+/- 1)    0.9
      get_f32::tbuf_2_costly     180 ns/iter (+/- 2)   187 ns/iter (+/- 2)    1.0
      
      get_f64::cursor             67 ns/iter (+/- 0)    21 ns/iter (+/- 0)    3.2
      get_f64::tbuf_1             80 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.3
      get_f64::tbuf_1_costly      82 ns/iter (+/- 3)    60 ns/iter (+/- 0)    1.4
      get_f64::tbuf_2            154 ns/iter (+/- 1)   164 ns/iter (+/- 0)    0.9
      get_f64::tbuf_2_costly     170 ns/iter (+/- 2)   187 ns/iter (+/- 1)    0.9
      
      get_u16::cursor             66 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.3
      get_u16::tbuf_1             77 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.2
      get_u16::tbuf_1_costly      85 ns/iter (+/- 2)    62 ns/iter (+/- 0)    1.4
      get_u16::tbuf_2            147 ns/iter (+/- 0)   154 ns/iter (+/- 0)    1.0
      get_u16::tbuf_2_costly     160 ns/iter (+/- 1)   177 ns/iter (+/- 0)    0.9
      
      get_u32::cursor             64 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.2
      get_u32::tbuf_1             77 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.2
      get_u32::tbuf_1_costly      91 ns/iter (+/- 2)    63 ns/iter (+/- 0)    1.4
      get_u32::tbuf_2            151 ns/iter (+/- 40)  157 ns/iter (+/- 0)    1.0
      get_u32::tbuf_2_costly     162 ns/iter (+/- 0)   180 ns/iter (+/- 0)    0.9
      
      get_u64::cursor             67 ns/iter (+/- 0)    20 ns/iter (+/- 0)    3.4
      get_u64::tbuf_1             78 ns/iter (+/- 0)    35 ns/iter (+/- 1)    2.2
      get_u64::tbuf_1_costly      87 ns/iter (+/- 1)    59 ns/iter (+/- 1)    1.5
      get_u64::tbuf_2            154 ns/iter (+/- 0)   160 ns/iter (+/- 0)    1.0
      get_u64::tbuf_2_costly     168 ns/iter (+/- 0)   184 ns/iter (+/- 0)    0.9
      
      get_u8::cursor              64 ns/iter (+/- 0)    19 ns/iter (+/- 0)    3.4
      get_u8::tbuf_1              77 ns/iter (+/- 0)    35 ns/iter (+/- 0)    2.2
      get_u8::tbuf_1_costly       68 ns/iter (+/- 0)    51 ns/iter (+/- 0)    1.3
      get_u8::tbuf_2              85 ns/iter (+/- 0)    43 ns/iter (+/- 0)    2.0
      get_u8::tbuf_2_costly       75 ns/iter (+/- 0)    61 ns/iter (+/- 0)    1.2
      get_u8::option              77 ns/iter (+/- 0)    59 ns/iter (+/- 0)    1.3
      
      Improvement on the basic std::Cursor implementation are clearly visible.
      
      Other implementations are specific to the bench tests and just map a static
      slice. Different variant are:
       - tbuf_1: only one call of 'bytes()' is needed.
       - tbuf_2: two calls of 'bytes()' is needed to read more than one byte.
       - _costly version are implemented with #[inline(never)] on 'bytes()',
         'remaining()' and 'advance()'.
      
      The cases that are slower (slightly) correspond to implementations that are not
      really realistic: more than one byte is never possible in one time
      e4447220
  13. Mar 12, 2018
    • Carl Lerche's avatar
      Fix `copy_to_slice` to use correct increment var · ebe52273
      Carl Lerche authored
      This patch fixes the `copy_to_slice` function, rectifying the logic.
      However, the incorrect code does not result in incorrect behavior as the
      only case `cnt != src.len()` is during the final iteration, and since
      `src.len()` is greater than `cnt` in that case, `off` will be
      incremented by too much, but this will still trigger the `off <
      dst.len()` condition.
      
      The only danger is `src.len()` could cause an overflow.
      Unverified
      ebe52273
    • Sean McArthur's avatar
      Remove ByteOrder generic methods from Buf and BufMut (#187) · 025d5334
      Sean McArthur authored
      * make Buf and BufMut usable as trait objects
      
      - All the `get_*` and `put_*` methods that take `T: ByteOrder` have
        a `where Self: Sized` bound added, so that they are only usable from
        sized types. It was impossible to make `Buf` or `BufMut` into trait
        objects before, so this change doesn't break anyone.
      - Add `get_n_be`/`get_n_le`/`put_n_be`/`put_n_le` methods that can be
        used on trait objects.
      - Deprecate the export of `ByteOrder` and methods generic on it.
      
      * remove deprecated ByteOrder methods
      
      Removes the `_be` suffix from all methods, implying that the default
      people should use is network endian.
      025d5334
    • Sean McArthur's avatar
      Make Buf and BufMut usable as trait objects (#186) · ce79f0a2
      Sean McArthur authored
      - All the `get_*` and `put_*` methods that take `T: ByteOrder` have
        a `where Self: Sized` bound added, so that they are only usable from
        sized types. It was impossible to make `Buf` or `BufMut` into trait
        objects before, so this change doesn't break anyone.
      - Add `get_n_be`/`get_n_le`/`put_n_be`/`put_n_le` methods that can be
        used on trait objects.
      - Deprecate the export of `ByteOrder` and methods generic on it.
      
      Fixes #163 
      ce79f0a2
  14. Jan 26, 2018
  15. Jun 27, 2017
  16. May 24, 2017
  17. Apr 30, 2017
  18. Mar 19, 2017
    • Carl Lerche's avatar
      Clarify when `BufMut::bytes_mut` can return &[] · bed128b2
      Carl Lerche authored
      Closes #79
      bed128b2
    • Dan Burkert's avatar
      Add inline attributes to Vec's MutBuf methods (#80) · 5a265cc8
      Dan Burkert authored
      I found this significantly improved a
      [benchmark](https://gist.github.com/danburkert/34a7d6680d97bc86dca7f396eb8d0abf)
      which calls `bytes_mut`, writes 1 byte, and advances the pointer with
      `advance_mut` in a pretty tight loop. In particular, it seems to be the
      inline annotation on `bytes_mut` which had the most effect. I also took
      the opportunity to simplify the bounds checking in advance_mut.
      
      before:
      
      ```
      test encode_varint_small  ... bench:         540 ns/iter (+/- 85) = 1481 MB/s
      ```
      
      after:
      
      ```
      test encode_varint_small  ... bench:         422 ns/iter (+/- 24) = 1895 MB/s
      ```
      
      As you can see, the variance is also significantly improved.
      
      Interestingly, I tried to change the last statement in `bytes_mut` from
      
      ```
      &mut slice::from_raw_parts_mut(ptr, cap)[len..]
      ```
      
      to
      
      ```
      slice::from_raw_parts_mut(ptr.offset(len as isize), cap - len)
      ```
      
      but, this caused a very measurable perf regression (almost completely
      negating the gains from marking bytes_mut inline).
      5a265cc8
    • Dan Burkert's avatar
      Clarify BufMut::advance_mut docs (#78) · 4fe4e942
      Dan Burkert authored
      Also fixes an issue with a line wrap in the middle of an inline code
      block.
      4fe4e942
  19. Mar 16, 2017
  20. Mar 07, 2017
    • Carl Lerche's avatar
      Remove buf::Source in favor of buf::IntoBuf · 06b94c55
      Carl Lerche authored
      The `Source` trait was essentially covering the same case as `IntoBuf`,
      so remove it.
      
      While technically a breaking change, this should not have any impact due
      to:
      
      1) There are no reverse dependencies that currently depend on `bytes`
      2) Source was not supposed to be implemented externally
      3) IntoBuf provides the same implementations as `Source`
      
      Given these points, the change should be safe to apply.
      06b94c55
    • Carl Lerche's avatar
      Provide Debug impls for all types · d70f575a
      Carl Lerche authored
      d70f575a
  21. Mar 02, 2017
  22. Mar 01, 2017
  23. Feb 28, 2017
  24. Feb 17, 2017
Loading