Use spawn_blocking directly to read from files #25

stephank · 2019-12-18T20:12:44Z

Code is rough, and not sure if this is a good idea. Just playing around with ideas in #24.

This bypasses tokio::fs and directly uses spawn_blocking inspired by Tokio internals. This allows us to skip Tokio's internal buffering, and read directly into our own BytesMut.

It looks like, at least on macOS, the OS will fill whatever size buffer we give it.

Some relative results on my own machine using the example server with: ab -n 1000 -c 10 http://127.0.0.1:3000/search-index.js (an 975K file), and tweaking our own BUF_SIZE:

before,   8 KB buffer |  244.64 req/s |  238639.96 KB/s
before,  16 KB buffer |  419.66 req/s |  409371.92 KB/s
after,    8 KB buffer |  274.51 req/s |  267781.17 KB/s
after,   16 KB buffer |  477.92 req/s |  466201.27 KB/s
after,   32 KB buffer |  774.12 req/s |  755135.07 KB/s
after,   64 KB buffer | 1143.46 req/s | 1115419.94 KB/s
after,  128 KB buffer | 1395.01 req/s | 1360802.62 KB/s
after,  256 KB buffer | 1437.07 req/s | 1401834.38 KB/s
after,  512 KB buffer | 1604.41 req/s | 1565070.82 KB/s
after, 1024 KB buffer | 1726.24 req/s | 1683912.61 KB/s

These numbers are really shifty, though. But what I do find interesting is that there's an apparent ~13% increase comparing the two runs with 8 KB buffers.

Besides skipping a copy, the other thing I thought might make a difference is the change in open_with_metadata. Through Tokio internals, we previously did separate spawn_blocking calls for the open and stat steps there, but I combined these in this PR. To try eliminate this, I split the new code into two manual spawn_blocking calls and got these numbers:

8 KB buffer |  269.98 req/s |  263361.36 KB/s

So that only accounts for maybe ~1.5% of that 13% gain.

cc @lnicola

lnicola · 2019-12-18T20:36:42Z

That code seems (IIRC) pretty similar to what tokio does, I didn't really expect the performance improvement. One workaround suggested in that issue is to use task::run_in_place instead of task::spawn_blocking (or tokio::spawn together with task::run_in_place). Perhaps it's worth a try?

open_with_metadata

Our testing method is a bit different. I used a large file and a single client at a time.

stephank · 2019-12-22T14:41:56Z

I guess we should have both benchmarks in code somewhere, but I'll stick to ab for now, because it's what I have. :)

Using block_in_place requires rt-threaded, which totally changes Tokio performance. Performing the same test, starting with current unmodified code in this branch:

New baseline for this test:
224.71 req/s - 219455.55 KB/s
With rt-threaded:
479.57 req/s - 468364.66 KB/s
With rt-threaded, and block_in_place in open_with_metadata:
501.74 req/s - 490019.86 KB/s
With rt-threaded, and block_in_place in open_with_metadata and FileBytesStream:
2546.89 req/s - 2487372.11 KB/s

That seems kind of ridiculous.

I'm not even sure if using block_in_place inside a poll function is safe. This is what I was trying, essentially removing the State enum:

fn poll_next(mut self: Pin<&mut Self>, _cx: &mut Context) -> Poll<Option<Self::Item>> {
    match block_in_place(|| {
        self.buf.reserve(BUF_SIZE);
        let slice = self.buf.bytes_mut();
        let slice = unsafe {
            std::slice::from_raw_parts_mut(slice.as_mut_ptr() as *mut u8, slice.len())
        };
        (&*self.file).read(slice)
    }) {
        Ok(0) => Poll::Ready(None),
        Ok(size) => {
            unsafe { self.buf.advance_mut(size) };
            let retval = self.buf.split().freeze();
            Poll::Ready(Some(Ok(retval)))
        }
        Err(e) => Poll::Ready(Some(Err(e))),
    }
}

lnicola · 2019-12-22T14:45:03Z

Well that's a nice improvement, right? :-). Yeah, pity it won't with with the single-threaded runtime. Maybe we can do nothing and wait for tokio to fix the fs performance.

Otherwise it's safe to use AFAICT.

Use spawn_blocking directly to read from files

bf4e29d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use spawn_blocking directly to read from files #25

Use spawn_blocking directly to read from files #25

stephank commented Dec 18, 2019

lnicola commented Dec 18, 2019

stephank commented Dec 22, 2019

lnicola commented Dec 22, 2019

Use spawn_blocking directly to read from files #25

Are you sure you want to change the base?

Use spawn_blocking directly to read from files #25

Conversation

stephank commented Dec 18, 2019

lnicola commented Dec 18, 2019

stephank commented Dec 22, 2019

lnicola commented Dec 22, 2019