Metadata caching #34

asomers · 2024-07-17T23:07:05Z

The current implementation of fuse-ufs reads file blocks on an as-needed basis. So it uses very little RAM. But an unintended result is that for many read-world workloads, fuse-ufs will read the same blocks from disk over and over. For example, if the user tries to read an entire large file, resolve_file_block will reread every indirect block from disk for each block in the file. For a file with one level of indirection, that basically doubles the amount of disk activity relative to an efficient implementation.

An efficient implementation would cache metadata in RAM. A full-featured cache might include an LRU or even an ARC to limit the cache's size. But two facts make fuse-ufs's job easier:

The kernel will cache file content itself, above the level of fusefs. So fuse-ufs doesn't need to cache files' direct blocks.
The kernel includes a vnode cache. Under pressure, it will evict vnodes, and when it does the fuse server will get a FUSE_FORGET request.

So an easy way to handle cacheing would be for fuse-ufs to keep a container of open Inode objects. The lookup method will add objects to that container, and forget will remove them. Then operations like read would cache the inode's indirect blocks within the Inode object itself. For directories, readdir would do the same.

Here is an example of a benchmark program that computes the read amplification of various operations for a similar fuse file system. It would be easy to adapt to fuse-ufs.
https://github.com/KhaledEmaraDev/xfuse/blob/b6ad7fd32ad4443b5ab4ab684ae0c7639eb31be1/benches/read-amplification.rs

The text was updated successfully, but these errors were encountered:

realchonk · 2024-07-18T00:22:55Z

I agree. I have even done something like that before, so it's nothing new to me.

realchonk · 2024-07-19T18:47:11Z

Blocked by #41

asomers added the performance label Jul 17, 2024

realchonk added the blocked Progress is blocked by another issue/PR label Jul 19, 2024

asomers mentioned this issue Jul 19, 2024

Implement basic benchmarking #41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata caching #34

Metadata caching #34

asomers commented Jul 17, 2024

realchonk commented Jul 18, 2024

realchonk commented Jul 19, 2024

Metadata caching #34

Metadata caching #34

Comments

asomers commented Jul 17, 2024

realchonk commented Jul 18, 2024

realchonk commented Jul 19, 2024