-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why libtorrent 2.0's use of memory mapped files was a bad idea #7551
Comments
I think it can be salvaged though. Recently described it in #6667 (comment). |
Compile 2.0 with |
This would still have very poor performance compared to libtorrent 1.2's disk I/O subsystem.
Those on network shares have ostensibly encountered these issues .. among others. |
@HanabishiRecca Arvid has been trying 'salvage' the mmap implementation for literally three years now but it hasn't been successful. Mind you a lot of these knobs are on Linux and do nothing for the others OSes. Ultimately the lesson from the paper is that to have control over I/O behavior in your program you really have to do it yourself in userspace. |
Well, yeah. I tried to address excessive memory usage in particular. But Arvid is kinda stubborn in this regard and I doubt we will see "back to the roots" any time soon. |
@arvidn the paper outlines the cases of many DBMS projects initially opting for mmap but then switching away when its limitations became clear and they need control over I/O performance. However your case with libtorrent is unique in that your trajectory is the reverse of many of these projects: You started out with your own buffer pool implementation, managing file I/O in userspace, but then switched to mmap with 2.0. What where the reasons that led you to make this curious decision? |
Contributions are welcome!
Windows does have some counterparts, like
I'm working on it. It's not easy to get right and efficient, contributions are welcome:
They are mostly documented here: https://github.com/arvidn/libtorrent/wiki/memory-mapped-I-O One aspect was that balancing the size of the write cache, read cache and read-back avoidance (i.e. blocks that will need to be read back from disk in order to compute their piece hash) is not possible to do well in user space. It turns out it's not so easy in kernel space either though. Another aspect was the emergence of fast SSDs and persistent memory (DAX) would most likely be much more efficient via memory mapped files. The major failure case of mmap (afaict) is in network mounted drives, or any FUSE drive. On these drives, writing a partial page in a memory mapped file becomes very expensive, as it needs to pull the page from the network, overwrite part of it, and then flush the whole page back again over the network. Preserving the fidelity of exactly which bytes are being written helps tremendously in this scenario. |
I'm personally fine with POSIX I/O. OS filesystem cache does a quite good job. (Even prior LT 2.0 I had in-client cache disabled anyway.)
I would have helped, but I'm not a C++ guy. |
Hmm as this new implementation is taking a while, why not just copy wholesale the disk I/O subsystem from 1.2 for 2.1 and then you can work on this new implementation afterwards? Because you had at first said that this new implementation you're working on wouldn't cache blocks, though more recently you've stated that it would use caching. So it seems to be getting more complex over time .. in the interest of pragmatism is it not prudent to use 1.2's I/O subsystem for now?
Did you mean to include a link here Arvid? I don't see it :(
Hmm how far out was this DAX persistent memory for consumer PCs in this idealized future scenario? Just asking because I have never heard of DAX and if it comes to consumer PCs it seems it will take a long time, if ever. Maybe you were getting ahead of things with respect to future hardware developments with the memory mapped implementation?
It's a pity I don't know C++ :( |
Because a lot of other this have changed around it. The 1.2 implementation doesn't fit in 2.0+. Either option is a lot of work, and I don't have a lot of time.
Yes, that was a copy-paste failure. I updated my post
It seems Intel Optane kind of failed in the market too.
It's never too late to start! |
The thing is, most heavy lifting seeders still use HDDs, simply because of huge amounts of storage required. I know people seeding tens or even hundreds terabytes of data, having 10000+ tasks in a single client. And I don't think they will change soon, as SSD space still remains significantly more expensive. |
My current plan is to only have a store-buffer and rely on the operating system for read cache. |
Persistent memory modules are Intel servers only, you put them in DIMM slots. Only Intel® Xeon® CPUs have hardware support for PMEM in memory controller. |
So @arvidn this wiki page is mostly about the how, not the why. The first three lines give the goals but everything that follows is about how the implementation will work. There isn't any detailed reasoning as to why to adopt the memory mapped design in the first place, with a careful exploration of all the pros and cons. Perhaps this was never done, which would explain all the problems since...
Huh, is your current plan changed from a few months ago? Because in September you wanted to use
Oh I've tried Arvid but man is it hard! And modern C++ is a career, all of C++17 and its idioms, plus the other things you use in your codebase like boost.asio, whose documentation is horrendous! Not for newbies at all! As an example of how difficult it is for me, in a November PR someone mentioned adding support for passing Nonetheless being the fool that I am - as a small exercise - I tried to code up an assignment to this data type and a corresponding Mind you this is a very small idiom of C++ you use in your code that in the end I could not grok enough to make working code out of it, no matter how much I tried. And there are so many much bigger pieces of modern C++ you use, not to mention the boost.asio library, which is quite formidable to grok in and of itself. |
A paper published last year goes into the details of why database systems using mmap in lieu of implementing a buffer pool inevitably run into problems from both a performance and correctness perspective. Of the four specific issues they detail, the first - transactional safety - does not apply to libtorrent since it is not a database in the conventional sense but works with immutable torrent files. But a lot of the reports they reference are eerily similar to issues reported by people here. (eg. "They also faced other issues when running in containerized environments or on machines without direct-attached storage" see #7480) And let's not even detail the additional issues this approach runs into with Windows, as reported by people here.
The paper Are You Sure You Want to Use MMAP in Your DBMS? says this in its abstract:
"mmap’s perceived ease of use has seduced DBMS developers for decades as a viable alternative to implementing a buffer pool. There are, however, severe correctness and performance issues with mmap that are not immediately apparent. Such problems make it difficult, if not impossible, to use mmap correctly and efficiently in a modern DBMS. In fact, several popular DBMSs initially used mmap to support larger-than-memory databases but soon encountered these hidden perils, forcing them to switch to managing file I/O themselves after significant engineering costs."
Though the paper has its detractors, namely the creators of the LMDB and RDB projects who use mmap (see an attempt at a rebuttal here) none disagree that if you want substantive control over I/O behavior then you have to do it yourself in userspace rather than rely on the OS to do it for you.
The text was updated successfully, but these errors were encountered: