Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

[WIP] Libnethack shared #140

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from
Draft

[WIP] Libnethack shared #140

wants to merge 14 commits into from

Conversation

tscmoo
Copy link
Contributor

@tscmoo tscmoo commented Dec 7, 2020

I'm creating this draft PR as things appear to be fully functional, but there are at minimum some cleanups necessary, and some more functionality I'd like to add.

This introduces a "shared" mode to the libnethack approach. It is Linux-only, and implements manual loading of the shared object file to avoid the need to copy the library anywhere and to better isolate the environment.
It is enabled by default on Linux and will fall back to the existing (dlopen) behavior on other platforms.
I called it shared as the shared object is actually shared between instances. Yey.

Some advantages:

  • never creates or writes to any files on the filesystem. NetHack is tricked into using temporary files in memory when it wants to write to something. Thus it doesn't leak anything even in the event of a crash, and instances are guaranteed not to inferere with each other (not even the leaderboard is shared). chdir is also supressed.
  • Should perform better, though this remains to be tested.
    • Resets should be much faster as they now involve little more than a 130kb memcpy.
    • Copies of the library are still done in memory, but they are copy-on-write mmapped copies - this allows them to share large portions of read-only instruction data, which in turn will be shared in the instruction cache in the CPU.
  • Some functions such as fork, popen, exit will currently throw an error, but my intention is for them to just end the episode. This allows us to continue even if nethack panics or quits for some reason.
  • It will be possible to track memory resources to detect leaks or force cleanup.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 7, 2020
dlpath = os.path.join(self._vardir, "libnethack.so")
shutil.copyfile(DLPATH, dlpath)
# Create a HACKDIR for us.
if os.getenv("SLURM_JOBID"):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slurm stuff probably doesn't belong here and I'll most likely remove it from this PR

Copy link
Contributor

@heiner heiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can review this more than superficially.

Would be interested if this releves any lock contention issue the dlopen/dlclose cycles produced?

nle/nethack/nethack.py Outdated Show resolved Hide resolved
os.close(os.open(os.path.join(self._vardir, fn), os.O_CREAT))
os.mkdir(os.path.join(self._vardir, "save"))
if _pynethack.supports_shared():
# "shared" mode does some hacky things to enable using a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mix of 2-letter and 4-letter indentation is weeeiiiirdd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. I apologize. I come from a world of 2 indentation. Formatting will be cleaned up.

sys/unix/nledl.c Outdated Show resolved Hide resolved
sys/unix/nleshared.cc Outdated Show resolved Hide resolved
sys/unix/nleshared.cc Show resolved Hide resolved
endaddr = 0;
size = 0;

forPh([&](Elf64_Phdr* ph) {;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a very vague notion of what this is doing: It's like a reimplementation of parts of Linux dynamic library loading. It interprets the data in the so in ELF format, creates manual copies ("forks"), and gives access to symbols a la dlsym.

I can't claim I understand much beyond that and perhaps a few comments might be good?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to outline the behavior in comments.

@@ -75,13 +75,13 @@ def _set_env_vars(options, hackdir, wizkit=None):
if wizkit is not None:
os.environ["WIZKIT"] = wizkit

_nhinstances = 0

# TODO: Not thread-safe for many reasons.
# TODO: On Linux, we could use dlmopen to use different linker namespaces,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably update some of these comments when we merge this.

@heiner
Copy link
Contributor

heiner commented Dec 7, 2020

Ah I read the PR description now.

Some functions such as fork, popen, exit will currently throw an error, but my intention is for them to just end the episode. This allows us to continue even if nethack panics or quits for some reason.

If you make them call nethack_exit(int status) that should do it.

As for the not-writing any files approach: I'm not sure to what extend users are currently using the built-in highscore (written to a file, persistent for one instance of NLE). Probably not much, and very probably not enough to warrant doing something different than you are doing here.

If only there were a POSIX way of doing this :P

@tscmoo
Copy link
Contributor Author

tscmoo commented Dec 7, 2020

If you make them call nethack_exit(int status) that should do it.

Great, thanks.

The records (highscore) file could be made persistent between resets, but is there even anything it could be used for?

If only there were a POSIX way of doing this :P

The other approach to replacing these functions would of course be to do it in nethack, through the system specific configs and defines.

sys/unix/nleshared.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@heiner heiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more question:

It will be possible to track memory resources to detect leaks or force cleanup.

How would you do that? Overriding each of malloc, alloc, posix_memalign, etc?

sys/unix/nleshared.cc Outdated Show resolved Hide resolved
@heiner
Copy link
Contributor

heiner commented Dec 8, 2020

The records (highscore) file could be made persistent between resets, but is there even anything it could be used for?

It's not very important; the only effect a persistent highscore has is that the replays contain this relative comparison of the episode that just ended with other episodes. I'm not sure to what extend researchers look at replays and the highscore data in there.

size_t baseaddr = ~0;
size_t endaddr = 0;

forPh([&](Elf64_Phdr* ph) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is fast enough to not need the ability to break based on e.g. a return value of the lambda?

Copy link
Contributor Author

@tscmoo tscmoo Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code only runs once in the entire lifespan of the process, so... There's really nothing to be gained by making this stuff faster.
(also, it is very fast yeah)

@heiner heiner mentioned this pull request May 6, 2024
5 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants