[NO MERGING] code release for NeurIPS 2020 #128

alexholdenmiller · 2020-11-13T15:05:40Z

This PR will be closed and is only here to highlight the code available within this branch.

Here we release updated code to get results competitive with our NeurIPS 2020 paper.

To be clear, this is not the exact code used for the paper: we made a number of performance improvements to NLE since the original results, dramatically increasing the speed of the environment (which was already one of the fastest-performing environments when the paper was published!).

We also introduced some additional modelling options, including conditioning the model on the in-game messages (i.e. msg.model=lt_cnn) and introducing new ways of observing the environment through different glyph types (i.e. glyph_type=all_cat). These features are enabled by default for the model now, which outperforms the models in the paper.

aleSuglia · 2020-11-15T18:20:50Z

@alexholdenmiller Thanks a lot for releasing the code. This is very useful indeed. Quick question: what would be the best way to extend the evaluation with a different model? I currently have a new model for NetHack and I'm wondering what codebase I should be using for having a fair evaluation with your NeurIPS results. In a nutshell: if I want to claim that I'm SOTA on NetHack would it be enough to run (with my custom model) the nle/agent/neurips_sweep.sh and compare the results I get with the one in the paper?

I believe at this stage would be extremely useful to have something documented so that this is crystal clear. Thanks a lot!

alexholdenmiller · 2020-11-16T11:30:12Z

@aleSuglia thanks for reaching out!

that's reasonable yes, though if you have the resources you may want to rerun the baseline as well due to a few changed params (or you can change them back) vs the paper. I'm rerunning experiments using these params here and will publish the results in the README in this folder.

here are the changed params, which I will also be adding to the README here:

disable reward clipping but enabled reward normalization (this has more consistent performance across tasks, and here I mean normalization ONLY to be dividing by the running stdev but NOT subtracting the mean--we preserve the meaning of positive vs negative reward in this environment)
increase hidden size 128 -> 256 and embedding size 32 -> 64
add a "message model" which feeds the ingame messages through a convolutional model (several other choices available)
instead of embedding each glyph in the observation with its unique ID (there are around 6000), creating an embedding based on different properties: the unique ID, the alphanumeric character used, the color, a special indicator, a group ID, and a sub-id within the group. this provides a more compositional meaning that the model can exploit (e.g. different monsters of the same type may have a similar character but different colours). we set it to "all_cat" in this config which concatenates sub-embeddings for each of these so they add up to 64 (i.e. 8 dim for group, 24 dim for id, 8 dim for color, 16 dim for character, 8 dim for special)

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 13, 2020

alexholdenmiller added 3 commits November 17, 2020 06:08

neurips release

ffd6cee

config

2c2da28

readme

0596efd

alexholdenmiller force-pushed the neurips2020release branch from 82c865e to 0596efd Compare November 18, 2020 14:37

alexholdenmiller added 3 commits December 2, 2020 07:12

updates

a241c90

readme

adb9173

fix readme

0424d29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NO MERGING] code release for NeurIPS 2020 #128

[NO MERGING] code release for NeurIPS 2020 #128

alexholdenmiller commented Nov 13, 2020

aleSuglia commented Nov 15, 2020 •

edited

Loading

alexholdenmiller commented Nov 16, 2020 •

edited

Loading

[NO MERGING] code release for NeurIPS 2020 #128

Are you sure you want to change the base?

[NO MERGING] code release for NeurIPS 2020 #128

Conversation

alexholdenmiller commented Nov 13, 2020

This PR will be closed and is only here to highlight the code available within this branch.

aleSuglia commented Nov 15, 2020 • edited Loading

alexholdenmiller commented Nov 16, 2020 • edited Loading

aleSuglia commented Nov 15, 2020 •

edited

Loading

alexholdenmiller commented Nov 16, 2020 •

edited

Loading