Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement

By Nikolaos Gkanatsios*, Ayush Jain*, Zhou Xian, Yunchu Zhang, Christopher G. Atkeson, Katerina Fragkiadaki.

Official implementation of "Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement", accepted by RSS 2023.

Install

Requirements

We showcase the installation for CUDA 11.1 and torch==1.10.2, which is what we used for our experiments.

conda create -n "srem" python=3.8
conda activate srem
pip install -r requirements.txt
pip install -U torch==1.10.2 torchvision==0.11.3 --extra-index-url https://download.pytorch.org/whl/cu111
sh scripts/init.sh (Make sure you have gcc>=5.5.0)

Data Preparation

For generating simulation data for our benchmarks, execute

python demos.py

This will generate data for cliport tasks, spatial relations, shapes and compositional benchmarks. If you want to generate data for a specific benchmark, you can comment out the rest from task_list

Download all the needed checkpoints:

wget https://zenodo.org/record/8114634/files/checkpoints.zip?download=1

Usage

We provide scripts for training goal conditioned transporter and evaluating our full model on various benchmarks in scripts folder.

sh scripts/run_train_composition_one_step.sh # composition-one-step benchmark
sh scripts/run_train_composition_group.sh # composition-group benchmark
sh scripts/run_train_single_relations.sh # relations benchmark
sh scripts/run_train_shapes.sh # shapes benchmark
sh scripts/run_train_cliport.sh # cliport benchmark

If you followed the Data Preparation section, you should have access to all the checkpoints needed to do the evaluations and reproduce the results.

Training Individual Modules

Parser

For training the language parser, first generate the language data by running:

python data/create_cliport_programs.py

This will create a json file with paired language sentence and expected program.

For actually training it, you can run:

sh scripts/train_parser.sh

EBM

To train the EBMs:

sh scripts/train_ebm_script.sh

This will train the EBMs on all different concepts we use in our paper. You can isolate a command from this script to train on your concept of interest. Use the --eval flag to run inference and visualize generated samples.

Grounding Model

We train the grounding model using the publically available code of BUTD-DETR

Vizualizations and Results

Please check our website for qualitative results and a quick overview of this project.

Acknowledgements

Parts of this code were based on the codebase of CLIPort. The code for grounding module is borrowed from BUTD-DETR. Parts of the EBM training code were based on the codebase of compose-visual-relations.

Citing SREM

If you find SREM useful in your research, please consider citing:

@article{gkanatsios2023energy,
  title={Energy-based models as zero-shot planners for compositional scene rearrangement},
  author={Gkanatsios, Nikolaos and Jain, Ayush and Xian, Zhou and Zhang, Yunchu and Atkeson, Christopher and Fragkiadaki, Katerina},
  journal={arXiv preprint arXiv:2304.14391},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
agents		agents
beauty_detr		beauty_detr
cfg		cfg
data		data
ebm_training		ebm_training
environments		environments
models		models
scripts		scripts
tasks		tasks
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cameras.py		cameras.py
demos.py		demos.py
global_vars.py		global_vars.py
init_executor.py		init_executor.py
main_ebm.py		main_ebm.py
main_parser.py		main_parser.py
requirements.txt		requirements.txt
train_transporter.py		train_transporter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement

Install

Requirements

Data Preparation

Usage

Training Individual Modules

Parser

EBM

Grounding Model

Vizualizations and Results

Acknowledgements

Citing SREM

About

Languages

License

ayushjain1144/ebmplanner

Folders and files

Latest commit

History

Repository files navigation

Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement

Install

Requirements

Data Preparation

Usage

Training Individual Modules

Parser

EBM

Grounding Model

Vizualizations and Results

Acknowledgements

Citing SREM

About

Resources

License

Stars

Watchers

Forks

Languages