OpenCatalyst
This repository contains the following:
- scripts to set up a python venv and run the codebase open_cataylst on JSC JuwelsBooster
- Results of initial runs in OpenCatalystOutput folder
Set up
- make a directory and clone the forked repository hpc ;made compatible with JUWELS
- Clone this repository in the same directory
- execute
nice bash setup.sh
source activate.sh
Install the packages with
pip install -e .
Finally, install the pre-commit hooks:
pre-commit install
Start Training
main.py
serves as the entry point to run any task. This script requires two command line
arguments at a minimum:
-
--mode MODE
: MODE can betrain
,predict
orrun-relaxations
to train a model, make predictions using an existing model, or run machine learning based relaxations using an existing model, respectively. -
--config-yml PATH
: PATH is the path to a YAML configuration file. We use YAML files to supply all parameters to the script. Theconfigs
directory contains a number of example config files.
Running main.py
directly runs the model on a single CPU or GPU if one is available:
python main.py --mode train --config-yml configs/mlperf_hpc.yml
To access slurm cluster using submitit package to simplify multi-node distributed training:
python main.py --num-gpus 4 --num-nodes 2 --num-workers 1 --submit --mode train --config-yml configs/mlperf_hpc.yml
For detailed explanations, check out Training and Evaluvating models on 0C20 dataset
Workflow
- Edit
config.sh
to change name and location of the venv - Edit
modules.sh
to change the modules loaded prior to the creation of the venv. - Edit
requirements.txt
to change the packages to be installed during setup. - Edit
setup.sh
andactivate.sh
to add extra steps for custom modules. - Create the environment with
bash setup.sh
. - Create a kernel with
bash create_kernel.sh
.