Skip to content
Snippets Groups Projects
Select Git revision
  • aaf83ff9eec6a8fbfd28fcd93cb0f1f44e45e255
  • main default protected
2 results

OpenCatalyst

Chelsea Maria John's avatar
Chelsea Maria John authored
README

See merge request !4
aaf83ff9
History

OpenCatalyst

This repository contains the following:

  • scripts to set up a python venv and run the codebase open_cataylst on JSC JuwelsBooster
  • Results of initial runs in OpenCatalystOutput folder

Set up

  • make a directory and clone the forked repository hpc ;made compatible with JUWELS
  • Clone this repository in the same directory
  • execute
nice bash setup.sh
source activate.sh

Install the packages with

pip install -e .

Finally, install the pre-commit hooks:

pre-commit install

Start Training

main.py serves as the entry point to run any task. This script requires two command line arguments at a minimum:

  • --mode MODE: MODE can be train, predict or run-relaxations to train a model, make predictions using an existing model, or run machine learning based relaxations using an existing model, respectively.
  • --config-yml PATH: PATH is the path to a YAML configuration file. We use YAML files to supply all parameters to the script. The configs directory contains a number of example config files.

Running main.py directly runs the model on a single CPU or GPU if one is available:

python main.py --mode train --config-yml configs/mlperf_hpc.yml

To access slurm cluster using submitit package to simplify multi-node distributed training:

python main.py --num-gpus 4 --num-nodes 2 --num-workers 1  --submit --mode train --config-yml configs/mlperf_hpc.yml

For detailed explanations, check out Training and Evaluvating models on 0C20 dataset

Workflow

  1. Edit config.sh to change name and location of the venv
  2. Edit modules.sh to change the modules loaded prior to the creation of the venv.
  3. Edit requirements.txt to change the packages to be installed during setup.
  4. Edit setup.sh and activate.sh to add extra steps for custom modules.
  5. Create the environment with bash setup.sh.
  6. Create a kernel with bash create_kernel.sh.