Merge branch 'master' into 'main'

README See merge request !4

Merge branch 'master' into 'main'
20c34d01 · Chelsea Maria John · a3a79111 · 414efb7e · 20c34d01
Commit 20c34d01 authored 2 years ago by Chelsea Maria John
--- a/README.md
+++ b/README.md
@@ -6,13 +6,12 @@ the forked [Meta OPT codebase](https://github.com/chelseajohn/metaseq.git).
 ## Getting Started 

 ### Set up
-Assuming you have already set up your environment on the
-supercomputer. If you have not, please see the ["Getting started at
+Please see the ["Getting started at
 JSC"](https://gitlab.jsc.fz-juelich.de/opengptx/infos-public/-/blob/main/documentation/getting_started_at_JSC.md)
-guide. Then 
+guide and setup your environment in the JUWELS supercomputer, if you have not yet. Then 

 - Clone this repository
- make required changes in `variables.bash`
+- make required location changes in `variables.bash`
 - execute 
 ```
 nice bash setup.bash
@@ -24,7 +23,7 @@ Make required changes in the `jobscript.sh` like adjusting the `#SBATCH` variabl
 ```
 sbatch jobscript.sh
 ```
-**WARNING** : PyTorch >= 1.11 will complain about not being able to handle some address families and tell you that sockets are invalid. This does **not** hinder the code from scaling according to the number of total GPUs.
+**WARNING** : PyTorch >= 1.11 will throw warnings about client socket initializations and `(errno: 97 - Address family not supported by protocol)`. This so far has **not** hindered the code from scaling to the total number of GPUs assigned.

 ### Launch tensorboard for the run 

@@ -44,7 +43,7 @@ tensorboard serve --logdir="INSERT_TENSORBOARD_LOGDIR" --bind_all

 ## Interactive Usage

-To work interactively, please activate the environment like this:
+To work interactively, please activate the environment using the following command:

 ```
 source activate.bash
@@ -59,7 +58,10 @@ environment, and set the variables specified in `variables.bash`.
 - JUWELS Cluster
 - JUWELS Booster

-Supported means tested and the correct CUDA compute architecture will
-be selected. Other machines can easily be supported by adjusting
-`activate.bash`.
+Other machines can easily be supported by adjusting `activate.bash` and setting the correct CUDA architecture.
+
+## Tested Models
+Test runs for 15-30 mins were performed on the follwoing models to train from scratch using the [OSCAR](https://huggingface.co/bigscience/misc-test-data/tree/main/stas) dataset.
+- 125m model
+- 30b model