Clayton Mullis · 6d2dcc97
--- a/Home.md
+++ b/Home.md
@@ -130,9 +130,9 @@ srun -A cstdl --cpu-bind=none \

 ```

-## Training with DeepSpeed
-Example of a job:
+### DeepSpeed

+Run 200 steps across 4xV100 using DeepSpeed (zero-optimization disabled)
 ```sh

 #!/usr/bin/env bash
@@ -162,6 +162,71 @@ deepspeed train_dalle.py \

 ```

+#### Configure DeepSpeed Zero Optimization
+> Stage 0, 1, 2, and 3 refer to disabled, optimizer state partitioning, and optimizer+gradient state partitioning, and optimizer+gradient+parameter partitioning, respectively.
+
+In order to change the DeepSpeed stage, find the python dict in `train_dalle.py` with the name `deepspeed_config` and modify as such:
+- Stage 1: Optimizer State Partitioning
+```python
+deepspeed_config = {
+    "zero_optimization": {
+        "stage": 1,
+    }
+}
+```
+
+- Stage 2 (aka ZeRO-Offload): Optimizer + Gradient State Partitioning
+Offload the optimizer (e.g. Adam) state to the CPU and partition across GPUs/nodes
+```python
+deepspeed_config = {
+    "zero_optimization": {
+        "stage": 2,
+        "cpu_offload": True,
+        "contiguous_gradients": True,
+        "overlap_comm": True
+    }
+}
+```
+
+- Stage 3: Optimizer + Gradient + Parameter partitioning
+```python
+deepspeed_config = {
+    "zero_optimization": {
+        "stage": 3,
+        "offload_param": {
+            "device": "cpu",
+            "pin_memory": True,
+        },
+        "offload_optimizer": {
+            "device": "cpu",
+            "pin_memory": True,
+        },
+    }
+}
+```
+
+- DeepSpeed Infinity (Requires NVMe drive): Optimizer + Gradient + Parameter + Checkpoint partitioning
+Takes advantage of fast read/write to NVMe drives to offload the optimizer state to the CPU and partition across GPUs/nodes.
+```python
+deepspeed_config = {
+    "zero_optimization": {
+        "stage": 3,
+        "offload_param": {
+            "device": "nvme",
+            "nvme_path": "./local_nvme",
+            "pin_memory": True,
+        },
+        "offload_optimizer": {
+            "device": "nvme",
+            "nvme_path": "./local_nvme",
+            "pin_memory": True,
+        },
+    },
+}
+```
+
+- For a lot more configuration options to tune, see https://www.deepspeed.ai/docs/config-json
+
 ## Monitoring a Job

 You can interactively attach to a running job via `srun --pty --jobid <job-id> bash`. For example, you may now analyze GPU usage using `nvidia-smi`.
\ No newline at end of file