... | ... | @@ -190,7 +190,7 @@ deepspeed train_dalle.py \ |
|
|
|
|
|
```
|
|
|
|
|
|
#### Configure DeepSpeed Zero Optimization
|
|
|
#### Configure DeepSpeed ZeRO Offload/Infinity
|
|
|
> Stage 0, 1, 2, and 3 refer to disabled, optimizer state partitioning, and optimizer+gradient state partitioning, and optimizer+gradient+parameter partitioning, respectively.
|
|
|
|
|
|
In order to change the DeepSpeed stage, find the python dict in `train_dalle.py` with the name `deepspeed_config` and modify as such:
|
... | ... | @@ -233,7 +233,7 @@ deepspeed_config = { |
|
|
}
|
|
|
```
|
|
|
|
|
|
- DeepSpeed Infinity (Requires NVMe drive): Optimizer + Gradient + Parameter + Checkpoint partitioning
|
|
|
- DeepSpeed ZeRO Infinity (Requires NVMe drive): Optimizer + Gradient + Parameter + Checkpoint partitioning
|
|
|
Takes advantage of fast read/write to NVMe drives to offload the optimizer state to the CPU and partition across GPUs/nodes.
|
|
|
```python
|
|
|
deepspeed_config = {
|
... | ... | |