From c0bff5b541ad868e636c4cce4d0f7812bc30164f Mon Sep 17 00:00:00 2001 From: janEbert <janpublicebert@posteo.net> Date: Thu, 14 Nov 2024 14:55:37 +0100 Subject: [PATCH] Link to `torchrun_jsc` repo --- README.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 8a8b897..bfcceaf 100644 --- a/README.md +++ b/README.md @@ -154,14 +154,16 @@ handling ([see here for more information](https://github.com/pytorch/pytorch/issues/73656)). Thankfully, there are options to fix these issues: -1. Use wrappers, such as `torchrun_jsc`. <!-- For PyTorch ≥2, this - wrapper is minimally intrusive and does not change underlying - PyTorch code. Instead, it adds some extra argument configurations - to solve the aforementioned issues. For PyTorch <2, it instead --> - It modifies the underlying code on-the-fly to fix the issues. - `torchrun_jsc`/`python -m torchrun_jsc` is a drop-in replacement - for `torchrun`/`python -m torch.distributed.run` and can be - installed via `pip`: `python -m pip install torchrun_jsc`. +1. Use wrappers, such as + [`torchrun_jsc`](https://github.com/HelmholtzAI-FZJ/torchrun_jsc). + <!-- For PyTorch ≥2, this wrapper is minimally intrusive and does + not change underlying PyTorch code. Instead, it adds some extra + argument configurations to solve the aforementioned issues. For + PyTorch <2, it instead --> It modifies the underlying code + on-the-fly to fix the issues. `torchrun_jsc`/`python -m + torchrun_jsc` is a drop-in replacement for `torchrun`/`python -m + torch.distributed.run` and can be installed via `pip`: `python -m + pip install torchrun_jsc`. 2. Use PyTorch as provided by the module system. We include patches to ensure that the errors in `torchrun` are fixed and that it reliably works on our system. -- GitLab