diff --git a/README.md b/README.md index eef34c2ac2d030330917f9141640e4a88bf83a31..e8ec00f30f041319581dcf3789acff1b30ac6ea5 100644 --- a/README.md +++ b/README.md @@ -152,29 +152,26 @@ Sadly, there are some issues with this API that make its usage on JSC systems difficult because of the aforementioned special hostname handling ([see here for more information](https://github.com/pytorch/pytorch/issues/73656)). -Thankfully, there are options to fix these issues: - -1. Use wrappers, such as - [`torchrun_jsc`](https://github.com/HelmholtzAI-FZJ/torchrun_jsc). - <!-- For PyTorch ≥2, this wrapper is minimally intrusive and does - not change underlying PyTorch code. Instead, it adds some extra - argument configurations to solve the aforementioned issues. For - PyTorch <2, it instead --> It modifies the underlying code - on-the-fly to fix the issues. `torchrun_jsc`/`python -m - torchrun_jsc` is a drop-in replacement for `torchrun`/`python -m - torch.distributed.run` and can be installed via `pip`: `python -m - pip install torchrun_jsc`. -2. Use PyTorch as provided by the module system. We include patches to - ensure that the errors in `torchrun` are fixed and that it reliably - works on our system. - -In our example, we always use the wrapper even if we are already using -the module system to show off how to use it. This way, you can apply -the same template to your own projects that may use `pip`-installed -PyTorch versions, or even a container. This also means that you need -to set up a virtual environment with `torchrun_jsc` installed before -being able to use the example out-of-the-box. This can be done by -executing `nice bash set_up.sh` once on a login node. +Thankfully, there are options to fix these issues, such as wrappers +like +[`torchrun_jsc`](https://github.com/HelmholtzAI-FZJ/torchrun_jsc). +<!-- For PyTorch ≥2, this wrapper is minimally intrusive and does not +change underlying PyTorch code. Instead, it adds some extra argument +configurations to solve the aforementioned issues. For PyTorch <2, it +instead --> It modifies the underlying code on-the-fly to fix the +issues. `torchrun_jsc`/`python -m torchrun_jsc` is a drop-in +replacement for `torchrun`/`python -m torch.distributed.run` and can +be installed via `pip`: `python -m pip install torchrun_jsc`. + +In our example, we always use the wrapper to show off how to use it. +This way, you can apply the same template to your own projects that +may use `pip`-installed PyTorch versions, or even a container. Despite +the name, `torchrun_jsc` works on _any_ computer since it is basically +just a fixed version of `torchrun`, so you don't need to adapt your +code when switching machines. This also means that you need to set up +a virtual environment with `torchrun_jsc` installed before being able +to use the example out-of-the-box. This can be done by executing `nice +bash set_up.sh` once on a login node. ### Job submission