Skip to content
Snippets Groups Projects
Commit c0bff5b5 authored by Jan Ebert's avatar Jan Ebert
Browse files

Link to `torchrun_jsc` repo

parent 3ca91f8a
Branches
No related tags found
No related merge requests found
......@@ -154,14 +154,16 @@ handling ([see here for more
information](https://github.com/pytorch/pytorch/issues/73656)).
Thankfully, there are options to fix these issues:
1. Use wrappers, such as `torchrun_jsc`. <!-- For PyTorch ≥2, this
wrapper is minimally intrusive and does not change underlying
PyTorch code. Instead, it adds some extra argument configurations
to solve the aforementioned issues. For PyTorch <2, it instead -->
It modifies the underlying code on-the-fly to fix the issues.
`torchrun_jsc`/`python -m torchrun_jsc` is a drop-in replacement
for `torchrun`/`python -m torch.distributed.run` and can be
installed via `pip`: `python -m pip install torchrun_jsc`.
1. Use wrappers, such as
[`torchrun_jsc`](https://github.com/HelmholtzAI-FZJ/torchrun_jsc).
<!-- For PyTorch ≥2, this wrapper is minimally intrusive and does
not change underlying PyTorch code. Instead, it adds some extra
argument configurations to solve the aforementioned issues. For
PyTorch <2, it instead --> It modifies the underlying code
on-the-fly to fix the issues. `torchrun_jsc`/`python -m
torchrun_jsc` is a drop-in replacement for `torchrun`/`python -m
torch.distributed.run` and can be installed via `pip`: `python -m
pip install torchrun_jsc`.
2. Use PyTorch as provided by the module system. We include patches to
ensure that the errors in `torchrun` are fixed and that it reliably
works on our system.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment