From a64941f6bdca4e37884f74033e43502798640b36 Mon Sep 17 00:00:00 2001
From: janEbert <janpublicebert@posteo.net>
Date: Tue, 9 Jul 2024 17:26:04 +0200
Subject: [PATCH] Avoid MPI terminology

---
 README.md | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 74a9ccf..a11c5ed 100644
--- a/README.md
+++ b/README.md
@@ -467,9 +467,8 @@ For initialization, FSDP first defines a hierarchy of distinct, but
 possibly nested, submodules ("units") for the model. This process is
 also called "wrapping" in FSDP terminology and can be controlled using
 the `auto_wrap_policy` argument to `FullyShardedDataParallel`. The
-parameters in each unit are then split and distributed ("sharded", or
-scattered) to all GPUs. In the end, each GPU contains its own,
-distinct model shard.
+parameters in each unit are then split and distributed ("sharded") to
+all GPUs. In the end, each GPU contains its own, distinct model shard.
 
 Whenever we do a forward pass with the model, we sequentially pass
 through units in the following way: FSDP automatically collects the
@@ -511,7 +510,7 @@ shard. This also means that we have to execute saving and loading on
 every process, since the data is fully distinct.
 
 The example also contains an unused `save_model_singular` function
-that gathers the full model on the CPU and then saves it in a single
+that collects the full model on the CPU and then saves it in a single
 checkpoint file which can then be loaded in a single process. Keep in
 mind that this way of checkpointing is slower and limited by CPU
 memory.
-- 
GitLab