Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
D
dl_on_supercomputers
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
HPC4NS
dl_on_supercomputers
Commits
efc3bd4a
Commit
efc3bd4a
authored
May 4, 2021
by
Fahad Khalid
Browse files
Options
Downloads
Patches
Plain Diff
Updated the course material so that the examples comply with TF2.
parent
cc8ad9a9
No related branches found
No related tags found
1 merge request
!4
Updated to use Tensorflow2
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
course_material/examples/mnist_epoch_distributed.py
+8
-9
8 additions, 9 deletions
course_material/examples/mnist_epoch_distributed.py
course_material/examples/mnist_single_gpu.py
+6
-7
6 additions, 7 deletions
course_material/examples/mnist_single_gpu.py
with
14 additions
and
16 deletions
course_material/examples/mnist_epoch_distributed.py
+
8
−
9
View file @
efc3bd4a
...
@@ -4,8 +4,6 @@
...
@@ -4,8 +4,6 @@
# Version 2.0 (see the NOTICE file for details).
# Version 2.0 (see the NOTICE file for details).
"""
"""
This program is an adaptation of the following code sample:
https://github.com/horovod/horovod/blob/master/examples/keras_mnist.py.
The program creates and trains a shallow ANN for handwritten digit
The program creates and trains a shallow ANN for handwritten digit
classification using the MNIST dataset.
classification using the MNIST dataset.
...
@@ -13,14 +11,14 @@
...
@@ -13,14 +11,14 @@
example epochs are distributed across the Horovod ranks, not data.
example epochs are distributed across the Horovod ranks, not data.
To run this sample use the following command on your
To run this sample use the following command on your
workstation/laptop
equipped with a GPU
:
workstation/laptop:
mpirun -np 1 python -u mnist_epoch_distributed.py
mpirun -np 1 python -u mnist_epoch_distributed.py
If you have more than one GPU on your system, you can increase the
If you have more than one GPU on your system, you can increase the
number of ranks accordingly.
number of ranks accordingly.
The code has been tested with Python 3.
7.5
, tensorflow
-gpu 1.1
3.1, and
The code has been tested with Python 3.
8.7
, tensorflow
2.
3.1, and
horovod 0.16.2.
horovod 0.16.2.
Note: This code will NOT work on the supercomputers.
Note: This code will NOT work on the supercomputers.
...
@@ -30,16 +28,17 @@
...
@@ -30,16 +28,17 @@
import
math
import
math
import
tensorflow
as
tf
import
tensorflow
as
tf
import
horovod.tensorflow.keras
as
hvd
import
horovod.tensorflow.keras
as
hvd
from
tensorflow.python.keras
import
backend
as
K
# Horovod: initialize Horovod.
# Horovod: initialize Horovod.
hvd
.
init
()
hvd
.
init
()
# Horovod: pin GPU to be used to process local rank (one GPU per process)
# Horovod: pin GPU to be used to process local rank (one GPU per process)
config
=
tf
.
ConfigProto
()
gpus
=
tf
.
config
.
experimental
.
list_physical_devices
(
'
GPU
'
)
config
.
gpu_options
.
visible_device_list
=
str
(
hvd
.
local_rank
())
if
gpus
:
K
.
set_session
(
tf
.
Session
(
config
=
config
))
tf
.
config
.
experimental
.
set_visible_devices
(
gpus
[
hvd
.
local_rank
()],
'
GPU
'
)
for
gpu
in
gpus
:
tf
.
config
.
experimental
.
set_memory_growth
(
gpu
,
True
)
# Reference to the MNIST dataset
# Reference to the MNIST dataset
mnist
=
tf
.
keras
.
datasets
.
mnist
mnist
=
tf
.
keras
.
datasets
.
mnist
...
...
This diff is collapsed.
Click to expand it.
course_material/examples/mnist_single_gpu.py
+
6
−
7
View file @
efc3bd4a
...
@@ -4,17 +4,16 @@
...
@@ -4,17 +4,16 @@
# Version 2.0 (see the NOTICE file for details).
# Version 2.0 (see the NOTICE file for details).
"""
"""
This program is an adaptation of the code sample available at
This program is an adaptation of a previously available code sample
https://www.tensorflow.org/tutorials/. The program creates
at https://www.tensorflow.org/tutorials/. The program creates and trains a
and trains a shallow ANN for handwritten digit classification
shallow ANN for handwritten digit classification using the MNIST dataset.
using the MNIST dataset.
To run this sample use the following command on your
To run this sample use the following command on your
workstation/laptop
equipped with a GPU
:
workstation/laptop:
python -u mnist.py
python -u mnist.py
The code has been tested with Python 3.
7.5
and tensorflow
-gpu 1.1
3.1
.
The code has been tested with Python 3.
8.7
and tensorflow
2.
3.1
Note: This code will NOT work on the supercomputers.
Note: This code will NOT work on the supercomputers.
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
sign in
to comment