Distributed Training (TensorFlow, MPI, & Horovod)
Distributed Training + Gradient
gradient experiments run multinode \
--name multiEx \
--projectId <your-project-id> \
--experimentType GRPC \
--workerContainer tensorflow/tensorflow:1.13.1-gpu-py3 \
--workerMachineType K80 \
--workerCommand "python mnist.py" \
--workerCount 2 \
--parameterServerContainer tensorflow/tensorflow:1.13.1-gpu-py3 \
--parameterServerMachineType K80 \
--parameterServerCommand "python mnist.py" \
--parameterServerCount 1 \
--workspaceUrl https://github.com/Paperspace/mnist-sample.git \
--modelType TensorflowRelated Material
Last updated