# Run a MPI Job The `mpirun.py` sample creates a pipeline runs allreduce-style distributed training. ## Requirements - [Install arena](https://github.com/kubeflow/arena/blob/master/docs/installation/README.md) - This sample requires to create distributed storage. In this sample, we use NFS as example. 1.You need to create `/data` in the NFS Server ``` # mkdir -p /nfs # mount -t nfs -o vers=4.0 NFS_SERVER_IP:/ /nfs # mkdir -p /data # cd / # umount /nfs ``` 2\.Create Persistent Volume. Moidfy `NFS_SERVER_IP` to yours. ``` # cat nfs-pv.yaml apiVersion: v1 kind: PersistentVolume metadata: name: user-susan labels: user-susan: pipelines spec: persistentVolumeReclaimPolicy: Retain capacity: storage: 10Gi accessModes: - ReadWriteMany nfs: server: NFS_SERVER_IP path: "/data" # kubectl create -f nfs-pv.yaml ``` 3\.Create Persistent Volume Claim. ``` # cat nfs-pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: user-susan annotations: description: "this is the mnist demo" owner: Tom spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi selector: matchLabels: user-susan: pipelines # kubectl create -f nfs-pvc.yaml ``` > Notice: suggest to add `description` and `owner` ## Instructions ### 1.With command line to compile the python code to p First, install the necessary Python Packages ```shell pip3 install http://kubeflow.oss-cn-beijing.aliyuncs.com/kfp/0.1.16/kfp.tar.gz --upgrade pip3 install http://kubeflow.oss-cn-beijing.aliyuncs.com/kfp-arena/kfp-arena-0.6.tar.gz --upgrade ``` Then run [mpi_run.py](mpi_run.py) with different parameters. ``` dsl-compile --py mpi_run.py --output mpi_run.py.tar.gz ``` Then, submit `[mpi_run.py.tar.gz](mpi_run.py.tar.gz)` to the kubeflow pipeline UI. ![](choose_pipelines.jpg) You can use the mpirun pipeline definition to submit run, and choose the different parameters. ![](submit_run.jpg) ### 2.Check the result of the MPI Run pipeline ![](demo.jpg)