mirror of https://github.com/kubeflow/examples.git
* Need to add kfmd to requirements.txt because the training code now uses kfmd to log data. * The Dockerfile didn't build with kaniko; it looks like a permission problem trying to install python files into the conda directory. The problem appears to be fixed by not switching to user root. * Updte the base docker image to 1.13. * Remove some references in the notebook to namespace because the fairing code should now detect namespace automatically and the notebook will no longer be running namespace kubeflow * When running training in a K8s job; the code will now try to contact the metadata server but this can fail if the ISTIO side car hasn't started yet. So we need to wait for ISTIO to start; we do this by trying to contact the metadata server for up to 3 minutes. * Add a lot more explanation in the notebook to explain what is happening. * Related to #619 |
||
|---|---|---|
| .. | ||
| addgcpsecret.png | ||