mirror of https://github.com/kubeflow/examples.git
Fix a bunch of issues with the xgboost_synthetic example (#621)
* Need to add kfmd to requirements.txt because the training code now uses kfmd to log data. * The Dockerfile didn't build with kaniko; it looks like a permission problem trying to install python files into the conda directory. The problem appears to be fixed by not switching to user root. * Updte the base docker image to 1.13. * Remove some references in the notebook to namespace because the fairing code should now detect namespace automatically and the notebook will no longer be running namespace kubeflow * When running training in a K8s job; the code will now try to contact the metadata server but this can fail if the ISTIO side car hasn't started yet. So we need to wait for ISTIO to start; we do this by trying to contact the metadata server for up to 3 minutes. * Add a lot more explanation in the notebook to explain what is happening. * Related to #619
This commit is contained in:
parent
2acf34f916
commit
5b3016fae9
|
@ -4,4 +4,6 @@
|
|||
**/__pycache__
|
||||
*.zip
|
||||
mlpipeline-metrics.json
|
||||
mlpipeline-ui-metadata.json
|
||||
mlpipeline-ui-metadata.json
|
||||
build-train-deploy.py
|
||||
**/.dat
|
|
@ -3,21 +3,7 @@
|
|||
# This docker image is based on existing notebook image
|
||||
# It also includes the dependencies required for training and deploying
|
||||
# this way we can use it as the base image
|
||||
FROM gcr.io/kubeflow-images-public/tensorflow-1.12.0-notebook-cpu:v0.5.0
|
||||
|
||||
USER root
|
||||
FROM gcr.io/kubeflow-images-public/tensorflow-1.13.1-notebook-cpu:v0.5.0
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip3 --no-cache-dir install -r requirements.txt
|
||||
|
||||
RUN apt-get update -y
|
||||
RUN apt-get install -y emacs
|
||||
|
||||
RUN pip3 install https://storage.googleapis.com/ml-pipeline/release/0.1.20/kfp.tar.gz
|
||||
|
||||
# Checkout kubeflow/testing because we use some of its utilities
|
||||
RUN mkdir -p /src/kubeflow && \
|
||||
cd /src/kubeflow && \
|
||||
git clone https://github.com/kubeflow/testing.git testing
|
||||
|
||||
USER jovyan
|
||||
|
|
File diff suppressed because it is too large
Load Diff
Binary file not shown.
After Width: | Height: | Size: 112 KiB |
|
@ -3,6 +3,7 @@ fire
|
|||
gitpython
|
||||
google-cloud-storage
|
||||
joblib
|
||||
kfmd
|
||||
numpy
|
||||
pandas
|
||||
retrying
|
||||
|
|
Loading…
Reference in New Issue