Commit Graph

4 Commits

Author SHA1 Message Date
Zhenghui Wang 74378a2990 Add end2end test for Xgboost housing example (#493)
* Add e2e test for xgboost housing example

* fix typo

add ks apply

add [

modify example to trigger tests

add prediction test

add xgboost ks param

rename the job name without _

use - instead of _

libson params

rm redudent component

rename component in prow config

add ames-hoursing-env

use - for all names

use _ for params names

use xgboost_ames_accross

rename component name

shorten the name

change deploy-test command

change to xgboost-
namespace

init ks app

fix type

add confest.py

change path

change deploy command

change dep

change the query URL for seldon

add ks_app with seldon lib

update ks_app

use ks init only

rerun

change to kf-v0-4-n00 cluster

add ks_app

use ks-13

remove --namespace

use kubeflow as namespace

delete seldon deployment

simplify ks_app

retry on 503

fix typo

query 1285

move deletion after prediction

wait 10s

always retry till 10 mins

move check to retry

 fix pylint

move  clean-up to the delete template

* set up xgboost component

* check in ks component& run it directly

* change comments

* add comment on why use 'ks delete'

* add two modules to pylint whitelist

* ignore tf_operator/py

* disable pylint per line

* reorder import
2019-02-12 06:37:05 -08:00
Jeremy Lewi 5b797c871e Create an E2E test for TFServing using the rest API (#479)
* Create an E2E test for TFServing using the rest API

* We use the pytest framework because
  1. it has really good support for using command line arguments
  2. can emit junit xml file to report results to prow.

Related to #270: Create a generic test runner

* Address comments.

* Fix lint.

* Add retries to the prediction.

* Add some comments.

* Fix model path.

* * Fix the workflow labels
* Set the K8s service name correctly on the test.

* Fix the workflow.

* Fix lint.
2019-01-18 16:29:42 -08:00
Jeremy Lewi 2494fdf8c5 Update serving in mnist example; use 0.4 and add testing. (#469)
* Add the TFServing component
* Create TFServing components.

* The model.py code doesn't appear to be exporting a model in saved model
  format; it was a missing a call to export.

  * I'm not sure how this ever worked.

* It also looks like there is a bug in the code in that its using the cnn input fn even if the model is the linear one. I'm going to leave that as is for now.

* Create a namespace for each test run; delete the namespace on teardown
* We need to copy the GCP service account key to the new namespace.
* Add a shell script to do that.
2019-01-11 14:36:43 -08:00
Jeremy Lewi ef108dbbcc Update training to use Kubeflow 0.4 and add testing. (#465)
* Update training to use Kubeflow 0.4 and add testing.

* To support testing we need to create a ksonnet template to train
  the model so we can easily subsitute in different parameters during
  training.

* We create a ksonnet component for just training; we don't use Argo.
  This makes the example much simpler.

* To support S3 we add a generic ksonnet parameter to take environment
  variables as a comma separated list of variables. This should make it
  easy for users to set the environment variables needed to talk to S3.
  This is compatible with the existing Argo workflow which supports S3.

* By default the training job runs non-distributed; this is because to
  run distributed the user needs a shared filesystem (e.g. S3/GCS/NFS).

* Update the mnist workflow to correctly build the images.

  * We didn't update the workflow in the previous example to actually
    build the correct images.

* Update the workflow to run the tfjob_test

* Related to #460 E2E test for mnist.

* Add a parameter to specify a secret that can be used to mount
  a secret such as the GCP service account key.

* Update the README with instructions for GCS and S3.

* Remove the instructions about Argo; the Argo workflow is outdated.

  Using Argo adds complexity to the example and the thinking is to remove
  that to provide a simpler example and to mirror the pytorch example.

* Add a TOC to the README

* Update prerequisite instructions.

  * Delete instructions for installing Kubeflow; just link to the
    getting started guide.

  * Argo CLI should no longer be needed.

  * GitHub token shouldn't be needed; I think that was only needed
    for ksonnet to pull the registry.

* * Fix instructions; access keys shouldn't be stored as ksonnet parameters
  as these will get checked into source control.
2019-01-10 12:42:45 -08:00