Commit Graph

5 Commits

Author SHA1 Message Date
cliveseldon 8d728f0b06 GitHub Summarization Seldon Update (#472)
* Update model inference wrapping to use S2I and update docs

* Add s2i reference in docs

* Fix typo highlighted in review

* Add pyLint annotation to allow protected-access on keras make predict function method
2019-01-17 16:07:34 -08:00
Jeremy Lewi 1043bc0c26 A bunch of changes to support distributed training using tf.estimator (#265)
* Unify the code for training with Keras and TF.Estimator

Create a single train.py and trainer.py which uses Keras inside TensorFlow
Provide options to either train with Keras or TF.TensorFlow
The code to train with TF.estimator doesn't worki

See #196
The original PR (#203) worked around a blocking issue with Keras and TF.Estimator by commenting
certain layers in the model architecture leading to a model that wouldn't generate meaningful
predictions
We weren't able to get TF.Estimator working but this PR should make it easier to troubleshoot further

We've unified the existing code so that we don't duplicate the code just to train with TF.estimator
We've added unitttests that can be used to verify training with TF.estimator works. This test
can also be used to reproduce the current errors with TF.estimator.
Add a Makefile to build the Docker image

Add a NFS PVC to our Kubeflow demo deployment.

Create a tfjob-estimator component in our ksonnet component.

changes to distributed/train.py as part of merging with notebooks/train.py
* Add command line arguments to specify paths rather than hard coding them.
* Remove the code at the start of train.py to wait until the input data
becomes available.
* I think the original intent was to allow the TFJob to be started simultaneously with the preprocessing
job and just block until the data is available
* That should be unnecessary since we can just run the preprocessing job as a separate job.

Fix notebooks/train.py (#186)

The code wasn't actually calling Model Fit
Add a unittest to verify we can invoke fit and evaluate without throwing exceptions.

* Address comments.
2018-11-07 16:23:59 -08:00
Michelle Casbon 41372c9314 Add .pylintrc (#61)
* Add .pylintrc

* Resolve lint complaints in agents/trainer/task.py

* Resolve lint complaints with flask app.py

* Resolve linting issues

Remove duplicate seq2seq_utils.py from workflow/workspace/src

* Use python 3.5.2 with pylint to match prow

Put pybullet import back into agents/trainer/task.py with a pylint ignore statement
Use main(_) to ensure it works with tf.app.run
2018-03-29 08:25:02 -07:00
Hamel Husain 2ec3b03ed4 Update seq2seq_utils.py (#51)
Found a mistake with calculation of BLEU Score.
2018-03-18 12:25:58 -07:00
Michelle Casbon 76862c5141 Remove third_party folder & MIT license file 2018-02-27 13:17:42 -05:00