Commit Graph

21 Commits

Author SHA1 Message Date
MrXinWang 2acf34f916 object_detection: fix typo error in tf-serving.libsonnet (#618)
modified tf-serving.libsonnet in object_detection example to fix the error of
"FileSystemStoragePathSource encountered a file-system access error:
Could not find base path /models/model for servable model"

Change-Id: I946a0a7fbb6c80992d66fe003ca90b1c21c67cfc
Signed-off-by: Henry Wang <henry.wang@arm.com>
2019-08-14 18:12:34 -07:00
Daniel Castellanos aaea8024be Fixed issue with TFJob api version in tfjob component (#541)
Signed-off-by: Luis Daniel Castellanos <luis.daniel.castellanos@intel.com>
2019-04-13 12:58:00 -07:00
Hougang Liu 1ed74b274c create pv for pets-pv (#439)
* create pv for pets-pv

For a lot of user k8s clusters, dynamic volume provisioning isn't
enabled. So the newcomer may be blocked since pets-pv will keep
Pending. We can guide them to create a nfs PV as an option.

* tell user how to check if a default storage class is defined

* add link about how to create PV
2018-12-21 06:05:11 -08:00
Hougang Liu fc5a85b948 reconcile tensorflow serving version (#409)
Since default OBJ_DETECTION_IMAGE tensorflow version is 1.10.0, we
pin consistent version 1.10.0 of TF across the example.

Fixes: #408
2018-12-08 14:32:46 -08:00
Sam Shi b2e6aa231c Save the batch-predict package in the image; Create a separate Dockfile for GPU (#383)
* Save the batch-predict package in the image; create a separate Dockerfile for gpu

* remove commented code
2018-12-07 10:28:57 -08:00
Hougang Liu 9994b57497 add object detection grpc client (#378)
* add object detection grpc client

Fixes: #377

* fix kubeflow-examples-presubmit error

object_detection_grpc_client.py depends on other files in
https://github.com/tensorflow/models.git, pylint will fail
for those files need to be compiled manually.
Since mnist_DDP.py has similar dependency, here just follow
mnist_DDP.py and ignore checking this file.
2018-12-06 18:51:24 -08:00
Hougang Liu 6855802aa1 tf-training-job doesn't complete (#367)
In tensorflow/models/research/object_detection/, only
tensorflow/models/research/object_detection/legacy/train.py
supports kubeflow sor far (construct cluster by reading
TF_CONFIG environment var).

Fixes: #277
2018-11-28 22:48:21 -08:00
Guang Ya Liu db8f4f4b37 Highlight the kubectl command. (#369) 2018-11-28 22:41:40 -08:00
Hougang Liu 15007fdeea Add ks env configuration guideline and directory(#346) (#347) 2018-11-26 22:05:36 -08:00
Konstantinos Samaras-Tsakiris 5c38c96fae Fix #272 (#273)
* Fix #272

Fix #272 where the `create-pet-record-job` pod produces this error: `models/research/object_detection/data/pet_label_map.pbtxt; No such file or directory`

* Update create-pet-record-job.jsonnet
2018-10-22 14:57:24 -07:00
Konstantinos Samaras-Tsakiris 6edf7915f5 Fix #275 (#276)
Fix #275 by changing the default mount path for the training data.
2018-10-22 12:14:13 -07:00
Konstantinos Samaras-Tsakiris b0f9b4cfd0 Fix bash (#271)
Remove spaces around a bash variable declaration.
2018-10-22 12:02:04 -07:00
Daniel Castellanos e6b6730650 Updated object detection training example (#228)
* Updated Dockerfile.traning to use latest tensorflow
  and tensorflow object detetion api.
* Updated tf-training-job component and added a chief
  replica spec
* Corrected some typos and updated some instructions
2018-08-20 19:32:12 -07:00
Lun-Kai Hsu f3806d0bac Small fix to TF serving gpu (#221)
* Small fix to TF serving gpu

* fix

* fix

* fix
2018-08-14 14:27:35 -07:00
Daniel Castellanos 9bda30b7d9 Fixed broken links in object detection example (#211) 2018-08-03 16:05:27 -07:00
Sam Shi b6a4d06f00 Batch predict example for object detection using GPU (#199)
* adding batch-predict on GPU example

* Sync with TF-serving GPU example.

* adding visualization instructions

* change the title of readme.md

* changes according to the review comments from jlewi

* Replace the links to personal project with the one in kubeflow-example project in the yaml file

* change the procedure to build images

* polish the md file

* some minor md change

* fix a broken gs link

* fix more merge errors
2018-08-03 11:57:53 -07:00
Daniel Castellanos 091eacb4f6 Parametrize Object detection example (#192)
* Added Ksonnet prototypes to parametrize old yaml files

* Modified instructions

* Added tf-training-job component

* Removed yaml manifest files

Modified serving instructions

* Consolidate get-data and decompression jobs

* Deleted registry and prototypes

* Added components to ks-app dir
* Modified instructions

* Fixed references to user guide page

Improved instructions

* General improvements to components and instructions

* Removed obj-detection.libsonnet file
* used specific params in export-graph and create-tf-record
  instead of list params like 'args' and 'command'
* Improved instructions and removed references to yaml files
2018-08-02 18:44:26 -07:00
Lun-Kai Hsu 1746820f8f Example of TF Serving with GPU (#154)
* initial

* wip

* working now

* fix

* fix lint

* fix lint

* fix lint

* review

* move

* fix

* addressing comment

* lint

* fix
2018-07-24 21:44:55 -07:00
Roy Xue 38b3259dc1 Update pets-training to v1alpha2 (#183)
* Update pets-training to v1alpha2

* Remove GPU config
2018-07-17 21:43:18 -07:00
Roy Xue 151713c7bf Update decompress config to avoid error, fix typo (#177) 2018-07-16 10:08:55 -07:00
Daniel Castellanos b6a3c4c0ea Added tutorial for object detection distributed training (#74)
* Added tutorial for object detection distributed training

Added steps on how to leverage kubeflow tooling to
submit a distributed object detection training job
in a small kubernetes cluster (minikube, 2-4 node cluster)

* Added Jobs to prepare the training data and model

* Updated instructions

* fixed typos and added export tf graph job

* Fixed paths in jobs and instructions

* Enhanced instructions and re-arranged folder structure

* Updated links to kubeflow user guide documentation
2018-07-03 14:10:20 -07:00