Commit Graph

167 Commits

Author SHA1 Message Date
Puneith Kaul 23c3999a3d renamed to xgboost_ames_housing 2018-08-22 00:55:36 -07:00
Puneith Kaul 859bed14b9 fixed lint errors 2018-08-22 00:49:59 -07:00
Puneith Kaul 20f8568da5 fix spaces 2018-08-22 00:28:39 -07:00
Puneith Kaul ec77c2aae1 fix spaces 2018-08-22 00:24:09 -07:00
Puneith Kaul 5a91630c77 reorder imports 2018-08-22 00:16:24 -07:00
Puneith Kaul e32f935faf removed test function 2018-08-22 00:14:58 -07:00
Puneith Kaul 4168dabfa6 added bitly 2018-08-22 00:07:07 -07:00
Puneith Kaul c38a19c8ea
Update housing.py 2018-08-21 23:59:22 -07:00
Puneith Kaul a85cc31fce
Update HousingServe.py 2018-08-21 23:58:52 -07:00
Puneith Kaul 37cf0ef755
Update housing.py 2018-08-21 19:07:45 -07:00
Puneith Kaul ecc1aab0e4 new PR for XGBoost due to problems with history rewrite 2018-08-21 18:49:50 -07:00
Daniel Castellanos e6b6730650 Updated object detection training example (#228)
* Updated Dockerfile.traning to use latest tensorflow
  and tensorflow object detetion api.
* Updated tf-training-job component and added a chief
  replica spec
* Corrected some typos and updated some instructions
2018-08-20 19:32:12 -07:00
Sanyam Kapoor f9873e6ac4 Upgrade notebook commands and other relevant changes (#229)
* Replace double quotes for field values (ks convention)

* Recreate the ksonnet application from scratch

* Fix pip commands to find requirements and redo installation, fix ks param set

* Use sed replace instead of ks param set.

* Add cells to first show JobSpec and then apply

* Upgrade T2T, fix conflicting problem types

* Update docker images

* Reduce to 200k samples for vocab

* Use Jupyter notebook service account

* Add illustrative gsutil commands to show output files, specify index files glob explicitly

* List files after index creation step

* Use the model in current repository and not upstream t2t

* Update Docker images

* Expose TF Serving Rest API at 9001

* Spawn terminal from the notebooks ui, no need to go to lab
2018-08-20 16:35:07 -07:00
Michelle Casbon 0843cdad66 Add Yelp restaurant review demo files (#220)
* Add Yelp restaurant review demo files

* Add video links

* Resolve lint issues
2018-08-15 22:49:00 -07:00
Sanyam Kapoor 4e015e76a3 Cherry pick changes to PredictionDoFn (#226)
* Cherry pick changes to PredictionDoFn

* Disable lint checks for cherry picked file

* Update TODO and notebook install instructions

* Restore CUSTOM_COMMANDS todo
2018-08-15 06:21:00 -07:00
Sanyam Kapoor 18829159b0 Add a new github function docstring extended problem (#225)
* Add a new github function docstring extended problem

* Fix lint errors

* Update images
2018-08-14 15:41:47 -07:00
Sanyam Kapoor 8fce4a7799 Allow ks param set for Code Search Ksonnet Application (#224)
* Allow ks param set for t2t-code-search

* Update notebook with working directory param set

* Abstract out common variables for easy ks param set
2018-08-14 15:29:04 -07:00
Lun-Kai Hsu f3806d0bac Small fix to TF serving gpu (#221)
* Small fix to TF serving gpu

* fix

* fix

* fix
2018-08-14 14:27:35 -07:00
Sanyam Kapoor a687c51036 Add a Jupyter notebook to be used for Kubeflow codelabs (#217)
* Add a Jupyter notebook to be used for Kubeflow codelabs

* Add help command for create_function_embeddings module

* Update README to point to Jupyter Notebook

* Add prerequisites to readme

* Update README and getting started with notebook guide

* [wip]

* Update noebook with BigQuery previews

* Update notebook to automatically select the latest MODEL_VERSION
2018-08-13 21:43:26 -07:00
Ankush Agarwal a80c15b50e
Merge pull request #213 from activatedgeek/search-server-kubeflow
Update Search Index server spec
2018-08-09 14:57:49 -07:00
Sanyam Kapoor 6e9150bad6 Parametrize volumes and ports for nmslib containers 2018-08-09 10:53:23 -07:00
Sanyam Kapoor 133e054033 Refactor job and deployment specs into different functions 2018-08-09 10:53:23 -07:00
Sanyam Kapoor e34f9aca75 Build just one image with the correct tag instead of double the number 2018-08-09 10:53:23 -07:00
Sanyam Kapoor c86f306d79 Use kind Job instead of Pod 2018-08-09 10:53:23 -07:00
Sanyam Kapoor 6527aba7c1 Upgrade JS app to be served at any path prefix 2018-08-09 10:53:23 -07:00
Sanyam Kapoor 9ce23d9fc6 Working search index server 2018-08-09 10:53:23 -07:00
Sanyam Kapoor 02db0065c1 Make search index creation a one-off job 2018-08-09 10:53:23 -07:00
Sanyam Kapoor d4669467d8 Update Search Index server spec with new commands 2018-08-09 10:53:23 -07:00
Sanyam Kapoor cfdcb1292c Update Ksonnet version, Add Python2 pip (#216)
* Update Ksonnet version, Add Python2 pip

* Update ks version in README
2018-08-07 22:58:20 -07:00
Richard Liu 082561a75b Create Jupyter notebook image for codelabs (#214)
* Create Jupyter notebook image for codelabs

* Add makefile
2018-08-06 16:16:02 -07:00
Daniel Castellanos 9bda30b7d9 Fixed broken links in object detection example (#211) 2018-08-03 16:05:27 -07:00
Sanyam Kapoor f2151f66fc Merge UI and Search Server (#209)
* Use the nicer tf.gfile interface for search index creation

* Update documentation and more maintainable interface to search server

* Add ability to control number of outputs

* Serve React UI from the Flask server

* Update Dockerfile for the unified server and ui
2018-08-03 15:56:09 -07:00
Sam Shi b6a4d06f00 Batch predict example for object detection using GPU (#199)
* adding batch-predict on GPU example

* Sync with TF-serving GPU example.

* adding visualization instructions

* change the title of readme.md

* changes according to the review comments from jlewi

* Replace the links to personal project with the one in kubeflow-example project in the yaml file

* change the procedure to build images

* polish the md file

* some minor md change

* fix a broken gs link

* fix more merge errors
2018-08-03 11:57:53 -07:00
Sanyam Kapoor e9e844022e Disable Distributed Training (#207)
* Upgrade TFJob and Ksonnet app

* Container name should be tensorflow. See #563.

* Working single node training and serving on Kubeflow

* Add issue link for fixme

* Remove redundant create secrets and use Kubeflow provided secrets
2018-08-02 23:02:05 -07:00
Daniel Castellanos 091eacb4f6 Parametrize Object detection example (#192)
* Added Ksonnet prototypes to parametrize old yaml files

* Modified instructions

* Added tf-training-job component

* Removed yaml manifest files

Modified serving instructions

* Consolidate get-data and decompression jobs

* Deleted registry and prototypes

* Added components to ks-app dir
* Modified instructions

* Fixed references to user guide page

Improved instructions

* General improvements to components and instructions

* Removed obj-detection.libsonnet file
* used specific params in export-graph and create-tf-record
  instead of list params like 'args' and 'command'
* Improved instructions and removed references to yaml files
2018-08-02 18:44:26 -07:00
Sanyam Kapoor fd2e750990 Fix T2T memory problem (#205)
* Update T2T problems to workaround memory limitations

* Add max_samples_for_vocab to prevent memory overflow

* Fix a base URL to download data from, sweet spot for max samples

* Convert class variables to class properties

* Fix lint errors

* Use Python2/3 compatible code for StringIO

* Fix lint errors

* Fix source data files format

* Move to Text2TextProblem instead of TranslateProblem

* Update details for num_shards and T2T problem dataset
2018-08-01 13:37:41 -07:00
Sanyam Kapoor 767c90ff20 Refactor dataflow pipelines (#197)
* Update to a new dataflow package

* [WIP] updating docstrings, fixing redundancies

* Limit the scope of Github Transform pipeline, make everything unicode

* Add ability to start github pipelines from transformed bigquery dataset

* Upgrade batch prediction pipeline to be modular

* Fix lint errors

* Add write disposition to BigQuery transform

* Update documentation format

* Nicer names for modules

* Add unicode encoding to parsed function docstring tuples

* Use Apache Beam options parser to expose all CLI arguments
2018-07-27 06:26:56 -07:00
Lun-Kai Hsu 1746820f8f Example of TF Serving with GPU (#154)
* initial

* wip

* working now

* fix

* fix lint

* fix lint

* fix lint

* review

* move

* fix

* addressing comment

* lint

* fix
2018-07-24 21:44:55 -07:00
Lun-Kai Hsu f340a4c2c7 fix typo in OWNER (#193) 2018-07-24 08:04:55 -07:00
Sanyam Kapoor 994fdf82c0 Integrate nmslib (#194)
* Integrate NMSLib server with new data file

* Integrate UI with query URL of search server
2018-07-23 17:17:24 -07:00
Sanyam Kapoor 636cf1c3d0 Integrate batch prediction (#184)
* Refactor the dataflow package

* Create placeholder for new prediction pipeline

* [WIP] add dofn for encoding

* Merge all modules under single package

* Pipeline data flow complete, wip prediction values

* Fallback to custom commands for extra dependency

* Working Dataflow runner installs, separate docker-related folder

* [WIP] Updated local user journey in README, fully working commands, easy container translation

* Working Batch Predictions.

* Remove docstring embeddings

* Complete batch prediction pipeline

* Update Dockerfiles and T2T Ksonnet components

* Fix linting

* Downgrade runtime to Python2, wip memory issues so use lesser data

* Pin master to index 0.

* Working batch prediction pipeline

* Modular Github Batch Prediction Pipeline, stores back to BigQuery

* Fix lint errors

* Fix module-wide imports, pin batch-prediction version

* Fix relative import, update docstrings

* Add references to issue and current workaround for Batch Prediction dependency.
2018-07-23 16:26:23 -07:00
Roy Xue 38b3259dc1 Update pets-training to v1alpha2 (#183)
* Update pets-training to v1alpha2

* Remove GPU config
2018-07-17 21:43:18 -07:00
Sanyam Kapoor 2adbb7ace4 Fix transformer export (#169)
* Add auto-downloads for the data

* Make top() a no-op, working export

* Fix lint errors

* Integrate NMSlib server with TF Serving

* Clarify data URLs purpose
2018-07-16 14:06:52 -07:00
Roy Xue 151713c7bf Update decompress config to avoid error, fix typo (#177) 2018-07-16 10:08:55 -07:00
Pete MacKinnon d2c5e949e5 Update PVC to /home/jovyan (#119) 2018-07-13 14:39:26 -07:00
Jeremy Lewi eaf0298590 Create a deployment to run the HP/Katib controller for the GitHub issue example. (#161)
* Some of the code is copied over from https://github.com/kubeflow/katib/tree/master/examples/GKEDemo

  * I think it makes sense to centralize all the code in a single place.

* Update the controller program (git-issue-summarize-demo.go) so that can
  specify the Docker image containing the training code.

* Create a ksonnet deployment for running the controller on the cluster.

* The HP tuning job isn't functional here's an incomplete list of issues

  * The training jobs launched fail because they don't have GCP credentials
    so they can't download the data.

  * We don't actually extract and report metrics back to Katib.

Related to: kubeflow/katib#116
2018-07-11 08:46:25 -07:00
Sanyam Kapoor d692db36e8 Search UI Components (#168)
* Initialize search UI. Needs connection to search service

* Fix page title

* Add component for code search results, dummy values for now

* Fix title and manifest

* Add mock loading UI. Need to fill in real API results

* Wrap application into Dockerfile
2018-07-10 20:08:25 -07:00
Sanyam Kapoor c5f13464b4 Add negative sampling to Transformer network (#167)
* Add negative sampling to Transformer network

* Add generate data flag, can skip t2t-datagen step
2018-07-04 20:14:22 -07:00
Daniel Castellanos b6a3c4c0ea Added tutorial for object detection distributed training (#74)
* Added tutorial for object detection distributed training

Added steps on how to leverage kubeflow tooling to
submit a distributed object detection training job
in a small kubernetes cluster (minikube, 2-4 node cluster)

* Added Jobs to prepare the training data and model

* Updated instructions

* fixed typos and added export tf graph job

* Fixed paths in jobs and instructions

* Enhanced instructions and re-arranged folder structure

* Updated links to kubeflow user guide documentation
2018-07-03 14:10:20 -07:00
Sanyam Kapoor 5a9748bf8f Add similarity transformer body (#159)
* Add similarity transformer body

* Update pipeline to Write a single CSV file

* Fix lint errors

* Use CSV writer to handle formatting rows

* Use direct transformer encoding methods with variable scopes

* Complete end-to-end training with new model and problem

* Read from mutliple csv files
2018-07-03 11:14:19 -07:00