History

Jeremy Lewi acd8007717 Use conditionals and add test for code search (#291 ) * Fix model export, loss function, and add some manual tests. Fix Model export to support computing code embeddings: Fix #260 * The previous exported model was always using the embeddings trained for the search query. * But we need to be able to compute embedding vectors for both the query and code. * To support this we add a new input feature "embed_code" and conditional ops. The exported model uses the value of the embed_code feature to determine whether to treat the inputs as a query string or code and computes the embeddings appropriately. * Originally based on #233 by @activatedgeek Loss function improvements * See #259 for a long discussion about different loss functions. * @activatedgeek was experimenting with different loss functions in #233 and this pulls in some of those changes. Add manual tests * Related to #258 * We add a smoke test for T2T steps so we can catch bugs in the code. * We also add a smoke test for serving the model with TFServing. * We add a sanity check to ensure we get different values for the same input based on which embeddings we are computing. Change Problem/Model name * Register the problem github_function_docstring with a different name to distinguish it from the version inside the Tensor2Tensor library. * * Skip the test when running under prow because its a manual test. * Fix some lint errors. * * Fix lint and skip tests. * Fix lint. * * Fix lint * Revert loss function changes; we can do that in a follow on PR. * * Run generate_data as part of the test rather than reusing a cached vocab and processed input file. * Modify SimilarityTransformer so we can overwrite the number of shards used easily to facilitate testing. * Comment out py-test for now.		2018-11-02 09:52:11 -07:00
..
docker	Use conditionals and add test for code search (#291 )	2018-11-02 09:52:11 -07:00
kubeflow	Add tensorboard and check in vendor for the code search example. (#255 )	2018-10-04 10:18:58 -07:00
src	Use conditionals and add test for code search (#291 )	2018-11-02 09:52:11 -07:00
.dockerignore	Integrate batch prediction (#184 )	2018-07-23 16:26:23 -07:00
.gitignore	Extension of T2T Ksonnet component (#149 )	2018-06-25 15:09:22 -07:00
README.md	Upgrade notebook commands and other relevant changes (#229 )	2018-08-20 16:35:07 -07:00
code-search.ipynb	Upgrade notebook commands and other relevant changes (#229 )	2018-08-20 16:35:07 -07:00
developer_guide.md	Use conditionals and add test for code search (#291 )	2018-11-02 09:52:11 -07:00

README.md

Code Search on Kubeflow

This demo implements End-to-End Code Search on Kubeflow.

Prerequisites

NOTE: If using the JupyterHub Spawner on a Kubeflow cluster, use the Docker image gcr.io/kubeflow-images-public/kubeflow-codelab-notebook which has baked all the pre-prequisites.

Kubeflow Latest This notebook assumes a Kubeflow cluster is already deployed. See Getting Started with Kubeflow.
Python 2.7 (bundled with pip) For this demo, we will use Python 2.7. This restriction is due to Apache Beam, which does not support Python 3 yet (See BEAM-1251).
Google Cloud SDK This example will use tools from the Google Cloud SDK. The SDK must be authenticated and authorized. See Authentication Overview.
Ksonnet 0.12 We use Ksonnet to write Kubernetes jobs in a declarative manner to be run on top of Kubeflow.

Getting Started

To get started, follow the instructions below.

NOTE: We will assume that the Kubeflow cluster is available at kubeflow.example.com. Make sure you replace this with the true FQDN of your Kubeflow cluster in any subsequent instructions.

Spawn a new JupyterLab instance inside the Kubeflow cluster by pointing your browser to https://kubeflow.example.com/hub and clicking "Start My Server".
In the Image text field, enter gcr.io/kubeflow-images-public/kubeflow-codelab-notebook:v20180808-v0.2-22-gcfdcb12. This image contains all the pre-requisites needed for the demo.
Once spawned, you should be redirected to the Jupyter Notebooks UI.
Spawn a new Terminal and run
```
$ git clone --branch=master --depth=1 https://github.com/kubeflow/examples
```
This will create an examples folder. It is safe to close the terminal now.
Navigate back to the Jupyter Notebooks UI and navigate to examples/code_search. Open the Jupyter notebook code-search.ipynb and follow it along.

Acknowledgements

This project derives from hamelsmu/code_search.