3.5 KiB
Jupyter Notebook
Please put Cornell-1000-nltk.ipynb and Twitter-5000-nltk.ipynb into the folder of Jupyter Notebook first.
If you use Minikube to install Kubeflow, the folder of Jupyter Notebook will usually be in:
/tmp/hostpath-provisioner/kubeflow-user-example-com/workspace-<your Jupyter name>
Pipeline
Cornell-1000.zip and twitter-5000.zip are compressed files generated after executing Cornell-1000-nltk.ipynb and Twitter-5000-nltk.ipynb.
The content of the compressed file is the yaml file of the pipeline.

Custom data
Twitter-5000-nltk and Cornell-1000-nltk use similar code, and the difference is in downloading and reading data.
If you want to use other data, you only need to classify the data and save it in str format into pos_tweets and neg_tweets.

Port Forward
Step 1:Find the pod name of Http port

Step 2:Port-forward
kubectl port-forward -n kubeflow-user-example-com <pod name> 3000:5000

Step 3:Input in the browser
http://localhost:3000/
or
127.0.0.1:3000

Step 4:Predict


Accuracy
You can confirm the accuracy of the NLP individually,

or you can use a comparison run for comparison.

Disabling caching in your Kubeflow Pipelines deployment
If you delete the pvc and execute the pipeline again, you find that it does not work properly, it may be a cache problem.
The following command can be executed to disable the cache.
export NAMESPACE=kubeflow
kubectl patch mutatingwebhookconfiguration cache-webhook-${NAMESPACE} --type='json' -p='[{"op":"replace", "path": "/webhooks/0/rules/0/operations/0", "value": "DELETE"}]'