+++
title = "Nuclio functions"
description = "Nuclio - High performance serverless for data processing and ML"
weight = 40
toc = true
+++
## Nuclio Overview
[nuclio](https://github.com/nuclio/nuclio) is a high performance serverless platform which runs over docker or kubernetes
and automate the development, operation, and scaling of code (written in 8 supported languages).
Nuclio is focused on data analytics and ML workloads, it provides extreme performance and parallelism, supports stateful and data intensive
workloads, GPU resource optimization, check-pointing, and 14 native triggers/streaming protocols out of the box including HTTP, Cron, batch, Kafka, Kinesis,
Google pub/sub, Azure event-hub, MQTT, etc. additional triggers can be added dynamically (e.g. [Twitter feed](https://github.com/v3io/tutorials/blob/master/demos/stocks/read-tweets.ipynb)).
nuclio can run in the cloud as a [managed offering](https://www.iguazio.com/), or on any Kubernetes cluster (cloud, on-prem, or edge)
[read more about nuclio ...](https://github.com/nuclio/nuclio)
## Using Nuclio In Data Science Pipelines
Nuclio functions can be used in the following ML pipline tasks:
- Data collectors, ETL, stream processing
- Data preparation and analysis
- Hyper parameter model training
- Real-time model serving
- Feature vector assembly (real-time data preparation)
Containerized functions (+ dependent files and spec) can be created directly from a Jupyter Notebook
using `%nuclio` magic commands or SDK API calls (see [nuclio-jupyter](https://github.com/nuclio/nuclio-jupyter)),
or they can be built/deployed using KubeFlow Pipeline (see: [nuclio pipeline components]())
e.g. if we want to deploy/update Inference functions right after we update an ML model.
## Installing Nuclio over Kubernetes
Nuclio [Git repo](https://github.com/nuclio/nuclio) contain detailed documentation on the installation and usage.
can also follow this [interactive tutorial](https://www.katacoda.com/javajon/courses/kubernetes-serverless/nuclio).
The simplest way to install is using `Helm`, assuming you deployed Helm on your cluster, type the following commands:
```
helm repo add nuclio https://nuclio.github.io/nuclio/charts
kubectl create ns nuclio
helm install nuclio/nuclio --name=nuclio --namespace=nuclio --set dashboard.nodePort=31000
kubectl -n nuclio get all
```
Browse to the dashboard URL, you can create, test, and manage functions using a visual editor.
> Note: you can change the NodePort number or skip that option for in-cluster use.
## Writing and Deploying a Simple Function
The simplest way to write a nuclio function is from within Jupyter.
the entire Notebook, portions of it, or code files can be turned into functions in a single magic/SDK command.
see [the SDK](https://github.com/nuclio/nuclio-jupyter) for detailed documentation.
The full notebook with the example below can be [found here](https://github.com/nuclio/nuclio-jupyter/blob/master/docs/nlp-example.ipynb)
before you begin install the latest `nuclio-jupyter` package:
pip install --upgrade nuclio-jupyter
We write and test our code inside a notebook like any other data science code.
We add some `%nuclio` magic commands to describe additional configurations such as which packages to install,
CPU/Mem/GPU resources, how the code will get triggered (http, cron, stream), environment variables,
additional files we want to bundle (e.g. ML model, libraries), versioning, etc.
First we need to import `nuclio` package (we add an `ignore` comment so this line wont be compiled later):
```python
# nuclio: ignore
import nuclio
```
We add function spec, environment, configuration details using magic commands:
```
%nuclio cmd pip install textblob
%nuclio env TO_LANG=fr
%nuclio config spec.build.baseImage = "python:3.6-jessie"
```
and we write our code as usual, just make sure we have a handler function which
is invoked to initiate our run. The function accepts a context and an event, e.g.:
`def handler(context, event)`
**Function code**
the following example show accepting text and doing NLP processing (correction, translation, sentiments):
```python
from textblob import TextBlob
import os
def handler(context, event):
context.logger.info('This is an NLP example! ')
# process and correct the text
blob = TextBlob(str(event.body.decode('utf-8')))
corrected = blob.correct()
# debug print the text before and after correction
context.logger.info_with("Corrected text", corrected=str(corrected), orig=str(blob))
# calculate sentiments
context.logger.info_with("Sentiment",
polarity=str(corrected.sentiment.polarity),
subjectivity=str(corrected.sentiment.subjectivity))
# read target language from environment and return translated text
lang = os.getenv('TO_LANG','fr')
return str(corrected.translate(to=lang))
```
Now we can test the function using a built-in function context and examine its output
```python
# nuclio: ignore
event = nuclio.Event(body=b'good morninng')
handler(context, event)
```
Finally we deploy our function using the magic commands, SDK, or KubeFlow Pipeline.
we can simply write and run the following command a cell:
`%nuclio deploy -n nlp -p ai -d `
if we want more control we can use the SDK:
```python
# nuclio: ignore
# deploy the notebook code with extra configuration (env vars, config, etc.)
spec = nuclio.ConfigSpec(config={'spec.maxReplicas': 2}, env={'EXTRA_VAR': 'something'})
addr = nuclio.deploy_file(name='nlp',project='ai',verbose=True, spec=spec,
tag='v1.1', dashboard_url='')
# invoke the generated function
resp = requests.get('http://' + addr)
print(resp.text)
```
We can also deploy our function directly from Git:
```python
addr = nuclio.deploy_file('git://github.com/nuclio/nuclio#master:/hack/examples/python/helloworld',
name='hw', project='myproj', dashboard_url='')
resp = requests.get('http://' + addr)
print(resp.text)
```
and we can deploy functions as part of a KubeFlow pipeline step:
```python
nuclio_dep = kfp.components.load_component_from_file('deploy/component.yaml')
def my_pipeline():
new_func = nuclio_dep(url='git://github.com/nuclio/nuclio#master:/hack/examples/python/helloworld', name='myfunc', project='myproj', tag='0.11')
...
```
See [nuclio pipline components](https://github.com/kubeflow/pipelines/tree/master/components/nuclio) (allowing to deploy, delete, or invoke functions)
> Note: Nuclio is not limited to Python, [see this example](https://github.com/nuclio/nuclio-jupyter/blob/master/docs/nuclio_bash.ipynb) showing how we create a simple `Bash` function from a
Notebook, e.g. we can create `Go` functions if we need performance/concurrency for our inference.
## Nuclio function examples
Some useful function example Notebooks:
- [TensorFlow Serving function](https://github.com/v3io/tutorials/blob/master/demos/image-classification/infer.ipynb)
- [Predictive Infrastructure Monitoring (Scikit Learn)](https://github.com/v3io/tutorials/blob/master/demos/netops/04-infer.ipynb)
- [Twitter Feed NLP](https://github.com/v3io/tutorials/blob/master/demos/stocks/read-tweets.ipynb)
- [Real-time Stock data reader](https://github.com/v3io/tutorials/blob/master/demos/stocks/read-stocks.ipynb)