feat(crud-web-apps): Add Prometheus metrics (kubeflow/kubeflow#7634)

* Add Prometheus metrics to CRUD backend

Use prometheus_flask_exporter library to add Prometheus metrics to
CRUD backend. With this approach all CRUD backens will be able to
enable metrics.

Signed-off-by: Robert Gildein <gildeinrobert@gmail.com>
Signed-off-by: Robert Gildein <robert.gildein@canonical.com>

* KF-6122 Add short doc about metrics a improve code

Add note to README.md about metrics and link the source code for more
information. Fix small issue and missing dependency for Python < 3.8.

Signed-off-by: Robert Gildein <robert.gildein@canonical.com>

* fix getting backend version from Python < 3.8

Signed-off-by: Robert Gildein <robert.gildein@canonical.com>

* Enable metrics by default and increase backend version to 1.2

Signed-off-by: Robert Gildein <robert.gildein@canonical.com>

* switch to group by rule instead of path

Signed-off-by: Robert Gildein <robert.gildein@canonical.com>

* fix yaml files

Signed-off-by: Robert Gildein <robert.gildein@canonical.com>

---------

Signed-off-by: Robert Gildein <gildeinrobert@gmail.com>
Signed-off-by: Robert Gildein <robert.gildein@canonical.com>
This commit is contained in:
Robert Gildein 2024-09-05 14:01:15 +02:00 committed by GitHub
parent 5fef82961d
commit 5abca012ed
16 changed files with 113 additions and 38 deletions

View File

@ -3,12 +3,14 @@
Since our CRUD web apps like the Jupyter, Tensorboards and Volumes UIs are similarly build with Angular and Python/Flask we should factor the common code in to modules and libraries.
This directory will contain:
1. A Python package with a base backend. Each one of the mentioned apps are supposed to extend this backend.
2. An Angular library that will contain the common frontend code that these apps will be sharing
## Backend
The backend will be exposing a base backend which will be taking care of:
* Serving the Single Page Application
* Adding liveness/readiness probes
* Authentication based on the `kubeflow-userid` header
@ -16,14 +18,17 @@ The backend will be exposing a base backend which will be taking care of:
* Uniform logging
* Exceptions handling
* Common helper functions for dates, yaml file parsing etc.
* Providing Prometheus metrics
### Supported ENV Vars
The following is a list of ENV var that can be set for any web app that is using this base app.
This is list is incomplete, we will be adding more variables in the future.
| ENV Var | Description |
| - | - |
| CSRF_SAMESITE | Controls the [SameSite value](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie#SameSite) of the CSRF cookie |
| METRICS | Enable the exporting of Prometheus metrics on `/metrics` path |
### How to use
@ -37,6 +42,7 @@ cd ../../common/backend && pip install -e .
This will install all the dependencies of the package and you will now be able to include code from `kubeflow.kubeflow.crud_backend` in you current Python environment.
In order to build a Docker image and use this code you coud build a wheel and then install it:
```dockerfile
### Docker
FROM python:3.7 AS backend-kubeflow-wheel
@ -55,9 +61,22 @@ COPY --from=backend-kubeflow-wheel /src .
RUN pip3 install .
...
```
### Metrics
The following metrics are exported:
flask_http_request_duration_seconds (Histogram)
flask_http_request_total (Counter)
flask_http_request_exceptions_total (Counter)
flask_exporter_info (Gauge)
For more information visit the [prometheus_flask_exporter](https://github.com/rycus86/prometheus_flask_exporter).
## Frontend
The common Angular library contains common code for:
* Communicating with the Central Dashboard to handle the Namespace selection
* Making http calls and handing their errors
* Surfacing errors and warnings
@ -66,6 +85,7 @@ The common Angular library contains common code for:
* Handling forms
### How to use
```bash
# build the common library
cd common/frontend/kubeflow-common-lib
@ -80,9 +100,10 @@ npm link
cd crud-web-apps/volumes/frontend
npm i
npm link kubeflow
```
### Common errors
```
NullInjectorError: StaticInjectorError(AppModule)[ApplicationRef -> NgZone]:
StaticInjectorError(Platform: core)[ApplicationRef -> NgZone]:
@ -118,26 +139,28 @@ COPY --from=frontend-kubeflow-lib /src/dist/kubeflow/ ./node_modules/kubeflow/
COPY ./components/crud-web-apps/volumes/frontend/ .
RUN npm run build -- --output-path=./dist/default --configuration=production
```
### Internationalization
Internationalization was implemented using [ngx-translate](https://github.com/ngx-translate/core).
This is based on the browser's language. If the browser detects a language that is not implemented in the application, it will default to English.
The i18n asset files are located under `frontend/src/assets/i18n` of each application (jupyter, volumes and tensorboard). One file is needed per language. The common project is duplicated in every asset.
The i18n asset files are located under `frontend/src/assets/i18n` of each application (jupyter, volumes and tensorboard). One file is needed per language. The common project is duplicated in every asset.
The translation asset files are set in the `app.module.ts`, which should not be needed to modify.
The translation default language is set in the `app.component.ts`.
For each language added, `app.component.ts` will need to be updated.
**When a language is added:**
**When a language is added:**
- Copy the en.json file and rename is to the language you want to add. As it currently is, the culture should not be included.
- Change the values to the translated ones
**When a translation is added or modified:**
- Choose an appropriate key
- Make sure to add the key in every language file
- If text is added/modified in the Common Project, it needs to be added/modified in the other applications as well.

View File

@ -6,6 +6,7 @@ from .authn import bp as authn_bp
from .config import BackendMode
from .csrf import bp as csrf_bp
from .errors import bp as errors_bp
from .metrics import enable_metrics
from .probes import bp as probes_bp
from .routes import bp as base_routes_bp
from .serving import bp as serving_bp
@ -32,4 +33,7 @@ def create_app(name, static_folder, config):
app.register_blueprint(serving_bp)
app.register_blueprint(base_routes_bp)
if config.METRICS:
enable_metrics(app)
return app

View File

@ -43,11 +43,14 @@ class Config(object):
JSONIFY_PRETTYPRINT_REGULAR = True
LOG_LEVEL = logging.INFO
PREFIX = "/"
METRICS: bool = True
def __init__(self):
if os.environ.get("LOG_LEVEL_DEBUG", "false") == "true":
self.LOG_LEVEL = logging.DEBUG
self.METRICS = bool(os.environ.get("METRICS", True))
class DevConfig(Config):
ENV = BackendMode.DEVELOPMENT_FULL.value

View File

@ -0,0 +1,44 @@
import logging
import sys
from flask import Flask
from prometheus_flask_exporter import PrometheusMetrics
log = logging.getLogger(__name__)
def _get_backend_version() -> str:
"""Get the backend version.
The version is defined in setup.py.
"""
if sys.version_info >= (3, 8):
from importlib import metadata
else:
import importlib_metadata as metadata
return metadata.version("kubeflow")
def enable_metrics(app: Flask) -> None:
"""Enable Prometheus metrics fro backend app.
This function will enable metrics collection for all routes and expose them
at /metrics.
Default metrics are:
flask_http_request_duration_seconds (Histogram)
flask_http_request_total (Counter)
flask_http_request_exceptions_total (Counter)
flask_exporter_info (Gauge)
"""
log.info("Enabling the Prometheus metrics for %s", app.name)
backend_version = _get_backend_version()
log.debug("Backend version is %s", backend_version)
metrics = PrometheusMetrics(
app, group_by="url_rule", default_labels={"app": app.name}
)
# add default metrics with info about app
metrics.info(
"app_info", "Application info", version=backend_version, app=app.name
)

View File

@ -9,11 +9,13 @@ REQUIRES = [
"Werkzeug >= 0.16.0",
"Flask-Cors >= 3.0.8",
"gevent",
"prometheus-flask-exporter >= 0.23.1",
"importlib-metadata >= 1.0;python_version<'3.8'",
]
setuptools.setup(
name="kubeflow",
version="1.1",
version="1.2",
author="kubeflow-dev-team",
description="A package with a base Flask CRUD backend common code",
packages=setuptools.find_packages(),

View File

@ -6,22 +6,6 @@ from kubeflow.kubeflow.crud_backend import config, logging
log = logging.getLogger(__name__)
def get_config(mode):
"""Return a config based on the selected mode."""
config_classes = {
config.BackendMode.DEVELOPMENT.value: config.DevConfig,
config.BackendMode.DEVELOPMENT_FULL.value: config.DevConfig,
config.BackendMode.PRODUCTION.value: config.ProdConfig,
config.BackendMode.PRODUCTION_FULL.value: config.ProdConfig,
}
cfg_class = config_classes.get(mode)
if not cfg_class:
raise RuntimeError("Backend mode '%s' is not implemented. Choose one"
" of %s" % (mode, list(config_classes.keys())))
return cfg_class()
APP_NAME = os.environ.get("APP_NAME", "Jupyter Web App")
BACKEND_MODE = os.environ.get("BACKEND_MODE",
config.BackendMode.PRODUCTION.value)
@ -32,7 +16,7 @@ UI_FLAVOR = os.environ.get("UI_FLAVOR", None)
if UI_FLAVOR is None:
UI_FLAVOR = os.environ.get("UI", "default")
cfg = get_config(BACKEND_MODE)
cfg = config.get_config(BACKEND_MODE)
cfg.PREFIX = PREFIX
# Load the app based on UI_FLAVOR env var

View File

@ -27,6 +27,8 @@ spec:
value: $(JWA_USERID_PREFIX)
- name: APP_SECURE_COOKIES
value: $(JWA_APP_SECURE_COOKIES)
- name: METRICS
value: $(JWA_APP_ENABLE_METRICS)
serviceAccountName: service-account
volumes:
- configMap:

View File

@ -83,3 +83,10 @@ vars:
apiVersion: v1
kind: ConfigMap
name: parameters
- name: JWA_APP_ENABLE_METRICS
fieldref:
fieldPath: data.JWA_APP_ENABLE_METRICS
objref:
apiVersion: v1
kind: ConfigMap
name: parameters

View File

@ -4,3 +4,4 @@ JWA_CLUSTER_DOMAIN=cluster.local
JWA_USERID_HEADER=kubeflow-userid
JWA_USERID_PREFIX=
JWA_APP_SECURE_COOKIES=true
JWA_APP_ENABLE_METRICS=1

View File

@ -20,4 +20,6 @@ spec:
value: $(TWA_USERID_PREFIX)
- name: APP_SECURE_COOKIES
value: $(TWA_APP_SECURE_COOKIES)
- name: METRICS
value: $(TWA_APP_ENABLE_METRICS)
serviceAccountName: service-account

View File

@ -64,3 +64,10 @@ vars:
apiVersion: v1
kind: ConfigMap
name: parameters
- name: TWA_APP_ENABLE_METRICS
fieldref:
fieldPath: data.TWA_APP_ENABLE_METRICS
objref:
apiVersion: v1
kind: ConfigMap
name: parameters

View File

@ -3,3 +3,4 @@ TWA_USERID_HEADER=kubeflow-userid
TWA_USERID_PREFIX=
TWA_PREFIX=/tensorboards
TWA_APP_SECURE_COOKIES=true
TWA_APP_ENABLE_METRICS=1

View File

@ -7,28 +7,13 @@ from kubeflow.kubeflow.crud_backend import config, logging
log = logging.getLogger(__name__)
def get_config(mode):
"""Return a config based on the selected mode."""
config_classes = {
config.BackendMode.DEVELOPMENT.value: config.DevConfig,
config.BackendMode.DEVELOPMENT_FULL.value: config.DevConfig,
config.BackendMode.PRODUCTION.value: config.ProdConfig,
config.BackendMode.PRODUCTION_FULL.value: config.ProdConfig,
}
cfg_class = config_classes.get(mode)
if not cfg_class:
raise RuntimeError("Backend mode '%s' is not implemented. Choose one"
" of %s" % (mode, list(config_classes.keys())))
return cfg_class()
APP_NAME = os.environ.get("APP_NAME", "Volumes Web App")
BACKEND_MODE = os.environ.get("BACKEND_MODE",
config.BackendMode.PRODUCTION.value)
UI_FLAVOR = os.environ.get("UI_FLAVOR", "default")
PREFIX = os.environ.get("APP_PREFIX", "/")
cfg = get_config(BACKEND_MODE)
cfg = config.get_config(BACKEND_MODE)
cfg.PREFIX = PREFIX
# Load the app based on UI_FLAVOR env var

View File

@ -22,6 +22,8 @@ spec:
value: $(VWA_APP_SECURE_COOKIES)
- name: VOLUME_VIEWER_IMAGE
value: filebrowser/filebrowser:v2.25.0
- name: METRICS
value: $(VWA_APP_ENABLE_METRICS)
volumeMounts:
- name: viewer-spec
mountPath: /etc/config/viewer-spec.yaml

View File

@ -67,3 +67,10 @@ vars:
apiVersion: v1
kind: ConfigMap
name: parameters
- name: VWA_APP_ENABLE_METRICS
fieldref:
fieldPath: data.VWA_APP_ENABLE_METRICS
objref:
apiVersion: v1
kind: ConfigMap
name: parameters

View File

@ -3,3 +3,4 @@ VWA_USERID_HEADER=kubeflow-userid
VWA_USERID_PREFIX=
VWA_PREFIX=/volumes
VWA_APP_SECURE_COOKIES=true
VWA_APP_ENABLE_METRICS=1