Add Knative Serving architecture page (#5545)

* Add Knative Serving architecture page

* Reorder and cross-link architecture and request-flow pages

* Review feedback

* Update docs/serving/architecture.md

Co-authored-by: Evan Anderson <evan.k.anderson@gmail.com>

* Update docs/serving/architecture.md

Co-authored-by: Evan Anderson <evan.k.anderson@gmail.com>

* Review updates

* Review updates

---------

Co-authored-by: Evan Anderson <evan.k.anderson@gmail.com>
This commit is contained in:
Reto Lehmann 2023-06-06 16:36:27 +02:00 committed by GitHub
parent 26c2b16908
commit 1292fa5b62
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 65 additions and 8 deletions

View File

@ -95,7 +95,10 @@ nav:
# Serving
###############################################################################
- Serving:
- Knative Serving overview: serving/README.md
- Knative Serving:
- Overview: serving/README.md
- Architecture: serving/architecture.md
- Request Flow: serving/request-flow.md
- Resources:
- Revisions:
- About Revisions: serving/revisions/README.md
@ -173,7 +176,6 @@ nav:
- Debugging application issues: serving/troubleshooting/debugging-application-issues.md
# Serving reference docs
- Reference:
- Request Flow: serving/reference/request-flow.md
- Serving API: serving/reference/serving-api.md
###############################################################################
# Eventing

View File

@ -1,5 +1,8 @@
# Installing Knative
!!! note
Please also take a look at the [Serving Architecture](../serving/architecture.md), which explains the Knative components and the general networking concept.
You can install the Serving component, Eventing component, or both on your
cluster by using one of the following deployment options:

View File

@ -0,0 +1,51 @@
# Knative Serving Architecture
Knative Serving consists of several components forming the backbone of the Serverless Platform.
This page explains the high-level architecture of Knative Serving. Please also refer to [the Knative Serving Overview](./README.md)
and [the Request Flow](./request-flow.md) for additional information.
## Diagram
![Knative Serving Architecture](images/serving-architecture.png)
## Components
| Component | Responsibilities |
|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Activator | The activator is part of the **data-plane**. It is responsible to queue incoming requests (if a `Knative Service` is scaled-to-zero). It communicates with the autoscaler to bring scaled-to-zero Services back up and forward the queued requests. Activator can also act as a request buffer to handle traffic bursts. Additional details can be found [here](https://github.com/knative/serving/blob/main/docs/scaling/SYSTEM.md). |
| Autoscaler | The autoscaler is responsible to scale the Knative Services based on configuration, metrics and incoming requests. |
| Controller | The controller manages the state of Knative resources within the cluster. It watches several objects, manages the lifecycle of dependent resources, and updates the resource state. |
| Queue-Proxy | The Queue-Proxy is a sidecar container in the Knative Service's Pod. It is responsible to collect metrics and enforcing the desired concurrency when forwarding requests to the user's container. It can also act as a queue if necessary, similar to the Activator. |
| Webhooks | Knative Serving has several webhooks responsible to validate and mutate Knative Resources. |
## Networking Layer and Ingress
!!! note
`Ingress` in this case, does not refer to the [Kubernetes Ingress Resource](https://kubernetes.io/docs/concepts/services-networking/ingress/). It refers to the concept of exposing external access to a resource on the cluster.
Knative Serving depends on a `Networking Layer` that fulfils the [Knative Networking Specification](https://github.com/knative/networking).
For this, Knative Serving defines an internal `KIngress` resource, which acts as an abstraction for different multiple pluggable networking layers. Currently, three networking layers are available and supported by the community:
* [net-kourier](https://github.com/knative-sandbox/net-kourier)
* [net-contour](https://github.com/knative-sandbox/net-contour)
* [net-istio](https://github.com/knative-sandbox/net-istio)
## Traffic flow and DNS
!!! note
There are fine differences between the different networking layers, the following section describes the general concept. Also, there are multiple ways to expose your `Ingress Gateway` and configure DNS. Please refer the installation documentation for more information.
![Knative Serving Architecture Ingress](images/serving-architecture-ingress.png)
* Each networking layer has a controller that is responsible to watch the `KIngress` resources and configure the `Ingress Gateway` accordingly. It will also report back `status` information through this resource.
* The `Ingress Gateway` is used to route requests to the `activator` or directly to a Knative Service Pod, depending on the mode (proxy/serve, see [here](https://github.com/knative/serving/blob/main/docs/scaling/SYSTEM.md) for more details). The `Ingress Gateway` is handling requests from inside the cluster and from outside the cluster.
* For the `Ingress Gateway` to be reachable outside the cluster, it must be [exposed](https://kubernetes.io/docs/tutorials/kubernetes-basics/expose/expose-intro/) using a Kubernetes Service of `type: LoadBalancer` or `type: NodePort`. The community supported networking layers include this as part of the installation. Then [DNS](../install/yaml-install/serving/install-serving-with-yaml.md#configure-dns) is configured to point to the `IP` or `Name` of the `Ingress Gateway`
!!! note
Please note, if you do use/set DNS, you should also set the [same domain](./using-a-custom-domain.md) for Knative.
## Autoscaling
You can find more detailed information on our autoscaling mechanism [here](https://github.com/knative/serving/tree/main/docs/scaling).

View File

Before

Width:  |  Height:  |  Size: 43 KiB

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

View File

@ -1,12 +1,13 @@
# HTTP Request Flows
While [the overview](/docs/serving) describes the logical components of Knative
While [the overview](/docs/serving) describes the logical components and
[the architecture](./architecture.md) describes the over all architecture of Knative
Serving, this page explains the behavior and flow of HTTP requests to an
application which is running on Knative Serving.
The following diagram shows the different request flows and control plane loops for Knative Serving. Note that some components, such as the autoscaler and the apiserver are not updated on every request, but instead measure the system periodically (this is referred to as the control plane).
![Diagram of Knative request flow through HTTP router to optional Activator, then queue-proxy and user container](./request-flow.png)
![Diagram of Knative request flow through HTTP router to optional Activator, then queue-proxy and user container](images/request-flow.png)
<!-- Image original: https://docs.google.com/drawings/d/1Jipg4755BHCyqZGu1sUj7FMFUpEs-35Rf5T5chHZ6m0/edit -->
The HTTP router, activator, and autoscaler are all shared cluster-level
@ -19,7 +20,7 @@ pluggable ingress layer), and are recorded on the request in an internal header.
Once a request has been assigned to a Revision, the subsequent routing depends
on the measured traffic flow; at low or zero traffic, incoming requests are
routed to the activator, while at high traffic levels ([spare capacity greater
than `target-burst-capacity`](../../load-balancing/target-burst-capacity))
than `target-burst-capacity`](./load-balancing/target-burst-capacity.md))
traffic is routed directly to the application pods.
## Scale From Zero
@ -33,7 +34,7 @@ additional capacity is needed.
When the autoscaler detects that the available capacity for a Revision is below
the requested capacity, it [increases the number of pods requested from
Kubernetes](../../autoscaling/autoscale-go#algorithm).
Kubernetes](./autoscaling/autoscale-go/README.md#algorithm).
When these new pods become ready or an existing pod has capacity, the activator
will forward the delayed request to a ready pod. If a new pod needs to be
@ -42,7 +43,7 @@ started to handle a request, this is called a _cold-start_.
## High scale
When a Revision has a high amount of traffic ([the spare capacity is greater
than `target-burst-capacity`](../../load-balancing/target-burst-capacity)), the
than `target-burst-capacity`](./load-balancing/target-burst-capacity.md)), the
ingress router is programmed directly with the pod adresses of the Revision, and
the activator is removed from the traffic flow. This reduces latency and
increases efficiency when the additional buffering of the activator is not
@ -78,7 +79,7 @@ reliability and scaling of Knative:
activator is removed from the request path.
* Implements the [`containerConcurrency` hard limit on request
concurrency](https://knative.dev/docs/serving/autoscaling/concurrency/#hard-limit)
concurrency](./autoscaling/concurrency.md#hard-limit)
if requested.
* Handles graceful shutdown on Pod termination (refuse new requests, fail