diff --git a/config/nav.yml b/config/nav.yml index 1dbd9f913..aa52e2c9a 100644 --- a/config/nav.yml +++ b/config/nav.yml @@ -95,7 +95,10 @@ nav: # Serving ############################################################################### - Serving: - - Knative Serving overview: serving/README.md + - Knative Serving: + - Overview: serving/README.md + - Architecture: serving/architecture.md + - Request Flow: serving/request-flow.md - Resources: - Revisions: - About Revisions: serving/revisions/README.md @@ -173,7 +176,6 @@ nav: - Debugging application issues: serving/troubleshooting/debugging-application-issues.md # Serving reference docs - Reference: - - Request Flow: serving/reference/request-flow.md - Serving API: serving/reference/serving-api.md ############################################################################### # Eventing diff --git a/docs/install/README.md b/docs/install/README.md index c4addafb8..8f051e86f 100644 --- a/docs/install/README.md +++ b/docs/install/README.md @@ -1,5 +1,8 @@ # Installing Knative +!!! note + Please also take a look at the [Serving Architecture](../serving/architecture.md), which explains the Knative components and the general networking concept. + You can install the Serving component, Eventing component, or both on your cluster by using one of the following deployment options: diff --git a/docs/serving/architecture.md b/docs/serving/architecture.md new file mode 100644 index 000000000..71f3bd7a8 --- /dev/null +++ b/docs/serving/architecture.md @@ -0,0 +1,51 @@ +# Knative Serving Architecture + +Knative Serving consists of several components forming the backbone of the Serverless Platform. +This page explains the high-level architecture of Knative Serving. Please also refer to [the Knative Serving Overview](./README.md) +and [the Request Flow](./request-flow.md) for additional information. + +## Diagram + +![Knative Serving Architecture](images/serving-architecture.png) + +## Components + +| Component | Responsibilities | +|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Activator | The activator is part of the **data-plane**. It is responsible to queue incoming requests (if a `Knative Service` is scaled-to-zero). It communicates with the autoscaler to bring scaled-to-zero Services back up and forward the queued requests. Activator can also act as a request buffer to handle traffic bursts. Additional details can be found [here](https://github.com/knative/serving/blob/main/docs/scaling/SYSTEM.md). | +| Autoscaler | The autoscaler is responsible to scale the Knative Services based on configuration, metrics and incoming requests. | +| Controller | The controller manages the state of Knative resources within the cluster. It watches several objects, manages the lifecycle of dependent resources, and updates the resource state. | +| Queue-Proxy | The Queue-Proxy is a sidecar container in the Knative Service's Pod. It is responsible to collect metrics and enforcing the desired concurrency when forwarding requests to the user's container. It can also act as a queue if necessary, similar to the Activator. | +| Webhooks | Knative Serving has several webhooks responsible to validate and mutate Knative Resources. | + +## Networking Layer and Ingress + +!!! note + `Ingress` in this case, does not refer to the [Kubernetes Ingress Resource](https://kubernetes.io/docs/concepts/services-networking/ingress/). It refers to the concept of exposing external access to a resource on the cluster. + +Knative Serving depends on a `Networking Layer` that fulfils the [Knative Networking Specification](https://github.com/knative/networking). +For this, Knative Serving defines an internal `KIngress` resource, which acts as an abstraction for different multiple pluggable networking layers. Currently, three networking layers are available and supported by the community: + +* [net-kourier](https://github.com/knative-sandbox/net-kourier) +* [net-contour](https://github.com/knative-sandbox/net-contour) +* [net-istio](https://github.com/knative-sandbox/net-istio) + + +## Traffic flow and DNS + +!!! note + There are fine differences between the different networking layers, the following section describes the general concept. Also, there are multiple ways to expose your `Ingress Gateway` and configure DNS. Please refer the installation documentation for more information. + +![Knative Serving Architecture Ingress](images/serving-architecture-ingress.png) + +* Each networking layer has a controller that is responsible to watch the `KIngress` resources and configure the `Ingress Gateway` accordingly. It will also report back `status` information through this resource. +* The `Ingress Gateway` is used to route requests to the `activator` or directly to a Knative Service Pod, depending on the mode (proxy/serve, see [here](https://github.com/knative/serving/blob/main/docs/scaling/SYSTEM.md) for more details). The `Ingress Gateway` is handling requests from inside the cluster and from outside the cluster. +* For the `Ingress Gateway` to be reachable outside the cluster, it must be [exposed](https://kubernetes.io/docs/tutorials/kubernetes-basics/expose/expose-intro/) using a Kubernetes Service of `type: LoadBalancer` or `type: NodePort`. The community supported networking layers include this as part of the installation. Then [DNS](../install/yaml-install/serving/install-serving-with-yaml.md#configure-dns) is configured to point to the `IP` or `Name` of the `Ingress Gateway` + +!!! note + Please note, if you do use/set DNS, you should also set the [same domain](./using-a-custom-domain.md) for Knative. + + +## Autoscaling + +You can find more detailed information on our autoscaling mechanism [here](https://github.com/knative/serving/tree/main/docs/scaling). diff --git a/docs/serving/reference/request-flow.png b/docs/serving/images/request-flow.png similarity index 100% rename from docs/serving/reference/request-flow.png rename to docs/serving/images/request-flow.png diff --git a/docs/serving/images/serving-architecture-ingress.png b/docs/serving/images/serving-architecture-ingress.png new file mode 100644 index 000000000..5aa4a1120 Binary files /dev/null and b/docs/serving/images/serving-architecture-ingress.png differ diff --git a/docs/serving/images/serving-architecture.png b/docs/serving/images/serving-architecture.png new file mode 100644 index 000000000..984c3cbe1 Binary files /dev/null and b/docs/serving/images/serving-architecture.png differ diff --git a/docs/serving/reference/request-flow.md b/docs/serving/request-flow.md similarity index 91% rename from docs/serving/reference/request-flow.md rename to docs/serving/request-flow.md index 36495d95c..d1ce3e0f7 100644 --- a/docs/serving/reference/request-flow.md +++ b/docs/serving/request-flow.md @@ -1,12 +1,13 @@ # HTTP Request Flows -While [the overview](/docs/serving) describes the logical components of Knative +While [the overview](/docs/serving) describes the logical components and +[the architecture](./architecture.md) describes the over all architecture of Knative Serving, this page explains the behavior and flow of HTTP requests to an application which is running on Knative Serving. The following diagram shows the different request flows and control plane loops for Knative Serving. Note that some components, such as the autoscaler and the apiserver are not updated on every request, but instead measure the system periodically (this is referred to as the control plane). -![Diagram of Knative request flow through HTTP router to optional Activator, then queue-proxy and user container](./request-flow.png) +![Diagram of Knative request flow through HTTP router to optional Activator, then queue-proxy and user container](images/request-flow.png) The HTTP router, activator, and autoscaler are all shared cluster-level @@ -19,7 +20,7 @@ pluggable ingress layer), and are recorded on the request in an internal header. Once a request has been assigned to a Revision, the subsequent routing depends on the measured traffic flow; at low or zero traffic, incoming requests are routed to the activator, while at high traffic levels ([spare capacity greater -than `target-burst-capacity`](../../load-balancing/target-burst-capacity)) +than `target-burst-capacity`](./load-balancing/target-burst-capacity.md)) traffic is routed directly to the application pods. ## Scale From Zero @@ -33,7 +34,7 @@ additional capacity is needed. When the autoscaler detects that the available capacity for a Revision is below the requested capacity, it [increases the number of pods requested from -Kubernetes](../../autoscaling/autoscale-go#algorithm). +Kubernetes](./autoscaling/autoscale-go/README.md#algorithm). When these new pods become ready or an existing pod has capacity, the activator will forward the delayed request to a ready pod. If a new pod needs to be @@ -42,7 +43,7 @@ started to handle a request, this is called a _cold-start_. ## High scale When a Revision has a high amount of traffic ([the spare capacity is greater -than `target-burst-capacity`](../../load-balancing/target-burst-capacity)), the +than `target-burst-capacity`](./load-balancing/target-burst-capacity.md)), the ingress router is programmed directly with the pod adresses of the Revision, and the activator is removed from the traffic flow. This reduces latency and increases efficiency when the additional buffering of the activator is not @@ -78,7 +79,7 @@ reliability and scaling of Knative: activator is removed from the request path. * Implements the [`containerConcurrency` hard limit on request - concurrency](https://knative.dev/docs/serving/autoscaling/concurrency/#hard-limit) + concurrency](./autoscaling/concurrency.md#hard-limit) if requested. * Handles graceful shutdown on Pod termination (refuse new requests, fail