diff --git a/sig-scalability/slos/connection_total_latency.md b/sig-scalability/slos/connection_total_latency.md deleted file mode 100644 index 2b828eca6..000000000 --- a/sig-scalability/slos/connection_total_latency.md +++ /dev/null @@ -1,32 +0,0 @@ -## Connection Total Latency SLI details - -### Definition - -| Status | SLI | -| --- | --- | -| WIP | The time elapsed in seconds (s) or minutes (min) from the successful establishment of a TCP connection to a Kubernetes service to the connection being closed measured as 99th percentile over last 5 minutes aggregated across all the node instances.| - -### User stories - -- As a user of vanilla Kubernetes, I want some visibility on how longs my pods are connected -to the services - -### Other notes - -The total connection duration can help to understand how clients interact with services, optimize resource usage, and identify potential issues like connection leaks or excessive short-lived connections. - -### How to measure the SLI. - -Requires precise timestamps for when the client sends the SYN packet and when it receives the last packet from the server. This can be done: - -- Client-side: In the application code or using a benchmark application. -- Network devices: Packet inspection and analysis on nodes along the network datapath. - -### Caveats - -Important Considerations: - -- Network Latency: geographic distance, routing, and network congestion. -- How quickly the server can process the SYN packet and send the SYN-ACK also contributes to the first packet latency. -- Other traffic on the network can delay the SYN-ACK, even if the server responds quickly. -- Client-side processing and network conditions on the client side can also introduce minor delays. \ No newline at end of file diff --git a/sig-scalability/slos/slos.md b/sig-scalability/slos/slos.md index 4cba9f20c..9a7aea6bf 100644 --- a/sig-scalability/slos/slos.md +++ b/sig-scalability/slos/slos.md @@ -123,7 +123,6 @@ __TODO: Cluster churn should be moved to scalability thresholds.__ | __WIP__ | In-cluster network latency from a single prober pod, measured as latency of per second ping from that pod to "null service", measured as 99th percentile over last 5 minutes. | In default Kubernetes installataion with RTT between nodes <= Y, 99th percentile of (99th percentile over all prober pods) per cluster-day[1](#footnote1) <= X | [Details](./network_latency.md) | | __WIP__ | In-cluster dns latency from a single prober pod, measured as latency of per second DNS lookup for "null service" from that pod, measured as 99th percentile over last 5 minutes. | In default Kubernetes installataion with RTT between nodes <= Y, 99th percentile of (99th percentile over all prober pods) per cluster-day[1](#footnote1) <= X | [Details](./dns_latency.md) | | __WIP__ | First Packet Latency in milliseconds (ms) from the client initiating the TCP connection to a Service (sending the SYN packet) to the client receiving the first packet from the Service backend (typically the SYN-ACK packet in the three-way handshake) measured as 99th percentile over last 5 minutes aggregated across all the node instances. | In default Kubernetes installation with RTT between nodes <= Y, 99th percentile of (99th percentile over all nodes) per cluster-day <= X | [Details](./first_packet_latency.md) | -| __WIP__ | The time elapsed in seconds (s) or minutes (min) from the successful establishment of a TCP connection to a Kubernetes service to the connection being closed measured as 99th percentile over last 5 minutes aggregated across all the node instances. | In default Kubernetes installation with RTT between nodes <= Y, 99th percentile of (99th percentile over all nodes) per cluster-day[1](#footnote1) <= X | [Details](./connection_total_latency.md) | | __WIP__ | The rate of successful data transfer over a TCP connection to services, measured in bits per second (bps), kilobits per second (kbps), megabits per second (Mbps), or gigabits per second (Gbps) measured as 99th percentile over last 5 minutes aggregated across all the connections to services in a node. | In default Kubernetes installation with RTT between nodes <= Y, 99th percentile of (99th percentile over all nodes) per cluster-day[1](#footnote1) <= X | [Details](./throughput.md) | \[1\] For the purpose of visualization it will be a