website/content/en/docs/concepts/services-networking/service.md

1563 lines
65 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
reviewers:
- bprashanth
title: Service
feature:
title: Service discovery and load balancing
description: >
No need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.
description: >-
Expose an application running in your cluster behind a single outward-facing
endpoint, even when the workload is split across multiple backends.
content_type: concept
weight: 10
---
<!-- overview -->
{{< glossary_definition term_id="service" length="short" >}}
With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism.
Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods,
and can load-balance across them.
<!-- body -->
## Motivation
Kubernetes {{< glossary_tooltip term_id="pod" text="Pods" >}} are created and destroyed
to match the desired state of your cluster. Pods are nonpermanent resources.
If you use a {{< glossary_tooltip term_id="deployment" >}} to run your app,
it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods
running in one moment in time could be different from
the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them "backends") provides
functionality to other Pods (call them "frontends") inside your cluster,
how do the frontends find out and keep track of which IP address to connect
to, so that the frontend can use the backend part of the workload?
Enter _Services_.
## Service resources {#service-resource}
In Kubernetes, a Service is an abstraction which defines a logical set of Pods
and a policy by which to access them (sometimes this pattern is called
a micro-service). The set of Pods targeted by a Service is usually determined
by a {{< glossary_tooltip text="selector" term_id="selector" >}}.
To learn about other ways to define Service endpoints,
see [Services _without_ selectors](#services-without-selectors).
For example, consider a stateless image-processing backend which is running with
3 replicas. Those replicas are fungible&mdash;frontends do not care which backend
they use. While the actual Pods that compose the backend set may change, the
frontend clients should not need to be aware of that, nor should they need to keep
track of the set of backends themselves.
The Service abstraction enables this decoupling.
### Cloud-native service discovery
If you're able to use Kubernetes APIs for service discovery in your application,
you can query the {{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}
for matching EndpointSlices. Kubernetes updates the EndpointSlices for a Service
whenever the set of Pods in a Service changes.
For non-native applications, Kubernetes offers ways to place a network port or load
balancer in between your application and the backend Pods.
## Defining a Service
A Service in Kubernetes is a REST object, similar to a Pod. Like all of the
REST objects, you can `POST` a Service definition to the API server to create
a new instance.
The name of a Service object must be a valid
[RFC 1035 label name](/docs/concepts/overview/working-with-objects/names#rfc-1035-label-names).
For example, suppose you have a set of Pods where each listens on TCP port 9376
and contains a label `app.kubernetes.io/name=MyApp`:
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app.kubernetes.io/name: MyApp
ports:
- protocol: TCP
port: 80
targetPort: 9376
```
This specification creates a new Service object named "my-service", which
targets TCP port 9376 on any Pod with the `app.kubernetes.io/name=MyApp` label.
Kubernetes assigns this Service an IP address (sometimes called the "cluster IP"),
which is used by the Service proxies
(see [Virtual IPs and service proxies](#virtual-ips-and-service-proxies) below).
The controller for the Service selector continuously scans for Pods that
match its selector, and then POSTs any updates to an Endpoint object
also named "my-service".
{{< note >}}
A Service can map _any_ incoming `port` to a `targetPort`. By default and
for convenience, the `targetPort` is set to the same value as the `port`
field.
{{< /note >}}
Port definitions in Pods have names, and you can reference these names in the
`targetPort` attribute of a Service. For example, we can bind the `targetPort`
of the Service to the Pod port in the following way:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app.kubernetes.io/name: proxy
spec:
containers:
- name: nginx
image: nginx:stable
ports:
- containerPort: 80
name: http-web-svc
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app.kubernetes.io/name: proxy
ports:
- name: name-of-service-port
protocol: TCP
port: 80
targetPort: http-web-svc
```
This works even if there is a mixture of Pods in the Service using a single
configured name, with the same network protocol available via different
port numbers. This offers a lot of flexibility for deploying and evolving
your Services. For example, you can change the port numbers that Pods expose
in the next version of your backend software, without breaking clients.
The default protocol for Services is TCP; you can also use any other
[supported protocol](#protocol-support).
As many Services need to expose more than one port, Kubernetes supports multiple
port definitions on a Service object.
Each port definition can have the same `protocol`, or a different one.
### Services without selectors
Services most commonly abstract access to Kubernetes Pods thanks to the selector,
but when used with a corresponding set of
{{<glossary_tooltip term_id="endpoint-slice" text="EndpointSlices">}}
objects and without a selector, the Service can abstract other kinds of backends,
including ones that run outside the cluster.
For example:
* You want to have an external database cluster in production, but in your
test environment you use your own databases.
* You want to point your Service to a Service in a different
{{< glossary_tooltip term_id="namespace" >}} or on another cluster.
* You are migrating a workload to Kubernetes. While evaluating the approach,
you run only a portion of your backends in Kubernetes.
In any of these scenarios you can define a Service _without_ a Pod selector.
For example:
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
ports:
- protocol: TCP
port: 80
targetPort: 9376
```
Because this Service has no selector, the corresponding EndpointSlice (and
legacy Endpoints) objects are not created automatically. You can manually map the Service
to the network address and port where it's running, by adding an EndpointSlice
object manually. For example:
```yaml
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: my-service-1 # by convention, use the name of the Service
# as a prefix for the name of the EndpointSlice
labels:
# You should set the "kubernetes.io/service-name" label.
# Set its value to match the name of the Service
kubernetes.io/service-name: my-service
addressType: IPv4
ports:
- name: '' # empty because port 9376 is not assigned as a well-known
# port (by IANA)
appProtocol: http
protocol: TCP
port: 9376
endpoints:
- addresses:
- "10.4.5.6" # the IP addresses in this list can appear in any order
- "10.1.2.3"
```
#### Custom EndpointSlices
When you create an [EndpointSlice](#endpointslices) object for a Service, you can
use any name for the EndpointSlice. Each EndpointSlice in a namespace must have a
unique name. You link an EndpointSlice to a Service by setting the
`kubernetes.io/service-name` {{< glossary_tooltip text="label" term_id="label" >}}
on that EndpointSlice.
{{< note >}}
The endpoint IPs _must not_ be: loopback (127.0.0.0/8 for IPv4, ::1/128 for IPv6), or
link-local (169.254.0.0/16 and 224.0.0.0/24 for IPv4, fe80::/64 for IPv6).
The endpoint IP addresses cannot be the cluster IPs of other Kubernetes Services,
because {{< glossary_tooltip term_id="kube-proxy" >}} doesn't support virtual IPs
as a destination.
{{< /note >}}
For an EndpointSlice that you create yourself, or in your own code,
you should also pick a value to use for the [`endpointslice.kubernetes.io/managed-by`](/docs/reference/labels-annotations-taints/#endpointslicekubernetesiomanaged-by) label.
If you create your own controller code to manage EndpointSlices, consider using a
value similar to `"my-domain.example/name-of-controller"`. If you are using a third
party tool, use the name of the tool in all-lowercase and change spaces and other
punctuation to dashes (`-`).
If people are directly using a tool such as `kubectl` to manage EndpointSlices,
use a name that describes this manual management, such as `"staff"` or
`"cluster-admins"`. You should
avoid using the reserved value `"controller"`, which identifies EndpointSlices
managed by Kubernetes' own control plane.
#### Accessing a Service without a selector {#service-no-selector-access}
Accessing a Service without a selector works the same as if it had a selector.
In the [example](#services-without-selectors) for a Service without a selector, traffic is routed to one of the two endpoints defined in
the EndpointSlice manifest: a TCP connection to 10.1.2.3 or 10.4.5.6, on port 9376.
An ExternalName Service is a special case of Service that does not have
selectors and uses DNS names instead. For more information, see the
[ExternalName](#externalname) section later in this document.
### EndpointSlices
{{< feature-state for_k8s_version="v1.21" state="stable" >}}
[EndpointSlices](/docs/concepts/services-networking/endpoint-slices/) are objects that
represent a subset (a _slice_) of the backing network endpoints for a Service.
Your Kubernetes cluster tracks how many endpoints each EndpointSlice represents.
If there are so many endpoints for a Service that a threshold is reached, then
Kubernetes adds another empty EndpointSlice and stores new endpoint information
there.
By default, Kubernetes makes a new EndpointSlice once the existing EndpointSlices
all contain at least 100 endpoints. Kubernetes does not make the new EndpointSlice
until an extra endpoint needs to be added.
See [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/) for more
information about this API.
### Endpoints
In the Kubernetes API, an
[Endpoints](/docs/reference/kubernetes-api/service-resources/endpoints-v1/)
(the resource kind is plural) defines a list of network endpoints, typically
referenced by a Service to define which Pods the traffic can be sent to.
The EndpointSlice API is the recommended replacement for Endpoints.
#### Over-capacity endpoints
Kubernetes limits the number of endpoints that can fit in a single Endpoints
object. When there are over 1000 backing endpoints for a Service, Kubernetes
truncates the data in the Endpoints object. Because a Service can be linked
with more than one EndpointSlice, the 1000 backing endpoint limit only
affects the legacy Endpoints API.
In that case, Kubernetes selects at most 1000 possible backend endpoints to store
into the Endpoints object, and sets an
{{< glossary_tooltip text="annotation" term_id="annotation" >}} on the
Endpoints:
[`endpoints.kubernetes.io/over-capacity: truncated`](/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity).
The control plane also removes that annotation if the number of backend Pods drops below 1000.
Traffic is still sent to backends, but any load balancing mechanism that relies on the
legacy Endpoints API only sends traffic to at most 1000 of the available backing endpoints.
The same API limit means that you cannot manually update an Endpoints to have more than 1000 endpoints.
### Application protocol
{{< feature-state for_k8s_version="v1.20" state="stable" >}}
The `appProtocol` field provides a way to specify an application protocol for
each Service port. The value of this field is mirrored by the corresponding
Endpoints and EndpointSlice objects.
This field follows standard Kubernetes label syntax. Values should either be
[IANA standard service names](https://www.iana.org/assignments/service-names) or
domain prefixed names such as `mycompany.com/my-custom-protocol`.
## Virtual IPs and service proxies
Every node in a Kubernetes cluster runs a `kube-proxy`. `kube-proxy` is
responsible for implementing a form of virtual IP for `Services` of type other
than [`ExternalName`](#externalname).
### Why not use round-robin DNS?
A question that pops up every now and then is why Kubernetes relies on
proxying to forward inbound traffic to backends. What about other
approaches? For example, would it be possible to configure DNS records that
have multiple A values (or AAAA for IPv6), and rely on round-robin name
resolution?
There are a few reasons for using proxying for Services:
* There is a long history of DNS implementations not respecting record TTLs,
and caching the results of name lookups after they should have expired.
* Some apps do DNS lookups only once and cache the results indefinitely.
* Even if apps and libraries did proper re-resolution, the low or zero TTLs
on the DNS records could impose a high load on DNS that then becomes
difficult to manage.
Later in this page you can read about how various kube-proxy implementations work. Overall,
you should note that, when running `kube-proxy`, kernel level rules may be
modified (for example, iptables rules might get created), which won't get cleaned up,
in some cases until you reboot. Thus, running kube-proxy is something that should
only be done by an administrator which understands the consequences of having a
low level, privileged network proxying service on a computer. Although the `kube-proxy`
executable supports a `cleanup` function, this function is not an official feature and
thus is only available to use as-is.
### Configuration
Note that the kube-proxy starts up in different modes, which are determined by its configuration.
- The kube-proxy's configuration is done via a ConfigMap, and the ConfigMap for kube-proxy
effectively deprecates the behavior for almost all of the flags for the kube-proxy.
- The ConfigMap for the kube-proxy does not support live reloading of configuration.
- The ConfigMap parameters for the kube-proxy cannot all be validated and verified on startup.
For example, if your operating system doesn't allow you to run iptables commands,
the standard kernel kube-proxy implementation will not work.
Likewise, if you have an operating system which doesn't support `netsh`,
it will not run in Windows userspace mode.
### User space proxy mode {#proxy-mode-userspace}
In this (legacy) mode, kube-proxy watches the Kubernetes control plane for the addition and
removal of Service and Endpoint objects. For each Service it opens a
port (randomly chosen) on the local node. Any connections to this "proxy port"
are proxied to one of the Service's backend Pods (as reported via
Endpoints). kube-proxy takes the `SessionAffinity` setting of the Service into
account when deciding which backend Pod to use.
Lastly, the user-space proxy installs iptables rules which capture traffic to
the Service's `clusterIP` (which is virtual) and `port`. The rules
redirect that traffic to the proxy port which proxies the backend Pod.
By default, kube-proxy in userspace mode chooses a backend via a round-robin algorithm.
![Services overview diagram for userspace proxy](/images/docs/services-userspace-overview.svg)
### `iptables` proxy mode {#proxy-mode-iptables}
In this mode, kube-proxy watches the Kubernetes control plane for the addition and
removal of Service and Endpoint objects. For each Service, it installs
iptables rules, which capture traffic to the Service's `clusterIP` and `port`,
and redirect that traffic to one of the Service's
backend sets. For each Endpoint object, it installs iptables rules which
select a backend Pod.
By default, kube-proxy in iptables mode chooses a backend at random.
Using iptables to handle traffic has a lower system overhead, because traffic
is handled by Linux netfilter without the need to switch between userspace and the
kernel space. This approach is also likely to be more reliable.
If kube-proxy is running in iptables mode and the first Pod that's selected
does not respond, the connection fails. This is different from userspace
mode: in that scenario, kube-proxy would detect that the connection to the first
Pod had failed and would automatically retry with a different backend Pod.
You can use Pod [readiness probes](/docs/concepts/workloads/pods/pod-lifecycle/#container-probes)
to verify that backend Pods are working OK, so that kube-proxy in iptables mode
only sees backends that test out as healthy. Doing this means you avoid
having traffic sent via kube-proxy to a Pod that's known to have failed.
![Services overview diagram for iptables proxy](/images/docs/services-iptables-overview.svg)
### IPVS proxy mode {#proxy-mode-ipvs}
{{< feature-state for_k8s_version="v1.11" state="stable" >}}
In `ipvs` mode, kube-proxy watches Kubernetes Services and Endpoints,
calls `netlink` interface to create IPVS rules accordingly and synchronizes
IPVS rules with Kubernetes Services and Endpoints periodically.
This control loop ensures that IPVS status matches the desired
state.
When accessing a Service, IPVS directs traffic to one of the backend Pods.
The IPVS proxy mode is based on netfilter hook function that is similar to
iptables mode, but uses a hash table as the underlying data structure and works
in the kernel space.
That means kube-proxy in IPVS mode redirects traffic with lower latency than
kube-proxy in iptables mode, with much better performance when synchronizing
proxy rules. Compared to the other proxy modes, IPVS mode also supports a
higher throughput of network traffic.
IPVS provides more options for balancing traffic to backend Pods;
these are:
* `rr`: round-robin
* `lc`: least connection (smallest number of open connections)
* `dh`: destination hashing
* `sh`: source hashing
* `sed`: shortest expected delay
* `nq`: never queue
{{< note >}}
To run kube-proxy in IPVS mode, you must make IPVS available on
the node before starting kube-proxy.
When kube-proxy starts in IPVS proxy mode, it verifies whether IPVS
kernel modules are available. If the IPVS kernel modules are not detected, then kube-proxy
falls back to running in iptables proxy mode.
{{< /note >}}
![Services overview diagram for IPVS proxy](/images/docs/services-ipvs-overview.svg)
In these proxy models, the traffic bound for the Service's IP:Port is
proxied to an appropriate backend without the clients knowing anything
about Kubernetes or Services or Pods.
If you want to make sure that connections from a particular client
are passed to the same Pod each time, you can select the session affinity based
on the client's IP addresses by setting `service.spec.sessionAffinity` to "ClientIP"
(the default is "None").
You can also set the maximum session sticky time by setting
`service.spec.sessionAffinityConfig.clientIP.timeoutSeconds` appropriately.
(the default value is 10800, which works out to be 3 hours).
{{< note >}}
On Windows, setting the maximum session sticky time for Services is not supported.
{{< /note >}}
## Multi-Port Services
For some Services, you need to expose more than one port.
Kubernetes lets you configure multiple port definitions on a Service object.
When using multiple ports for a Service, you must give all of your ports names
so that these are unambiguous.
For example:
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app.kubernetes.io/name: MyApp
ports:
- name: http
protocol: TCP
port: 80
targetPort: 9376
- name: https
protocol: TCP
port: 443
targetPort: 9377
```
{{< note >}}
As with Kubernetes {{< glossary_tooltip term_id="name" text="names">}} in general, names for ports
must only contain lowercase alphanumeric characters and `-`. Port names must
also start and end with an alphanumeric character.
For example, the names `123-abc` and `web` are valid, but `123_abc` and `-web` are not.
{{< /note >}}
## Choosing your own IP address
You can specify your own cluster IP address as part of a `Service` creation
request. To do this, set the `.spec.clusterIP` field. For example, if you
already have an existing DNS entry that you wish to reuse, or legacy systems
that are configured for a specific IP address and difficult to re-configure.
The IP address that you choose must be a valid IPv4 or IPv6 address from within the
`service-cluster-ip-range` CIDR range that is configured for the API server.
If you try to create a Service with an invalid clusterIP address value, the API
server will return a 422 HTTP status code to indicate that there's a problem.
## Traffic policies
### External traffic policy
You can set the `spec.externalTrafficPolicy` field to control how traffic from external sources is routed.
Valid values are `Cluster` and `Local`. Set the field to `Cluster` to route external traffic to all ready endpoints
and `Local` to only route to ready node-local endpoints. If the traffic policy is `Local` and there are no node-local
endpoints, the kube-proxy does not forward any traffic for the relevant Service.
{{< note >}}
{{< feature-state for_k8s_version="v1.22" state="alpha" >}}
If you enable the `ProxyTerminatingEndpoints`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
for the kube-proxy, the kube-proxy checks if the node
has local endpoints and whether or not all the local endpoints are marked as terminating.
If there are local endpoints and **all** of those are terminating, then the kube-proxy ignores
any external traffic policy of `Local`. Instead, whilst the node-local endpoints remain as all
terminating, the kube-proxy forwards traffic for that Service to healthy endpoints elsewhere,
as if the external traffic policy were set to `Cluster`.
This forwarding behavior for terminating endpoints exists to allow external load balancers to
gracefully drain connections that are backed by `NodePort` Services, even when the health check
node port starts to fail. Otherwise, traffic can be lost between the time a node is still in the node pool of a load
balancer and traffic is being dropped during the termination period of a pod.
{{< /note >}}
### Internal traffic policy
{{< feature-state for_k8s_version="v1.22" state="beta" >}}
You can set the `spec.internalTrafficPolicy` field to control how traffic from internal sources is routed.
Valid values are `Cluster` and `Local`. Set the field to `Cluster` to route internal traffic to all ready endpoints
and `Local` to only route to ready node-local endpoints. If the traffic policy is `Local` and there are no node-local
endpoints, traffic is dropped by kube-proxy.
## Discovering services
Kubernetes supports 2 primary modes of finding a Service - environment
variables and DNS.
### Environment variables
When a Pod is run on a Node, the kubelet adds a set of environment variables
for each active Service. It adds `{SVCNAME}_SERVICE_HOST` and `{SVCNAME}_SERVICE_PORT` variables,
where the Service name is upper-cased and dashes are converted to underscores.
It also supports variables (see [makeLinkVariables](https://github.com/kubernetes/kubernetes/blob/dd2d12f6dc0e654c15d5db57a5f9f6ba61192726/pkg/kubelet/envvars/envvars.go#L72))
that are compatible with Docker Engine's
"_[legacy container links](https://docs.docker.com/network/links/)_" feature.
For example, the Service `redis-primary` which exposes TCP port 6379 and has been
allocated cluster IP address 10.0.0.11, produces the following environment
variables:
```shell
REDIS_PRIMARY_SERVICE_HOST=10.0.0.11
REDIS_PRIMARY_SERVICE_PORT=6379
REDIS_PRIMARY_PORT=tcp://10.0.0.11:6379
REDIS_PRIMARY_PORT_6379_TCP=tcp://10.0.0.11:6379
REDIS_PRIMARY_PORT_6379_TCP_PROTO=tcp
REDIS_PRIMARY_PORT_6379_TCP_PORT=6379
REDIS_PRIMARY_PORT_6379_TCP_ADDR=10.0.0.11
```
{{< note >}}
When you have a Pod that needs to access a Service, and you are using
the environment variable method to publish the port and cluster IP to the client
Pods, you must create the Service *before* the client Pods come into existence.
Otherwise, those client Pods won't have their environment variables populated.
If you only use DNS to discover the cluster IP for a Service, you don't need to
worry about this ordering issue.
{{< /note >}}
### DNS
You can (and almost always should) set up a DNS service for your Kubernetes
cluster using an [add-on](/docs/concepts/cluster-administration/addons/).
A cluster-aware DNS server, such as CoreDNS, watches the Kubernetes API for new
Services and creates a set of DNS records for each one. If DNS has been enabled
throughout your cluster then all Pods should automatically be able to resolve
Services by their DNS name.
For example, if you have a Service called `my-service` in a Kubernetes
namespace `my-ns`, the control plane and the DNS Service acting together
create a DNS record for `my-service.my-ns`. Pods in the `my-ns` namespace
should be able to find the service by doing a name lookup for `my-service`
(`my-service.my-ns` would also work).
Pods in other namespaces must qualify the name as `my-service.my-ns`. These names
will resolve to the cluster IP assigned for the Service.
Kubernetes also supports DNS SRV (Service) records for named ports. If the
`my-service.my-ns` Service has a port named `http` with the protocol set to
`TCP`, you can do a DNS SRV query for `_http._tcp.my-service.my-ns` to discover
the port number for `http`, as well as the IP address.
The Kubernetes DNS server is the only way to access `ExternalName` Services.
You can find more information about `ExternalName` resolution in
[DNS Pods and Services](/docs/concepts/services-networking/dns-pod-service/).
## Headless Services
Sometimes you don't need load-balancing and a single Service IP. In
this case, you can create what are termed "headless" Services, by explicitly
specifying `"None"` for the cluster IP (`.spec.clusterIP`).
You can use a headless Service to interface with other service discovery mechanisms,
without being tied to Kubernetes' implementation.
For headless `Services`, a cluster IP is not allocated, kube-proxy does not handle
these Services, and there is no load balancing or proxying done by the platform
for them. How DNS is automatically configured depends on whether the Service has
selectors defined:
### With selectors
For headless Services that define selectors, the Kubernetes control plane creates
EndpointSlice objects in the Kubernetes API, and modifies the DNS configuration to return
A or AAAA records (IPv4 or IPv6 addresses) that point directly to the Pods backing
the Service.
### Without selectors
For headless Services that do not define selectors, the control plane does
not create EndpointSlice objects. However, the DNS system looks for and configures
either:
* DNS CNAME records for [`type: ExternalName`](#externalname) Services.
* DNS A / AAAA records for all IP addresses of the Service's ready endpoints,
for all Service types other than `ExternalName`.
* For IPv4 endpoints, the DNS system creates A records.
* For IPv6 endpoints, the DNS system creates AAAA records.
## Publishing Services (ServiceTypes) {#publishing-services-service-types}
For some parts of your application (for example, frontends) you may want to expose a
Service onto an external IP address, that's outside of your cluster.
Kubernetes `ServiceTypes` allow you to specify what kind of Service you want.
`Type` values and their behaviors are:
* `ClusterIP`: Exposes the Service on a cluster-internal IP. Choosing this value
makes the Service only reachable from within the cluster. This is the
default that is used if you don't explicitly specify a `type` for a Service.
* [`NodePort`](#type-nodeport): Exposes the Service on each Node's IP at a static port
(the `NodePort`).
To make the node port available, Kubernetes sets up a cluster IP address,
the same as if you had requested a Service of `type: ClusterIP`.
* [`LoadBalancer`](#loadbalancer): Exposes the Service externally using a cloud
provider's load balancer.
* [`ExternalName`](#externalname): Maps the Service to the contents of the
`externalName` field (e.g. `foo.bar.example.com`), by returning a `CNAME` record
with its value. No proxying of any kind is set up.
{{< note >}}
You need either `kube-dns` version 1.7 or CoreDNS version 0.0.8 or higher
to use the `ExternalName` type.
{{< /note >}}
You can also use [Ingress](/docs/concepts/services-networking/ingress/) to expose your Service.
Ingress is not a Service type, but it acts as the entry point for your cluster.
It lets you consolidate your routing rules into a single resource as it can expose multiple
services under the same IP address.
### Type NodePort {#type-nodeport}
If you set the `type` field to `NodePort`, the Kubernetes control plane
allocates a port from a range specified by `--service-node-port-range` flag (default: 30000-32767).
Each node proxies that port (the same port number on every Node) into your Service.
Your Service reports the allocated port in its `.spec.ports[*].nodePort` field.
Using a NodePort gives you the freedom to set up your own load balancing solution,
to configure environments that are not fully supported by Kubernetes, or even
to expose one or more nodes' IP addresses directly.
For a node port Service, Kubernetes additionally allocates a port (TCP, UDP or
SCTP to match the protocol of the Service). Every node in the cluster configures
itself to listen on that assigned port and to forward traffic to one of the ready
endpoints associated with that Service. You'll be able to contact the `type: NodePort`
Service, from outside the cluster, by connecting to any node using the appropriate
protocol (for example: TCP), and the appropriate port (as assigned to that Service).
#### Choosing your own port {#nodeport-custom-port}
If you want a specific port number, you can specify a value in the `nodePort`
field. The control plane will either allocate you that port or report that
the API transaction failed.
This means that you need to take care of possible port collisions yourself.
You also have to use a valid port number, one that's inside the range configured
for NodePort use.
Here is an example manifest for a Service of `type: NodePort` that specifies
a NodePort value (30007, in this example).
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
type: NodePort
selector:
app.kubernetes.io/name: MyApp
ports:
# By default and for convenience, the `targetPort` is set to the same value as the `port` field.
- port: 80
targetPort: 80
# Optional field
# By default and for convenience, the Kubernetes control plane will allocate a port from a range (default: 30000-32767)
nodePort: 30007
```
#### Custom IP address configuration for `type: NodePort` Services {#service-nodeport-custom-listen-address}
You can set up nodes in your cluster to use a particular IP address for serving node port
services. You might want to do this if each node is connected to multiple networks (for example:
one network for application traffic, and another network for traffic between nodes and the
control plane).
If you want to specify particular IP address(es) to proxy the port, you can set the
`--nodeport-addresses` flag for kube-proxy or the equivalent `nodePortAddresses`
field of the
[kube-proxy configuration file](/docs/reference/config-api/kube-proxy-config.v1alpha1/)
to particular IP block(s).
This flag takes a comma-delimited list of IP blocks (e.g. `10.0.0.0/8`, `192.0.2.0/25`)
to specify IP address ranges that kube-proxy should consider as local to this node.
For example, if you start kube-proxy with the `--nodeport-addresses=127.0.0.0/8` flag,
kube-proxy only selects the loopback interface for NodePort Services.
The default for `--nodeport-addresses` is an empty list.
This means that kube-proxy should consider all available network interfaces for NodePort.
(That's also compatible with earlier Kubernetes releases.)
{{< note >}}
This Service is visible as `<NodeIP>:spec.ports[*].nodePort` and `.spec.clusterIP:spec.ports[*].port`.
If the `--nodeport-addresses` flag for kube-proxy or the equivalent field
in the kube-proxy configuration file is set, `<NodeIP>` would be a filtered node IP address (or possibly IP addresses).
{{< /note >}}
### Type LoadBalancer {#loadbalancer}
On cloud providers which support external load balancers, setting the `type`
field to `LoadBalancer` provisions a load balancer for your Service.
The actual creation of the load balancer happens asynchronously, and
information about the provisioned balancer is published in the Service's
`.status.loadBalancer` field.
For example:
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app.kubernetes.io/name: MyApp
ports:
- protocol: TCP
port: 80
targetPort: 9376
clusterIP: 10.0.171.239
type: LoadBalancer
status:
loadBalancer:
ingress:
- ip: 192.0.2.127
```
Traffic from the external load balancer is directed at the backend Pods.
The cloud provider decides how it is load balanced.
Some cloud providers allow you to specify the `loadBalancerIP`. In those cases, the load-balancer is created
with the user-specified `loadBalancerIP`. If the `loadBalancerIP` field is not specified,
the loadBalancer is set up with an ephemeral IP address. If you specify a `loadBalancerIP`
but your cloud provider does not support the feature, the `loadbalancerIP` field that you
set is ignored.
To implement a Service of `type: LoadBalancer`, Kubernetes typically starts off
by making the changes that are equivalent to you requesting a Service of
`type: NodePort`. The cloud-controller-manager component then configures the external load balancer to
forward traffic to that assigned node port.
_As an alpha feature_, you can configure a load balanced Service to
[omit](#load-balancer-nodeport-allocation) assigning a node port, provided that the
cloud provider implementation supports this.
{{< note >}}
On **Azure**, if you want to use a user-specified public type `loadBalancerIP`, you first need
to create a static type public IP address resource. This public IP address resource should
be in the same resource group of the other automatically created resources of the cluster.
For example, `MC_myResourceGroup_myAKSCluster_eastus`.
Specify the assigned IP address as loadBalancerIP. Ensure that you have updated the
`securityGroupName` in the cloud provider configuration file.
For information about troubleshooting `CreatingLoadBalancerFailed` permission issues see,
[Use a static IP address with the Azure Kubernetes Service (AKS) load balancer](https://docs.microsoft.com/en-us/azure/aks/static-ip)
or [CreatingLoadBalancerFailed on AKS cluster with advanced networking](https://github.com/Azure/AKS/issues/357).
{{< /note >}}
#### Load balancers with mixed protocol types
{{< feature-state for_k8s_version="v1.24" state="beta" >}}
By default, for LoadBalancer type of Services, when there is more than one port defined, all
ports must have the same protocol, and the protocol must be one which is supported
by the cloud provider.
The feature gate `MixedProtocolLBService` (enabled by default for the kube-apiserver as of v1.24) allows the use of
different protocols for LoadBalancer type of Services, when there is more than one port defined.
{{< note >}}
The set of protocols that can be used for LoadBalancer type of Services is still defined by the cloud provider. If a
cloud provider does not support mixed protocols they will provide only a single protocol.
{{< /note >}}
#### Disabling load balancer NodePort allocation {#load-balancer-nodeport-allocation}
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
You can optionally disable node port allocation for a Service of `type=LoadBalancer`, by setting
the field `spec.allocateLoadBalancerNodePorts` to `false`. This should only be used for load balancer implementations
that route traffic directly to pods as opposed to using node ports. By default, `spec.allocateLoadBalancerNodePorts`
is `true` and type LoadBalancer Services will continue to allocate node ports. If `spec.allocateLoadBalancerNodePorts`
is set to `false` on an existing Service with allocated node ports, those node ports will **not** be de-allocated automatically.
You must explicitly remove the `nodePorts` entry in every Service port to de-allocate those node ports.
#### Specifying class of load balancer implementation {#load-balancer-class}
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
`spec.loadBalancerClass` enables you to use a load balancer implementation other than the cloud provider default.
By default, `spec.loadBalancerClass` is `nil` and a `LoadBalancer` type of Service uses
the cloud provider's default load balancer implementation if the cluster is configured with
a cloud provider using the `--cloud-provider` component flag.
If `spec.loadBalancerClass` is specified, it is assumed that a load balancer
implementation that matches the specified class is watching for Services.
Any default load balancer implementation (for example, the one provided by
the cloud provider) will ignore Services that have this field set.
`spec.loadBalancerClass` can be set on a Service of type `LoadBalancer` only.
Once set, it cannot be changed.
The value of `spec.loadBalancerClass` must be a label-style identifier,
with an optional prefix such as "`internal-vip`" or "`example.com/internal-vip`".
Unprefixed names are reserved for end-users.
#### Internal load balancer
In a mixed environment it is sometimes necessary to route traffic from Services inside the same
(virtual) network address block.
In a split-horizon DNS environment you would need two Services to be able to route both external
and internal traffic to your endpoints.
To set an internal load balancer, add one of the following annotations to your Service
depending on the cloud Service provider you're using.
{{< tabs name="service_tabs" >}}
{{% tab name="Default" %}}
Select one of the tabs.
{{% /tab %}}
{{% tab name="GCP" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
cloud.google.com/load-balancer-type: "Internal"
[...]
```
{{% /tab %}}
{{% tab name="AWS" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
[...]
```
{{% /tab %}}
{{% tab name="Azure" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
[...]
```
{{% /tab %}}
{{% tab name="IBM Cloud" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
service.kubernetes.io/ibm-load-balancer-cloud-provider-ip-type: "private"
[...]
```
{{% /tab %}}
{{% tab name="OpenStack" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
[...]
```
{{% /tab %}}
{{% tab name="Baidu Cloud" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
[...]
```
{{% /tab %}}
{{% tab name="Tencent Cloud" %}}
```yaml
[...]
metadata:
annotations:
service.kubernetes.io/qcloud-loadbalancer-internal-subnetid: subnet-xxxxx
[...]
```
{{% /tab %}}
{{% tab name="Alibaba Cloud" %}}
```yaml
[...]
metadata:
annotations:
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: "intranet"
[...]
```
{{% /tab %}}
{{% tab name="OCI" %}}
```yaml
[...]
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/oci-load-balancer-internal: true
[...]
```
{{% /tab %}}
{{< /tabs >}}
#### TLS support on AWS {#ssl-support-on-aws}
For partial TLS / SSL support on clusters running on AWS, you can add three
annotations to a `LoadBalancer` service:
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:us-east-1:123456789012:certificate/12345678-1234-1234-1234-123456789012
```
The first specifies the ARN of the certificate to use. It can be either a
certificate from a third party issuer that was uploaded to IAM or one created
within AWS Certificate Manager.
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: (https|http|ssl|tcp)
```
The second annotation specifies which protocol a Pod speaks. For HTTPS and
SSL, the ELB expects the Pod to authenticate itself over the encrypted
connection, using a certificate.
HTTP and HTTPS selects layer 7 proxying: the ELB terminates
the connection with the user, parses headers, and injects the `X-Forwarded-For`
header with the user's IP address (Pods only see the IP address of the
ELB at the other end of its connection) when forwarding requests.
TCP and SSL selects layer 4 proxying: the ELB forwards traffic without
modifying the headers.
In a mixed-use environment where some ports are secured and others are left unencrypted,
you can use the following annotations:
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443,8443"
```
In the above example, if the Service contained three ports, `80`, `443`, and
`8443`, then `443` and `8443` would use the SSL certificate, but `80` would be proxied HTTP.
From Kubernetes v1.9 onwards you can use
[predefined AWS SSL policies](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-security-policy-table.html)
with HTTPS or SSL listeners for your Services.
To see which policies are available for use, you can use the `aws` command line tool:
```bash
aws elb describe-load-balancer-policies --query 'PolicyDescriptions[].PolicyName'
```
You can then specify any one of those policies using the
"`service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy`"
annotation; for example:
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: "ELBSecurityPolicy-TLS-1-2-2017-01"
```
#### PROXY protocol support on AWS
To enable [PROXY protocol](https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt)
support for clusters running on AWS, you can use the following service
annotation:
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
```
Since version 1.3.0, the use of this annotation applies to all ports proxied by the ELB
and cannot be configured otherwise.
#### ELB Access Logs on AWS
There are several annotations to manage access logs for ELB Services on AWS.
The annotation `service.beta.kubernetes.io/aws-load-balancer-access-log-enabled`
controls whether access logs are enabled.
The annotation `service.beta.kubernetes.io/aws-load-balancer-access-log-emit-interval`
controls the interval in minutes for publishing the access logs. You can specify
an interval of either 5 or 60 minutes.
The annotation `service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-name`
controls the name of the Amazon S3 bucket where load balancer access logs are
stored.
The annotation `service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-prefix`
specifies the logical hierarchy you created for your Amazon S3 bucket.
```yaml
metadata:
name: my-service
annotations:
# Specifies whether access logs are enabled for the load balancer
service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: "true"
# The interval for publishing the access logs. You can specify an interval of either 5 or 60 (minutes).
service.beta.kubernetes.io/aws-load-balancer-access-log-emit-interval: "60"
# The name of the Amazon S3 bucket where the access logs are stored
service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-name: "my-bucket"
# The logical hierarchy you created for your Amazon S3 bucket, for example `my-bucket-prefix/prod`
service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-prefix: "my-bucket-prefix/prod"
```
#### Connection Draining on AWS
Connection draining for Classic ELBs can be managed with the annotation
`service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled` set
to the value of `"true"`. The annotation
`service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout` can
also be used to set maximum time, in seconds, to keep the existing connections open before
deregistering the instances.
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout: "60"
```
#### Other ELB annotations
There are other annotations to manage Classic Elastic Load Balancers that are described below.
```yaml
metadata:
name: my-service
annotations:
# The time, in seconds, that the connection is allowed to be idle (no data has been sent
# over the connection) before it is closed by the load balancer
service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
# Specifies whether cross-zone load balancing is enabled for the load balancer
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
# A comma-separated list of key-value pairs which will be recorded as
# additional tags in the ELB.
service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: "environment=prod,owner=devops"
# The number of successive successful health checks required for a backend to
# be considered healthy for traffic. Defaults to 2, must be between 2 and 10
service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: ""
# The number of unsuccessful health checks required for a backend to be
# considered unhealthy for traffic. Defaults to 6, must be between 2 and 10
service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "3"
# The approximate interval, in seconds, between health checks of an
# individual instance. Defaults to 10, must be between 5 and 300
service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "20"
# The amount of time, in seconds, during which no response means a failed
# health check. This value must be less than the service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval
# value. Defaults to 5, must be between 2 and 60
service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: "5"
# A list of existing security groups to be configured on the ELB created. Unlike the annotation
# service.beta.kubernetes.io/aws-load-balancer-extra-security-groups, this replaces all other
# security groups previously assigned to the ELB and also overrides the creation
# of a uniquely generated security group for this ELB.
# The first security group ID on this list is used as a source to permit incoming traffic to
# target worker nodes (service traffic and health checks).
# If multiple ELBs are configured with the same security group ID, only a single permit line
# will be added to the worker node security groups, that means if you delete any
# of those ELBs it will remove the single permit line and block access for all ELBs that shared the same security group ID.
# This can cause a cross-service outage if not used properly
service.beta.kubernetes.io/aws-load-balancer-security-groups: "sg-53fae93f"
# A list of additional security groups to be added to the created ELB, this leaves the uniquely
# generated security group in place, this ensures that every ELB
# has a unique security group ID and a matching permit line to allow traffic to the target worker nodes
# (service traffic and health checks).
# Security groups defined here can be shared between services.
service.beta.kubernetes.io/aws-load-balancer-extra-security-groups: "sg-53fae93f,sg-42efd82e"
# A comma separated list of key-value pairs which are used
# to select the target nodes for the load balancer
service.beta.kubernetes.io/aws-load-balancer-target-node-labels: "ingress-gw,gw-name=public-api"
```
#### Network Load Balancer support on AWS {#aws-nlb-support}
{{< feature-state for_k8s_version="v1.15" state="beta" >}}
To use a Network Load Balancer on AWS, use the annotation `service.beta.kubernetes.io/aws-load-balancer-type` with the value set to `nlb`.
```yaml
metadata:
name: my-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
```
{{< note >}}
NLB only works with certain instance classes; see the
[AWS documentation](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-register-targets.html#register-deregister-targets)
on Elastic Load Balancing for a list of supported instance types.
{{< /note >}}
Unlike Classic Elastic Load Balancers, Network Load Balancers (NLBs) forward the
client's IP address through to the node. If a Service's `.spec.externalTrafficPolicy`
is set to `Cluster`, the client's IP address is not propagated to the end
Pods.
By setting `.spec.externalTrafficPolicy` to `Local`, the client IP addresses is
propagated to the end Pods, but this could result in uneven distribution of
traffic. Nodes without any Pods for a particular LoadBalancer Service will fail
the NLB Target Group's health check on the auto-assigned
`.spec.healthCheckNodePort` and not receive any traffic.
In order to achieve even traffic, either use a DaemonSet or specify a
[pod anti-affinity](/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity)
to not locate on the same node.
You can also use NLB Services with the [internal load balancer](/docs/concepts/services-networking/service/#internal-load-balancer)
annotation.
In order for client traffic to reach instances behind an NLB, the Node security
groups are modified with the following IP rules:
| Rule | Protocol | Port(s) | IpRange(s) | IpRange Description |
|------|----------|---------|------------|---------------------|
| Health Check | TCP | NodePort(s) (`.spec.healthCheckNodePort` for `.spec.externalTrafficPolicy = Local`) | Subnet CIDR | kubernetes.io/rule/nlb/health=\<loadBalancerName\> |
| Client Traffic | TCP | NodePort(s) | `.spec.loadBalancerSourceRanges` (defaults to `0.0.0.0/0`) | kubernetes.io/rule/nlb/client=\<loadBalancerName\> |
| MTU Discovery | ICMP | 3,4 | `.spec.loadBalancerSourceRanges` (defaults to `0.0.0.0/0`) | kubernetes.io/rule/nlb/mtu=\<loadBalancerName\> |
In order to limit which client IP's can access the Network Load Balancer,
specify `loadBalancerSourceRanges`.
```yaml
spec:
loadBalancerSourceRanges:
- "143.231.0.0/16"
```
{{< note >}}
If `.spec.loadBalancerSourceRanges` is not set, Kubernetes
allows traffic from `0.0.0.0/0` to the Node Security Group(s). If nodes have
public IP addresses, be aware that non-NLB traffic can also reach all instances
in those modified security groups.
{{< /note >}}
Further documentation on annotations for Elastic IPs and other common use-cases may be found
in the [AWS Load Balancer Controller documentation](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/service/annotations/).
#### Other CLB annotations on Tencent Kubernetes Engine (TKE)
There are other annotations for managing Cloud Load Balancers on TKE as shown below.
```yaml
metadata:
name: my-service
annotations:
# Bind Loadbalancers with specified nodes
service.kubernetes.io/qcloud-loadbalancer-backends-label: key in (value1, value2)
# ID of an existing load balancer
service.kubernetes.io/tke-existed-lbidlb-6swtxxxx
# Custom parameters for the load balancer (LB), does not support modification of LB type yet
service.kubernetes.io/service.extensiveParameters: ""
# Custom parameters for the LB listener
service.kubernetes.io/service.listenerParameters: ""
# Specifies the type of Load balancer;
# valid values: classic (Classic Cloud Load Balancer) or application (Application Cloud Load Balancer)
service.kubernetes.io/loadbalance-type: xxxxx
# Specifies the public network bandwidth billing method;
# valid values: TRAFFIC_POSTPAID_BY_HOUR(bill-by-traffic) and BANDWIDTH_POSTPAID_BY_HOUR (bill-by-bandwidth).
service.kubernetes.io/qcloud-loadbalancer-internet-charge-type: xxxxxx
# Specifies the bandwidth value (value range: [1,2000] Mbps).
service.kubernetes.io/qcloud-loadbalancer-internet-max-bandwidth-out: "10"
# When this annotation is setthe loadbalancers will only register nodes
# with pod running on it, otherwise all nodes will be registered.
service.kubernetes.io/local-svc-only-bind-node-with-pod: true
```
### Type ExternalName {#externalname}
Services of type ExternalName map a Service to a DNS name, not to a typical selector such as
`my-service` or `cassandra`. You specify these Services with the `spec.externalName` parameter.
This Service definition, for example, maps
the `my-service` Service in the `prod` namespace to `my.database.example.com`:
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: prod
spec:
type: ExternalName
externalName: my.database.example.com
```
{{< note >}}
ExternalName accepts an IPv4 address string, but as a DNS name comprised of digits, not as an IP address.
ExternalNames that resemble IPv4 addresses are not resolved by CoreDNS or ingress-nginx because ExternalName
is intended to specify a canonical DNS name. To hardcode an IP address, consider using
[headless Services](#headless-services).
{{< /note >}}
When looking up the host `my-service.prod.svc.cluster.local`, the cluster DNS Service
returns a `CNAME` record with the value `my.database.example.com`. Accessing
`my-service` works in the same way as other Services but with the crucial
difference that redirection happens at the DNS level rather than via proxying or
forwarding. Should you later decide to move your database into your cluster, you
can start its Pods, add appropriate selectors or endpoints, and change the
Service's `type`.
{{< warning >}}
You may have trouble using ExternalName for some common protocols, including HTTP and HTTPS.
If you use ExternalName then the hostname used by clients inside your cluster is different from
the name that the ExternalName references.
For protocols that use hostnames this difference may lead to errors or unexpected responses.
HTTP requests will have a `Host:` header that the origin server does not recognize;
TLS servers will not be able to provide a certificate matching the hostname that the client connected to.
{{< /warning >}}
{{< note >}}
This section is indebted to the [Kubernetes Tips - Part
1](https://akomljen.com/kubernetes-tips-part-1/) blog post from [Alen Komljen](https://akomljen.com/).
{{< /note >}}
### External IPs
If there are external IPs that route to one or more cluster nodes, Kubernetes Services can be exposed on those
`externalIPs`. Traffic that ingresses into the cluster with the external IP (as destination IP), on the Service port,
will be routed to one of the Service endpoints. `externalIPs` are not managed by Kubernetes and are the responsibility
of the cluster administrator.
In the Service spec, `externalIPs` can be specified along with any of the `ServiceTypes`.
In the example below, "`my-service`" can be accessed by clients on "`80.11.12.10:80`" (`externalIP:port`)
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app.kubernetes.io/name: MyApp
ports:
- name: http
protocol: TCP
port: 80
targetPort: 9376
externalIPs:
- 80.11.12.10
```
## Shortcomings
Using the userspace proxy for VIPs works at small to medium scale, but will
not scale to very large clusters with thousands of Services. The
[original design proposal for portals](https://github.com/kubernetes/kubernetes/issues/1107)
has more details on this.
Using the userspace proxy obscures the source IP address of a packet accessing
a Service.
This makes some kinds of network filtering (firewalling) impossible. The iptables
proxy mode does not
obscure in-cluster source IPs, but it does still impact clients coming through
a load balancer or node-port.
The `Type` field is designed as nested functionality - each level adds to the
previous. This is not strictly required on all cloud providers (e.g. Google Compute Engine does
not need to allocate a `NodePort` to make `LoadBalancer` work, but AWS does)
but the Kubernetes API design for Service requires it anyway.
## Virtual IP implementation {#the-gory-details-of-virtual-ips}
The previous information should be sufficient for many people who want to
use Services. However, there is a lot going on behind the scenes that may be
worth understanding.
### Avoiding collisions
One of the primary philosophies of Kubernetes is that you should not be
exposed to situations that could cause your actions to fail through no fault
of your own. For the design of the Service resource, this means not making
you choose your own port number if that choice might collide with
someone else's choice. That is an isolation failure.
In order to allow you to choose a port number for your Services, we must
ensure that no two Services can collide. Kubernetes does that by allocating each
Service its own IP address from within the `service-cluster-ip-range`
CIDR range that is configured for the API server.
To ensure each Service receives a unique IP, an internal allocator atomically
updates a global allocation map in {{< glossary_tooltip term_id="etcd" >}}
prior to creating each Service. The map object must exist in the registry for
Services to get IP address assignments, otherwise creations will
fail with a message indicating an IP address could not be allocated.
In the control plane, a background controller is responsible for creating that
map (needed to support migrating from older versions of Kubernetes that used
in-memory locking). Kubernetes also uses controllers to check for invalid
assignments (e.g. due to administrator intervention) and for cleaning up allocated
IP addresses that are no longer used by any Services.
#### IP address ranges for `type: ClusterIP` Services {#service-ip-static-sub-range}
{{< feature-state for_k8s_version="v1.25" state="beta" >}}
However, there is a problem with this `ClusterIP` allocation strategy, because a user
can also [choose their own address for the service](#choosing-your-own-ip-address).
This could result in a conflict if the internal allocator selects the same IP address
for another Service.
The `ServiceIPStaticSubrange`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled by default in v1.25
and later, using an allocation strategy that divides the `ClusterIP` range into two bands, based on
the size of the configured `service-cluster-ip-range` by using the following formula
`min(max(16, cidrSize / 16), 256)`, described as _never less than 16 or more than 256,
with a graduated step function between them_. Dynamic IP allocations will be preferentially
chosen from the upper band, reducing risks of conflicts with the IPs
assigned from the lower band.
This allows users to use the lower band of the `service-cluster-ip-range` for their
Services with static IPs assigned with a very low risk of running into conflicts.
### Service IP addresses {#ips-and-vips}
Unlike Pod IP addresses, which actually route to a fixed destination,
Service IPs are not actually answered by a single host. Instead, kube-proxy
uses iptables (packet processing logic in Linux) to define _virtual_ IP addresses
which are transparently redirected as needed. When clients connect to the
VIP, their traffic is automatically transported to an appropriate endpoint.
The environment variables and DNS for Services are actually populated in
terms of the Service's virtual IP address (and port).
kube-proxy supports three proxy modes&mdash;userspace, iptables and IPVS&mdash;which
each operate slightly differently.
#### Userspace
As an example, consider the image processing application described above.
When the backend Service is created, the Kubernetes master assigns a virtual
IP address, for example 10.0.0.1. Assuming the Service port is 1234, the
Service is observed by all of the kube-proxy instances in the cluster.
When a proxy sees a new Service, it opens a new random port, establishes an
iptables redirect from the virtual IP address to this new port, and starts accepting
connections on it.
When a client connects to the Service's virtual IP address, the iptables
rule kicks in, and redirects the packets to the proxy's own port.
The "Service proxy" chooses a backend, and starts proxying traffic from the client to the backend.
This means that Service owners can choose any port they want without risk of
collision. Clients can connect to an IP and port, without being aware
of which Pods they are actually accessing.
#### iptables
Again, consider the image processing application described above.
When the backend Service is created, the Kubernetes control plane assigns a virtual
IP address, for example 10.0.0.1. Assuming the Service port is 1234, the
Service is observed by all of the kube-proxy instances in the cluster.
When a proxy sees a new Service, it installs a series of iptables rules which
redirect from the virtual IP address to per-Service rules. The per-Service
rules link to per-Endpoint rules which redirect traffic (using destination NAT)
to the backends.
When a client connects to the Service's virtual IP address the iptables rule kicks in.
A backend is chosen (either based on session affinity or randomly) and packets are
redirected to the backend. Unlike the userspace proxy, packets are never
copied to userspace, the kube-proxy does not have to be running for the virtual
IP address to work, and Nodes see traffic arriving from the unaltered client IP
address.
This same basic flow executes when traffic comes in through a node-port or
through a load-balancer, though in those cases the client IP does get altered.
#### IPVS
iptables operations slow down dramatically in large scale cluster e.g. 10,000 Services.
IPVS is designed for load balancing and based on in-kernel hash tables.
So you can achieve performance consistency in large number of Services from IPVS-based kube-proxy.
Meanwhile, IPVS-based kube-proxy has more sophisticated load balancing algorithms
(least conns, locality, weighted, persistence).
## API Object
Service is a top-level resource in the Kubernetes REST API. You can find more details
about the [Service API object](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#service-v1-core).
## Supported protocols {#protocol-support}
### TCP
You can use TCP for any kind of Service, and it's the default network protocol.
### UDP
You can use UDP for most Services. For type=LoadBalancer Services, UDP support
depends on the cloud provider offering this facility.
### SCTP
{{< feature-state for_k8s_version="v1.20" state="stable" >}}
When using a network plugin that supports SCTP traffic, you can use SCTP for
most Services. For type=LoadBalancer Services, SCTP support depends on the cloud
provider offering this facility. (Most do not).
#### Warnings {#caveat-sctp-overview}
##### Support for multihomed SCTP associations {#caveat-sctp-multihomed}
{{< warning >}}
The support of multihomed SCTP associations requires that the CNI plugin can support the
assignment of multiple interfaces and IP addresses to a Pod.
NAT for multihomed SCTP associations requires special logic in the corresponding kernel modules.
{{< /warning >}}
##### Windows {#caveat-sctp-windows-os}
{{< note >}}
SCTP is not supported on Windows based nodes.
{{< /note >}}
##### Userspace kube-proxy {#caveat-sctp-kube-proxy-userspace}
{{< warning >}}
The kube-proxy does not support the management of SCTP associations when it is in userspace mode.
{{< /warning >}}
### HTTP
If your cloud provider supports it, you can use a Service in LoadBalancer mode
to set up external HTTP / HTTPS reverse proxying, forwarded to the Endpoints
of the Service.
{{< note >}}
You can also use {{< glossary_tooltip term_id="ingress" >}} in place of Service
to expose HTTP/HTTPS Services.
{{< /note >}}
### PROXY protocol
If your cloud provider supports it,
you can use a Service in LoadBalancer mode to configure a load balancer outside
of Kubernetes itself, that will forward connections prefixed with
[PROXY protocol](https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt).
The load balancer will send an initial series of octets describing the
incoming connection, similar to this example
```
PROXY TCP4 192.0.2.202 10.0.42.7 12345 7\r\n
```
followed by the data from the client.
## {{% heading "whatsnext" %}}
* Follow the [Connecting Applications with Services](/docs/tutorials/services/connect-applications-service/) tutorial
* Read about [Ingress](/docs/concepts/services-networking/ingress/)
* Read about [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/)