Istio BlogBlog posts from the Istio service mesh./v1.24/Istio Blog/v1.24//favicons/android-192x192.png/v1.24/Service meshAnnouncing Istio's 2025 Steering Committee ElectionsThe Istio Steering Committee oversees the administrative aspects of the project, including governance, branding, marketing, and working with the CNCF.

Every year, the leaders in the Istio project estimate the proportion of the hundreds of companies that have contributed to Istio in the past year, and uses that metric to proportionally allocate nine Contribution Seats on our Steering Committee.

Then, four Community Seats are voted for by our project members, with candidates being from companies that did not receive Contribution Seats.

We are pleased to share the result of this year’s calculation, and to kick off our Community Seat election.

Contribution seats

The calculation for the 2025-2026 term reflects the deep investment of our vendors in the Istio open source project, especially in the area of ambient mode. As was the case last year, we have five companies represented in our Contribution Seats:

Company Seat allocation
Solo.io 5
Microsoft 1
Huawei 1
Google 1
Tetrate 1

The full allocation can be seen in our formula spreadsheet.

Community Seat election

Last year, we changed the timing of the Community Seat elections to immediately follow the allocation of the Contribution Seats. It is therefore now time to collect our nominations for candidates, and ensure our voter list is correct.

Candidates

Eligibility for candidacy is defined in the Steering Committee charter as a project member who does not work for a Company that will hold a Contribution Seat during the upcoming term.

We would now like to invite members from outside our Contribution Seat holders to stand for election. Nominations are due by February 23.

Voters

Eligibility to vote is defined in the charter as either:

  • a project member who has had at least one Pull Request merged in the past 12 months, or
  • someone who has submitted the voting exception form and has been accepted by the Steering Committee as having standing in the community through contribution of another kind.

The draft list of voters has been published. If you’re not on that list and you believe you have standing in the Istio community, please submit the exception form.

Exception requests are due by February 23. Voting will start on February 24 and last until March 9.

Announcement of the new committee

Upon the completion of the election, the entire 2025-2026 committee - election winners and company-selected Contribution Seat holders - will be announced.

The Steering Committee wishes to thank its members, old and new, and looks forward to continue to grow and improve Istio as a successful and sustainable open source project. We encourage everyone to get involved in the Istio community by contributing, standing for election, voting, and helping us shape the future of cloud native networking.

]]>
Thu, 13 Feb 2025 00:00:00 +0000/v1.24//blog/2025/steering-election//v1.24//blog/2025/steering-election/istiosteeringgovernancecommunityelection
Policy based authorization using KyvernoIstio supports integration with many different projects. The Istio blog recently featured a post on L7 policy functionality with OpenPolicyAgent. Kyverno is a similar project, and today we will dive how Istio and the Kyverno Authz Server can be used together to enforce Layer 7 policies in your platform.

We will show you how to get started with a simple example. You will come to see how this combination is a solid option to deliver policy quickly and transparently to application team everywhere in the business, while also providing the data the security teams need for audit and compliance.

Try it out

When integrated with Istio, the Kyverno Authz Server can be used to enforce fine-grained access control policies for microservices.

This guide shows how to enforce access control policies for a simple microservices application.

Prerequisites

  • A Kubernetes cluster with Istio installed.
  • The istioctl command-line tool installed.

Install Istio and configure your mesh options to enable Kyverno:

$ istioctl install -y -f - <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  meshConfig:
    accessLogFile: /dev/stdout
    accessLogFormat: |
      [KYVERNO DEMO] my-new-dynamic-metadata: '%DYNAMIC_METADATA(envoy.filters.http.ext_authz)%'
    extensionProviders:
    - name: kyverno-authz-server
      envoyExtAuthzGrpc:
        service: kyverno-authz-server.kyverno.svc.cluster.local
        port: '9081'
EOF

Notice that in the configuration, we define an extensionProviders section that points to the Kyverno Authz Server installation:

[...]
    extensionProviders:
    - name: kyverno-authz-server
      envoyExtAuthzGrpc:
        service: kyverno-authz-server.kyverno.svc.cluster.local
        port: '9081'
[...]

Deploy the Kyverno Authz Server

The Kyverno Authz Server is a GRPC server capable of processing Envoy External Authorization requests.

It is configurable using Kyverno AuthorizationPolicy resources, either stored in-cluster or provided externally.

$ kubectl create ns kyverno
$ kubectl label namespace kyverno istio-injection=enabled
$ helm install kyverno-authz-server --namespace kyverno --wait --repo https://kyverno.github.io/kyverno-envoy-plugin kyverno-authz-server

Deploy the sample application

httpbin is a well-known application that can be used to test HTTP requests and helps to show quickly how we can play with the request and response attributes.

$ kubectl create ns my-app
$ kubectl label namespace my-app istio-injection=enabled
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/httpbin/httpbin.yaml -n my-app

Deploy an Istio AuthorizationPolicy

An AuthorizationPolicy defines the services that will be protected by the Kyverno Authz Server.

$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: my-kyverno-authz
  namespace: istio-system # This enforce the policy on all the mesh, istio-system being the mesh root namespace
spec:
  selector:
    matchLabels:
      ext-authz: enabled
  action: CUSTOM
  provider:
    name: kyverno-authz-server
  rules: [{}] # Empty rules, it will apply to selectors with ext-authz: enabled label
EOF

Notice that in this resource, we define the Kyverno Authz Server extensionProvider you set in the Istio configuration:

[...]
  provider:
    name: kyverno-authz-server
[...]

Label the app to enforce the policy

Let’s label the app to enforce the policy. The label is needed for the Istio AuthorizationPolicy to apply to the sample application pods.

$ kubectl patch deploy httpbin -n my-app --type=merge -p='{
  "spec": {
    "template": {
      "metadata": {
        "labels": {
          "ext-authz": "enabled"
        }
      }
    }
  }
}'

Deploy a Kyverno AuthorizationPolicy

A Kyverno AuthorizationPolicy defines the rules used by the Kyverno Authz Server to make a decision based on a given Envoy CheckRequest.

It uses the CEL language to analyze an incoming CheckRequest and is expected to produce a CheckResponse in return.

The incoming request is available under the object field, and the policy can define variables that will be made available to all authorizations.

$ kubectl apply -f - <<EOF
apiVersion: envoy.kyverno.io/v1alpha1
kind: AuthorizationPolicy
metadata:
  name: demo-policy.example.com
spec:
  failurePolicy: Fail
  variables:
  - name: force_authorized
    expression: object.attributes.request.http.?headers["x-force-authorized"].orValue("")
  - name: allowed
    expression: variables.force_authorized in ["enabled", "true"]
  authorizations:
  - expression: >
      variables.allowed
        ? envoy.Allowed().Response()
        : envoy.Denied(403).Response()
EOF

Notice that you can build the CheckResponse by hand or use CEL helper functions like envoy.Allowed() and envoy.Denied(403) to simplify creating the response message:

[...]
  - expression: >
      variables.allowed
        ? envoy.Allowed().Response()
        : envoy.Denied(403).Response()
[...]

How it works

When applying the AuthorizationPolicy, the Istio control plane (istiod) sends the required configurations to the sidecar proxy (Envoy) of the selected services in the policy. Envoy will then send the request to the Kyverno Authz Server to check if the request is allowed or not.

The Envoy proxy works by configuring filters in a chain. One of those filters is ext_authz, which implements an external authorization service with a specific message. Any server implementing the correct protobuf can connect to the Envoy proxy and provide the authorization decision; The Kyverno Authz Server is one of those servers.

Reviewing Envoy’s Authorization service documentation, you can see that the message has these attributes:

  • Ok response

    {
      "status": {...},
      "ok_response": {
        "headers": [],
        "headers_to_remove": [],
        "response_headers_to_add": [],
        "query_parameters_to_set": [],
        "query_parameters_to_remove": []
      },
      "dynamic_metadata": {...}
    }
  • Denied response

    {
      "status": {...},
      "denied_response": {
        "status": {...},
        "headers": [],
        "body": "..."
      },
      "dynamic_metadata": {...}
    }

This means that based on the response from the authz server, Envoy can add or remove headers, query parameters, and even change the response body.

We can do this as well, as documented in the Kyverno Authz Server documentation.

Testing

Let’s test the simple usage (authorization) and then let’s create a more advanced policy to show how we can use the Kyverno Authz Server to modify the request and response.

Deploy an app to run curl commands to the httpbin sample application:

$ kubectl apply -n my-app -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/curl/curl.yaml

Apply the policy:

$ kubectl apply -f - <<EOF
apiVersion: envoy.kyverno.io/v1alpha1
kind: AuthorizationPolicy
metadata:
  name: demo-policy.example.com
spec:
  failurePolicy: Fail
  variables:
  - name: force_authorized
    expression: object.attributes.request.http.?headers["x-force-authorized"].orValue("")
  - name: allowed
    expression: variables.force_authorized in ["enabled", "true"]
  authorizations:
  - expression: >
      variables.allowed
        ? envoy.Allowed().Response()
        : envoy.Denied(403).Response()
EOF

The simple scenario is to allow requests if they contain the header x-force-authorized with the value enabled or true. If the header is not present or has a different value, the request will be denied.

In this case, we combined allow and denied response handling in a single expression. However it is possible to use multiple expressions, the first one returning a non null response will be used by the Kyverno Authz Server, this is useful when a rule doesn’t want to make a decision and delegate to the next rule:

[...]
  authorizations:
  # allow the request when the header value matches
  - expression: >
      variables.allowed
        ? envoy.Allowed().Response()
        : null
  # else deny the request
  - expression: >
      envoy.Denied(403).Response()
[...]

Simple rule

The following request will return 403:

$ kubectl exec -n my-app deploy/curl -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get

The following request will return 200:

$ kubectl exec -n my-app deploy/curl -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get -H "x-force-authorized: true"

Advanced manipulations

Now the more advanced use case, apply the second policy:

$ kubectl apply -f - <<EOF
apiVersion: envoy.kyverno.io/v1alpha1
kind: AuthorizationPolicy
metadata:
  name: demo-policy.example.com
spec:
  variables:
  - name: force_authorized
    expression: object.attributes.request.http.headers[?"x-force-authorized"].orValue("") in ["enabled", "true"]
  - name: force_unauthenticated
    expression: object.attributes.request.http.headers[?"x-force-unauthenticated"].orValue("") in ["enabled", "true"]
  - name: metadata
    expression: '{"my-new-metadata": "my-new-value"}'
  authorizations:
    # if force_unauthenticated -> 401
  - expression: >
      variables.force_unauthenticated
        ? envoy
            .Denied(401)
            .WithBody("Authentication Failed")
            .Response()
        : null
    # if force_authorized -> 200
  - expression: >
      variables.force_authorized
        ? envoy
            .Allowed()
            .WithHeader("x-validated-by", "my-security-checkpoint")
            .WithoutHeader("x-force-authorized")
            .WithResponseHeader("x-add-custom-response-header", "added")
            .Response()
            .WithMetadata(variables.metadata)
        : null
    # else -> 403
  - expression: >
      envoy
        .Denied(403)
        .WithBody("Unauthorized Request")
        .Response()
EOF

In that policy, you can see:

  • If the request has the x-force-unauthenticated: true header (or x-force-unauthenticated: enabled), we will return 401 with the “Authentication Failed” body
  • Else, if the request has the x-force-authorized: true header (or x-force-authorized: enabled), we will return 200 and manipulate request headers, response headers and inject dynamic metadata
  • In all other cases, we will return 403 with the “Unauthorized Request” body

The corresponding CheckResponse will be returned to the Envoy proxy from the Kyverno Authz Server. Envoy will use those values to modify the request and response accordingly.

Change returned body

Let’s test the new capabilities:

$ kubectl exec -n my-app deploy/curl -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get

Now we can change the response body.

With 403 the body will be changed to “Unauthorized Request”, running the previous command, you should receive:

Unauthorized Request
http_code=403

Change returned body and status code

Running the request with the header x-force-unauthenticated: true:

$ kubectl exec -n my-app deploy/curl -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get -H "x-force-unauthenticated: true"

This time you should receive the body “Authentication Failed” and error 401:

Authentication Failed
http_code=401

Adding headers to request

Running a valid request:

$ kubectl exec -n my-app deploy/curl -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get -H "x-force-authorized: true"

You should receive the echo body with the new header x-validated-by: my-security-checkpoint and the header x-force-authorized removed:

[...]
    "X-Validated-By": [
      "my-security-checkpoint"
    ]
[...]
http_code=200

Adding headers to response

Running the same request but showing only the header:

$ kubectl exec -n my-app deploy/curl -- curl -s -I -w "\nhttp_code=%{http_code}" httpbin:8000/get -H "x-force-authorized: true"

You will find the response header added during the Authz check x-add-custom-response-header: added:

HTTP/1.1 200 OK
[...]
x-add-custom-response-header: added
[...]
http_code=200

Sharing data between filters

Finally, you can pass data to the following Envoy filters using dynamic_metadata.

This is useful when you want to pass data to another ext_authz filter in the chain or you want to print it in the application logs.

To do so, review the access log format you set earlier:

[...]
    accessLogFormat: |
      [KYVERNO DEMO] my-new-dynamic-metadata: "%DYNAMIC_METADATA(envoy.filters.http.ext_authz)%"
[...]

DYNAMIC_METADATA is a reserved keyword to access the metadata object. The rest is the name of the filter that you want to access.

In our case, the name envoy.filters.http.ext_authz is created automatically by Istio. You can verify this by dumping the Envoy configuration:

$ istioctl pc all deploy/httpbin -n my-app -oyaml | grep envoy.filters.http.ext_authz

You will see the configurations for the filter.

Let’s test the dynamic metadata. In the advance rule, we are creating a new metadata entry: {"my-new-metadata": "my-new-value"}.

Run the request and check the logs of the application:

$ kubectl exec -n my-app deploy/curl -- curl -s -I httpbin:8000/get -H "x-force-authorized: true"
$ kubectl logs -n my-app deploy/httpbin -c istio-proxy --tail 1

You will see in the output the new attributes configured by the Kyverno policy:

[...]
[KYVERNO DEMO] my-new-dynamic-metadata: '{"my-new-metadata":"my-new-value","ext_authz_duration":5}'
[...]

Conclusion

In this guide, we have shown how to integrate Istio and the Kyverno Authz Server to enforce policies for a simple microservices application. We also showed how to use policies to modify the request and response attributes.

This is the foundational example for building a platform-wide policy system that can be used by all application teams.

]]>
Mon, 25 Nov 2024 00:00:00 +0000/v1.24//blog/2024/authz-policy-with-kyverno//v1.24//blog/2024/authz-policy-with-kyverno/istiokyvernopolicyplatformauthorization
A new Phippy and Friends story: Izzy Saves the BirthdayEarlier this year, we added Izzy Dolphin, the Indo-Pacific Bottlenose to the CNCF “Phippy and Friends” family. Ever since then, Istio lovers worldwide have been eagerly awaiting the first children’s book featuring our cute dolphin.

And here it is!

The Istio project is excited to unveil Izzy’s adventure sailing with the Phippy family at KubeCon North America 2024 this week, as together we celebrate the 10 year anniversary of Kubernetes. Copies are available at the the CNCF Store, or on the online store shortly after the event.

Captain Kube hosts a grand birthday bash on a special cruise with Phippy and her friends, however the ship is in great danger! But there is nothing to worry about, when Izzy is in charge of the security! Join Izzy’s smart and adventurous chase of the pirates who want to spoil Captain Kube’s birthday bash.

Why the book?

The co-authors of the book, Faseela K. and Lin Sun, are both Istio maintainers and parents. They have often found themselves in a tough spot explaining what they do at work, particularly in a context that makes sense to younger people. Their children read and enjoyed the Illustrated Children’s Guide to Kubernetes but were curious to learn more about the other characters and their roles and responsibilities!

This book is for every one who has encountered curious little eyes that keep asking you what “Service Mesh” is. It’s also a great gift for anyone of any age who needs to understand what Istio is, or who thinks that service mesh is too complex.

Acknowledgements

The Istio Steering Committee would like to thank Faseela and Lin for writing this amazing book. Suri Patel and Alex Davy from CNCF did a wonderful job with the design and illustrations, bringing the story to life. Last, but not least, a huge thanks to Katie Greenley for her support throughout the process to make sure the book was released on time for Captain Kube’s birthday celebrations at our community’s largest international conference.

We are planning a book signing event at next year’s KubeCon EU in London.

Happy reading!

]]>
Tue, 12 Nov 2024 00:00:00 +0000/v1.24//blog/2024/istio-phippy-book//v1.24//blog/2024/istio-phippy-book/istiophippyizzydolphin
Fast, Secure, and Simple: Istio’s Ambient Mode Reaches General Availability in v1.24We are proud to announce that Istio’s ambient data plane mode has reached General Availability, with the ztunnel, waypoints and APIs being marked as Stable by the Istio TOC. This marks the final stage in Istio’s feature phase progression, signaling that ambient mode is fully ready for broad production usage.

Ambient mesh — and its reference implementation with Istio’s ambient mode — was announced in September 2022. Since then, our community has put in 26 months of hard work and collaboration, with contributions from Solo.io, Google, Microsoft, Intel, Aviatrix, Huawei, IBM, Red Hat, and many others. Stable status in 1.24 indicates the features of ambient mode are now fully ready for broad production workloads. This is a huge milestone for Istio, bringing Istio to production readiness without sidecars, and offering users a choice.

Why ambient mesh?

From the launch of Istio in 2017, we have observed a clear and growing demand for mesh capabilities for applications — but heard that many users found the resource overhead and operational complexity of sidecars hard to overcome. Challenges that Istio users shared with us include how sidecars can break applications after they are added, the large CPU and memory requirement for a proxy with every workload, and the inconvenience of needing to restart application pods with every new Istio release.

As a community, we designed ambient mesh from the ground up to tackle these problems, alleviating the previous barriers of complexity faced by users looking to implement service mesh. The new concept was named ‘ambient mesh’ as it was designed to be transparent to your application, with no proxy infrastructure collocated with user workloads, no subtle changes to configuration required to onboard, and no application restarts required. In ambient mode it is trivial to add or remove applications from the mesh. All you need to do is label a namespace, and all applications in that namespace are instantly added to the mesh. This immediately secures all traffic within that namespace with industry-standard mutual TLS encryption — no other configuration or restarts required!. Refer to the Introducing Ambient Mesh blog for more information on why we built Istio’s ambient mode.

How does ambient mode make adoption easier?

The core innovation behind ambient mesh is that it slices Layer 4 (L4) and Layer 7 (L7) processing into two distinct layers. Istio’s ambient mode is powered by lightweight, shared L4 node proxies and optional L7 proxies, removing the need for traditional sidecar proxies from the data plane. This layered approach allows you to adopt Istio incrementally, enabling a smooth transition from no mesh, to a secure overlay (L4), to optional full L7 processing — on a per-namespace basis, as needed, across your fleet.

By utilizing ambient mesh, users bypass some of the previously restrictive elements of the sidecar model. Server-send-first protocols now work, most reserved ports are now available, and the ability for containers to bypass the sidecar — either maliciously or not — is eliminated.

The lightweight shared L4 node proxy is called the ztunnel (zero-trust tunnel). ztunnel drastically reduces the overhead of running a mesh by removing the need to potentially over-provision memory and CPU within a cluster to handle expected loads. In some use cases, the savings can exceed 90% or more, while still providing zero-trust security using mutual TLS with cryptographic identity, simple L4 authorization policies, and telemetry.

The L7 proxies are called waypoints. Waypoints process L7 functions such as traffic routing, rich authorization policy enforcement, and enterprise-grade resilience. Waypoints run outside of your application deployments and can scale independently based on your needs, which could be for the entire namespace or for multiple services within a namespace. Compared with sidecars, you don’t need one waypoint per application pod, and you can scale your waypoint effectively based on its scope, thus saving significant amounts of CPU and memory in most cases.

The separation between the L4 secure overlay layer and L7 processing layer allows incremental adoption of the ambient mode data plane, in contrast to the earlier binary “all-in” injection of sidecars. Users can start with the secure L4 overlay, which offers a majority of features that people deploy Istio for (mTLS, authorization policy, and telemetry). Complex L7 handling such as retries, traffic splitting, load balancing, and observability collection can then be enabled on a case-by-case basis.

Rapid exploration and adoption of ambient mode

The ztunnel image on Docker Hub has reached over 1 million downloads, with ~63,000 pulls in the last week alone.

We asked a few of our users for their thoughts on ambient mode’s GA:

What is in scope?

The general availability of ambient mode means the following things are now considered stable:

Refer to the feature status page for more information.

Roadmap

We are not standing still! There are a number of features that we continue to work on for future releases, including some that are currently in Alpha/Beta.

In our upcoming releases, we expect to move quickly on the following extensions to ambient mode:

  • Full support for sidecar and ambient mode interoperability
  • Multi-cluster installations
  • Multi-network support
  • VM support

What about sidecars?

Sidecars are not going away, and remain first-class citizens in Istio. You can continue to use sidecars, and they will remain fully supported. While we believe most use cases will be best served with a mesh in ambient mode, the Istio project remains committed to ongoing sidecar mode support.

Try ambient mode today

With the 1.24 release of Istio and the GA release of ambient mode, it is now easier than ever to try out Istio on your own workloads.

  • Follow the getting started guide to explore ambient mode.
  • Read our user guides to learn how to incrementally adopt ambient for mutual TLS & L4 authorization policy, traffic management, rich L7 authorization policy, and more.
  • Explore the new Kiali 2.0 dashboard to visualize your mesh.

You can engage with the developers in the #ambient channel on the Istio Slack, or use the discussion forum on GitHub for any questions you may have.

]]>
Thu, 07 Nov 2024 00:00:00 +0000/v1.24//blog/2024/ambient-reaches-ga//v1.24//blog/2024/ambient-reaches-ga/ambientsidecars
Istio in Salt Lake City!An amazing lineup of Istio activities awaits you in Salt Lake City, Utah at KubeCon + CloudNativeCon North America 2024!

Follow us on X, LinkedIn or Bluesky to get live updates from the event. See you soon!

]]>
Tue, 05 Nov 2024 00:00:00 +0000/v1.24//blog/2024/kubecon-na//v1.24//blog/2024/kubecon-na/Istio DayIstioconferenceKubeConCloudNativeCon
Scaling in the Clouds: Istio Ambient vs. CiliumA common question from prospective Istio users is “how does Istio compare to Cilium?” While Cilium originally only provided L3/L4 functionality, including network policy, recent releases have added service mesh functionality using Envoy, as well as WireGuard encryption. Like Istio, Cilium is a CNCF Graduated project, and has been around in the community for many years.

Despite offering a similar feature set on the surface, the two projects have substantially different architectures, most notably Cilium’s use of eBPF and WireGuard for processing and encrypting L4 traffic in the kernel, contrasted with Istio’s ztunnel component for L4 in user space. These differences have resulted in substantial speculation about how Istio will perform at scale compared to Cilium.

While many comparisons have been made about tenancy models, security protocols and basic performance of the two projects, there has not yet been a full evaluation published at enterprise scale. Rather than emphasizing theoretical performance, we put Istio’s ambient mode and Cilium through their paces, focusing on key metrics like latency, throughput, and resource consumption. We cranked up the pressure with realistic load scenarios, simulating a bustling Kubernetes environment. Finally, we pushed the size of our AKS cluster up to 1,000 nodes on 11,000 cores, to understand how these projects perform at scale. Our results show areas where each can improve, but also indicate that Istio is the clear winner.

Test Scenario

In order to push Istio and Cilium to their limits, we created 500 different services, each backed by 100 pods. Each service is in a separate namespace, which also contains one Fortio load generator client. We restricted the clients to a node pool of 100 32-core machines, to eliminate noise from collocated clients, and allocated the remaining 900 8-core instances to our services.

For the Istio test, we used Istio’s ambient mode, with a waypoint proxy in every service namespace, and default install parameters. In order to make our test scenarios similar, we had to turn on a few non-default features in Cilium, including WireGuard encryption, L7 Proxies, and Node Init. We also created a Cilium Network Policy in each namespace, with HTTP path-based rules. In both scenarios, we generated churn by scaling one service to between 85 and 115 instances at random every second, and relabeling one namespace every minute. To see the precise settings we used, and to reproduce our results, see my notes.

Scalability Scorecard

Istio was able to deliver 56% more queries at 20% lower tail latency. The CPU usage was 30% less for Cilium, though our measurement does not include the cores Cilium used to process encryption, which is done in the kernel.

Taking into account the resource used, Istio processed 2178 Queries Per Core, vs Cilium’s 1815, a 20% improvement.

  • The Cilium Slowdown: Cilium, while boasting impressive low latency with default install parameters, slows down substantially when Istio’s baseline features such as L7 policy and encryption are turned on. Additionally, Cilium’s memory and CPU utilization remained high even when no traffic was flowing in the mesh. This can impact the overall stability and reliability of your cluster, especially as it grows.
  • Istio, The Steady Performer: Istio’s ambient mode, on the other hand, showed its strength in stability and maintaining decent throughput, even with the added overhead of encryption. While Istio did consume more memory and CPU than Cilium under test, its CPU utilization settled to a fraction of Cilium’s when not under load.

Behind the Scenes: Why the Difference?

The key to understanding these performance differences lies in the architecture and design of each tool.

  • Cilium’s Control Plane Conundrum: Cilium runs a control plane instance on each node, leading to API server strain and configuration overhead as your cluster expands. This frequently caused our API server to crash, followed by Cilium becoming unready, and the entire cluster becoming unresponsive.
  • Istio’s Efficiency Edge: Istio, with its centralized control plane and identity-based approach, streamlines configuration and reduces the burden on your API server and nodes, directing critical resources to processing and securing your traffic, rather than processing configuration. Istio takes further advantage of the resources not used in the control plane by running as many Envoy instances as a workload needs, while Cilium is limited to one shared Envoy instance per node.

Digging Deeper

While the objective of this project is to compare Istio and Cilium scalability, several constraints make a direct comparison difficult.

Layer 4 Isn’t always Layer 4

While Istio and Cilium both offer L4 policy enforcement, their APIs and implementation differ substantially. Cilium implements Kubernetes NetworkPolicy, which uses labels and namespaces to block or allow access to and from IP Addresses. Istio offers an AuthorizationPolicy API, and makes allow and deny decisions based on the TLS identity used to sign each request. Most defense-in-depth strategies will need to make use of both NetworkPolicy and TLS-based policy for comprehensive security.

Not all Encryption is Created Equal

While Cilium offers IPsec for FIPS-compatible encryption, most other Cilium features such as L7 policy and load balancing are incompatible with IPsec. Cilium has much better feature compatibility when using WireGuard encryption, but WireGuard cannot be used in FIPS-compliant environments. Istio, on the other-hand, because it strictly complies with TLS protocol standards, always uses FIPS-compliant mTLS by default.

Hidden Costs

While Istio operates entirely in user space, Cilium’s L4 dataplane runs in the Linux kernel using eBPF. Prometheus metrics for resource consumption only measure user space resources, meaning that all kernel resources used by Cilium are not accounted for in this test.

Recommendations: Choosing the Right Tool for the Job

So, what’s the verdict? Well, it depends on your specific needs and priorities. For small clusters with pure L3/L4 use cases and no requirement for encryption, Cilium offers a cost-effective and performant solution. However, for larger clusters and a focus on stability, scalability, and advanced features, Istio’s ambient mode, along with an alternate NetworkPolicy implementation, is the way to go. Many customers choose to combine the L3/L4 features of Cilium with the L4/L7 and encryption features of Istio for a defense-in-depth strategy.

Remember, the world of cloud-native networking is constantly evolving. Keep an eye on developments in both Istio and Cilium, as they continue to improve and address these challenges.

Let’s Keep the Conversation Going

Have you worked with Istio’s ambient mode or Cilium? What are your experiences and insights? Share your thoughts in the comments below. Let’s learn from each other and navigate the exciting world of Kubernetes together!

]]>
Mon, 21 Oct 2024 00:00:00 +0000/v1.24//blog/2024/ambient-vs-cilium//v1.24//blog/2024/ambient-vs-cilium/istiociliumanalysis
More community leadership: Regularly electing the Istio Technical Oversight CommitteeLike many Open Source foundations and projects, the Istio project has two governance groups: a Steering Committee, that oversees the administrative and marketing aspects of the project, and a Technical Oversight Committee (TOC), responsible for cross-cutting product and design decisions.

The Steering Committee represents the companies and contributors that support the Istio project, while the TOC is the top of an individual contributor ladder made up of our members, maintainers and working group leads.

Each year, we build our Steering Committee with representatives from our top commercial contributors, and members elected by our maintainer community. This is the group with the responsibility of electing new TOC members, who have traditionally served indefinitely.

We want to ensure that all the members of our community have the opportunity to stand for, and serve in, our leadership positions. Today, we are pleased to announce our transition to a regularly-elected TOC, with members serving two-year terms, and call for candidates for our first election.

What does the Technical Oversight Committee do?

The charter for the TOC spells out the responsibilities of its members, including:

  • Setting the overall technical direction and roadmap of the project.
  • Resolving technical issues, disagreements, and escalations.
  • Declaring maturity levels for Istio features.
  • Approving the creation and dissolution of working groups and approving leadership changes of working groups.
  • Ensuring the team adheres to our code of conduct and respects our values.
  • Fostering an environment for a healthy and happy community of developers and contributors.

While the interest of our vendors is represented by our Steering Committee, TOC membership is associated with the individual, irrespective of their current employer. Members act independently, in their individual capacities, and must prioritize the best interests of the project and the community. This has always been achieved by method of consensus, and as such we seat an even number of members. The TOC has traditionally comprised 6 members, and this remains the case going forward.

What changes with the new charter?

The key changes in the new charter, recently ratified by the Steering Committee, are:

  • Members will now serve 2 year terms.
  • The Steering Committee will vote every year to (re-) seat 3 of the 6 members on the TOC.
  • The mechanics for election are clearly defined, including the expectation for candidates to qualify for the election, and how they will be evaluated.
  • The expectations of regular meetings between the Steering and TOC have been formalized.
  • There is now a formal process for removing a TOC member, should they lose the confidence of the Steering Committee.

There is no limit on the number of terms a member may serve for, and incumbent TOC members are welcome to run again at the end of their term.

TOC member farewells

We recently announced the retirement of long-time contributor Eric Van Norman. We also now bid farewell to Neeraj Poddar from the Istio TOC. Neeraj has been involved with the project since 2017, co-founding Aspen Mesh within F5, and later leading Gloo Mesh as VP of Engineering at Solo.io. He was first elected to the TOC in 2020. Neeraj has taken a role as VP of Engineering at NimbleEdge, and we congratulate him and wish him well for the future.

Maintainers: stand in our first election

We have set our annual TOC elections to occur after the seating of the Steering Committee each year, which will put the first instance around March 2025.

However, as we currently have two vacancies, we are announcing our first election will be a by-election to fill these two seats for the remainder of their terms.

The bar for joining the TOC is deliberately set high. Candidates must be tenured maintainers, recognized within the Istio community as collaborative technical leaders, and meet qualification criteria which demonstrate their suitability for the position.

To stand for a TOC seat, please send an e-mail to elections@istio.io, including a link to a one-page Google Doc with your self-assessment against the qualification criteria. Nominations will close in two weeks, on 31 October.

Good luck!

]]>
Thu, 17 Oct 2024 00:00:00 +0000/v1.24//blog/2024/toc-charter-elections//v1.24//blog/2024/toc-charter-elections/istiotocgovernancecommunityelection
Can Your Platform Do Policy? Accelerate Teams With Platform L7 Policy FunctionalityShared computing platforms offer resources and shared functionality to tenant teams so that they don’t need to build everything from scratch themselves. While it can sometimes be hard to balance all the requests from tenants, it’s important that platform teams ask the question: what’s the highest value feature we can offer our tenants?

Often work is given directly to application teams to implement, but there are some features that are best implemented once, and offered as a service to all teams. One feature within the reach of most platform teams is offering a standard, responsive system for Layer 7 application authorization policy. Policy as code enables teams to lift authorization decisions out of the application layer into a lightweight and performant decoupled system. It might sound like a challenge, but it doesn’t have to be, with the right tools for the job.

We’re going to dive into how Istio and Open Policy Agent (OPA) can be used to enforce Layer 7 policies in your platform. We’ll show you how to get started with a simple example. You will come to see how this combination is a solid option to deliver policy quickly and transparently to application team everywhere in the business, while also providing the data the security teams need for audit and compliance.

Try it out

When integrated with Istio, OPA can be used to enforce fine-grained access control policies for microservices. This guide shows how to enforce access control policies for a simple microservices application.

Prerequisites

  • A Kubernetes cluster with Istio installed.
  • The istioctl command-line tool installed.

Install Istio and configure your mesh options to enable OPA:

$ istioctl install -y -f - <<'EOF'
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  meshConfig:
    accessLogFile: /dev/stdout
    accessLogFormat: |
      [OPA DEMO] my-new-dynamic-metadata: "%DYNAMIC_METADATA(envoy.filters.http.ext_authz)%"
    extensionProviders:
    - name: "opa.local"
      envoyExtAuthzGrpc:
        service: "opa.opa.svc.cluster.local"
        port: "9191"
EOF

Notice that in the configuration, we define an extensionProviders section that points to the OPA standalone installation.

Deploy the sample application. Httpbin is a well-known application that can be used to test HTTP requests and helps to show quickly how we can play with the request and response attributes.

$ kubectl create ns my-app
$ kubectl label namespace my-app istio-injection=enabled

$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/httpbin/httpbin.yaml -n my-app

Deploy OPA. It will fail because it expects a configMap containing the default Rego rule to use. This configMap will be deployed later in our example.

$ kubectl create ns opa
$ kubectl label namespace opa istio-injection=enabled

$ kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: opa
  name: opa
  namespace: opa
spec:
  replicas: 1
  selector:
    matchLabels:
      app: opa
  template:
    metadata:
      labels:
        app: opa
    spec:
      containers:
      - image: openpolicyagent/opa:0.61.0-envoy
        name: opa
        args:
          - "run"
          - "--server"
          - "--disable-telemetry"
          - "--config-file=/config/config.yaml"
          - "--log-level=debug" # Uncomment this line to enable debug logs
          - "--diagnostic-addr=0.0.0.0:8282"
          - "/policy/policy.rego" # Default policy
        volumeMounts:
          - mountPath: "/config"
            name: opa-config
          - mountPath: "/policy"
            name: opa-policy
      volumes:
        - name: opa-config
          configMap:
            name: opa-config
        - name: opa-policy
          configMap:
            name: opa-policy
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: opa-config
  namespace: opa
data:
  config.yaml: |
    # Here the OPA configuration you can find in the offcial documention
    decision_logs:
      console: true
    plugins:
      envoy_ext_authz_grpc:
        addr: ":9191"
        path: mypackage/mysubpackage/myrule # Default path for grpc plugin
    # Here you can add your own configuration with services and bundles
---
apiVersion: v1
kind: Service
metadata:
  name: opa
  namespace: opa
  labels:
    app: opa
spec:
  ports:
    - port: 9191
      protocol: TCP
      name: grpc
  selector:
    app: opa
---
EOF

Deploy the AuthorizationPolicy to define which services will be protected by OPA.

$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: my-opa-authz
  namespace: istio-system # This enforce the policy on all the mesh being istio-system the mesh config namespace
spec:
  selector:
    matchLabels:
      ext-authz: enabled
  action: CUSTOM
  provider:
    name: "opa.local"
  rules: [{}] # Empty rules, it will apply to selectors with ext-authz: enabled label
EOF

Let’s label the app to enforce the policy:

$ kubectl patch deploy httpbin -n my-app --type=merge -p='{
  "spec": {
    "template": {
      "metadata": {
        "labels": {
          "ext-authz": "enabled"
        }
      }
    }
  }
}'

Notice that in this resource, we define the OPA extensionProvider you set in the Istio configuration:

[...]
  provider:
    name: "opa.local"
[...]

How it works

When applying the AuthorizationPolicy, the Istio control plane (istiod) sends the required configurations to the sidecar proxy (Envoy) of the selected services in the policy. Envoy will then send the request to the OPA server to check if the request is allowed or not.

The Envoy proxy works by configuring filters in a chain. One of those filters is ext_authz, which implements an external authorization service with a specific message. Any server implementing the correct protobuf can connect to the Envoy proxy and provide the authorization decision; OPA is one of those servers.

Before, when you installed OPA server, you used the Envoy version of the server. This image allows the configuration of the gRPC plugin which implements the ext_authz protobuf service.

[...]
      containers:
      - image: openpolicyagent/opa:0.61.0-envoy # This is the OPA image version which brings the Envoy plugin
        name: opa
[...]

In the configuration, you have enabled the Envoy plugin and the port which will listened to:

[...]
    decision_logs:
      console: true
    plugins:
      envoy_ext_authz_grpc:
        addr: ":9191" # This is the port where the envoy plugin will listen
        path: mypackage/mysubpackage/myrule # Default path for grpc plugin
    # Here you can add your own configuration with services and bundles
[...]

Reviewing Envoy’s Authorization service documentation, you can see that the message has these attributes:

OkHttpResponse
{
  "status": {...},
  "denied_response": {...},
  "ok_response": {
      "headers": [],
      "headers_to_remove": [],
      "dynamic_metadata": {...},
      "response_headers_to_add": [],
      "query_parameters_to_set": [],
      "query_parameters_to_remove": []
    },
  "dynamic_metadata": {...}
}

This means that based on the response from the authz server, Envoy can add or remove headers, query parameters, and even change the response status. OPA can do this as well, as documented in the OPA documentation.

Testing

Let’s test the simple usage (authorization) and then let’s create a more advanced rule to show how we can use OPA to modify the request and response.

Deploy an app to run curl commands to the httpbin sample application:

$ kubectl -n my-app run --image=curlimages/curl curl -- /bin/sleep 100d

Apply the first Rego rule and restart the OPA deployment:

$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: opa-policy
  namespace: opa
data:
  policy.rego: |
    package mypackage.mysubpackage

    import rego.v1

    default myrule := false

    myrule if {
      input.attributes.request.http.headers["x-force-authorized"] == "enabled"
    }

    myrule if {
      input.attributes.request.http.headers["x-force-authorized"] == "true"
    }
EOF
$ kubectl rollout restart deployment -n opa

The simple scenario is to allow requests if they contain the header x-force-authorized with the value enabled or true. If the header is not present or has a different value, the request will be denied.

There are multiple ways to create the Rego rule. In this case, we created two different rules. Executed in order, the first one which satisfies all the conditions will be the one that will be used.

Simple rule

The following request will return 403:

$ kubectl exec -n my-app curl -c curl  -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get

The following request will return 200 and the body:

$ kubectl exec -n my-app curl -c curl  -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get -H "x-force-authorized: enabled"

Advanced manipulations

Now the more advanced rule. Apply the second Rego rule and restart the OPA deployment:

$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: opa-policy
  namespace: opa
data:
  policy.rego: |
    package mypackage.mysubpackage

    import rego.v1

    request_headers := input.attributes.request.http.headers

    force_unauthenticated if request_headers["x-force-unauthenticated"] == "enabled"

    default allow := false

    allow if {
      not force_unauthenticated
      request_headers["x-force-authorized"] == "true"
    }

    default status_code := 403

    status_code := 200 if allow

    status_code := 401 if force_unauthenticated

    default body := "Unauthorized Request"

    body := "Authentication Failed" if force_unauthenticated

    myrule := {
      "body": body,
      "http_status": status_code,
      "allowed": allow,
      "headers": {"x-validated-by": "my-security-checkpoint"},
      "response_headers_to_add": {"x-add-custom-response-header": "added"},
      "request_headers_to_remove": ["x-force-authorized"],
      "dynamic_metadata": {"my-new-metadata": "my-new-value"},
    }
EOF
$ kubectl rollout restart deployment -n opa

In that rule, you can see:

myrule["allowed"] := allow # Notice that `allowed` is mandatory when returning an object, like here `myrule`
myrule["headers"] := headers
myrule["response_headers_to_add"] := response_headers_to_add
myrule["request_headers_to_remove"] := request_headers_to_remove
myrule["body"] := body
myrule["http_status"] := status_code

Those are the values that will be returned to the Envoy proxy from the OPA server. Envoy will use those values to modify the request and response.

Notice that allowed is required when returning a JSON object instead of only true/false. This can be found in the OPA documentation.

Change returned body

Let’s test the new capabilities:

$ kubectl exec -n my-app curl -c curl  -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get

Now we can change the response body. With 403 the body in the Rego rule is changed to “Unauthorized Request”. With the previous command, you should receive:

Unauthorized Request
http_code=403

Change returned body and status code

Running the request with the header x-force-authorized: enabled you should receive the body “Authentication Failed” and error “401”:

$ kubectl exec -n my-app curl -c curl  -- curl -s -w "\nhttp_code=%{http_code}" httpbin:8000/get -H "x-force-unauthenticated: enabled"

Adding headers to request

Running a valid request, you should receive the echo body with the new header x-validated-by: my-security-checkpoint and the header x-force-authorized removed:

$ kubectl exec -n my-app curl -c curl  -- curl -s httpbin:8000/get -H "x-force-authorized: true"

Adding headers to response

Running the same request but showing only the header, you will find the response header added during the Authz check x-add-custom-response-header: added:

$ kubectl exec -n my-app curl -c curl  -- curl -s -I httpbin:8000/get -H "x-force-authorized: true"

Sharing data between filters

Finally, you can pass data to the following Envoy filters using dynamic_metadata. This is useful when you want to pass data to another ext_authz filter in the chain or you want to print it in the application logs.

To do so, review the access log format you set earlier:

[...]
    accessLogFormat: |
      [OPA DEMO] my-new-dynamic-metadata: "%DYNAMIC_METADATA(envoy.filters.http.ext_authz)%"
[...]

DYNAMIC_METADATA is a reserved keyword to access the metadata object. The rest is the name of the filter that you want to access. In your case, the name envoy.filters.http.ext_authz is created automatically by Istio. You can verify this by dumping the Envoy configuration:

$ istioctl pc all deploy/httpbin -n my-app -oyaml | grep envoy.filters.http.ext_authz

You will see the configurations for the filter.

Let’s test the dynamic metadata. In the advance rule, you are creating a new metadata entry: {"my-new-metadata": "my-new-value"}.

Run the request and check the logs of the application:

$ kubectl exec -n my-app curl -c curl  -- curl -s -I httpbin:8000/get -H "x-force-authorized: true"
$ kubectl logs -n my-app deploy/httpbin -c istio-proxy --tail 1

You will see in the output the new attributes configured by OPA Rego rules:

[...]
 my-new-dynamic-metadata: "{"my-new-metadata":"my-new-value","decision_id":"8a6d5359-142c-4431-96cd-d683801e889f","ext_authz_duration":7}"
[...]

Conclusion

In this guide, we have shown how to integrate Istio and OPA to enforce policies for a simple microservices application. We also showed how to use Rego to modify the request and response attributes. This is the foundational example for building a platform-wide policy system that can be used by all application teams.

]]>
Mon, 14 Oct 2024 00:00:00 +0000/v1.24//blog/2024/l7-policy-with-opa//v1.24//blog/2024/l7-policy-with-opa/istioopapolicyplatformauthorization
External post: The Istio Service Mesh for People Who Have Stuff to Do

Read the whole post at lucavall.in.

]]>
Thu, 10 Oct 2024 00:00:00 +0000/v1.24//blog/2024/link-the-istio-service-mesh-for-people-who-have-stuff-to-do//v1.24//blog/2024/link-the-istio-service-mesh-for-people-who-have-stuff-to-do/
Introducing the Sail Operator: a new way to manage IstioWith the recent announcement of the In-Cluster IstioOperator deprecation in Istio 1.23 and its subsequent deletion for Istio 1.24, we want to build awareness of a new operator that the team at Red Hat have been developing to manage Istio as part of the istio-ecosystem organization.

The Sail Operator manages the lifecycle of Istio control planes, making it easier and more efficient for cluster administrators to deploy, configure and upgrade Istio in large scale production environments. Instead of creating a new configuration schema and reinventing the wheel, the Sail Operator APIs are built around Istio’s Helm chart APIs. All installation and configuration options that are exposed by Istio’s Helm charts are available through the Sail Operator CRDs’ values fields. This means that you can easily manage and customize Istio using familiar configurations without adding additional items to learn.

The Sail Operator has 3 main resource concepts:

  • Istio: used to manage the Istio control planes.
  • Istio Revision: represents a revision of that control plane, which is an instance of Istio with a specific version and revision name.
  • Istio CNI: used to manage the resource and lifecycle of Istio’s CNI plugin. To install the Istio CNI Plugin, you create an IstioCNI resource.

Currently, the main feature of the Sail Operator is the Update Strategy. The operator provides an interface that manages the upgrade of Istio control plane(s). It currently supports two update strategies:

  • In Place: with the InPlace strategy, the existing Istio control plane is replaced with a new version, and the workload sidecars immediately connect to the new control plane. This way, workloads don’t need to be moved from one control plane instance to another.
  • Revision Based: with the RevisionBased strategy, a new Istio control plane instance is created for every change to the Istio.spec.version field. The old control plane remains in place until all workloads have been moved to the new control plane instance. Optionally, the updateWorkloads flag can be set to automatically move workloads to the new control plane when it is ready.

We know that doing upgrades of the Istio control plane carries risk and can require a substantial manual effort for large deployments and this is why it is our current focus. For the future, we are looking at how the Sail Operator can better support use cases such as multi-tenancy and isolation, multi-cluster federation, and simplified integration with 3rd party projects.

The Sail Operator project is still alpha and under heavy development. Note that as an istio-ecosystem project, it is not supported as part of the Istio project. We are actively seeking feedback and contributions from the community. If you want to get involved with the project please refer to the repo documentation and contributing guidelines. If you are a user, you can also try the new operator by following the instructions in the user documentation.

For more information, contact us:

]]>
Mon, 19 Aug 2024 00:00:00 +0000/v1.24//blog/2024/introducing-sail-operator//v1.24//blog/2024/introducing-sail-operator/istiooperatorsailinclusterdeprecation
Istio has deprecated its In-Cluster OperatorIstio’s In-Cluster Operator has been deprecated in Istio 1.23. Users leveraging the operator — which we estimate to be fewer than 10% of our user base — will need to migrate to other install and upgrade mechanisms in order to upgrade to Istio 1.24 or above. Read on to learn why we are making this change, and what operator users need to do.

Does this affect you?

This deprecation only affects users of the In-Cluster Operator. Users who install Istio with the istioctl install command and an IstioOperator YAML file are not affected.

To determine if you are affected, run kubectl get deployment -n istio-system istio-operator and kubectl get IstioOperator. If both commands return non-empty values, your cluster will be affected. Based on recent polls, we expect that this will affect fewer than 10% of Istio users.

Operator-based Installations of Istio will continue to run indefinitely, but cannot be upgraded past 1.23.x.

When do I need to migrate?

In keeping with Istio’s deprecation policy for Beta features, the Istio In-Cluster Operator will be removed with the release of Istio 1.24, roughly three months from this announcement. Istio 1.23 will be supported through March 2025, at which time operator users will need to migrate to another install mechanism to retain support.

How do I migrate?

The Istio project will continue to support installation and upgrade via the istioctl command, as well as with Helm. Because of Helm’s popularity within the platform engineering ecosystem, we recommend most users migrate to Helm. istioctl install is based on Helm templates, and future versions may integrate deeper with Helm.

Helm installs can also be managed with GitOps tools like Flux or Argo CD.

Users who prefer the operator pattern for running Istio can migrate to either of two new Istio Ecosystem projects, the Classic Operator Controller, or the Sail Operator.

Migrating to Helm

Helm migration requires translating your IstioOperator YAML into a Helm values.yaml file. Tooling to support this migration will be provided alongside the Istio 1.24 release.

Migrating to istioctl

Identify your IstioOperator custom resource: there should be only one result.

$ kubectl get IstioOperator

Using the name of your resource, download your operator configuration in YAML format:

$ kubectl get IstioOperator <name> > istio.yaml

Disable the In-Cluster Operator. This will not disable your control plane or disrupt your current mesh traffic.

$ kubectl scale deployment -n istio-system istio-operator –replicas 0

When you are ready to upgrade Istio to version 1.24 or later, follow the upgrade instructions, using the istio.yaml file you downloaded above.

Once you have completed and verified your migration, run the following commands to clean up your operator resources:

$ kubectl delete deployment -n istio-system istio-operator
$ kubectl delete customresourcedefinition istiooperator

Migrating to the Classic Operator Controller

A new ecosystem project, the Classic Operator Controller, is a fork of the original controller built into Istio. This project maintains the same API and code base as the original operator, but is maintained outside of Istio core.

Because the API is the same, migration is straightforward: only the installation of the new operator will be required.

Classic Operator Controller is not supported by the Istio project.

Migrating to Sail Operator

A new ecosystem project, the Sail Operator, is able to install and manage the lifecycle of the Istio control plane in a Kubernetes or OpenShift cluster.

Sail Operator APIs are built around Istio’s Helm chart APIs. All installation and configuration options that are exposed by Istio’s Helm charts are available through the Sail Operator CRD’s values: fields.

Sail Operator is not supported by the Istio project.

What is an operator, and why did Istio have one?

The operator pattern was popularized by CoreOS in 2016 as a method for codifying human intelligence into code. The most common use case is a database operator, where a user might have multiple database instances in one cluster, with multiple ongoing operational tasks (backups, vacuums, sharding).

Istio introduced istioctl and the in-cluster operator in version 1.4, in response to problems with Helm v2. Around the same time, Helm v3 was introduced, which addressed the community’s concerns, and is a preferred method for installing software on Kubernetes today. Support for Helm v3 was added in Istio 1.8.

Istio’s in-cluster operator handled installation of the service mesh components - an operation you generally do one time, and for one instance, per cluster. You can think of it as a way to run istioctl inside your cluster. However, this meant you had a high-privilege controller running inside your cluster, which weakens your security posture. It doesn’t handle any ongoing administration tasks (backing up, taking snapshots etc, are not requirements for running Istio).

The Istio operator is something you have to install into the cluster, which means you already have to manage the installation of something. Using it to upgrade the cluster likewise first required you to download and run a new version of istioctl.

Using an operator means you have created a level of indirection, where you have to have options in your custom resource to configure everything you may wish to change about an installation. Istio worked around this by offering the IstioOperator API, which allows configuration of installation options. This resource is used by both the in-cluster operator and istioctl install, so there is a trivial migration path for operator users.

Three years ago — around the time of Istio 1.12 — we updated our documentation to say that use of the operator for new Istio installations is discouraged, and that users should use istioctl or Helm to install Istio.

Having three different installation methods has caused confusion, and in order to provide the best experience for people using Helm or istioctl - over 90% of our install base - we have decided to formally deprecate the in-cluster operator in Istio 1.23.

]]>
Wed, 14 Aug 2024 00:00:00 +0000/v1.24//blog/2024/in-cluster-operator-deprecation-announcement//v1.24//blog/2024/in-cluster-operator-deprecation-announcement/operatordeprecation
Happy 7th Birthday, Istio!

On this day in 2017, Google and IBM announced the launch of the Istio service mesh. Istio is an open technology that enables developers to seamlessly connect, manage, and secure networks of different services — regardless of platform, source, or vendor. We can hardly believe that Istio turns seven today! To celebrate the project’s 7th birthday, we wanted to highlight Istio’s momentum and its exciting future.

Rapid adoption among users

Istio, the most widely adopted service mesh project in the world, has been gathering significant momentum since its inception in 2017. Last year Istio joined Kubernetes, Prometheus, and other stalwarts of the cloud native ecosystem with its CNCF graduation. End users range from digital native startups to the world’s largest financial institutions and telcos, with case studies from companies including eBay, T-Mobile, Airbnb, Splunk, FICO, T-Mobile, Salesforce, and many others.

Istio’s control plane and sidecar are the #3 and #4 most downloaded images on Docker Hub, each with over 10 billion downloads.

We have over 35,000 GitHub stars on Istio’s main repository, with continuing growth. Thank you everyone who starred the istio/istio repo.

We asked a few of our users for their thoughts on the occasion of Istio’s 7th birthday:

Amazing diversity of contributors and vendors

Over the past year, our community has observed tremendous growth in terms of both the number of contributing companies and the number of contributors. Recall that Istio had 500 contributors when it turned three years old? We have had over 1,700 contributors in the past year!

With Microsoft’s Open Service Mesh team joining the Istio community, we added Azure to the list of clouds and enterprise Kubernetes vendors providing Istio-compatible solutions, including Google Cloud, Red Hat OpenShift, VMware Tanzu, Huawei Cloud, DaoCloud, Oracle Cloud, Tencent Cloud, Akamai Cloud and Alibaba Cloud. We are also delighted to see the Amazon Web Services team publish the EKS Blueprint for Istio due to high demand from users wanting to run Istio on AWS.

Specialist network software providers are also driving Istio forward, with Solo.io, Tetrate and F5 Networks all offering enterprise Istio solutions that will run in any environment.

Below are the top contributing companies for the past year, with Solo.io, Google, and DaoCloud taking the top three places. While most of these companies are Istio vendors, Salesforce and Ericsson are end users, running Istio in production!

Here are some thoughts from our community leaders:

Continuous technical innovation

We are firm believers that diversity drives innovation. What amazes us most is the continuous innovation from the Istio community, from making upgrades easier, to adopting Kubernetes Gateway API, to adding the new sidecar-less ambient data plane mode, to making Istio easy to use and as transparent as possible.

Istio’s ambient mode was introduced in September 2022, designed for simplified operations, broader application compatibility, and reduced infrastructure cost. Ambient mode introduces lightweight, shared Layer 4 (L4) node proxies and optional Layer 7 (L7) proxies, removing the need for traditional sidecar proxies from the data plane. The core innovation behind ambient mode is that it slices the L4 and L7 processing into two distinct layers. This layered approach allows you to adopt Istio incrementally, enabling a smooth transition from no mesh, to a secure overlay (L4), to optional full L7 processing — on a per-namespace basis, as needed, across your fleet.

As part of the Istio 1.22 release, ambient mode has reached beta and you can run Istio without sidecars in production with precautions.

Here are some thoughts and well-wishes from our contributors and users:

Learn more about Istio

If you are new to Istio, here are a few resources to help you learn more:

If you are already part of the Istio community, please wish the Istio project a happy 7th birthday, and share your thoughts about the project on social media. Thank you for your help and support!

]]>
Fri, 24 May 2024 00:00:00 +0000/v1.24//blog/2024/happy-7th-birthday//v1.24//blog/2024/happy-7th-birthday/istiobirthdaymomentumfuture
Say goodbye to your sidecars: Istio's ambient mode reaches Beta in v1.22Today, Istio’s revolutionary new ambient data plane mode has reached Beta. Ambient mode is designed for simplified operations, broader application compatibility, and reduced infrastructure cost. It gives you a sidecar-less data plane that’s integrated into your infrastructure, all while maintaining Istio’s core features of zero-trust security, telemetry, and traffic management.

Ambient mode was announced in September 2022. Since then, our community has put in 20 months of hard work and collaboration, with contributions from Solo.io, Google, Microsoft, Intel, Aviatrix, Huawei, IBM, Red Hat, and many others. Beta status in 1.22 indicates the features of ambient mode are now ready for production workloads, with appropriate precautions. This is a huge milestone for Istio, bringing both Layer 4 and Layer 7 mesh features to production readiness without sidecars.

Why ambient mode?

In listening to feedback from Istio users, we observed a growing demand for mesh capabilities for applications — but heard that many of you found the resource overhead and operational complexity of sidecars hard to overcome. Challenges that sidecar users shared with us include how Istio can break applications after sidecars are added, the large consumption of CPU and memory by sidecars, and the inconvenience of the requirement to restart application pods with every new proxy release.

As a community, we designed ambient mode to tackle these problems, alleviating the previous barriers of complexity faced by users looking to implement service mesh. The new feature set was named ‘ambient mode’ as it was designed to be transparent to your application, ensuring no additional configuration was required to adopt it, and required no restarting of applications by users.

In ambient mode it is trivial to add or remove applications from the mesh. You can now simply label a namespace, and all applications in that namespace are added to the mesh. This immediately secures all traffic with mTLS, all without sidecars or the need to restart applications.

Refer to the Introducing Ambient Mesh blog for more information on why we built ambient mode.

How does ambient mode make adoption easier?

Istio’s ambient mode introduces lightweight, shared Layer 4 (L4) node proxies and optional Layer 7 (L7) proxies, removing the need for traditional sidecar proxies from the data plane. The core innovation behind ambient mode is that it slices the L4 and L7 processing into two distinct layers. This layered approach allows you to adopt Istio incrementally, enabling a smooth transition from no mesh, to a secure overlay (L4), to optional full L7 processing — on a per-namespace basis, as needed, across your fleet.

Ambient mode works without any modification required to your existing Kubernetes deployments. You can label a namespace to add all of its workloads to the mesh, or opt-in certain deployments as needed. By utilizing ambient mode, users bypass some of the previously restrictive elements of the sidecar model. Server-send-first protocols now work, most reserved ports are now available, and the ability for containers to bypass the sidecar — either maliciously or not — is eliminated.

The lightweight shared L4 node proxy is called the ztunnel (zero-trust tunnel). Ztunnel drastically reduces the overhead of running a mesh by removing the need to potentially over-provision memory and CPU within a cluster to handle expected loads. In some use cases, the savings can exceed 90% or more, while still providing zero-trust security using mutual TLS with cryptographic identity, simple L4 authorization policies, and telemetry.

The L7 proxies are called waypoints. Waypoints process L7 functions such as traffic routing, rich authorization policy enforcement, and enterprise-grade resilience. Waypoints run outside of your application deployments and can scale independently based on your needs, which could be for the entire namespace or for multiple services within a namespace. Compared with sidecars, you don’t need one waypoint per application pod, and you can scale your waypoint effectively based on its scope, thus saving significant amounts of CPU and memory in most cases.

The separation between the L4 secure overlay layer and L7 processing layer allows incremental adoption of the ambient mode data plane, in contrast to the earlier binary “all-in” injection of sidecars. Users can start with the secure L4 overlay, which offers a majority of features that people deploy Istio for (mTLS, authorization policy, and telemetry). Complex L7 handling such as retries, traffic splitting, load balancing, and observability collection can then be enabled on a case-by-case basis.

What is in the scope of the Beta?

We recommend you explore the following Beta functions of ambient mode in production with appropriate precautions, after validating them in test environments:

Alpha features

Many other features we want to include in ambient mode have been implemented, but remain in Alpha status in this release. Please help test them, so they can be promoted to Beta in 1.23 or later:

  • Multi-cluster installations
  • DNS proxying
  • Interoperability with sidecars
  • IPv6/Dual stack
  • SOCKS5 support (for outbound)
  • Istio’s classic APIs (VirtualService and DestinationRule)

Roadmap

We have a number of features which are not yet implemented in ambient mode, but are planned for upcoming releases:

  • Controlled egress traffic
  • Multi-network support
  • Improve status messages on resources to help troubleshoot and understand the mesh
  • VM support

What about sidecars?

Sidecars are not going away, and remain first-class citizens in Istio. You can continue to use sidecars, and they will remain fully supported. For any feature outside of the Alpha or Beta scope for ambient mode, you should consider using the sidecar mode until the feature is added to ambient mode. Some use cases, such as traffic shifting based on source labels, will continue to be best implemented using the sidecar mode. While we believe most use cases will be best served with a mesh in ambient mode, the Istio project remains committed to ongoing sidecar mode support.

Try ambient mode today

With the 1.22 release of Istio and the Beta release of ambient mode, it is now easier than ever to try out Istio on your own workloads. Follow the getting started guide to explore ambient mode, or read our new user guides to learn how to incrementally adopt ambient for mutual TLS & L4 authorization policy, traffic management, rich L7 authorization policy, and more. You can engage with the developers in the #ambient channel on the Istio Slack, or use the discussion forum on GitHub for any questions you may have.

]]>
Mon, 13 May 2024 00:00:00 +0000/v1.24//blog/2024/ambient-reaches-beta//v1.24//blog/2024/ambient-reaches-beta/ambientsidecars
Introducing Istio v1 APIsIstio provides networking, security and telemetry APIs that are crucial for ensuring the robust security, seamless connectivity, and effective observability of services within the service mesh. These APIs are used on thousands of clusters across the world, securing and enhancing critical infrastructure.

Most of the features powered by these APIs have been considered stable for some time, but the API version has remained at v1beta1. As a reflection of the stability, adoption, and value of these resources, the Istio community has decided to promote these APIs to v1 in Istio 1.22.

In Istio 1.22 we are happy to announce that a concerted effort has been made to graduate the below APIs to v1:

Feature stability and API versions

Declarative APIs, such as those used by Kubernetes and Istio, decouple the description of a resource from the implementation that acts on it.

Istio’s feature phase definitions describe how a stable feature — one that is deemed ready for production use at any scale, and comes with a formal deprecation policy — should be matched with a v1 API. We are now making good on that promise, with our API versions matching our feature stability for both features that have been stable for some time, and those which are being newly designated as stable in this release.

Although there are currently no plans to discontinue support for the previous v1beta1 and v1alpha1 API versions, users are encouraged to manually transition to utilizing the v1 APIs by updating their existing YAML files.

Telemetry API

The v1 Telemetry API is the only API that was promoted that had changes from its previous API version. The following v1alpha1 features weren’t promoted to v1:

  • metrics.reportingInterval
    • Reporting interval allows configuration of the time between calls out to for metrics reporting. This currently only supports TCP metrics but we may use this for long duration HTTP streams in the future.

      At this time, Istio lacks usage data to support the need for this feature.

  • accessLogging.filter
    • If specified, this filter will be used to select specific requests/connections for logging.

      This feature is based on a relatively new feature in Envoy, and Istio needs to further develop the use case and implementation before graduating it to v1.

  • tracing.useRequestIdForTraceSampling
    • This value is true by default. The format of this Request ID is specific to Envoy, and if the Request ID generated by the proxy that receives user traffic first is not specific to Envoy, Envoy will break the trace because it cannot interpret the Request ID. By setting this value to false, we can prevent Envoy from sampling based on the Request ID.

      There is not a strong use case for making this configurable through the Telemetry API.

Please share any feedback on these fields by creating issues on GitHub.

Overview of Istio CRDs

This is the full list of supported API versions:

Category API Versions
Networking Destination Rule v1, v1beta1, v1alpha3
Istio Gateway v1, v1beta1, v1alpha3
Service Entry v1, v1beta1, v1alpha3
Sidecar scope v1, v1beta1, v1alpha3
Virtual Service v1, v1beta1, v1alpha3
Workload Entry v1, v1beta1, v1alpha3
Workload Group v1, v1beta1, v1alpha3
Proxy Config v1beta1
Envoy Filter v1alpha3
Security Authorization Policy v1, v1beta1
Peer Authentication v1, v1beta1
Request Authentication v1, v1beta1
Telemetry Telemetry v1, v1alpha1
Extension Wasm Plugin v1alpha1

Istio can also be configured using the Kubernetes Gateway API.

Using the v1 Istio APIs

There are some APIs in Istio that are still under active development and are subject to potential changes between releases. For instance, the Envoy Filter, Proxy Config and Wasm Plugin APIs.

Furthermore, Istio maintains a strictly identical schema across all versions of an API due to limitations in CRD versioning. Therefore, even though there is a v1 Telemetry API, the three v1alpha1 fields mentioned above can still be utilized when declaring a v1 Telemetry API resource.

For risk-averse environments, we have added a stable validation policy, a validating admission policy which can ensure that only v1 APIs and fields are used with Istio APIs.

In new environments, selecting the stable validation policy upon installing Istio will guarantee that all future Custom Resources created or updated are v1 and contain only v1 features.

If the policy is deployed into an existing Istio installation that has Custom Resources that do not comply with it, the only allowed action is to delete the resource or remove the usage of the offending fields.

To install Istio with the stable validation policy:

$ helm install istio-base -n istio-system --set experimental.stableValidationPolicy=true

To set a specific revision when installing Istio with the policy:

$ helm install istio-base -n istio-system --set experimental.stableValidationPolicy=true -set revision=x

This feature is compatible with Kubernetes 1.30 and higher. The validations are created using CEL expressions, and users can modify the validations for their specific needs.

Summary

The Istio project is committed to delivering stable APIs and features essential for the successful operation of your service mesh. We would love to receive your feedback to help guide us in making the right decisions as we continue to refine relevant use cases and stability blockers for our features. Please share your feedback by creating issues, posting in the relevant Istio Slack channel, or by joining us in our weekly working group meeting.

]]>
Mon, 13 May 2024 00:00:00 +0000/v1.24//blog/2024/v1-apis//v1.24//blog/2024/v1-apis/istiotrafficsecuritytelemetryAPI
Gateway API Mesh Support Promoted To StableWe are thrilled to announce that Service Mesh support in the Gateway API is now officially “Stable”! With this release (part of Gateway API v1.1 and Istio v1.22), users can make use of the next-generation traffic management APIs for both ingress (“north-south”) and service mesh use cases (“east-west”).

What is the Gateway API?

The Gateway API is a collection of APIs that are part of Kubernetes, focusing on traffic routing and management. The APIs are inspired by, and serve many of the same roles as, Kubernetes’ Ingress and Istio’s VirtualService and Gateway APIs.

These APIs have been under development both in Istio, as well as with broad collaboration, since 2020, and have come a long way since then. While the API initially targeted only serving ingress use cases (which went GA last year), we had always envisioned allowing the same APIs to be used for traffic within a cluster as well.

With this release, that vision is made a reality: Istio users can use the same routing API for all of their traffic!

Getting started

Throughout the Istio documentation, all of our examples have been updated to show how to use the Gateway API, so explore some of the tasks to gain a deeper understanding.

Using Gateway API for service mesh should feel familiar both to users already using Gateway API for ingress, and users using VirtualService for service mesh today.

  • Compared to Gateway API for ingress, routes target a Service instead of a Gateway.
  • Compared to VirtualService, where routes associate with a set of hosts, routes target a Service.

Here is a simple example, which demonstrates routing requests to two different versions of a Service based on the request header:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: reviews
spec:
  parentRefs:
  - group: ""
    kind: Service
    name: reviews
    port: 9080
  rules:
  - matches:
    - headers:
      - name: my-favorite-service-mesh
        value: istio
    filters:
    - type: RequestHeaderModifier
      requestHeaderModifier:
      add:
        - name: hello
          value: world
    backendRefs:
    - name: reviews-v2
      port: 9080
  - backendRefs:
    - name: reviews-v1
      port: 9080

Breaking this down, we have a few parts:

  • First, we identify what routes we should match. By attaching our route to the reviews Service, we will apply this routing configuration to all requests that were originally targeting reviews.
  • Next, matches configures criteria for selecting which traffic this route should handle.
  • Optionally, we can modify the request. Here, we add a header.
  • Finally, we select a destination for the request. In this example, we are picking between two versions of our application.

For more details, see Istio’s traffic routing internals and Gateway API’s Service documentation.

Which API should I use?

With overlapping responsibilities (and names!), picking which APIs to use can be a bit confusing.

Here is the breakdown:

API Name Object Types Status Recommendation
Gateway APIs HTTPRoute, Gateway, … Stable in Gateway API v1.0 (2023) Use for new deployments, in particular with ambient mode
Istio APIs Virtual Service, Gateway v1 in Istio 1.22 (2024) Use for existing deployments, or where advanced features are needed
Ingress API Ingress Stable in Kubernetes v1.19 (2020) Use only for legacy deployments

You may wonder, given the above, why the Istio APIs were promoted to v1 concurrently? This was part of an effort to accurate categorize the stability of the APIs. While we view Gateway API as the future (and present!) of traffic routing APIs, our existing APIs are here to stay for the long run, with full compatibility. This mirrors Kubernetes’ approach with Ingress, which was promoted to v1 while directing future work towards the Gateway API.

Community

This stability graduation represents the culmination of countless hours of work and collaboration across the project. It is incredible to look at the list of organizations involved in the API and consider back at how far we have come.

A special thanks goes out to my co-leads on the effort: Flynn, Keith Mattix, and Mike Morris, as well as the countless others involved.

Interested in getting involved, or even just providing feedback? Check out Istio’s community page or the Gateway API contributing guide!

]]>
Mon, 13 May 2024 00:00:00 +0000/v1.24//blog/2024/gateway-mesh-ga//v1.24//blog/2024/gateway-mesh-ga/istiotrafficAPI
Istio joins Phippy and friends — Welcome Izzy!Having sailed into, and proudly graduated within the Cloud Native Computing Foundation in 2023, it is now time for Istio to join the CNCF Phippy family’s mission to demystify and simplify cloud native computing.

The Istio Steering Committee is excited to unveil Izzy Dolphin, the Istio Indo-Pacific Bottlenose, who today dives into the family of “Phippy and Friends”.

Istio stands on the shoulders of several other CNCF projects, including Kubernetes, Envoy, Prometheus, and Helm. Izzy is proud to join Phippy, Hazel, and Captain Kube’s gang, taking cloud native to the masses.

Izzy not only represents the hard work and imagination of Istio’s maintainers from diverse companies, but will help us illustrate the concepts of service mesh and Istio’s new ambient mode in an easy manner. Keep tuned, as we build out our illustrated guides where Izzy will demystify Istio and service mesh in terms a child could understand! Next time you’re breaking down these concepts for people who don’t share your background knowledge, how about using Izzy?

Istio was initially developed by Google and IBM and built on the Envoy project from Lyft. The project now has maintainers from more than 16 companies, including many of the largest networking vendors and cloud organizations worldwide. Istio provides zero-trust networking, policy enforcement, traffic management, load balancing, and monitoring without requiring applications to be rewritten.

Over the years Istio has made substantial strides in simplifying the complex problem of cloud native networking. We understand that these concepts remain complicated for many, and that is why we are proud to join Phippy’s mission to talk about tech in an accessible, straight-forward manner. We would like to open the doors of service mesh technology to more folks than ever before through Izzy and enable you to join #teamcloudnative!

]]>
Fri, 08 Mar 2024 00:00:00 +0000/v1.24//blog/2024/istio-phippy//v1.24//blog/2024/istio-phippy/istiophippyizzydolphin
Istio's Steering Committee for 2024The Istio Steering Committee oversees the administrative aspects of the project, including governance, branding, marketing, and working with the CNCF.

Every year, the leaders in the Istio project estimate the proportion of the hundreds of companies that have contributed to Istio in the past year, and uses that metric to proportionally allocate nine Contribution Seats on our Steering Committee.

Then, four Community Seats are voted for by our project members, with candidates being from companies that did not receive Contribution Seats.

We are pleased to share the result of this year’s calculation, and changes to our Community Seat holders as a result.

Contribution seats

The calculation for the 2024-2025 term brings the most diverse set of company representation ever in the Contribution Seats, with five companies1 represented:

Company Seat allocation
Google 3
Solo.io 2
IBM/Red Hat 2
DaoCloud 1
Huawei 1

The full allocation can be seen in our formula spreadsheet.

Community seats

As a result of this year’s calculation, two of our Community Seat holders move to Contribution Seats. This creates two extra vacancies, which are allocated to the runners-up of our last election2.

We are pleased to welcome our two newest Community Seat holders, Mitch Connors from Aviatrix and Keith Mattix from Microsoft. Both are highly active maintainers and leaders in the project, and we are delighted to have them join the Steering Committee.

Proposed changes to election timing

Our charter currently allocates Contribution Seats in February and holds the Community Seat election in July. We previously anticipated a situation where people would change seat types mid-term, and this has now come to pass.

We will therefore be voting on a change to our Charter which will move our Community Seat elections to February, to be held immediately after the allocation of Contribution Seats. It is our intention that the next annual election be held in February 2025.

The full group

Following these changes, we now have representation from nine companies:

Name Company Profile Seat type
Craig Box Solo.io craigbox Contribution seat
Rob Cernich Red Hat rcernich Contribution seat
Mitch Connors Aviatrix therealmitchconnors Community seat
Iris (Shaojun) Ding Intel irisdingbj Community seat
Cameron Etezadi Google cetezadi Contribution seat
John Howard Google howardjohn Contribution seat
Faseela K Ericsson Software Technology kfaseela Community seat
Kebe Liu DaoCloud kebe7jun Contribution seat
Jamie Longmuir Red Hat longmuir Contribution seat
Keith Mattix Microsoft keithmattix Community seat
Justin Pettit Google justinpettit Contribution seat
Lin Sun Solo.io linsun Contribution seat
Zhonghu Xu Huawei hzxuzhonghu Contribution seat

Our sincerest thanks to Ameer Abbas, April Kyle Nassi, Cale Rath and Chaomeng Zhang, whose terms have come to an end.

The Steering Committee wishes to thank its members, old and new, and looks forward to continue to grow and improve Istio as a successful and sustainable open source project. We encourage everyone to get involved in the Istio community by contributing, voting, and helping us shape the future of cloud native networking.


  1. Our Steering Committee charter considers groups of companies as one, for the purposes of allocation of seats. This means we group IBM and Red Hat together as a single entity. ↩︎

  2. The first runner-up from the election was Kebe Liu from DaoCloud, who will join with their newly allocated Contribution Seat. ↩︎

]]>
Thu, 15 Feb 2024 00:00:00 +0000/v1.24//blog/2024/steering-results//v1.24//blog/2024/steering-results/istiosteeringgovernancecommunityelection
Maturing Istio Ambient: Compatibility Across Various Kubernetes Providers and CNIsThe Istio project announced ambient mesh - its new sidecar-less dataplane mode in 2022, and released an alpha implementation in early 2023.

Our alpha was focused on proving out the value of the ambient data plane mode under limited configurations and environments. However, the conditions were quite limited. Ambient mode relies on transparently redirecting traffic between workload pods and ztunnel, and the initial mechanism we used to do that conflicted with several categories of 3rd-party Container Networking Interface (CNI) implementations. Through GitHub issues and Slack discussions, we heard our users wanted to be able to use ambient mode in minikube and Docker Desktop, with CNI implementations like Cilium and Calico, and on services that ship in-house CNI implementations like OpenShift and Amazon EKS. Getting broad support for Kubernetes anywhere has become the No. 1 requirement for ambient mesh moving to beta — people have come to expect Istio to work on any Kubernetes platform and with any CNI implementation. After all, ambient wouldn’t be ambient without being all around you!

At Solo, we’ve been integrating ambient mode into our Gloo Mesh product, and came up with an innovative solution to this problem. We decided to upstream our changes in late 2023 to help ambient reach beta faster, so more users can operate ambient in Istio 1.21 or newer, and enjoy the benefits of ambient sidecar-less mesh in their platforms regardless of their existing or preferred CNI implementation.

How did we get here?

Service meshes and CNIs: it’s complicated

Istio is a service mesh, and all service meshes by strict definition are not CNI implementations - service meshes require a spec-compliant, primary CNI implementation to be present in every Kubernetes cluster, and rest on top of that.

This primary CNI implementation may be provided by your cloud provider (AKS, GKE, and EKS all ship their own), or by third-party CNI implementations like Calico and Cilium. Some service meshes may also ship bundled with their own primary CNI implementation, which they explicitly require to function.

Basically, before you can do things like secure pod traffic with mTLS and apply high-level authentication and authorization policy at the service mesh layer, you must have a functional Kubernetes cluster with a functional CNI implementation, to make sure the basic networking pathways are set up so that packets can get from one pod to another (and from one node to another) in your cluster.

Though some service meshes may also ship and require their own in-house primary CNI implementation, and it is sometimes possible to run two primary CNI implementations in parallel within the same cluster (for instance, one shipped by the cloud provider, and a 3rd-party implementation), in practice this introduces a whole host of compatibility issues, strange behaviors, reduced feature sets, and some incompatibilities due to the wildly varying mechanisms each CNI implementation might employ internally.

To avoid this, the Istio project has chosen not to ship or require our own primary CNI implementation, or even require a “preferred” CNI implementation - instead choosing to support CNI chaining with the widest possible ecosystem of CNI implementations, and ensuring maximum compatibility with managed offerings, cross-vendor support, and composability with the broader CNCF ecosystem.

Traffic redirection in ambient alpha

The istio-cni component is an optional component in the sidecar data plane mode, commonly used to remove the requirement for the NET_ADMIN and NET_RAW capabilities for users deploying pods into the mesh. istio-cni is a required component in the ambient data plane mode. The istio-cni component is not a primary CNI implementation, it is a node agent that extends whatever primary CNI implementation is already present in the cluster.

Whenever pods are added to an ambient mesh, the istio-cni component configures traffic redirection for all incoming and outgoing traffic between the pods and the ztunnel running on the pod’s node, via the node-level network namespace. The key difference between the sidecar mechanism and the ambient alpha mechanism is that in the latter, pod traffic was redirected out of the pod network namespace, and into the co-located ztunnel pod network namespace - necessarily passing through the host network namespace on the way, which is where the bulk of the traffic redirection rules to achieve this were implemented.

As we tested more broadly in multiple real-world Kubernetes environments, which have their own default CNI, it became clear that capturing and redirecting pod traffic in the host network namespace, as we were during alpha development, was not going to meet our requirements. Achieving our goals in a generic manner across these diverse environments was simply not feasible with this approach.

The fundamental problem with redirecting traffic in the host network namespace is that this is precisely the same spot where the cluster’s primary CNI implementation must configure traffic routing/networking rules. This created inevitable conflicts, most critically:

  • The primary CNI implementation’s basic host-level networking configuration could interfere with the host-level ambient networking configuration from Istio’s CNI extension, causing traffic disruption and other conflicts.
  • If users deployed a network policy to be enforced by the primary CNI implementation, the network policy might not be enforced when the Istio CNI extension is deployed (depending on how the primary CNI implementation enforces NetworkPolicy)

While we could design around this on a case-by-case basis for some primary CNI implementations, we could not sustainably approach universal CNI support. We considered eBPF, but realized any eBPF implementation would have the same basic problem, as there is no standardized way to safely chain/extend arbitrary eBPF programs at this time, and we would still potentially have a hard time supporting non-eBPF CNIs with this approach.

Addressing the challenges

A new solution was necessary - doing redirection of any sort in the node’s network namespace would create unavoidable conflicts, unless we compromised our compatibility requirements.

In sidecar mode, it is trivial to configure traffic redirection between the sidecar and application pod, as both operate within the pod’s network namespace. This led to a light-bulb moment: why not mimic sidecars, and configure the redirection in the application pod’s network namespace?

While this sounds like a “simple” thought, how would this even be possible? A critical requirement of ambient is that ztunnel must run outside application pods, in the Istio system namespace. After some research, we discovered a Linux process running in one network namespace could create and own listening sockets within another network namespace. This is a basic capability of the Linux socket API. However, to make this work operationally and cover all pod lifecycle scenarios, we had to make architectural changes to the ztunnel as well as to the istio-cni node agent.

After prototyping and sufficiently validating that this novel approach does work for all the Kubernetes platforms we have access to, we built confidence in the work and decided to contribute to upstream this new traffic redirection model, an in-Pod traffic redirection mechanism between workload pods and the ztunnel node proxy component that has been built from the ground up to be highly compatible with all major cloud providers and CNIs.

The key innovation is to deliver the pod’s network namespace to the co-located ztunnel so that ztunnel can start its redirection sockets inside the pod’s network namespace, while still running outside the pod. With this approach, the traffic redirection between ztunnel and application pods happens in a way that’s very similar to sidecars and application pods today and is strictly invisible to any Kubernetes primary CNI operating in the node network namespace. Network policy can continue to be enforced and managed by any Kubernetes primary CNI, regardless of whether the CNI uses eBPF or iptables, without any conflict.

Technical deep dive of in-Pod traffic redirection

First, let’s go over the basics of how a packet travels between pods in Kubernetes.

Linux, Kubernetes, and CNI - what’s a network namespace, and why does it matter?

In Linux, a container is one or more Linux processes running within isolated Linux namespaces. A Linux namespace is simply a kernel flag that controls what processes running within that namespace are able to see. For instance, if you create a new Linux network namespace via the ip netns add my-linux-netns command and run a process inside it, that process can only see the networking rules created within that network namespace. It can not see any network rules created outside of it - even though everything running on that machine is still sharing one Linux networking stack.

Linux namespaces are conceptually a lot like Kubernetes namespaces - logical labels that organize and isolate different active processes, and allow you to create rules about what things within a given namespace can see and what rules are applied to them - they simply operate at a much lower level.

When a process running within a network namespace creates a TCP packet outward bound for something else, the packet must be processed by any local rules within the local network namespace first, then leave the local network namespace, passing into another one.

For example, in plain Kubernetes without any mesh installed, a pod might create a packet and send it to another pod, and the packet might (depending on how networking was set up):

  • Be processed by any rules within the source pod’s network namespace.
  • Leave the source pod network namespace, and bubble up into the node’s network namespace where it is processed by any rules in that namespace.
  • From there, finally be redirected into the target pod’s network namespace (and processed by any rules there).

In Kubernetes, the Container Runtime Interface (CRI) is responsible for talking to the Linux kernel, creating network namespaces for new pods, and starting processes within them. The CRI then invokes the Container Networking Interface (CNI), which is responsible for wiring up the networking rules in the various Linux network namespaces, so that packets leaving and entering the new pod can get where they’re supposed to go. It doesn’t matter much to Kubernetes or the container runtime what topology or mechanism the CNI uses to accomplish this - as long as packets get where they’re supposed to be, Kubernetes works and everyone is happy.

Why did we drop the previous model?

In Istio ambient mesh, every node has a minimum of two containers running as Kubernetes DaemonSets:

  • An efficient ztunnel which handles mesh traffic proxying duties, and L4 policy enforcement.
  • A istio-cni node agent that handles adding new and existing pods into the ambient mesh.

In the previous ambient mesh implementation, this is how application pod is added to the ambient mesh:

  • The istio-cni node agent detects an existing or newly-started Kubernetes pod with its namespace labeled with istio.io/dataplane-mode=ambient, indicating that it should be included in the ambient mesh.
  • The istio-cni node agent then establishes network redirection rules in the host network namespace, such that packets entering or leaving the application pod would be intercepted and redirected to that node’s ztunnel on the relevant proxy ports (15008, 15006, or 15001).

This means that for a packet created by a pod in the ambient mesh, that packet would leave that source pod, enter the node’s host network namespace, and then ideally would be intercepted and redirected to that node’s ztunnel (running in its own network namespace) for proxying to the destination pod, with the return trip being similar.

This model worked well enough as a placeholder for the initial ambient mesh alpha implementation, but as mentioned, it has a fundamental problem - there are many CNI implementations, and in Linux there are many fundamentally different and incompatible ways in which you can configure how packets get from one network namespace to another. You can use tunnels, overlay networks, go through the host network namespace, or bypass it. You can go through the Linux user space networking stack, or you can skip it and shuttle packets back and forth in the kernel space stack, etc. For every possible approach, there’s probably a CNI implementation out there that makes use of it.

Which meant that with the previous redirection approach, there were a lot of CNI implementations ambient simply wouldn’t work with. Given its reliance on host network namespace packet redirection - any CNI that didn’t route packets thru the host network namespace would need a different redirection implementation. And even for CNIs that did do this, we would have unavoidable and potentially unresolvable problems with conflicting host-level rules. Do we intercept before the CNI, or after? Will some CNIs break if we do one, or the other, and they aren’t expecting that? Where and when is NetworkPolicy enforced, since NetworkPolicy must be enforced in the host network namespace? Do we need lots of code to special-case every popular CNI?

Istio ambient traffic redirection: the new model

In the new ambient model, this is how application pod is added to the ambient mesh:

  • The istio-cni node agent detects a Kubernetes pod (existing or newly-started) with its namespace labeled with istio.io/dataplane-mode=ambient, indicating that it should be included in the ambient mesh.
    • If a new pod is started that should be added to the ambient mesh, a CNI plugin (as installed and managed by the istio-cni agent) is triggered by the CRI. This plugin is used to push a new pod event to the node’s istio-cni agent, and block pod startup until the agent successfully configures redirection. Since CNI plugins are invoked by the CRI as early as possible in the Kubernetes pod creation process, this ensures that we can establish traffic redirection early enough to prevent traffic escaping during startup, without relying on things like init containers.
    • If an already-running pod becomes added to the ambient mesh, a new pod event is triggered. The istio-cni node agent’s Kubernetes API watcher detects this, and redirection is configured in the same manner.
  • The istio-cni node agent enters the pod’s network namespace and establishes network redirection rules inside the pod network namespace, such that packets entering and leaving the pod are intercepted and transparently redirected to the node-local ztunnel proxy instance listening on well-known ports (15008, 15006, 15001).
  • The istio-cni node agent then informs the node ztunnel over a Unix domain socket that it should establish local proxy listening ports inside the pod’s network namespace, (on 15008, 15006, and 15001), and provides ztunnel with a low-level Linux file descriptor representing the pod’s network namespace.
    • While typically sockets are created within a Linux network namespace by the process actually running inside that network namespace, it is perfectly possible to leverage Linux’s low-level socket API to allow a process running in one network namespace to create listening sockets in another network namespace, assuming the target network namespace is known at creation time.
  • The node-local ztunnel internally spins up a new proxy instance and listen port set, dedicated to the newly-added pod.
  • Once the in-Pod redirect rules are in place and the ztunnel has established the listen ports, the pod is added in the mesh and traffic begins flowing thru the node-local ztunnel, as before.

Here’s a basic diagram showing the flow of application pod being added to the ambient mesh:

Once the pod is successfully added to the ambient mesh, traffic to and from pods in the mesh will be fully encrypted with mTLS by default, as always with Istio.

Traffic will now enter and leave the pod network namespace as encrypted traffic - it will look like every pod in the ambient mesh has the ability to enforce mesh policy and securely encrypt traffic, even though the user application running in the pod has no awareness of either.

Here’s a diagram to illustrate how encrypted traffic flows between pods in the ambient mesh in the new model:

And, as before, unencrypted plaintext traffic from outside the mesh can still be handled and policy enforced, for use cases where that is necessary:

The new ambient traffic redirection: what this gets us

The end result of the new ambient capture model is that all traffic capture and redirection happens inside the pod’s network namespace. To the node, the CNI, and everything else, it looks like there is a sidecar proxy inside the pod, even though there is no sidecar proxy running in the pod at all. Remember that the job of CNI implementations is to get packets to and from the pod. By design and by the CNI spec, they do not care what happens to packets after that point.

This approach automatically eliminates conflicts with a wide range of CNI and NetworkPolicy implementations, and drastically improves Istio ambient mesh compatibility with all major managed Kubernetes offerings across all major CNIs.

Wrapping up

Thanks to significant amounts of effort from our lovely community in testing the change with a large variety of Kubernetes platforms and CNIs, and many rounds of reviews from Istio maintainers, we are glad to announce that the ztunnel and istio-cni PRs implementing this feature merged to Istio 1.21 and are enabled by default for ambient, so Istio users can start running ambient mesh on any Kubernetes platforms with any CNIs in Istio 1.21 or newer. We’ve tested this with GKE, AKS, and EKS and all the CNI implementations they offer, as well as with 3rd-party CNIs like Calico and Cilium, as well as platforms like OpenShift, with solid results.

We are extremely excited that we are able to move Istio ambient mesh forward to run everywhere with this innovative in-Pod traffic redirection approach between ztunnel and users’ application pods. With this top technical hurdle to ambient beta resolved, we can’t wait to work with the rest of the Istio community to get ambient mesh to beta soon! To learn more about ambient mesh’s beta progress, join us in the #ambient and #ambient-dev channel in Istio’s slack, or attend the weekly ambient contributor meeting on Wednesdays, or check out the ambient mesh beta project board and help us fix something!

]]>
Mon, 29 Jan 2024 00:00:00 +0000/v1.24//blog/2024/inpod-traffic-redirection-ambient//v1.24//blog/2024/inpod-traffic-redirection-ambient/AmbientIstioCNIztunneltraffic
Istio in Paris! See you at KubeCon Europe 2024There will be lots of Istio-related activity at KubeCon + CloudNativeCon Europe in Paris! We’ll keep this page updated with more details as they are published.

See you soon in Paris!

]]>
Fri, 19 Jan 2024 00:00:00 +0000/v1.24//blog/2024/kubecon-eu//v1.24//blog/2024/kubecon-eu/Istio DayIstioconferenceKubeConCloudNativeCon
Routing egress traffic to wildcard destinationsIf you are using Istio to handle application-originated traffic to destinations outside of the mesh, you’re probably familiar with the concept of egress gateways. Egress gateways can be used to monitor and forward traffic from mesh-internal applications to locations outside of the mesh. This is a useful feature if your system is operating in a restricted environment and you want to control what can be reached on the public internet from your mesh.

The use-case of configuring an egress gateway to handle arbitrary wildcard domains had been included in the official Istio docs up until version 1.13, but was subsequently removed because the documented solution was not officially supported or recommended and was subject to breakage in future versions of Istio. Nevertheless, the old solution was still usable with Istio versions before 1.20. Istio 1.20, however, dropped some Envoy functionality that was required for the approach to work.

This post attempts to describe how we resolved the issue and filled the gap with a similar approach using Istio version-independent components and Envoy features, but without the need for a separate Nginx SNI proxy. Our approach allows users of the old solution to seamlessly migrate configurations before their systems face the breaking changes in Istio 1.20.

Problem to solve

The currently documented egress gateway use-cases rely on the fact that the target of the traffic (the hostname) is statically configured in a VirtualService, telling Envoy in the egress gateway pod where to TCP proxy the matching outbound connections. You can use multiple, and even wildcard, DNS names to match the routing criteria, but you are not able to route the traffic to the exact location specified in the application request. For example you can match traffic for targets *.wikipedia.org, but you then need to forward the traffic to a single final target, e.g., en.wikipedia.org. If there is another service, e.g., anyservice.wikipedia.org, that is not hosted by the same server(s) as en.wikipedia.org, the traffic to that host will fail. This is because, even though the target hostname in the TLS handshake of the HTTP payload contains anyservice.wikipedia.org, the en.wikipedia.org servers will not be able to serve the request.

The solution to this problem at a high level is to inspect the original server name (SNI extension) in the application TLS handshake (which is sent in plain-text, so no TLS termination or other man-in-the-middle operation is needed) in every new gateway connection and use it as the target to dynamically TCP proxy the traffic leaving the gateway.

When restricting egress traffic via egress gateways, we need to lock down the egress gateways so that they can only be used by clients within the mesh. This is achieved by enforcing ISTIO_MUTUAL (mTLS peer authentication) between the application sidecar and the gateway. That means that there will be two layers of TLS on the application L7 payload. One that is the application originated end-to-end TLS session terminated by the final remote target, and another one that is the Istio mTLS session.

Another thing to keep in mind is that in order to mitigate any potential application pod corruption, the application sidecar and the gateway should both perform hostname list checks. This way, any compromised application pod will still only be able to access the allowed targets and nothing more.

Low-level Envoy programming to the rescue

Recent Envoy releases include a dynamic TCP forward proxy solution that uses the SNI header on a per- connection basis to determine the target of an application request. While an Istio VirtualService cannot configure a target like this, we are able to use EnvoyFilters to alter the Istio generated routing instructions so that the SNI header is used to determine the target.

To make it all work, we start by configuring a custom egress gateway to listen for the outbound traffic. Using a DestinationRule and a VirtualService we instruct the application sidecars to route the traffic (for a selected list of hostnames) to that gateway, using Istio mTLS. On the gateway pod side we build the SNI forwarder with the EnvoyFilters, mentioned above, introducing internal Envoy listeners and clusters to make it all work. Finally, we patch the internal destination of the gateway-implemented TCP proxy to the internal SNI forwarder.

The end-to-end request flow is shown in the following diagram:

Egress SNI routing with arbitrary domain names

This diagram shows an egress HTTPS request to en.wikipedia.org using SNI as a routing key.

  • Application container

    Application originates HTTP/TLS connection towards the final destination. Puts destination’s hostname into the SNI header. This TLS session is not decrypted inside the mesh. Only SNI header is inspected (as it is in cleartext).

  • Sidecar proxy

    Sidecar intercepts traffic to matching hostnames in the SNI header from the application originated TLS sessions. Based on the VirtualService, the traffic is routed to the egress gateway while wrapping original traffic into Istio mTLS as well. Outer TLS session has the gateway Service address in the SNI header.

  • Mesh listener

    A dedicated listener is created in the Gateway that mutually authenticates the Istio mTLS traffic. After the outer Istio mTLS termination, it unconditionally sends the inner TLS traffic with a TCP proxy to the other (internal) listener in the same Gateway.

  • SNI forwarder

    Another listener with SNI forwarder performs a new TLS header inspection for the original TLS session. If the inner SNI hostname matches the allowed domain names (including wildcards), it TCP proxies the traffic to the destination, read from the header per connection. This listener is internal to Envoy (allowing it to restart traffic processing to see the inner SNI value), so that no pods (inside or outside the mesh) can connect to it directly. This listener is 100% manually configured through EnvoyFilter.

Deploy the sample

In order to deploy the sample configuration, start by creating the istio-egress namespace and then use the following YAML to deploy an egress gateway, along with some RBAC and its Service. We use the gateway injection method to create the gateway in this example. Depending on your install method, you may want to deploy it differently (for example, using an IstioOperator CR or using Helm).

# New k8s cluster service to put egressgateway into the Service Registry,
# so application sidecars can route traffic towards it within the mesh.
apiVersion: v1
kind: Service
metadata:
  name: egressgateway
  namespace: istio-egress
spec:
  type: ClusterIP
  selector:
    istio: egressgateway
  ports:
  - port: 443
    name: tls-egress
    targetPort: 8443

---
# Gateway deployment with injection method
apiVersion: apps/v1
kind: Deployment
metadata:
  name: istio-egressgateway
  namespace: istio-egress
spec:
  selector:
    matchLabels:
      istio: egressgateway
  template:
    metadata:
      annotations:
        inject.istio.io/templates: gateway
      labels:
        istio: egressgateway
        sidecar.istio.io/inject: "true"
    spec:
      containers:
      - name: istio-proxy
        image: auto # The image will automatically update each time the pod starts.
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsUser: 1337
          runAsGroup: 1337

---
# Set up roles to allow reading credentials for TLS
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: istio-egressgateway-sds
  namespace: istio-egress
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "watch", "list"]
- apiGroups:
  - security.openshift.io
  resourceNames:
  - anyuid
  resources:
  - securitycontextconstraints
  verbs:
  - use

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: istio-egressgateway-sds
  namespace: istio-egress
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: istio-egressgateway-sds
subjects:
- kind: ServiceAccount
  name: default

Verify the gateway pod is up and running in the istio-egress namespace and then apply the following YAML to configure the gateway routing:

# Define a new listener that enforces Istio mTLS on inbound connections.
# This is where sidecar will route the application traffic, wrapped into
# Istio mTLS.
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: egressgateway
  namespace: istio-system
spec:
  selector:
    istio: egressgateway
  servers:
  - port:
      number: 8443
      name: tls-egress
      protocol: TLS
    hosts:
      - "*"
    tls:
      mode: ISTIO_MUTUAL

---
# VirtualService that will instruct sidecars in the mesh to route the outgoing
# traffic to the egress gateway Service if the SNI target hostname matches
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: direct-wildcard-through-egress-gateway
  namespace: istio-system
spec:
  hosts:
    - "*.wikipedia.org"
  gateways:
  - mesh
  - egressgateway
  tls:
  - match:
    - gateways:
      - mesh
      port: 443
      sniHosts:
        - "*.wikipedia.org"
    route:
    - destination:
        host: egressgateway.istio-egress.svc.cluster.local
        subset: wildcard
# Dummy routing instruction. If omitted, no reference will point to the Gateway
# definition, and istiod will optimise the whole new listener out.
  tcp:
  - match:
    - gateways:
      - egressgateway
      port: 8443
    route:
    - destination:
        host: "dummy.local"
      weight: 100

---
# Instruct sidecars to use Istio mTLS when sending traffic to the egress gateway
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: egressgateway
  namespace: istio-system
spec:
  host: egressgateway.istio-egress.svc.cluster.local
  subsets:
  - name: wildcard
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

---
# Put the remote targets into the Service Registry
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: wildcard
  namespace: istio-system
spec:
  hosts:
    - "*.wikipedia.org"
  ports:
  - number: 443
    name: tls
    protocol: TLS

---
# Access logging for the gateway
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: mesh-default
  namespace: istio-system
spec:
  accessLogging:
    - providers:
      - name: envoy

---
# And finally, the configuration of the SNI forwarder,
# it's internal listener, and the patch to the original Gateway
# listener to route everything into the SNI forwarder.
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: sni-magic
  namespace: istio-system
spec:
  configPatches:
  - applyTo: CLUSTER
    match:
      context: GATEWAY
    patch:
      operation: ADD
      value:
        name: sni_cluster
        load_assignment:
          cluster_name: sni_cluster
          endpoints:
          - lb_endpoints:
            - endpoint:
                address:
                  envoy_internal_address:
                    server_listener_name: sni_listener
  - applyTo: CLUSTER
    match:
      context: GATEWAY
    patch:
      operation: ADD
      value:
        name: dynamic_forward_proxy_cluster
        lb_policy: CLUSTER_PROVIDED
        cluster_type:
          name: envoy.clusters.dynamic_forward_proxy
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.clusters.dynamic_forward_proxy.v3.ClusterConfig
            dns_cache_config:
              name: dynamic_forward_proxy_cache_config
              dns_lookup_family: V4_ONLY

  - applyTo: LISTENER
    match:
      context: GATEWAY
    patch:
      operation: ADD
      value:
        name: sni_listener
        internal_listener: {}
        listener_filters:
        - name: envoy.filters.listener.tls_inspector
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector

        filter_chains:
        - filter_chain_match:
            server_names:
            - "*.wikipedia.org"
          filters:
            - name: envoy.filters.network.sni_dynamic_forward_proxy
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.sni_dynamic_forward_proxy.v3.FilterConfig
                port_value: 443
                dns_cache_config:
                  name: dynamic_forward_proxy_cache_config
                  dns_lookup_family: V4_ONLY
            - name: envoy.tcp_proxy
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
                stat_prefix: tcp
                cluster: dynamic_forward_proxy_cluster
                access_log:
                - name: envoy.access_loggers.file
                  typed_config:
                    "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
                    path: "/dev/stdout"
                    log_format:
                      text_format_source:
                        inline_string: '[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%
                          %PROTOCOL%" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS%
                          "%UPSTREAM_TRANSPORT_FAILURE_REASON%" %BYTES_RECEIVED% %BYTES_SENT% %DURATION%
                          %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% "%REQ(X-FORWARDED-FOR)%" "%REQ(USER-AGENT)%"
                          "%REQ(X-REQUEST-ID)%" "%REQ(:AUTHORITY)%" "%UPSTREAM_HOST%" %UPSTREAM_CLUSTER%
                          %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS%
                          %REQUESTED_SERVER_NAME% %ROUTE_NAME%

                          '
  - applyTo: NETWORK_FILTER
    match:
      context: GATEWAY
      listener:
        filterChain:
          filter:
            name: "envoy.filters.network.tcp_proxy"
    patch:
      operation: MERGE
      value:
        name: envoy.tcp_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
          stat_prefix: tcp
          cluster: sni_cluster

Check the istiod and gateway logs for any errors or warnings. If all went well, your mesh sidecars are now routing *.wikipedia.org requests to your gateway pod while the gateway pod is then forwarding them to the exact remote host specified in the application request.

Try it out

Following other Istio egress examples, we will use the sleep pod as a test source for sending requests. Assuming automatic sidecar injection is enabled in your default namespace, deploy the test app using the following command:

$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/sleep/sleep.yaml

Get your sleep and gateway pods:

$ export SOURCE_POD=$(kubectl get pod -l app=sleep -o jsonpath={.items..metadata.name})
$ export GATEWAY_POD=$(kubectl get pod -n istio-egress -l istio=egressgateway -o jsonpath={.items..metadata.name})

Run the following command to confirm that you are able to connect to the wikipedia.org site:

$ kubectl exec "$SOURCE_POD" -c sleep -- sh -c 'curl -s https://en.wikipedia.org/wiki/Main_Page | grep -o "<title>.*</title>"; curl -s https://de.wikipedia.org/wiki/Wikipedia:Hauptseite | grep -o "<title>.*</title>"'
<title>Wikipedia, the free encyclopedia</title>
<title>Wikipedia – Die freie Enzyklopädie</title>

We could reach both English and German wikipedia.org subdomains, great!

Normally, in a production environment, we would block external requests that are not configured to redirect through the egress gateway, but since we didn’t do that in our test environment, let’s access another external site for comparison:

$ kubectl exec "$SOURCE_POD" -c sleep -- sh -c 'curl -s https://cloud.ibm.com/login | grep -o "<title>.*</title>"'
<title>IBM Cloud</title>

Since we have access logging turned on globally (with the Telemetry CR in the manifest), we can now inspect the logs to see how the above requests were handled by the proxies.

First, check the gateway logs:

$ kubectl logs -n istio-egress $GATEWAY_POD
[...]
[2023-11-24T13:21:52.798Z] "- - -" 0 - - - "-" 813 111152 55 - "-" "-" "-" "-" "185.15.59.224:443" dynamic_forward_proxy_cluster 172.17.5.170:48262 envoy://sni_listener/ envoy://internal_client_address/ en.wikipedia.org -
[2023-11-24T13:21:52.798Z] "- - -" 0 - - - "-" 1531 111950 55 - "-" "-" "-" "-" "envoy://sni_listener/" sni_cluster envoy://internal_client_address/ 172.17.5.170:8443 172.17.34.35:55102 outbound_.443_.wildcard_.egressgateway.istio-egress.svc.cluster.local -
[2023-11-24T13:21:53.000Z] "- - -" 0 - - - "-" 821 92848 49 - "-" "-" "-" "-" "185.15.59.224:443" dynamic_forward_proxy_cluster 172.17.5.170:48278 envoy://sni_listener/ envoy://internal_client_address/ de.wikipedia.org -
[2023-11-24T13:21:53.000Z] "- - -" 0 - - - "-" 1539 93646 50 - "-" "-" "-" "-" "envoy://sni_listener/" sni_cluster envoy://internal_client_address/ 172.17.5.170:8443 172.17.34.35:55108 outbound_.443_.wildcard_.egressgateway.istio-egress.svc.cluster.local -

There are four log entries, representing two of our three curl requests. Each pair shows how a single request flows through the envoy traffic processing pipeline. They are printed in reverse order, but we can see the 2nd and the 4th line show that the requests arrived at the gateway service and were passed through the internal sni_cluster target. The 1st and 3rd line show that the final target is determined from the inner SNI header, i.e., the target host set by the application. The request is forwarded to dynamic_forward_proxy_cluster which finally sends on the request from Envoy to the remote target.

Great, but where is the third request to IBM Cloud? Let’s check the sidecar logs:

$ kubectl logs $SOURCE_POD -c istio-proxy
[...]
[2023-11-24T13:21:52.793Z] "- - -" 0 - - - "-" 813 111152 61 - "-" "-" "-" "-" "172.17.5.170:8443" outbound|443|wildcard|egressgateway.istio-egress.svc.cluster.local 172.17.34.35:55102 208.80.153.224:443 172.17.34.35:37020 en.wikipedia.org -
[2023-11-24T13:21:52.994Z] "- - -" 0 - - - "-" 821 92848 55 - "-" "-" "-" "-" "172.17.5.170:8443" outbound|443|wildcard|egressgateway.istio-egress.svc.cluster.local 172.17.34.35:55108 208.80.153.224:443 172.17.34.35:37030 de.wikipedia.org -
[2023-11-24T13:21:55.197Z] "- - -" 0 - - - "-" 805 15199 158 - "-" "-" "-" "-" "104.102.54.251:443" PassthroughCluster 172.17.34.35:45584 104.102.54.251:443 172.17.34.35:45582 cloud.ibm.com -

As you can see, Wikipedia requests were sent through the gateway while the request to IBM Cloud went straight out from the application pod to the internet, as indicated by the PassthroughCluster log.

Conclusion

We implemented controlled routing for egress HTTPS/TLS traffic using egress gateways, supporting arbitrary and wildcard domain names. In a production environment, the example shown in this post would be extended to support HA requirements (e.g., adding zone aware gateway Deployments, etc.) and to restrict the direct external network access of your application so that the application can only access the public network through the gateway, which is limited to a predefined set of remote hostnames.

The solution scales easily. You can include multiple domain names in the configuration, and they will be allow-listed as soon as you roll it out! No need to configure per domain VirtualServices or other routing details. Be careful, however, as the domain names are listed in multiple places in the config. If you use tooling for CI/CD (e.g., Kustomize), it’s best to extract the domain name list into a single place from which you can render into the required configuration resources.

That’s all! I hope this was helpful. If you’re an existing user of the previous Nginx-based solution, you can now migrate to this approach before upgrading to Istio 1.20, which will otherwise disrupt your current setup.

Happy SNI routing!

References

]]>
Fri, 01 Dec 2023 00:00:00 +0000/v1.24//blog/2023/egress-sni//v1.24//blog/2023/egress-sni/traffic-managementgatewaymeshmtlsegressremote
Istio at KubeCon North America 2023The open source and cloud native community gathered from the 6th to the 9th of November in Chicago for the final KubeCon of 2023. The four-day conference, organized by the Cloud Native Computing Foundation, was “twice the fun” for Istio, as we grew from a half-day event in Europe in April to a full day co-located event. To add to the excitement, Istio Day North America marked our first event as a CNCF graduated project.

With Istio Day NA over, that’s a wrap for our major community events for 2023. In case you missed them, Istio Day Europe was held in April, and alongside our Virtual IstioCon 2023 event, IstioCon China 2023 was held on September 26 in Shanghai, China.

Istio Day kicked off with an opening keynote from the Program Committee chairs, Faseela K and Zack Butcher. The keynote made sure to recognize the day-to-day efforts of our contributors, maintainers, release managers, and users, with some awards for our top contributors and community helpers. Rob Salmond and Andrea Ma were recognized for their selfless efforts in the Istio community, and the top 20 contributors in the last 6 months were also called out.

Top 20 contributors who were in attendance were asked to come onto the stage

The opening keynote also announced the availability of the Istio Certified Associate (ICA) exam for enrollment starting November 6th.

We were also proud to showcase a small video of many of our contributors, vendors and end-users congratulating us for the CNCF graduation!

The keynote was followed by an end user talk by Kush Trivedi and Khushboo Mittal from DevRev about their usage of Istio. We had a much-awaited session on architecting ambient for scale from John Howard, which stirred some interesting discussions in the community. We also had an interesting talk showcasing the collaboration between Lilt and Intel about Scaling AI powered translation services using Istio.

After this we stepped into another end user talk from Intuit where Karim Lakhani explained about Intuit’s modern SaaS platform deploying multiple cloud native projects including Istio. The audience was excited when Mitch Connors and Christian Hernandez did a live demo of upgrading Istio ambient mesh with Argo on a live public site, with a publicly accessible availability monitor.

Jam-packed sessions at Istio Day

The event witnessed more focus on security in subsequent talks with Jackie Elliot from Microsoft taking a dig into Istio Identity, followed by a lightning talk from Kush Mansing from Speedscale showing the impacts of running services with arbitrary code on Istio. We also had a lightning talk from Xiangfeng Zhu, a PhD student at the University of Washington, where he showcased a tool developed to analyze and predict the performance overhead of Istio.

The talk from the Kiali maintainers Jay Shaughnessy and Nick Fox, was very interesting, as it demonstrated many advanced ways of using Kiali for better debugging of Istio use cases. Ekansh Gupta from Zeta, and Nirupama Singh from Reskill pitched in another end user talk explaining the best practices while upgrading Istio in their production deployments.

Istio multi-cluster is always a hot topic, and Lukonde Mwila and Ovidiu from AWS nailed it in the talk on bridging trust between multi-cluster meshes.

We also had an interactive panel discussion with the Istio TOC Members, where a lot of questions came in from the audience, and the good attendance for the discussion was a testament to the continued popularity of Istio. Istio Day concluded with a brilliant workshop on getting started with ambient mesh from Christian Posta and Jim Barton from Solo.io, which is the hot topic all of the audience were looking forward to.

The slides for all the sessions can be found in the Istio Day NA 2023 schedule.

Kush Trivedi and Khushboo Mittal from DevRev on stage

Our presence at the conference did not end with Istio Day. The first day keynote of KubeCon + CloudNativeCon started with a project update video from Mitch Connors. It was also a proud moment for us, when two of our contributors, Lin Sun and Faseela K, took home the prestigious CNCF community “Chop Wood Carry Water” award, presented by Chris Aniszczyk, CTO CNCF, at the second day keynote.

Chop Wood Carry Water winners, Faseela K and Lin Sun (second and third from left)

Some of our maintainers and contributors made it to the CNCF Fall 2023 Ambassadors list as well, Lin Sun, Mitch Connors, and Faseela K, to name a few.

The CNCF Ambassador group photo. Many Istio maintainers are in this picture!

The KubeCon maintainer track session for Istio, presented by TOC members John Howard and Louis Ryan, grabbed great attention as they talked about the current ongoing efforts and future roadmap of Istio. The technologies described in the talk, and the resulting size of the audience, underlined why Istio continues to be the most popular service mesh in the industry.

The Contribfest Hands-on Development and Contribution Workshop by Lin Sun, Eric Van Norman, Steven Landow, and Faseela K was also well received. It was great to see so many people interested in contributing to Istio and pushing their first pull request at the end of the workshop.

A much-awaited panel discussion on Service Mesh Battle Scars: Technology, Timing and Tradeoffs, led by the maintainers from three CNCF Service Mesh projects, had a huge crowd in attendance, and a lot of interesting discussions.

Istio came up as a hot topic of discussion in several other KubeCon talks as well. Here are a few we noticed:

Istio had a kiosk in the project pavilion, with the majority of questions asked being around the schedule for ambient mesh being production ready.

Discussions at the Istio kiosk

We are glad that the major question which we had at the Istio kiosk in Europe — the schedule for CNCF graduation — has been answered, and we assured everyone that we are working on ambient mesh with the same level of seriousness.

Many of our members and maintainers offered support at our kiosk, helping us answer all the questions from our users.

Members and maintainers at the Istio kiosk

Another highlight of our kiosk was that we had new Istio T-shirts sponsored by Microsoft, Solo.io, Stackgenie and Tetrate for everyone to grab!

A new crop of Istio T-shirts

We would like to express our heartfelt gratitude to our platinum sponsors Google Cloud, for supporting Istio Day North America! Last but not least, we would like to thank our Istio Day Program Committee members, for all their hard work and support!

See you in Paris in March 2024!

]]>
Thu, 16 Nov 2023 00:00:00 +0000/v1.24//blog/2023/istio-at-kubecon-na//v1.24//blog/2023/istio-at-kubecon-na/Istio DayIstioConIstioconferenceKubeConCloudNativeCon
Secure Application Communications with Mutual TLS and IstioOne of the biggest reasons users adopt service mesh is to enable secure communication among applications using mutual TLS (mTLS) based on cryptographically verifiable identities. In this blog, we’ll discuss the requirements of secure communication among applications, how mTLS enables and meets all those requirements, along with simple steps to get you started with enabling mTLS among your applications using Istio.

What do you need to secure the communications among your applications?

Modern cloud native applications are frequently distributed across multiple Kubernetes clusters or virtual machines. New versions are being staged frequently and they can rapidly scale up and down based on user requests. As modern applications gain resource utilization efficiency by not being dependent on co-location, it is paramount to be able to apply access policy to and secure the communications among these distributed applications due to increased multiple entry points resulting in a larger attack surface. To ignore this is to invite massive business risk from data loss, data theft, forged data, or simple mishandling.

The following are the common key requirements for secure communications between applications:

Identities

Identity is a fundamental component of any security architecture. Before your applications can send their data securely, identities must be established for the applications. This establishing an identity process is called identity validation - it involves some well-known, trusted authority performing one or more checks on the application workload to establish that it is what it claims to be. Once the authority is satisfied, it grants the workload an identity.

Consider the act of being issued a passport - you will request one from some authority, that authority will probably ask you for several different identity validations that prove you are who you say you are - a birth certificate, current address, medical records, etc. Once you have satisfied all the identity validations, you will (hopefully) be granted the identity document. You can give that identity document to someone else as proof that you have satisfied all the identity validation requirements of the issuing authority, and if they trust the issuing authority (and the identity document itself), they can trust what it says about you (or they can contact the trusted authority and verify the document).

An identity could take any form, but, as with any form of identity document, the weaker the identity validations are, the easier it is to forge, and the less useful that identity document is to anyone using it to make a decision. That’s why, in computing, cryptographically verifiable identities are so important - they are signed by a verifiable authority, similar to your passport and driver’s license. Identities based around anything less are a security weakness that is relatively easy to exploit.

Your system may have identities derived from network properties such as IP addresses with distributed identity caches that track the mapping between identities and these network properties. These identities don’t have strong guarantees as cryptographically verifiable identities because IP addresses could be re-allocated to different workloads and identity caches may not always be updated to the latest.

Using cryptographically verifiable identities for your applications is desired, because exchanging cryptographically verifiable identities for applications during connection establishment is inherently more reliable and secure than systems dependent on mapping IP addresses to identities. These systems depend on distributed identity caches with eventual consistency and staleness issues which could create a structural weakness in Kubernetes, where high rates of automated pod churn are the norm.

Confidentiality

Encrypting the data transmitted among applications is critical - because in a world where breaches are common, costly, and effectively trivial, relying entirely on secure internal environments or other security perimeters has long since ceased to be adequate. To prevent a man-in-the-middle attack, you require a unique encryption channel for a source-destination pair because you want a strong identity uniqueness guarantee to avoid confused deputy problems. In other words, it is not enough to simply encrypt the channel - it must be encrypted using unique keys directly derived from the unique source and destination identities so that only the source and destination can decrypt the data. Further, you may need to customize the encryption, e.g. by choosing specific ciphers, in accordance with what your security team requires.

Integrity

The encrypted data sent over the network from source to destination can’t be modified by any identities other than the source and destination once it is sent. In other words, data received is the same as data sent. If you don’t have data integrity, someone in the middle could modify some bits or the entire content of the data during the communication between the source and destination.

Access Policy Enforcement

Application owners need to apply access policies to their applications and have them enforced properly, consistently, and unambiguously. In order to apply policy for both ends of a communication channel, we need an application identity for each end. Once we have a cryptographically verifiable identity with an unambiguous provenance chain for both ends of a potential communication channel, we can begin to apply policies about who can communicate with what. Standard TLS, the widely used cryptographic protocol that secures communication between clients (e.g., web browsers) and servers (e.g., web servers), only really verifies and mandates an identity for one side - the server. But for comprehensive end-to-end policy enforcement, it is critical to have a reliable, verifiable, unambiguous identity for both sides - client and server. This is a common requirement for internal applications - imagine for example a scenario where only a frontend application should call the GET method for a backend checkout application, but should not be allowed to call the POST or DELETE method. Or a scenario where only applications that have a JWT token issued by a particular JWT issuer can call the GET method for a checkout application. By leveraging cryptographic identities on both ends, we can ensure powerful access policies are enforced correctly, securely, and reliably, with a validatable audit trail.

FIPS compliance

Federal Information Processing Standards (FIPS) are standards and guidelines for federal computer systems that are developed by National Institute of Standards and Technology (NIST). Not everyone requires FIPS compliance, but FIPS compliance means meeting all the necessary security requirements established by the U.S. government for protecting sensitive information. It is required when working with the federal government. To follow the guidelines developed by the U.S. government relating to cybersecurity, many in the private sector voluntarily use these FIPS standards.

To illustrate the above secure application requirements (identity, confidentiality and integrity), let’s use the example that the frontend application calls the checkout application. Remember, you can think of ID in the diagram as any kind of identity document such as a government issued passport, photo identifier:

Requirements when the frontend calls the checkout application

How does mTLS satisfy the above requirements?

TLS 1.3 (the most recent TLS version at the time of writing) specification’s primary goal is to provide a secure channel between two communicating peers. The TLS secure channel has the following properties:

  1. Authentication: the server side of the channel is always authenticated, the client side is optionally authenticated. When the client is also authenticated, the secure channel becomes a mutual TLS channel.
  2. Confidentiality: Data is encrypted and only visible to the client and server. Data must be encrypted using keys that are unambiguously cryptographically bound to the source and destination identity documents in order to reliably protect the application-layer traffic.
  3. Integrity: data sent over the channel can’t be modified without detection. This is guaranteed by the fact that only source and destination have the key to encrypt and decrypt the data for a given session.

mTLS internals

We’ve established that cryptographically verifiable identities are key for securing channels and supporting access policy enforcement, and we’ve established that mTLS is a battle-tested protocol that mandates some extremely important guarantees for using cryptographically verifiable identities on both ends of a channel - let’s get into some detail on how the mTLS protocol actually works under the hood.

Handshake protocol

The handshake protocol authenticates the communicating peers, negotiates cryptographic modes and parameters, and establishes shared keying material. In other words, the role of the handshake is to verify the communicating peers’ identities and negotiate a session key, so that the rest of the connection can be encrypted based on the session key. When your applications make a mTLS connection, server and client negotiate a cipher suite, which dictates what encryption algorithm your applications will use for the rest of the connection and your applications also negotiate the cryptographic session key to use. The whole handshake is designed to resist tampering - interference by any entities that do not possess the same unique, cryptographically verifiable identity document as the source and/or destination will be rejected. For this reason, it is important to check the whole handshake and verify its integrity before any communicating peer continues with the application data.

The handshake can be thought of as having three phases per the handshake protocol overview in the TLS 1.3 specification - again, let’s use the example of a frontend application calling a backend checkout application:

  1. Phase 1: frontend and checkout negotiates the cryptographic parameters and encryption keys that can be used to protect the rest of the handshake and traffic data.
  2. Phase 2: everything in this phase and after are encrypted. In this phase, frontend and checkout establish other handshake parameters, and whether or not the client is also authenticated - that is, mTLS.
  3. Phase 3: frontend authenticates checkout via its cryptographically verifiable identity (and, in mTLS, checkout authenticates frontend in the same way).

There are a few major differences since TLS 1.2 related to handshake, refer to the TLS 1.3 specification for more details:

  1. All handshake messages (phase 2 and 3) are encrypted using the encryption keys negotiated in phase 1.
  2. Legacy symmetric encryption algorithms have been pruned.
  3. A zero round-trip time (0-RTT) mode was added, saving a round trip at connection setup.

Record protocol

Having negotiated the TLS protocol version, session-key & HMAC during the handshake phase, the peers can now securely exchange encrypted data that is chunked by the record protocol. It is critical (and required as part of the spec) to use the exact same negotiated parameters from the handshake to encrypt the traffic to ensure the traffic confidentiality and integrity.

Putting the two protocols from the TLS 1.3 specification together and using the frontend and checkout applications to illustrate the flow as below:

mTLS flows when the frontend calls the checkout application

Who issues the identity certificates for frontend and checkout? They are commonly issued by a certificate authority (CA) which either has its own root certificate or uses an intermediate certificate from its root CA. A root certificate is basically a public key certificate that identifies a root CA, which you likely already have in your organization. The root certificate is distributed to frontend (or checkout) in addition to its own root-signed identity certificate. This is how everyday, basic Public Key Infrastructure (PKI) works - a CA has responsibility for validating an entity’s identity document, and then grants it an unforgeable identity document in the form of a certificate.

You can rely on your CA and intermediate CAs as source of identity truth in a structural fashion that maintains high availability and stable, persistently-verifiable identity guarantees in a way that a massive distributed cache of IP and identity maps simply cannot. When the frontend and checkout identity certificates are issued by the same root certificate, frontend and checkout can verify their peer identities consistently and reliably regardless of which cluster or nodes or scale they run.

You learned about how mTLS provides cryptographic identity, confidentiality and integrity, what about scalability as you grow to thousands or more applications among multiple clusters? If you establish a single root certificate across multiple clusters, the system doesn’t need to care when your application gets a connection request from another cluster as long as it is trusted by the root certificate - the system knows the identity on the connection is cryptographically verified. As your application pod changes IP or is redeployed to a different cluster or network, your application (or component acting on behalf of it) simply originates the traffic with its trusted certificate minted by the CA to the destination. It can be 500+ network hops or can be direct; your access policies for your application are enforced in the same fashion regardless of the topology, without needing to keep track of the identity cache and calculate which IP address maps to which application pod.

What about FIPS compliance? Per TLS 1.3 specification, TLS-compliant applications must implement the TLS_AES_128_GCM_SHA256 cipher suite, and are recommended to implement TLS_AES_256_GCM_SHA384, both of which are also in the guidelines for TLS by NIST. RSA or ECDSA server certificates are also recommended by both TLS 1.3 specification and NIST’s guideline for TLS. As long as you use mTLS and FIPS 140-2 or 140-3 compliant cryptographic modules for your mTLS connections, you will be on the right path for FIPS 140-2 or 140-3 validation.

What could go wrong

It is critical to implement mTLS exactly as the TLS 1.3 specification dictates. Without using proper mTLS following the TLS specification, here are a few things that can go wrong without detection:

What if someone in the middle of the connection silently captures the encrypted data?

If the connection doesn’t follow exactly the handshake and record protocols as outlined in the TLS specification, for example, the connection follows the handshake protocol but not using the negotiated session key and parameters from the handshake in the record protocol, you may have your connection’s handshake unrelated to the record protocol where identities could be different between the handshake and record protocols. TLS requires that the handshake and record protocols share the same connection because separating them increases the attack surface for man-in-the-middle attacks.

A mTLS connection has a consistent end-to-end security from start of the handshake to finish. The encrypted data is encrypted with the session key negotiated using the public key in the certificate. Only the source and destination can decrypt the data with the private key. In other words, only the owner of the certificate who has the private key can decrypt the data. Unless a hacker has control of the private key of the certificate, he or she doesn’t have a way to mess around with the mTLS connection to successfully execute a man-in-the-middle attack.

What if either source or destination identity is not cryptographically secure?

If the identity is based on network properties such as IP address, which could be re-allocated to other pods, the identity can’t be validated using cryptographic techniques. Since this type of identity isn’t based on cryptographic identity, your system likely has an identity cache to track the mapping between the identity, the pod’s network labels, the corresponding IP address and the Kubernetes node info where the pod is deployed. With an identity cache, you could run into pod IP addresses being reused and identity mistaken where policy isn’t enforced properly when the identity cache gets out of sync for a short period of time. For example, if you don’t have cryptographic identity on the connection between the peers, your system would have to get the identity from the identity cache which could be outdated or incomplete.

These identity caches that map identity to workload IPs are not ACID (Atomicity, Consistency, Isolation, and Durability) and you want your security system to be applied to something with strong guarantees. Consider the following properties and questions you may want to ask yourself:

  • Staleness: How can a peer verify that an entry in the cache is current?
  • Incompleteness: If there’s a cache miss and the system fails to close the connection, does the network become unstable when it’s only the cache synchronizer that is failing?
  • What if something simply doesn’t have an IP? For example, an AWS Lambda service doesn’t by default have a public IP.
  • Non-transactional: If you read the identity twice will you see the same value? If you are not careful in your access policy or auditing implementation this can cause real issues.
  • Who will guard the guards themselves? Are there established practices to protect the cache like a CA has? What proof do you have that the cache has not been tampered with? Are you forced to reason about (and audit) the security of some complex infrastructure that is not your CA?

Some of the above are worse than others. You can apply the failing closed principle but that does not solve all of the above.

Identities are also used in enforcing access policies such as authorization policy, and these access policies are in the request path where your system has to make decisions fast to allow or deny the access. Whenever identities become mistaken, access policies could be bypassed without being detected or audited. For example, your identity cache may have your checkout pod’s prior allocated IP address associated as one of the checkout identities. If the checkout pod gets recycled and the same IP address is just allocated to one of the frontend pods, that frontend pod could have the checkout’s identity before the cache is updated, which could cause wrong access policies to be enforced.

Let us illustrate the identity cache staleness problem assuming the following large scale multi-cluster deployment:

  1. 100 clusters where each cluster has 100 nodes with 20 pods per node. The number of total pods is 200,000.
  2. 0.25% of pods are being churned at all times (rollout, restarts, recovery, node churn, …), each churn is a 10 second window.
  3. 500 pods which are being churned are distributed to 10,000 nodes (caches) every 10 secs
  4. If the cache synchronizer stalls what % stale is the system after 5 minutes - potentially as high as 7.5%!

Above assumes the cache synchronizer is in a steady state. If cache synchronizer has a brown-out it would affect its health-checking which increases churn rate, leading to cascading instability.

CA could also be compromised by an attacker who claims to present someone else and trick the CA to issue a certificate. The attacker can then use that certificate to communicate with other peers. This is where certificate revocation can remediate the situation by revoking the certificate so it is no longer valid. Otherwise the attacker can exploit the compromised certificate till expiry. It is critical to keep the private key for the root certificates in an HSM that is kept offline and use intermediate certificates for signing workload certificates. In the event when CA is brown-out or stalled for 5 minutes, you won’t be able to obtain new or renewed workload certificates but the previously issued and valid certificates continue to provide strong identity guarantees for your workloads. For increased reliability for issuance, you can deploy Intermediate CAs to different zones and regions.

mTLS in Istio

Enable mTLS

Enabling mTLS in Istio for intra-mesh applications is very simple. All you need is to add your applications to the mesh, which can be done by labeling your namespace for either sidecar injection or ambient. In the case of sidecar, a rollout restart would be required for sidecar to be injected to your application pods.

Cryptographic identity

In Kubernetes environment, Istio creates an application’s identity based on its service account. Identity certificate is provided to each application pod in the mesh after you add your application to the mesh.

By default, your pod’s identity certificate expires in 24 hours and Istio rotates the pod identity certificate every 12 hours so that in the event of a compromise (for example, compromised CA or stolen private key for the pod), the compromised certificate only works for a very limited period of time until the certificate expires and therefore limit the damage it can cause.

Enforce strict mTLS

The default mTLS behavior is mTLS whenever possible but not strictly enforced. To strictly enforce your application to accept only mTLS traffic, you can use Istio’s PeerAuthentication policy, mesh-wide or per namespace or workload. In addition, you can also apply Istio’s AuthorizationPolicy to control access for your workloads.

TLS version

TLS version 1.3 is the default in Istio for intra-mesh application communication with the Envoy’s default cipher suites (for example TLS_AES_256_GCM_SHA384 for Istio 1.19.0). If you need an older TLS version, you can configure a different mesh-wide minimum TLS protocol version for your workloads.

Wrapping up

The TLS protocol, as established by the Internet Engineering Task Force (IETF), is one of the most widely-reviewed, expert-approved, battle-tested data security protocols in existence. TLS is also widely used globally - whenever you visit any secured website, you shop with confidence partly because of the padlock icon to indicate that you are securely connected to a trusted site by using TLS. The TLS 1.3 protocol was designed with end-to-end authentication, confidentiality, and integrity to ensure your application’s identity and communications are not compromised, and to prevent man-in-the-middle attacks. In order to achieve that (and to be considered standards-compliant TLS), it is not only important to properly authenticate the communicating peers but also critical to encrypt the traffic using the keys established from the handshake. Now that you know mTLS excels at satisfying your secure application communication requirements (cryptographic identities, confidentiality, integrity and access policy enforcement), you can simply use Istio to upgrade your intra-mesh application communication with mTLS out of the box - with very little configuration!

Huge thanks to Louis Ryan, Ben Leggett, John Howard, Christian Posta, Justin Pettit who contributed significant time in reviewing and proposing updates to the blog!

]]>
Tue, 17 Oct 2023 00:00:00 +0000/v1.24//blog/2023/secure-apps-with-istio//v1.24//blog/2023/secure-apps-with-istio/istiomtlstls
IstioCon China 2023 wrap-upIt’s great to be able to safely get together in person again. After two years of only running virtual events, we have filled the calendar for 2023. Istio Day Europe was held in April, and Istio Day North America is coming this November.

IstioCon is committed to the industry-leading service mesh that provides a platform to explore insights gained from real-world Istio deployments, engage in interactive hands-on activities, and connect with maintainers across the entire Istio ecosystem.

Alongside our virtual IstioCon 2023 event, IstioCon China 2023 was held on September 26 in Shanghai, China. Part of the KubeCon + CloudNativeCon + Open Source Summit China, the event was arranged and hosted by the Istio maintainers and the CNCF. We were very proud to have a strong program for IstioCon in Shanghai and pleased to bring together members of the Chinese Istio community. The event was a testament to Istio’s immense popularity in the Asia-Pacific ecosystem.

IstioCon China 2023

IstioCon China kicked off with an opening keynote from Program Committee members Jimmy Song and Zhonghu Xu. The event was packed with great content, ranging from new features to end user talks, with major focus on the new Istio ambient mesh.

IstioCon China 2023, Welcome

The welcome speech was followed by a sponsored keynote from Justin Pettit from Google, on “Istio Ambient Mesh as a Managed Infrastructure” which highlighted the importance and priority of the ambient model in the Istio community, especially for our top supporters like Google Cloud.

IstioCon China 2023, Google Cloud Sponsored Keynote

Perfectly placed after the keynote, Huailong Zhang from Intel and Yuxing Zeng from Alibaba discussed configurations for the co-existence of Ambient and Sidecar: a very relevant topic for existing users who want to experiment with the new ambient model.

IstioCon China 2023, Deep Dive into Istio Network Flows and Configurations for the co-existence of Ambient and Sidecar

Huawei’s new Istio data plane based on eBPF intends to implement the capabilities of L4 and L7 in the kernel,to avoid kernel-state and user-mode switching and reduce the latency of the data plane. This was explained by an interesting talk from Xie SongYang and Zhonghu Xu. Chun Li and Iris Ding from Intel also integrated eBPF with Istio, with their talk “Harnessing eBPF for Traffic Redirection in Istio ambient mode”, leading to more interesting discussions. DaoCloud also had a presence at the event, with Kebe Liu sharing Merbridge’s innovation in eBPF and Xiaopeng Han presenting about MirageDebug for localized Istio development.

The talk from Tetrate’s Jimmy Song, about the perfect union of different GitOps and Observability tools, was also very well received. Chaomeng Zhang from Huawei presented on how cert-manager helps enhance the security and flexibility of Istio’s certificate management system, and Xi Ning Wang and Zehuan Shi from Alibaba Cloud shared the idea of using VK (Virtual Kubelet) to implement serverless mesh.

While Shivanshu Raj Shrivastava gave a perfect introduction to WebAssembly through his talk “Extending and Customizing Istio with Wasm”, Zufar Dhiyaulhaq from GoTo Financial, Indonesia shared the practice of using Coraza Proxy Wasm to extend Envoy and quickly implement custom Web Application Firewalls. Huabing Zhao from Tetrate shared Aeraki Mesh’s Dubbo service governance practices with Qin Shilin from Boss Direct. While multi-tenancy is always a hot topic with Istio, John Zheng from HP described in detail about multi-tenant management in HP OneCloud Platform.

The slides for all the sessions can be found in the IstioCon China 2023 schedule and all the presentations will be available in the CNCF YouTube Channel soon for the audience in other parts of the world.

On the show floor

Istio had a full time kiosk in the project pavilion at KubeCon + CloudNativeCon + Open Source Summit China 2023 , with the majority of questions asked around ambient mesh. Many of our members and maintainers offered support at the booth, where a lot of interesting discussions happened.

KubeCon + CloudNativeCon + Open Source Summit China 2023, Istio Kiosk

Another highlight was the Istio Steering Committee members and authors of the Istio books “Cloud Native Service Mesh Istio” and “Istio: the Definitive Guide”, Zhonghu Xu and Chaomeng Zhang, spent time at the Istio booth interacting with our users and contributors.

Meet the Authors

We would like to express our heartfelt gratitude to our diamond sponsors Google Cloud, for supporting IstioCon 2023!

IstioCon 2023, Our Diamond Sponsor

Last but not least, we would like to thank our IstioCon China Program Committee members for all their hard work and support!

IstioCon China 2023, Program Committee Members (Not Pictured: Iris Ding)

See you all in Chicago in November!

]]>
Fri, 29 Sep 2023 00:00:00 +0000/v1.24//blog/2023/istiocon-china//v1.24//blog/2023/istiocon-china/Istio DayIstioConIstioconferenceKubeConCloudNativeCon
Deep Dive into the Network Traffic Path of the Coexistence of Ambient and SidecarThere are 2 deployment modes for Istio: ambient mode and sidecar mode. The former is still on the way, the latter is the classic one. Therefore, the coexistence of ambient mode and sidecar mode should be a normal deployment form and the reason why this blog may be helpful for Istio users.

Background

In the architecture of modern microservices, communication and management among services is critical. To address the challenge, Istio emerged as a service mesh technology. It provides traffic control, security, and superior observation capabilities by utilizing the sidecar. In order to further improve the adaptability and flexibility of Istio, the Istio community began to explore a new mode - ambient mode. In this mode, Istio no longer relies on explicit sidecar injection, but achieves communication and mesh management among services through ztunnel and waypoint proxies. Ambient also brings a series of improvements, such as lower resource consumption, simpler deployment, and more flexible configuration options. When enabling ambient mode, we don’t have to restart pods anymore which enables Istio to play a better role in various scenarios.

There are many blogs, which can be found in the Reference Resources section of this blog, that introduce and analyze ambient, and this blog will analyze the network traffic path in Istio ambient and sidecar modes.

To clarify the network traffic paths and make it easier to understand, this blog post explores the following two scenarios with corresponding diagrams:

  • The network path of services in ambient mode to services in sidecar mode
  • The network path of services in sidecar mode to services in ambient mode

Information about the analysis

The analysis is based on Istio 1.18.2, where ambient mode uses iptables for redirection.

Ambient mode sleep to sidecar mode httpbin

Deployment and configuration for the first scenario

  • sleep is deployed in namespace foo
    • sleep pod is scheduled to Node A
  • httpbin is deployed in namespace bar
    • httpbin is scheduled to Node B
  • foo namespace enables ambient mode (foo namespace contains label: istio.io/dataplane-mode=ambient)
  • bar namespace enables sidecar injection (bar namespace contains label: istio-injection: enabled)

With the above description, the deployment and network traffic paths are:

Ambient mode sleep to Sidecar mode httpbin

ztunnel will be deployed as a DaemonSet in istio-system namespace if ambient mode is enabled, while istio-cni and ztunnel would generate iptables rules and routes for both the ztunnel pod and pods on each node.

All network traffic coming in/out of the pod with ambient mode enabled will go through ztunnel based on the network redirection logic. The ztunnel will then forward the traffic to the correct endpoints.

Network traffic path analysis of ambient mode sleep to sidecar mode httpbin

According to above diagram, the details of network traffic path is demonstrated as below:

(1) (2) (3) Request traffic of the sleep service is sent out from the veth of the sleep pod where it will be marked and forwarded to the istioout device in the node by following the iptables rules and route rules. The istioout device on node A is a Geneve tunnel, and the other end of the tunnel is pistioout, which is inside the ztunnel pod on the same node.

(4) (5) When the traffic arrives through the pistioout device, the iptables rules inside the pod intercept and redirect it through the eth0 interface in the pod to port 15001.

(6) According to the original request information, ztunnel can obtain the endpoint list of the target service. It will then handle sending the request to the endpoint, such as one of the httpbin pods. Finally, the request traffic would get into the httpbin pod via the container network.

(7) The request traffic arriving in httpbin pod will be intercepted and redirected through port 15006 of the sidecar by its iptables rules.

(8) Sidecar handles the inbound request traffic coming in via port 15006, and forwards the traffic to the httpbin container in the same pod.

Sidecar mode sleep to ambient mode httpbin and helloworld

Deployment and configuration for the second scenario

  • sleep is deployed in namespace foo
    • sleep pod is scheduled to Node A
  • httpbin deployed in namespace bar-1
    • httpbin pod is scheduled to Node B
    • the waypoint proxy of httpbin is disabled
  • helloworld is deployed in namespace bar-2
    • helloworld pod is scheduled to Node D
    • the waypoint proxy of helloworld is enabled
    • the waypoint proxy is scheduled to Node C
  • foo namespace enables sidecar injection (foo namespace contains label: istio-injection: enabled)
  • bar-1 namespace enables ambient mode (bar-1 namespace contains label: istio.io/dataplane-mode=ambient)

With the above description, the deployment and network traffic paths are:

sleep to httpbin and helloworld

Network traffic path analysis of sidecar mode sleep to ambient mode httpbin

Network traffic path of a request from the sleep pod (sidecar mode) to the httpbin pod (ambient mode) is depicted in the top half of the diagram above.

(1) (2) (3) (4) the sleep container sends a request to httpbin. The request is intercepted by iptables rules and directed to port 15001 on the sidecar in the sleep pod. Then, the sidecar handles the request and routes the traffic based on the configuration received from istiod (control plane) forwarding the traffic to an IP address corresponding to the httpbin pod on node B.

(5) (6) After the request is sent to the device pair (veth httpbin <-> eth0 inside httpbin pod), the request is intercepted and forwarded using the iptables and route rules to the istioin device on node B where httpbin pod is running by following its iptables and route rules. The istioin device on node B and the pistion device inside the ztunnel pod on the same node are connected by a Geneve tunnel.

(7) (8) After the request enters the pistioin device of the ztunnel pod, the iptables rules in the ztunnel pod intercept and redirect the traffic through port 15008 on the ztunnel proxy running inside the pod.

(9) The traffic getting into the port 15008 would be considered as a inbound request, and the ztunnel will then forward the request to the httpbin pod in the same node B.

Network traffic path analysis of sidecar mode sleep to ambient mode httpbin via waypoint proxy

Comparing with the top part of the diagram, the bottom part inserts a waypoint proxy in the path between the sleep, ztunnel and httpbin pods. The Istio control plane has all the service information and configuration of the service mesh. When helloworld pod is deployed with a waypoint proxy, the EDS configuration of helloworld service received by the sleep pod sidecar will be changed to the type of envoy_internal_address. This causes that the request traffic going through the sidecar to be forwarded to port 15008 of the waypoint proxy on node C via the HTTP Based Overlay Network (HBONE) protocol.

Waypoint proxy is an instance of Envoy proxy and forwards the request to the helloworld pod based on the routing configuration received from the control plane. Once traffic reaches the veth on node D, it follows the same path as the previous scenario.

Wrapping up

The sidecar mode is what made Istio a great service mesh. However, the sidecar mode can also cause problems as it requires the app and sidecar containers to run in the same pod. Istio ambient mode implements communication among services through centralized proxies (ztunnel and waypoint). The ambient mode provides greater flexibility and scalability, reduces resource consumption as it doesn’t require a sidecar for each pod in the mesh, and allows more precise configuration. Therefore, there’s no doubt ambient mode is the next evolution of Istio. It’s obvious that the coexistence of sidecar and ambient modes may be last a very long time, although the ambient mode is still in alpha stage and the sidecar mode is still the recommended mode of Istio, it will give users a more light-weight option of running and adopting the Istio service mesh as the ambient mode moves towards beta and future releases.

Reference Resources

]]>
Mon, 18 Sep 2023 00:00:00 +0000/v1.24//blog/2023/traffic-for-ambient-and-sidecar//v1.24//blog/2023/traffic-for-ambient-and-sidecar/trafficambientsidecarcoexistence
Istio Announces Winners of 2023 Steering Committee ElectionThe Istio Steering Committee is pleased to announce the four winners of the 2023 election for Community Seats. The winners are:

  • Craig Box, ARMO
  • Iris Ding, Intel
  • Lin Sun, Solo.io
  • Faseela K, Ericsson Software Technology

The winners will serve on the Steering Committee for one year, starting on September 1, 2023. They will be responsible for helping to guide the development and governance of Istio, the world’s most popular service mesh.

The election was held in August 2023, and was open to any member of the Istio community who submitted a pull request or made other significant project contributions. Over 120 eligible voters evaluated the candidates on their contributions to Istio, their experience in open source governance, and their commitment to the project’s mission.

In addition to the four Community Seats, the Istio Steering Committee also consists of nine Contribution Seats, which are awarded proportionally to organizations which made significant contributions to the project. The Contribution Seats for 2023 are held by:

  • Google
  • IBM / Red Hat
  • Huawei

The Steering Committee congratulates the winners of the election, and looks forward to working with them to continue to grow and improve Istio as a successful and sustainable open source project. We encourage everyone to get involved in the Istio community, contribute, vote, and help us shape the future of service mesh.

]]>
Wed, 16 Aug 2023 00:00:00 +0000/v1.24//blog/2023/steering-election-results//v1.24//blog/2023/steering-election-results/istiosteeringgovernancecommunityelection
Kubernetes Native Sidecars in IstioIf you have heard anything about service meshes, it is that they work using the sidecar pattern: a proxy server is deployed alongside your application code. The sidecar pattern is just that: a pattern. Up until this point, there has been no formal support for sidecar containers in Kubernetes at all.

This has caused a number of problems: what if you have a job that terminates by design, but a sidecar container that doesn’t? This exact use case is the most popular ever on the Kubernetes issue tracker.

A formal proposal for adding sidecar support in Kubernetes was raised in 2019. With many stops and starts along the way, and after a reboot of the project last year, formal support for sidecars is being released to Alpha in Kubernetes 1.28. Istio has implemented support for this feature, and in this post you can learn how to take advantage of it.

Sidecar woes

Sidecar containers give a lot of power, but come with some issues. While containers within a pod can share some things, their lifecycle’s are entirely decoupled. To Kubernetes, both of these containers are functionally the same.

However, in Istio they are not the same - the Istio container is required for the primary application container to run, and has no value without the primary application container.

This mismatch in expectation leads to a variety of issues:

  • If the application container starts faster than Istio’s container, it cannot access the network. This wins the most +1’s on Istio’s GitHub by a landslide.
  • If Istio’s container shuts down before the application container, the application container cannot access the network.
  • If an application container intentionally exits (typically from usage in a Job), Istio’s container will still run and keep the pod running indefinitely. This is also a top GitHub issue.
  • InitContainers, which run before Istio’s container starts, cannot access the network.

Countless hours have been spent in the Istio community and beyond to work around these issues - to limited success.

Fixing the root cause

While increasingly-complex workarounds in Istio can help alleviate the pain for Istio users, ideally all of this would just work - and not just for Istio. Fortunately, the Kubernetes community has been hard at work to address these directly in Kubernetes.

In Kubernetes 1.28, a new feature to add native support for sidecars was merged, closing out over 5 years of ongoing work. With this merged, all of our issues can be addressed without workarounds!

While we are on the “GitHub issue hall of fame”, these two issues account for #1 and #6 all time issues in Kubernetes - and have finally been closed!

A special thanks goes to the huge group of individuals involved in getting this past the finish line.

Trying it out

While Kubernetes 1.28 was just released, the new SidecarContainers feature is Alpha (and therefore, off by default), and the support for the feature in Istio is not yet shipped, we can still try it out today - just don’t try this in production!

First, we need to spin up a Kubernetes 1.28 cluster, with the SidecarContainers feature enabled:

$ cat <<EOF | kind create cluster --name sidecars --image gcr.io/istio-testing/kind-node:v1.28.0 --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
featureGates:
  SidecarContainers: true
EOF

Then we can download the latest Istio 1.19 pre-release (as 1.19 is not yet out). I used Linux here. This is a pre-release of Istio, so again - do not try this in production! When we install Istio, we will enable the feature flag for native sidecar support and turn on access logs to help demo things later.

$ TAG=1.19.0-beta.0
$ curl -L https://github.com/istio/istio/releases/download/$TAG/istio-$TAG-linux-amd64.tar.gz | tar xz
$ ./istioctl install --set values.pilot.env.ENABLE_NATIVE_SIDECARS=true -y --set meshConfig.accessLogFile=/dev/stdout

And finally we can deploy a workload:

$ kubectl label namespace default istio-injection=enabled
$ kubectl apply -f samples/sleep/sleep.yaml

Let’s look at the pod:

$ kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
sleep-7656cf8794-8fhdk   2/2     Running   0          51s

Everything looks normal at first glance… If we look under the hood, we can see the magic, though.

$ kubectl get pod -o "custom-columns="\
"NAME:.metadata.name,"\
"INIT:.spec.initContainers[*].name,"\
"CONTAINERS:.spec.containers[*].name"

NAME                     INIT                     CONTAINERS
sleep-7656cf8794-8fhdk   istio-init,istio-proxy   sleep

Here we can see all the containers and initContainers in the pod.

Surprise! istio-proxy is now an initContainer.

More specifically, it is an initContainer with restartPolicy: Always set (a new field, enabled by the SidecarContainers feature). This tells Kubernetes to treat it as a sidecar.

This means that later containers in the list of initContainers, and all normal containers will not start until the proxy container is ready. Additionally, the pod will terminate even if the proxy container is still running.

Init container traffic

To put this to the test, let’s make our pod actually do something. Here we deploy a simple pod that sends a request in an initContainer. Normally, this would fail.

apiVersion: v1
kind: Pod
metadata:
  name: sleep
spec:
  initContainers:
  - name: check-traffic
    image: istio/base
    command:
    - curl
    - httpbin.org/get
  containers:
  - name: sleep
    image: istio/base
    command: ["/bin/sleep", "infinity"]

Checking the proxy container, we can see the request both succeeded and went through the Istio sidecar:

$ kubectl logs sleep -c istio-proxy | tail -n1
[2023-07-25T22:00:45.703Z] "GET /get HTTP/1.1" 200 - via_upstream - "-" 0 1193 334 334 "-" "curl/7.81.0" "1854226d-41ec-445c-b542-9e43861b5331" "httpbin.org" ...

If we inspect the pod, we can see our sidecar now runs before the check-traffic initContainer:

$ kubectl get pod -o "custom-columns="\
"NAME:.metadata.name,"\
"INIT:.spec.initContainers[*].name,"\
"CONTAINERS:.spec.containers[*].name"

NAME    INIT                                  CONTAINERS
sleep   istio-init,istio-proxy,check-traffic   sleep

Exiting pods

Earlier, we mentioned that when applications exit (common in Jobs), the pod would live forever. Fortunately, this is addressed as well!

First we deploy a pod that will exit after one second and doesn’t restart:

apiVersion: v1
kind: Pod
metadata:
  name: sleep
spec:
  restartPolicy: Never
  containers:
- name: sleep
  image: istio/base
  command: ["/bin/sleep", "1"]

And we can watch its progress:

$ kubectl get pods -w
NAME    READY   STATUS     RESTARTS   AGE
sleep   0/2     Init:1/2   0          2s
sleep   0/2     PodInitializing   0          2s
sleep   1/2     PodInitializing   0          3s
sleep   2/2     Running           0          4s
sleep   1/2     Completed         0          5s
sleep   0/2     Completed         0          12s

Here we can see the application container exited, and shortly after Istio’s sidecar container exits as well. Previously, the pod would be stuck in Running, while now it can transition to Completed. No more zombie pods!

What about ambient mode?

Last year, Istio announced ambient mode - a new data plane mode for Istio that doesn’t rely on sidecar containers. So with ambient mode coming, does any of this even matter?

I would say a resounding “Yes”!

While the impacts of sidecar are lessened when ambient mode is used for a workload, I expect that almost all large scale Kubernetes users have some sort of sidecar in their deployments. This could be Istio workloads they don’t want to migrate to ambient, that they haven’t yet migrated, or things unrelated to Istio. So while there may be fewer scenarios where this matters, it still is a huge improvement for the cases where sidecars are used.

You may wonder the opposite - if all our sidecar woes are addressed, why do we need ambient mode at all? There are still a variety of benefits ambient brings with these sidecar limitations addressed. For example, this blog post goes into details about why decoupling proxies from workloads is advantageous.

Try it out yourself

We encourage the adventurous readers to try this out themselves in testing environments! Feedback for these experimental and alpha features is critical to ensure they are stable and meeting expectations before promoting them. If you try it out, let us know what you think in the Istio Slack!

In particular, the Kubernetes team is interested in hearing more about:

  • Handling of shutdown sequence, especially when there are multiple sidecars involved.
  • Backoff restart handling when sidecar containers are crashing.
  • Edge cases they have not yet considered.
]]>
Tue, 15 Aug 2023 00:00:00 +0000/v1.24//blog/2023/native-sidecars//v1.24//blog/2023/native-sidecars/istiosidecarskubernetes
Using Accelerated Offload Connection Load Balancing in IstioWhat is connection load balancing?

Load balancing is a core networking solution used to distribute traffic across multiple servers in a server farm. Load balancers improve application availability and responsiveness and prevent server overload. Each load balancer sits between client devices and backend servers, receiving and then distributing incoming requests to any available server capable of fulfilling them.

For a common web server, it usually has multiple workers (processors or threads). If many clients connect to a single worker, this worker becomes busy and brings long tail latency while other workers run in the free state, affecting the performance of the web server. Connection load balancing is the solution for this situation, which is also known as connection balancing.

What does Istio do for connection load balancing?

Istio uses Envoy as the data plane.

Envoy provides a connection load balancing implementation called Exact connection balancer. As its name says, a lock is held during balancing so that connection counts are nearly exactly balanced between workers. It is “nearly” exact in the sense that a connection might close in parallel thus making the counts incorrect, but this should be rectified on the next accept. This balancer sacrifices accept throughput for accuracy and should be used when there are a small number of connections that rarely cycle, e.g., service mesh gRPC egress.

Obviously, it is not suitable for an ingress gateway since an ingress gateway accepts thousands of connections within a short time, and the resource cost from the lock brings a big drop in throughput.

Now, Envoy has integrated Intel® Dynamic Load Balancing (Intel®DLB) connection load balancing to accelerate in high connection count cases like ingress gateway.

How Intel® Dynamic Load Balancing accelerates connection load balancing in Envoy

Intel DLB is a hardware managed system of queues and arbiters connecting producers and consumers. It is a PCI device envisaged to live in the server CPU uncore and can interact with software running on cores, and potentially with other devices.

Intel DLB implements the following load balancing features:

  • Offloads queue management from software — useful where there are significant queuing-based costs.
    • Especially with multi-producer / multi-consumer scenarios and enqueue batching to multiple destinations.
    • The overhead locks are required to access shared queues in the software. Intel DLB implements lock-free access to shared queues.
  • Dynamic, flow aware load balancing and reordering.
    • Ensures equal distribution of tasks and better CPU core utilization. Can provide flow-based atomicity if required.
    • Distributes high bandwidth flows across many cores without loss of packet order.
    • Better determinism and avoids excessive queuing latencies.
    • Uses less IO memory footprint and saves DDR Bandwidth.
  • Priority queuing (up to 8 levels) — allows for QOS.
    • Lower latency for traffic that is latency sensitive.
    • Optional delay measurements in the packets.
  • Scalability
    • Allows dynamic sizing of applications, seamless scale up/down.
    • Power aware; application can drop workers to lower power state in cases of lighter load.

There are three types of load balancing queues:

  • Unordered: For multiple producers and consumers. The order of tasks is not important, and each task is assigned to the processor core with the lowest current load.
  • Ordered: For multiple producers and consumers where the order of tasks is important. When multiple tasks are processed by multiple processor cores, they must be rearranged in the original order.
  • Atomic: For multiple producers and consumers, where tasks are grouped according to certain rules. These tasks are processed using the same set of resources and the order of tasks within the same group is important.

An ingress gateway is expected to process as much data as possible as quickly as possible, so Intel DLB connection load balancing uses an unordered queue.

How to use Intel DLB connection load balancing in Istio

With the 1.17 release, Istio officially supports Intel DLB connection load balancing.

The following steps show how to use Intel DLB connection load balancing in an Istio Ingress Gateway in an SPR (Sapphire Rapids) machine, assuming the Kubernetes cluster is running.

Step 1: Prepare DLB environment

Install the Intel DLB driver by following the instructions on the Intel DLB driver official site.

Install the Intel DLB device plugin with the following command:

$ kubectl apply -k https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/dlb_plugin?ref=v0.26.0

For more details about the Intel DLB device plugin, please refer to Intel DLB device plugin homepage.

You can check the Intel DLB device resource:

$ kubectl describe nodes | grep dlb.intel.com/pf
  dlb.intel.com/pf:   2
  dlb.intel.com/pf:   2
...

Step 2: Download Istio

In this blog we use 1.17.2. Let’s download the installation:

$ curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.17.2 TARGET_ARCH=x86_64 sh -
$ cd istio-1.17.2
$ export PATH=$PWD/bin:$PATH

You can check the version is 1.17.2:

$ istioctl version
no running Istio pods in "istio-system"
1.17.2

Step 3: Install Istio

Create an install configuration for Istio, notice that we assign 4 CPUs and 1 DLB device to ingress gateway and set concurrency as 4, which is equal to the CPU number.

$ cat > config.yaml << EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  profile: default
  components:
    ingressGateways:
    - enabled: true
      name: istio-ingressgateway
      k8s:
        overlays:
          - kind: Deployment
            name: istio-ingressgateway
        podAnnotations:
          proxy.istio.io/config: |
            concurrency: 4
        resources:
          requests:
            cpu: 4000m
            memory: 4096Mi
            dlb.intel.com/pf: '1'
          limits:
            cpu: 4000m
            memory: 4096Mi
            dlb.intel.com/pf: '1'
        hpaSpec:
          maxReplicas: 1
          minReplicas: 1
  values:
    telemetry:
      enabled: false
EOF

Use istioctl to install:

$ istioctl install -f config.yaml --set values.gateways.istio-ingressgateway.runAsRoot=true -y
✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
✔ Installation complete                                                                                                                                                                                                                                                                       Making this installation the default for injection and validation.

Thank you for installing Istio 1.17.  Please take a few minutes to tell us about your install/upgrade experience!  https://forms.gle/hMHGiwZHPU7UQRWe9

Step 4: Setup Backend Service

Since we want to use DLB connection load balancing in Istio ingress gateway, we need to create a backend service first.

We’ll use an Istio-provided sample to test, httpbin.

$ kubectl apply -f samples/httpbin/httpbin.yaml
$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: httpbin-gateway
spec:
  # The selector matches the ingress gateway pod labels.
  # If you installed Istio using Helm following the standard documentation, this would be "istio=ingress"
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "httpbin.example.com"
EOF
$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - "httpbin.example.com"
  gateways:
  - httpbin-gateway
  http:
  - match:
    - uri:
        prefix: /status
    - uri:
        prefix: /delay
    route:
    - destination:
        port:
          number: 8000
        host: httpbin
EOF

You have now created a virtual service configuration for the httpbin service containing two route rules that allow traffic for paths /status and /delay.

The gateways list specifies that only requests through your httpbin-gateway are allowed. All other external requests will be rejected with a 404 response.

Step 5: Enable DLB Connection Load Balancing

$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: dlb
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
  - applyTo: LISTENER
    match:
      context: GATEWAY
    patch:
      operation: MERGE
      value:
        connection_balance_config:
            extend_balance:
              name: envoy.network.connection_balance.dlb
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.network.connection_balance.dlb.v3alpha.Dlb
EOF

It is expected that if you check the log of ingress gateway pod istio-ingressgateway-xxxx you will see log entries similar to:

$ export POD="$(kubectl get pods -n istio-system | grep gateway | awk '{print $1}')"
$ kubectl logs -n istio-system ${POD} | grep dlb
2023-05-05T06:16:36.921299Z     warning envoy config external/envoy/contrib/network/connection_balance/dlb/source/connection_balancer_impl.cc:46        dlb device 0 is not found, use dlb device 3 instead     thread=35

Envoy will auto detect and choose the DLB device.

Step 6: Test

$ export HOST="<YOUR-HOST-IP>"
$ export PORT="$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')"
$ curl -s -I -HHost:httpbin.example.com "http://${HOST}:${PORT}/status/200"
HTTP/1.1 200 OK
server: istio-envoy
...

Note that you use the -H flag to set the Host HTTP header to httpbin.example.com since now you have no DNS binding for that host and are simply sending your request to the ingress IP.

You can also add the DNS binding in /etc/hosts and remove -H flag:

$ echo "$HOST httpbin.example.com" >> /etc/hosts
$ curl -s -I "http://httpbin.example.com:${PORT}/status/200"
HTTP/1.1 200 OK
server: istio-envoy
...

Access any other URL that has not been explicitly exposed. You should see an HTTP 404 error:

$ curl -s -I -HHost:httpbin.example.com "http://${HOST}:${PORT}/headers"
HTTP/1.1 404 Not Found
...

You can turn on debug log level to see more DLB related logs:

$ istioctl pc log ${POD}.istio-system --level debug
istio-ingressgateway-665fdfbf95-2j8px.istio-system:
active loggers:
  admin: debug
  alternate_protocols_cache: debug
  aws: debug
  assert: debug
  backtrace: debug
...

Run curl to send one request and you will see something like below:

$ kubectl logs -n istio-system ${POD} | grep dlb
2023-05-05T06:16:36.921299Z     warning envoy config external/envoy/contrib/network/connection_balance/dlb/source/connection_balancer_impl.cc:46        dlb device 0 is not found, use dlb device 3 instead     thread=35
2023-05-05T06:37:45.974241Z     debug   envoy connection external/envoy/contrib/network/connection_balance/dlb/source/connection_balancer_impl.cc:269   worker_3 dlb send fd 45 thread=47
2023-05-05T06:37:45.974427Z     debug   envoy connection external/envoy/contrib/network/connection_balance/dlb/source/connection_balancer_impl.cc:286   worker_0 get dlb event 1        thread=46
2023-05-05T06:37:45.974453Z     debug   envoy connection external/envoy/contrib/network/connection_balance/dlb/source/connection_balancer_impl.cc:303   worker_0 dlb recv 45    thread=46
2023-05-05T06:37:45.975215Z     debug   envoy connection external/envoy/contrib/network/connection_balance/dlb/source/connection_balancer_impl.cc:283   worker_0 dlb receive none, skip thread=46

For more details about Istio Ingress Gateway, please refer to Istio Ingress Gateway Official Doc.

]]>
Tue, 08 Aug 2023 00:00:00 +0000/v1.24//blog/2023/dlb-connection-balancing//v1.24//blog/2023/dlb-connection-balancing/IstioDLBgateways
Announcing Istio's graduation within the CNCFWe are delighted to announce that Istio is now a graduated Cloud Native Computing Foundation (CNCF) project.

We would like to thank our TOC sponsors Emily Fox and Nikhita Raghunath, and everyone who has collaborated over the past six years on Istio’s design, development, and deployment.

As before, project work continues uninterrupted. We were excited to bring ambient mesh to Alpha in Istio 1.18 and are continuing to drive it to production readiness. Sidecar deployments remain the recommended method of using Istio, and our 1.19 release will support a new sidecar container feature in Alpha in Kubernetes 1.28.

We have been delighted to welcome Microsoft to our community after their decision to archive the Open Service Mesh project and collaborate together on Istio. As the third most active CNCF project in terms of PRs, and with support from over 20 vendors and dozens of contributing companies, there is simply no better choice for a service mesh.

We would like to invite the Istio community to submit a talk to the upcoming virtual IstioCon 2023, the companion full day, in-person event co-located with KubeCon China in Shanghai, or Istio Day co-located with KubeCon NA in Chicago.

Watch a video

In this video for Techstrong TV, I talk about the history of the project, and what graduation means to us.

Words of support from our alumni

When we announced our incubation, we mentioned that the journey began with Istio’s inception in 2016. One of the great things about collaborative open source projects is that people come and go from employers, but their affiliation with a project can remain. Some of our original contributors founded companies based on Istio; some moved to other companies that support it; and some are still working on it at Google or IBM, six years later.

The announcement from the CNCF and blog posts from Intel, Red Hat, Solo.io, Tetrate, VMware and DaoCloud summarize the thoughts and feelings of those working on the project today.

We also reached out to some contributors who have moved on from the project, to share their thoughts.

]]>
Wed, 12 Jul 2023 00:00:00 +0000/v1.24//blog/2023/istio-graduates-within-cncf//v1.24//blog/2023/istio-graduates-within-cncf/IstioCNCF
Istio Day North America 2023, Twice The Fun!

We all had a blast at Istio Day Europe in April. The event was incredibly well received, but organizers and attendees alike felt that a half-day was not enough to showcase all that Istio has to offer. Due to the overwhelming response, we are glad to share with all of you that Istio Day North America is going to be a full-day event, co-located with KubeCon North America in Chicago.

Submit a talk

We now encourage Istio users, developers, partners, and advocates to submit a session proposal through the CNCF event portal, which is open until August 6th.

We want to see real world examples, case studies, and success stories that can inspire newcomers to use Istio in production. The content will cover introductory to advanced levels, split into four main topic tracks:

  • New Features: What have you been working on that the community should know about?
  • Case Studies: How have you built a platform or service on top of Istio?
  • Istio Recipes: How you can solve a specific business problem using Istio.
  • Project Updates: The evolution of Istio, and the latest updates from the project maintainers.

You can pick one of these formats to submit a session proposal:

  • Presentation: 25 minutes, 1 or 2 speaker(s) presenting a topic
  • Panel Discussion: 35 minutes of discussion among 3 to 5 speakers
  • Lightning Talk: A brief 5-minute presentation, maximum of 1 speaker

Accepted speakers will receive a complimentary All-Access In-Person ticket for all four days of KubeCon + CloudNativeCon.

Timeline

  • Thursday, June 15: CFP + Sponsor Prospectus Launch
  • Sunday, August 6: CFP Closes
  • Tuesday, August 8 - Monday, August 21: CFP Review Window
  • Thursday, September 7: Speaker Notifications
  • Week of September 11: Schedule Launch & Announcement
  • Wednesday, September 20: Sponsor Sales Close
  • Monday, November 6: Event Day

Do you want to put your product or service in front of the most discerning Cloud Native users: those who demand 25% more conference than the crowd? Check out page 19 of the CNCF events prospectus to learn more. Contact sponsor@cncf.io to secure your sponsorship today! Signed contracts must be received by September 20.

Register to attend

Istio Day is a KubeCon + CloudNativeCon North America CNCF-hosted Co-located Event. In-person KubeCon + CloudNativeCon attendees have the option to buy an All-Access ticket which includes entry to all the CNCF-hosted “day 0” events, as well as the main three days of the conference. You must be attending KubeCon to attend Istio Day, but virtual registration options are available, and the recordings will be posted to YouTube soon after the event.

For those of you who can’t make it, keep your eyes peeled for announcements of IstioCon 2023 (Virtual) and Istio Day China.

Stay tuned to hear more about the event, and we hope you can join us in Chicago for Istio Day!

]]>
Fri, 16 Jun 2023 00:00:00 +0000/v1.24//blog/2023/istioday-kubecon-na-cfp//v1.24//blog/2023/istioday-kubecon-na-cfp/Istio DayIstioConIstioconferenceKubeConCloudNativeCon
Istio at KubeCon Europe 2023The open source and cloud native community gathered from 18th to 21st April in Amsterdam for the first KubeCon of 2023. The four-day conference, organized by the Cloud Native Computing Foundation, was special for Istio, as we evolved from a participant at ServiceMeshCon to hosting our first official project co-located event.

Istio Day Europe 2023, Welcome

Istio Day kicked off with an opening keynote from the Program Committee chairs, Mitch Connors and Faseela K. The event was packed with great content, ranging from new features to end user talks, and the hall was always jam-packed. The opening keynote was an ice-breaker with some Istio fun in the form of a pop quiz, and recognition for the day-to-day efforts of our contributors, maintainers, release managers, and users.

Istio Day Europe 2023, Opening Keynote

This was followed by a 2023 roadmap update session from TOC members Lin Sun and Louis Ryan. We had our much awaited session on the security posture of Ambient Mesh, from Christian Posta and John Howard, which stirred some interesting discussions in the community. After this we stepped into our first end user talk from John Keates from Wehkamp, a local Dutch company, followed by speakers from Bloomberg, Alexa Griffith and Zhenni Fu, on how they secure their highly privileged financial information using Istio. Istio Day witnessed more focus on security, which became even more prominent when Zack Butcher talked about using Istio for Controls Compliance. We also had lightning talks covering faster Istio development environments, guide for Istio resource isolation and securing hybrid cloud deployments from Mitch Connors, Zhonghu Xu and Matt Turner respectively.

Istio Day Europe 2023, Jam packed sessions

A number of our ecosystem members had Istio-related announcements at the event. Microsoft announced Istio as a managed add-on for Azure Kubernetes Service, and support for Istio is now generally available in D2iQ Kubernetes Platform.

Tetrate announced Tetrate Service Express, an Istio-based service connectivity, security and resilience automation solution for Amazon EKS, and Solo.io announced Gloo Fabric, with Istio-based application networking capabilities expanded to VM-based, container, and serverless applications across cloud environments.

Istio’s presence at the conference did not end with Istio Day. The second day keynote started with a project update video from Lin Sun. It was also a proud moment for us, when our steering committee member Craig Box was recognized as a CNCF mentor in the keynote. The maintainer track for Istio presented by TOC member Neeraj Poddar grabbed great attention as he talked about the current ongoing efforts and future roadmap of Istio. The talk, and the size of the audience, underlined why Istio continues to be the most popular service mesh in the industry.

KubeCon Europe 2023, Question: How many of you use Istio in production?

The following sessions at KubeCon were based on Istio and almost all of them had a huge crowd in attendance:

Istio had a full time kiosk in the KubeCon project pavilion, with the majority of questions asked being on the status of our CNCF graduation. We are so excited to know that our users are eagerly waiting for news of our graduation, and we promise we are actively working towards it!

KubeCon Europe 2023, Istio Kiosk

Many of our TOC members and maintainers also offered support at the booth, where a lot of interesting discussions happened around Istio Ambient Mesh as well.

KubeCon Europe, More support at Istio Kiosk

Another highlight was Istio TOC and steering members and authors Lin Sun and Christian Posta signing copies of the “Istio Ambient Explained” book.

KubeCon Europe 2023, Ambient Mesh book signing by authors

Last, but not least, we would like to express our heartfelt gratitude to our platinum sponsors Tetrate, for supporting Istio Day!

2023 is going to be really big for Istio, with more events planned for the coming months! Stay tuned for updates on IstioCon 2023 and Istio’s presence at KubeCon in China and North America.

]]>
Thu, 27 Apr 2023 00:00:00 +0000/v1.24//blog/2023/istio-at-kubecon-eu//v1.24//blog/2023/istio-at-kubecon-eu/Istio DayIstioConIstioconferenceKubeConCloudNativeCon
Comprehensive Network Security at SplunkWith dozens of tools for securing your network available, it is easy to find tutorials and demonstrations illustrating how these individual tools make your network more secure by adding identity, policy, and observability to your traffic. What is often less clear is how these tools interoperate to provide comprehensive security for your network in production. How many tools do you need? When is your network secure enough?

This post will explore the tools and practices leveraged by Splunk to secure their Kubernetes network infrastructure, starting with VPC design and connectivity and going all the way up the stack to HTTP Request based security. Along the way, we’ll see what it takes to provide comprehensive network security for your cloud native stack, how these tools interoperate, and where some of them can improve. Splunk uses a variety of tools to secure their network, including:

  • AWS Functionality
  • Kubernetes
  • Istio
  • Envoy
  • Aviatrix

About Splunk’s Use Case

Splunk is a technology company that provides a platform for collecting, analyzing and visualizing data generated by various sources. It is primarily used for searching, monitoring, and analyzing machine-generated big data through a web-style interface. Splunk Cloud is an initiative to move Splunk’s internal infrastructure to a cloud native architecture. Today Splunk Cloud consists of over 35 fully replicated clusters in AWS and GCP in regions around the world.

Securing Layer 3/4: AWS, Aviatrix and Kubernetes

At Splunk Cloud, we use a pattern called “cookie cutter VPCs” where each cluster is provisioned with it’s own VPC, with identical private subnets for Pod and Node IPs, a public subnet for ingress and egress to and from the public internet, and an internal subnet for traffic between clusters. This keeps Pods and Nodes from separate clusters completely isolated, while allowing traffic outside the cluster to have particular rules enforced in the public and internal subnets. Additionally, this pattern avoids the possibility of RFC 1918 private IP exhaustion when leveraging many clusters.

Within each VPC, Network ACLs and Security Groups are set up to restrict connectivity to what is absolutely required. As an example, we restrict public connectivity to our Ingress nodes (that will deploy Envoy ingress gateways). In addition to ordinary east/west and north/south traffic, there are also shared services at Splunk that every cluster needs to access. Aviatrix is used to provide overlapping VPC access, while also enforcing some high level security rules (segmentation per domain).

Splunk Network Security Architecture

The next security layer in Splunk’s stack is Kubernetes itself. Validating Webhooks are used to prevent the deployment of K8S objects that would allow insecure traffic in the cluster (typically around NLBs and services). Splunk also relies on NetworkPolicies for securing and restricting Pod to Pod connectivity.

Securing Layer 7: Istio

Splunk uses Istio to enforce policy on the application layer based on the details of each request. Istio also emits Telemetry data (metrics, logs, traces) that is useful for validating request-level security.

One of the key benefits of Istio’s injection of Envoy sidecars is that Istio can provide in-transit encryption for the entire mesh without requiring any modifications to the applications. The applications send plain text HTTP requests, but the Envoy sidecar intercepts the traffic and implements Mutual TLS encryption to protect against interception or modification.

Istio manages Splunk’s ingress gateways, which receive traffic from public and internal NLBs. The gateways are managed by the platform team and run in the Istio Gateway namespace, allowing users to plug into them, but not modify them. The Gateway service is also provisioned with certificates to enforce TLS by default, and Validating Webhooks ensure that services can only connect to gateways for their own hostnames. Additionally, gateways enforce request authentication at ingress, before traffic is able to impact application pods.

Because Istio and related K8S objects are relatively complex to configure, Splunk created an abstraction layer, which is a controller that configures everything for the service, including virtual services, destination rules, gateways, certificates, and more. It sets up DNS that goes directly to the right NLB. It’s a one-click solution for end-to-end network deployment. For more complex use cases, the services teams can still bypass the abstraction and configure these settings directly.

Splunk Application Platform

Pain Points

While Splunk’s architecture meets many of our needs, there are a few pain points worth discussing. Istio operates by creating as many Envoy Sidecars as application pods, which is an inefficient use of resources. In addition, when a particular application has unique needs from its sidecar, such as additional CPU or Memory, it can be difficult to adjust these settings without adjusting them for all sidecars in the mesh. Istio Sidecar injection involves a lot of magic, using a mutating webhook to add a sidecar container to every pod as it is created, which means those pods no longer match their corresponding deployments. Additionally, injection can only happen at pod creation time, which means that any time a sidecar version or parameter is updated, all pods must be restarted before they will get the new settings. Overall, this magic complicates running a service mesh in production, and adds a great deal of operational uncertainty to your application.

The Istio project is aware of these limitations, and believes they will be substantially improved by the new Ambient mode for Istio. In this mode, Layer 4 constructs like identity and encryption will be applied by a Daemon running on the node, but not in the same pod as the application. Layer 7 features will still be handled by Envoy, but Envoy will be run in an adjacent pod as part of its own deployment, rather than relying on the magic of sidecar injection. Application pods will not be modified in any way in ambient mode, which should add a good deal of predictability to service mesh operations. Ambient mode is expected to reach Alpha quality in Istio 1.18.

Conclusion

With all these layers to network security at Splunk Cloud, it is helpful to take a step back and examine the life of a request as it traverses these layers. When a client sends a request, they first connect to the NLB, which will be allowed or blocked by the VPC ACLs. The NLB then proxies the request to one of the ingress nodes, which terminates TLS and inspects the request at Layer 7, choosing to allow or block the request. The Envoy Gateway then validates the request using ExtAuthZ to ensure it is properly authenticated, and meets quota restrictions before being allowed into the cluster. Next, the Envoy Gateway proxies the request upstream, and the network policies from Kubernetes take effect again to make sure this proxying is allowed. The upstream sidecar on the workload inspects the Layer 7 requests and if allowed, it will decrypt the request and send it to the workload in clear text.

Cloud Native Network Security Matrix

Securing Splunk’s Cloud Native Network Stack while meeting the scalability needs of this large enterprise company requires careful security planning at each layer.

While applying identity, observability, and policy principles at every layer in the stack may appear redundant at first glance, each layer is able to make up for the shortcomings of the others, so that together these layers form a tight and effective barrier to unwanted access.

If you are interested in diving deeper into Splunk’s Network Security Stack, you can watch our Cloud Native SecurityCon presentation.

]]>
Mon, 03 Apr 2023 00:00:00 +0000/v1.24//blog/2023/network-security-splunk//v1.24//blog/2023/network-security-splunk/IstioSecurityUse Case
Istio Ambient Waypoint Proxy Made SimpleAmbient splits Istio’s functionality into two distinct layers, a secure overlay layer and a Layer 7 processing layer. The waypoint proxy is an optional component that is Envoy-based and handles L7 processing for workloads it manages. Since the initial ambient launch in 2022, we have made significant changes to simplify waypoint configuration, debuggability and scalability.

Architecture of waypoint proxies

Similar to sidecar, the waypoint proxy is also Envoy-based and is dynamically configured by Istio to serve your applications configuration. What is unique about the waypoint proxy is that it runs either per-namespace (default) or per-service account. By running outside of the application pod, a waypoint proxy can install, upgrade, and scale independently from the application, as well as reduce operational costs.

Waypoint architecture

Waypoint proxies are deployed declaratively using Kubernetes Gateway resources or the helpful istioctl command:

$ istioctl experimental waypoint generate
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: namespace
spec:
  gatewayClassName: istio-waypoint
  listeners:
  - name: mesh
    port: 15008
    protocol: HBONE

Istiod will monitor these resources and deploy and manage the corresponding waypoint deployment for users automatically.

Shift source proxy configuration to destination proxy

In the existing sidecar architecture, most traffic-shaping (for example request routing or traffic shifting or fault injection) policies are implemented by the source (client) proxy while most security policies are implemented by the destination (server) proxy. This leads to a number of concerns:

  • Scaling - each source sidecar needs to know information about every other destination in the mesh. This is a polynomial scaling problem. Worse, if any destination configuration changes, we need to notify all sidecars at once.
  • Debugging - because policy enforcement is split between the client and server sidecars, it can be hard to understand the behavior of the system when troubleshooting.
  • Mixed environments - if we have systems where not all clients are part of the mesh, we get inconsistent behavior. For example, a non-mesh client wouldn’t respect a canary rollout policy, leading to unexpected traffic distribution.
  • Ownership and attribution - ideally a policy written in one namespace should only affect work done by proxies running in the same namespace. However, in this model, it is distributed and enforced by each sidecar. While Istio has designed around this constraint to make this secure, it is still not optimal.

In ambient, all policies are enforced by the destination waypoint. In many ways, the waypoint acts as a gateway into the namespace (default scope) or service account. Istio enforces that all traffic coming into the namespace goes through the waypoint, which then enforces all policies for that namespace. Because of this, each waypoint only needs to know about configuration for its own namespace.

The scalability problem, in particular, is a nuisance for users running in large clusters. If we visualize it, we can see just how big an improvement the new architecture is.

Consider a simple deployment, where we have 2 namespaces, each with 2 (color coded) deployments. The Envoy (XDS) configuration required to program the sidecars is shown as circles:

Every sidecar has configuration about all other sidecars

In the sidecar model, we have 4 workloads, each with 4 sets of configuration. If any of those configurations changed, all of them would need to be updated. In total there are 16 configurations distributed.

In the waypoint architecture, however, the configuration is dramatically simplified:

Each waypoint only has configuration for its own namespace

Here, we see a very different story. We have only 2 waypoint proxies, as each one is able to serve the entire namespace, and each one only needs configuration for its own namespace. In total we have 25% of the amount of configuration sent, even for a simple example.

If we scale each namespace up to 25 deployments with 10 pods each and each waypoint deployment with 2 pods for high availability, the numbers are even more impressive - the waypoint config distribution requires just 0.8% of the configuration distribution of the sidecar, as the table below illustrates!

Config Distribution Namespace 1 Namespace 2 Total
Sidecars 25 configurations * 250 sidecars 25 configurations * 250 sidecars 12500
Waypoints 25 configurations * 2 waypoints 25 configurations * 2 waypoints 100
Waypoints / Sidecars 0.8% 0.8% 0.8%

While we use namespace scoped waypoint proxies to illustrate the simplification above, the simplification is similar when you apply it to service account waypoint proxies.

This reduced configuration means lower resource usage (CPU, RAM, and network bandwidth) for both the control plane and data plane. While users today can see similar improvements with careful usage of exportTo in their Istio networking resources or of the Sidecar API, in ambient mode this is no longer required, making scaling a breeze.

What if my destination doesn’t have a waypoint proxy?

The design of ambient mode centers around the assumption that most configuration is best implemented by the service producer, rather than the service consumer. However, this isn’t always the case - sometimes we need to configure traffic management for destinations we don’t control. A common example of this would be connecting to an external service with improved resilience to handle occasional connection issues (e.g., to add a timeout for calls to example.com).

This is an area under active development in the community, where we design how traffic can be routed to your egress gateway and how you can configure the egress gateway with your desired policies. Look out for future blog posts in this area!

A deep-dive of waypoint configuration

Assuming you have followed the ambient get started guide up to and including the control traffic section, you have deployed a waypoint proxy for the bookinfo-reviews service account to direct 90% traffic to reviews v1 and 10% traffic to reviews v2.

Use istioctl to retrieve the listeners for the reviews waypoint proxy:

$ istioctl proxy-config listener deploy/bookinfo-reviews-istio-waypoint --waypoint
LISTENER              CHAIN                                                 MATCH                                         DESTINATION
envoy://connect_originate                                                       ALL                                           Cluster: connect_originate
envoy://main_internal inbound-vip|9080||reviews.default.svc.cluster.local-http  ip=10.96.104.108 -> port=9080                 Inline Route: /*
envoy://main_internal direct-tcp                                            ip=10.244.2.14 -> ANY                         Cluster: encap
envoy://main_internal direct-tcp                                            ip=10.244.1.6 -> ANY                          Cluster: encap
envoy://main_internal direct-tcp                                            ip=10.244.2.11 -> ANY                         Cluster: encap
envoy://main_internal direct-http                                           ip=10.244.2.11 -> application-protocol='h2c'  Cluster: encap
envoy://main_internal direct-http                                           ip=10.244.2.11 -> application-protocol='http/1.1' Cluster: encap
envoy://main_internal direct-http                                           ip=10.244.2.14 -> application-protocol='http/1.1' Cluster: encap
envoy://main_internal direct-http                                           ip=10.244.2.14 -> application-protocol='h2c'  Cluster: encap
envoy://main_internal direct-http                                           ip=10.244.1.6 -> application-protocol='h2c'   Cluster: encap
envoy://main_internal direct-http                                           ip=10.244.1.6 -> application-protocol='http/1.1'  Cluster: encap
envoy://connect_terminate default                                               ALL                                           Inline Route:

For requests arriving on port 15008, which by default is Istio’s inbound HBONE port, the waypoint proxy terminates the HBONE connection and forwards the request to the main_internal listener to enforce any workload policies such as AuthorizationPolicy. If you are not familiar with internal listeners, they are Envoy listeners that accepts user space connections without using the system network API. The --waypoint flag added to the istioctl proxy-config command, above, instructs it to show the details of the main_internal listener, its filter chains, chain matches, and destinations.

Note 10.96.104.108 is the reviews’ service VIP and 10.244.x.x are the reviews’ v1/v2/v3 pod IPs, which you can view for your cluster using the kubectl get svc,pod -o wide command. For plain text or HBONE terminated inbound traffic, it will be matched on the service VIP and port 9080 for reviews or by pod IP address and application protocol (either ANY, h2c, or http/1.1).

Checking out the clusters for the reviews waypoint proxy, you get the main_internal cluster along with a few inbound clusters. Other than the clusters for infrastructure, the only Envoy clusters created are for services and pods running in the same service account. No clusters are created for services or pods running elsewhere.

$ istioctl proxy-config clusters deploy/bookinfo-reviews-istio-waypoint
SERVICE FQDN                         PORT SUBSET  DIRECTION   TYPE         DESTINATION RULE
agent                                -    -       -           STATIC
connect_originate                    -    -       -           ORIGINAL_DST
encap                                -    -       -           STATIC
kubernetes.default.svc.cluster.local 443  tcp     inbound-vip EDS
main_internal                        -    -       -           STATIC
prometheus_stats                     -    -       -           STATIC
reviews.default.svc.cluster.local    9080 http    inbound-vip EDS
reviews.default.svc.cluster.local    9080 http/v1 inbound-vip EDS
reviews.default.svc.cluster.local    9080 http/v2 inbound-vip EDS
reviews.default.svc.cluster.local    9080 http/v3 inbound-vip EDS
sds-grpc                             -    -       -           STATIC
xds-grpc                             -    -       -           STATIC
zipkin                               -    -       -           STRICT_DNS

Note that there are no outbound clusters in the list, which you can confirm using istioctl proxy-config cluster deploy/bookinfo-reviews-istio-waypoint --direction outbound! What’s nice is that you didn’t need to configure exportTo on any other bookinfo services (for example, the productpage or ratings services). In other words, the reviews waypoint is not made aware of any unnecessary clusters, without any extra manual configuration from you.

Display the list of routes for the reviews waypoint proxy:

$ istioctl proxy-config routes deploy/bookinfo-reviews-istio-waypoint
NAME                                                    DOMAINS MATCH              VIRTUAL SERVICE
encap                                                   *       /*
inbound-vip|9080|http|reviews.default.svc.cluster.local *       /*                 reviews.default
default

Recall that you didn’t configure any Sidecar resources or exportTo configuration on your Istio networking resources. You did, however, deploy the bookinfo-productpage route to configure an ingress gateway to route to productpage but the reviews waypoint has not been made aware of any such irrelevant routes.

Displaying the detailed information for the inbound-vip|9080|http|reviews.default.svc.cluster.local route, you’ll see the weight-based routing configuration directing 90% of the traffic to reviews v1 and 10% of the traffic to reviews v2, along with some of Istio’s default retry and timeout configurations. This confirms the traffic and resiliency policies are shifted from the source to destination oriented waypoint as discussed earlier.

$ istioctl proxy-config routes deploy/bookinfo-reviews-istio-waypoint --name "inbound-vip|9080|http|reviews.default.svc.cluster.local" -o yaml
- name: inbound-vip|9080|http|reviews.default.svc.cluster.local
 validateClusters: false
 virtualHosts:
 - domains:
   - '*'
   name: inbound|http|9080
   routes:
   - decorator:
       operation: reviews:9080/*
     match:
       prefix: /
     metadata:
       filterMetadata:
         istio:
           config: /apis/networking.istio.io/v1alpha3/namespaces/default/virtual-service/reviews
     route:
       maxGrpcTimeout: 0s
       retryPolicy:
         hostSelectionRetryMaxAttempts: "5"
         numRetries: 2
         retriableStatusCodes:
         - 503
         retryHostPredicate:
         - name: envoy.retry_host_predicates.previous_hosts
           typedConfig:
             '@type': type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate
         retryOn: connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes
       timeout: 0s
       weightedClusters:
         clusters:
         - name: inbound-vip|9080|http/v1|reviews.default.svc.cluster.local
           weight: 90
         - name: inbound-vip|9080|http/v2|reviews.default.svc.cluster.local
           weight: 10

Check out the endpoints for reviews waypoint proxy:

$ istioctl proxy-config endpoints deploy/bookinfo-reviews-istio-waypoint
ENDPOINT                                            STATUS  OUTLIER CHECK CLUSTER
127.0.0.1:15000                                     HEALTHY OK            prometheus_stats
127.0.0.1:15020                                     HEALTHY OK            agent
envoy://connect_originate/                          HEALTHY OK            encap
envoy://connect_originate/10.244.1.6:9080           HEALTHY OK            inbound-vip|9080|http/v2|reviews.default.svc.cluster.local
envoy://connect_originate/10.244.1.6:9080           HEALTHY OK            inbound-vip|9080|http|reviews.default.svc.cluster.local
envoy://connect_originate/10.244.2.11:9080          HEALTHY OK            inbound-vip|9080|http/v1|reviews.default.svc.cluster.local
envoy://connect_originate/10.244.2.11:9080          HEALTHY OK            inbound-vip|9080|http|reviews.default.svc.cluster.local
envoy://connect_originate/10.244.2.14:9080          HEALTHY OK            inbound-vip|9080|http/v3|reviews.default.svc.cluster.local
envoy://connect_originate/10.244.2.14:9080          HEALTHY OK            inbound-vip|9080|http|reviews.default.svc.cluster.local
envoy://main_internal/                              HEALTHY OK            main_internal
unix://./etc/istio/proxy/XDS                        HEALTHY OK            xds-grpc
unix://./var/run/secrets/workload-spiffe-uds/socket HEALTHY OK            sds-grpc

Note that you don’t get any endpoints related to any services other than reviews, even though you have a few other services in the default and istio-system namespace.

Wrapping up

We are very excited about the waypoint simplification focusing on destination oriented waypoint proxies. This is another significant step towards simplifying Istio’s usability, scalability and debuggability which are top priorities on Istio’s roadmap. Follow our getting started guide to try the ambient alpha build today and experience the simplified waypoint proxy!

]]>
Fri, 31 Mar 2023 00:00:00 +0000/v1.24//blog/2023/waypoint-proxy-made-simple//v1.24//blog/2023/waypoint-proxy-made-simple/istioambientwaypoint
Using eBPF for traffic redirection in Istio ambient mode

In Istio’s new ambient mode, the istio-cni component running on each Kubernetes worker node is responsible for redirecting application traffic to the zero-trust tunnel (ztunnel) on that node. By default it relies on iptables and Generic Network Virtualization Encapsulation (Geneve) overlay tunnels to achieve this redirection. We have now added support for an eBPF-based method of traffic redirection.

Why eBPF

Although performance considerations are essential in the implementation of Istio ambient mode redirection, it’s also important to consider ease of programmability, to enable the implementation of versatile and customized requirements. With eBPF, you can leverage additional context in the kernel to bypass complex routing and simply send packets to their final destination.

Furthermore, eBPF enables deeper visibility and additional context for packets in the kernel, allowing for more efficient and flexible management of data flow compared with iptables.

How it works

An eBPF program, attached to the traffic control ingress and egress hook, has been compiled into the Istio CNI component. istio-cni will watch pod events and attach/detach the eBPF program to other related network interfaces when the pod is moved into or out of ambient mode.

Using an eBPF program (instead of iptables) eliminates the need to encapsulate tasks (for Geneve), allowing the routing tasks to be customized in kernel space instead. This yields both gains in performance, and additional flexibility, in routing.

ambient eBPF architecture

All traffic to/from the application pod will be intercepted by eBPF and redirected to the corresponding ztunnel pod. On the ztunnel side, proper redirection will be performed based on connection lookup results within the eBPF program. This provides more efficient control of the network traffic between the application and ztunnel.

How to enable eBPF redirection in Istio ambient mode

Follow the instructions in Getting Started with Ambient Mesh to set up your cluster, with a small change: when you install Istio, set the values.cni.ambient.redirectMode configuration parameter to ebpf.

$ istioctl install --set profile=ambient --set values.cni.ambient.redirectMode="ebpf"

Check the istio-cni logs to confirm eBPF redirection is on:

ambient Writing ambient config: {"ztunnelReady":true,"redirectMode":"eBPF"}

Performance gains

The latency and throughput (QPS) for eBPF redirection are somewhat better than using iptables. The following tests were run in a kind cluster with a Fortio client sending requests to a Fortio server, both running in ambient mode (with eBPF debug logging disabled) and on the same Kubernetes worker node.

$ fortio load -uniform -t 60s -qps 0 -c <num_connections> http://<fortio-svc-name>:8080
Max QPS, with varying number of connections
$ fortio load -uniform -t 60s -qps 8000 -c <num_connections> http://<fortio-svc-name>:8080
P75 Latency (ms) for QPS 8000 with varying number of connections

Wrapping up

Both eBPF and iptables have their own advantages and disadvantages when it comes to traffic redirection. eBPF is a modern, flexible, and powerful alternative that allows for more customization in rule creation and offers better performance. However, it does require a modern kernel version (4.20 or later for redirection case) which may not be available on some systems. On the other hand, iptables is widely used and compatible with most Linux distributions, even those with older kernels. However, it lacks the flexibility and extensibility of eBPF and may have lower performance.

Ultimately, the choice between eBPF and iptables for traffic redirection will depend on the specific needs and requirements of the system, as well as the user’s level of expertise in using each tool. Some users may prefer the simplicity and compatibility of iptables, while others may require the flexibility and performance of eBPF.

There is still plenty of work to be done, including integration with various CNI plugins, and contributions to improve the ease of use would be greatly welcomed. Please join us in #ambient on the Istio slack.

]]>
Wed, 29 Mar 2023 00:00:00 +0000/v1.24//blog/2023/ambient-ebpf-redirection//v1.24//blog/2023/ambient-ebpf-redirection/istioambientztunneleBPF
Support for Dual Stack Kubernetes ClustersOver the past year, both Intel and F5 have collaborated on an effort to bring support for Kubernetes Dual-Stack networking to Istio.

Background

The journey has taken us longer than anticipated and we continue to have work to do. The team initially started with a design based on a reference implementation from F5. The design led to an RFC that caused us to re-examine our approach. Notably, there were concerns about memory and performance issues that the community wanted to be addressed before implementation. The original design had to duplicate Envoy configuration for listeners, clusters, routes and endpoints. Given that many people already experience Envoy memory and CPU consumption issues, early feedback wanted us to completely re-evaluate this approach. Many proxies transparently handle outbound dual-stack traffic regardless of how the traffic was originated. Much of the earliest feedback was to implement the same behavior in Istio and Envoy.

Redefining Dual Stack Support

Much of the feedback provided by the community for the original RFC was to update Envoy to better support dual-stack use cases internally instead of supporting this within Istio. This has led us to a new design where we have taken lessons learned as well as feedback and have applied them to fit a simplified design.

Support for Dual Stack in Istio 1.17

We have worked with the Envoy community to address numerous concerns which is a reason why dual-stack enablement has taken us a while to implement. We have implemented matched IP Family for outbound listener and supported multiple addresses per listener. Alex Xu has also been working fervently to get long outstanding issues resolved, with the ability for Envoy to have a smarter way to pick endpoints for dual-stack. Some of these improvements to Envoy, such as the ability to enable socket options on multiple addresses, have landed in the Istio 1.17 release (e.g. extra source addresses on inbound clusters).

The Envoy API changes made by the team can be found at their site at Listener addresses and bind config. Making sure we can have proper support at both the downstream and upstream connection for Envoy is important for realizing dual-stack support.

In total the team has submitted over a dozen PRs to Envoy and are working on at least a half dozen more to make Envoy adoption of dual stack easier for Istio.

Meanwhile, on the Istio side you can track the progress in Issue #40394. Progress has slowed down a bit lately as we continue working with Envoy on various issues, however, we are happy to announce experimental support for dual stack in Istio 1.17!

A Quick Experiment using Dual Stack

  1. Enable dual stack experimental support on Istio 1.17.0+ with the following:

    $ istioctl install -y -f - <<EOF
    apiVersion: install.istio.io/v1alpha1
    kind: IstioOperator
    spec:
      meshConfig:
        defaultConfig:
          proxyMetadata:
            ISTIO_DUAL_STACK: "true"
      values:
        pilot:
          env:
            ISTIO_DUAL_STACK: "true"
    EOF
  2. Create three namespaces:

    • dual-stack: tcp-echo will listen on both an IPv4 and IPv6 address.
    • ipv4: tcp-echo will listen on only an IPv4 address.
    • ipv6: tcp-echo will listen on only an IPv6 address.
    $ kubectl create namespace dual-stack
    $ kubectl create namespace ipv4
    $ kubectl create namespace ipv6
  3. Enable sidecar injection on all of those namespaces as well as the default namespace:

    $ kubectl label --overwrite namespace default istio-injection=enabled
    $ kubectl label --overwrite namespace dual-stack istio-injection=enabled
    $ kubectl label --overwrite namespace ipv4 istio-injection=enabled
    $ kubectl label --overwrite namespace ipv6 istio-injection=enabled
  4. Create tcp-echo deployments in the namespaces:

    $ kubectl apply --namespace dual-stack -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/tcp-echo/tcp-echo-dual-stack.yaml
    $ kubectl apply --namespace ipv4 -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/tcp-echo/tcp-echo-ipv4.yaml
    $ kubectl apply --namespace ipv6 -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/tcp-echo/tcp-echo-ipv6.yaml
  5. Create sleep deployment in the default namespace:

    $ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/sleep/sleep.yaml
  6. Verify the traffic:

    $ kubectl exec -it "$(kubectl get pod -l app=sleep -o jsonpath='{.items[0].metadata.name}')" -- sh -c "echo dualstack | nc tcp-echo.dual-stack 9000"
    hello dualstack
    $ kubectl exec -it "$(kubectl get pod -l app=sleep -o jsonpath='{.items[0].metadata.name}')" -- sh -c "echo ipv4 | nc tcp-echo.ipv4 9000"
    hello ipv4
    $ kubectl exec -it "$(kubectl get pod -l app=sleep -o jsonpath='{.items[0].metadata.name}')" -- sh -c "echo ipv6 | nc tcp-echo.ipv6 9000"
    hello ipv6

Now you can experiment with dual-stack services in your environment!

Important Changes to Listeners and Endpoints

For the above experiment, you’ll notice changes are made to listeners and routes:

$ istioctl proxy-config listeners "$(kubectl get pod -n dual-stack -l app=tcp-echo -o jsonpath='{.items[0].metadata.name}')" -n dual-stack --port 9000

You will see listeners are now bound to multiple addresses, but only for dual stack services. Other services will only be listening on a single IP address.

"name": "fd00:10:96::f9fc_9000",
"address": {
    "socketAddress": {
        "address": "fd00:10:96::f9fc",
        "portValue": 9000
    }
},
"additionalAddresses": [
    {
        "address": {
            "socketAddress": {
                "address": "10.96.106.11",
                "portValue": 9000
            }
        }
    }
],

Virtual inbound addresses are now also configured to listen on both 0.0.0.0 and [::].

"name": "virtualInbound",
"address": {
    "socketAddress": {
        "address": "0.0.0.0",
        "portValue": 15006
    }
},
"additionalAddresses": [
    {
        "address": {
            "socketAddress": {
                "address": "::",
                "portValue": 15006
            }
        }
    }
],

Envoy’s endpoints now are configured to route to both IPv4 and IPv6:

$ istioctl proxy-config endpoints "$(kubectl get pod -l app=sleep -o jsonpath='{.items[0].metadata.name}')" --port 9000
ENDPOINT                 STATUS      OUTLIER CHECK     CLUSTER
10.244.0.19:9000         HEALTHY     OK                outbound|9000||tcp-echo.ipv4.svc.cluster.local
10.244.0.26:9000         HEALTHY     OK                outbound|9000||tcp-echo.dual-stack.svc.cluster.local
fd00:10:244::1a:9000     HEALTHY     OK                outbound|9000||tcp-echo.dual-stack.svc.cluster.local
fd00:10:244::18:9000     HEALTHY     OK                outbound|9000||tcp-echo.ipv6.svc.cluster.local

Get Involved

Plenty of work remains, and you are welcome to help us with the remaining tasks needed for dual stack support to get to Alpha here.

For instance, Iris Ding (Intel) and Li Chun (Intel) are already working with the community for getting redirection of network traffic for ambient, and we are hoping to have ambient support dual stack for its upcoming alpha release in Istio 1.18.

We would love your feedback and if you are eager to work with us please stop by our slack channel, #dual-stack within the Istio Slack.

Thank you to the team that has worked on Istio dual-stack!

]]>
Fri, 10 Mar 2023 00:00:00 +0000/v1.24//blog/2023/experimental-dual-stack//v1.24//blog/2023/experimental-dual-stack/dual-stack
Istio Ambient Service Mesh Merged to Istio’s Main BranchIstio ambient service mesh was launched in Sept 2022 in an experimental branch, introducing a new data plane mode for Istio without sidecars. Through collaboration with the Istio community, across Google, Solo.io, Microsoft, Intel, Aviatrix, Huawei, IBM and others, we are excited to announce that Istio ambient mesh has graduated from the experimental branch and merged to Istio’s main branch! This is a significant milestone for ambient mesh, paving the way for releasing ambient in Istio 1.18 and installing it by default in Istio’s future releases.

Major Changes from the Initial Launch

Ambient mesh is designed for simplified operations, broader application compatibility, and reduced infrastructure cost. The ultimate goal of ambient is to be transparent to your applications and we have made a few changes to make the ztunnel and waypoint components simpler and lightweight.

  • The ztunnel component has been rewritten from the ground up to be fast, secure, and lightweight. Refer to Introducing Rust-Based Ztunnel for Istio Ambient Service Mesh for more information.
  • We made significant changes to simplify waypoint proxy’s configuration to improve its debuggability and performance. Refer to Istio Ambient Waypoint Proxy Made Simple for more information.
  • Added the istioctl x waypoint command to help you conveniently deploy waypoint proxies, along with istioctl pc workload to help you view workload information.
  • We gave users the ability to explicitly bind Istio policies such as AuthorizationPolicy to waypoint proxies vs selecting the destination workload.

Get involved

Follow our getting started guide to try the ambient pre-alpha build today. We’d love to hear from you! To learn more about ambient:

  • Join us in the #ambient and #ambient-dev channel in Istio’s slack.
  • Attend the weekly ambient contributor meeting on Wednesdays.
  • Check out the Istio and ztunnel repositories, submit issues or PRs!
]]>
Tue, 28 Feb 2023 00:00:00 +0000/v1.24//blog/2023/ambient-merged-istio-main//v1.24//blog/2023/ambient-merged-istio-main/istioambient
Introducing Rust-Based Ztunnel for Istio Ambient Service MeshThe ztunnel (zero trust tunnel) component is a purpose-built per-node proxy for Istio ambient mesh. It is responsible for securely connecting and authenticating workloads within ambient mesh. Ztunnel is designed to focus on a small set of features for your workloads in ambient mesh such as mTLS, authentication, L4 authorization and telemetry, without terminating workload HTTP traffic or parsing workload HTTP headers. The ztunnel ensures traffic is efficiently and securely transported to the waypoint proxies, where the full suite of Istio’s functionality, such as HTTP telemetry and load balancing, is implemented.

Because ztunnel is designed to run on all of your Kubernetes worker nodes, it is critical to keep its resource footprint small. Ztunnel is designed to be an invisible (or “ambient”) part of your service mesh with minimal impact on your workloads.

Ztunnel architecture

Similar to sidecars, ztunnel also serves as an xDS client and CA client:

  1. During startup, it securely connects to the Istiod control plane using its service account token. Once the connection from ztunnel to Istiod is established securely using TLS, it starts to fetch xDS configuration as an xDS client. This works similarly to sidecars or gateways or waypoint proxies, except that Istiod recognizes the request from ztunnel and sends the purpose-built xDS configuration for ztunnel, which you will learn more about soon.
  2. It also serves as a CA client to manage and provision mTLS certificates on behalf of all co-located workloads it manages.
  3. As traffic comes in or goes out, it serves as a core proxy that handles the inbound and outbound traffic (either out-of-mesh plain text or in-mesh HBONE) for all co-located workloads it manages.
  4. It provides L4 telemetry (metrics and logs) along with an admin server with debugging information to help you debug ztunnel if needed.
Ztunnel architecture

Why not reuse Envoy?

When Istio ambient service mesh was announced on Sept 7, 2022, the ztunnel was implemented using an Envoy proxy. Given that we use Envoy for the rest of Istio - sidecars, gateways, and waypoint proxies - it was natural for us to start implementing ztunnel using Envoy.

However, we found that while Envoy was a great fit for other use cases, it was challenging to implement ztunnel in Envoy, as many of the tradeoffs, requirements, and use cases are dramatically different than that of a sidecar proxy or ingress gateway. In addition, most of the things that make Envoy such a great fit for those other use cases, such as its rich L7 feature set and extensibility, went to waste in ztunnel which didn’t need those features.

A purpose-built ztunnel

After having trouble bending Envoy to our needs, we started investigating making a purpose-built implementation of the ztunnel. Our hypothesis was that by designing with a single focused use case in mind from the beginning, we could develop a solution that was simpler and more performant than molding a general purpose project to our bespoke use cases. The explicit decision to make ztunnel simple was key to this hypothesis; similar logic wouldn’t hold up to rewriting the gateway, for example, which has a huge list of supported features and integrations.

This purpose-built ztunnel involved two key areas:

  • The configuration protocol between ztunnel and its Istiod
  • The runtime implementation of ztunnel

Configuration protocol

Envoy proxies use the xDS Protocol for configuration. This is a key part of what makes Istio work well, offering rich and dynamic configuration updates. However, as we tread off the beaten path, the config becomes more and more bespoke, which means it’s much larger and more expensive to generate. In a sidecar, a single Service with 1 pod, generates roughly ~350 lines of xDS (in YAML), which already has been challenging to scale. The Envoy-based ztunnel was far worse, and in some areas had N^2 scaling attributes.

To keep the ztunnel configuration as small as possible, we investigated using a purpose built configuration protocol, that contains precisely the information we need (and nothing more), in an efficient format. For example, a single pod could be represented concisely:

name: helloworld-v1-55446d46d8-ntdbk
namespace: default
serviceAccount: helloworld
node: ambient-worker2
protocol: TCP
status: Healthy
waypointAddresses: []
workloadIp: 10.244.2.8
canonicalName: helloworld
canonicalRevision: v1
workloadName: helloworld-v1
workloadType: deployment

This information is transported over the xDS transport API, but uses a custom ambient-specific type. Refer to the workload xDS configuration section to learn more about the configuration details.

By having a purpose built API, we can push logic into the proxy instead of in Envoy configuration. For example, to configure mTLS in Envoy, we need to add an identical large set of configuration tuning the precise TLS settings for each service; with ztunnel, we need only a single enum to declare whether mTLS should be used or not. The rest of the complex logic is embedded directly into ztunnel code.

With this efficient API between Istiod and ztunnel, we found we could configure ztunnels with information about large meshes (such as those with 100,000 pods) with orders of magnitude less configuration, which means less CPU, memory, and network costs.

Runtime implementation

As the name suggests, ztunnel uses an HTTPS tunnel to carry users requests. While Envoy supports this tunneling, we found the configuration model limiting for our needs. Roughly speaking, Envoy operates by sending requests through a series of “filters”, starting with accepting a request and ending with sending a request. With our requirements, which have multiple layers of requests (the tunnel itself and the users’ requests), as well as a need to apply per-pod policy after load balancing, we found we would need to loop through these filters 4 times per connection when implementing our prior Envoy-based ztunnel. While Envoy has some optimizations for essentially “sending a request to itself” in memory, this was still very complex and expensive.

By building out our own implementation, we could design around these constraints from the ground up. In addition, we have more flexibility in all aspects of the design. For example, we could choose to share connections across threads or implement more bespoke requirements around isolation between service accounts. After establishing that a purpose built proxy was viable, we set out to choose the implementation details.

A Rust-based ztunnel

With the goal to make ztunnel fast, secure, and lightweight, Rust was an obvious choice. However, it wasn’t our first. Given Istio’s current extensive usage of Go, we had hoped we could make a Go-based implementation meet these goals. In initial prototypes, we built out some simple versions of both a Go-based implementation as well as a Rust-based one. From our tests, we found that the Go-based version didn’t meet our performance and footprint requirements. While it’s likely we could have optimized it further, we felt that a Rust-based proxy would give us the long-term optimal implementation.

A C++ implementation – likely reusing parts of Envoy – was also considered. However, this option was not pursued due to lack of memory safety, developer experience concerns, and a general industry trend towards Rust.

This process of elimination left us with Rust, which was a perfect fit. Rust has a strong history of success in high performance, low resource utilization applications, especially in network applications (including service mesh). We chose to build on top of the Tokio and Hyper libraries, two of the de-facto standards in the ecosystem that are extensively battle-tested and easy to write highly performant asynchronous code with.

A quick tour of the Rust-based ztunnel

Workload xDS configuration

The workload xDS configurations are very easy to understand and debug. You can view them by sending a request to localhost:15000/config_dump from one of your ztunnel pods, or use the convenient istioctl pc workload command. There are two key workload xDS configurations: workloads and policies.

Before your workloads are included in your ambient mesh, you will still be able to see them in ztunnel’s config dump, as ztunnel is aware of all of the workloads regardless of whether they are ambient enabled or not. For example, below contains a sample workload configuration for a newly deployed helloworld v1 pod which is out-of-mesh indicated by protocol: TCP:

{
  "workloads": {
    "10.244.2.8": {
      "workloadIp": "10.244.2.8",
      "protocol": "TCP",
      "name": "helloworld-v1-cross-node-55446d46d8-ntdbk",
      "namespace": "default",
      "serviceAccount": "helloworld",
      "workloadName": "helloworld-v1-cross-node",
      "workloadType": "deployment",
      "canonicalName": "helloworld",
      "canonicalRevision": "v1",
      "node": "ambient-worker2",
      "authorizationPolicies": [],
      "status": "Healthy"
    }
  }
}

After the pod is included in ambient (by labeling the namespace default with istio.io/dataplane-mode=ambient), the protocol value is replaced with HBONE, instructing ztunnel to upgrade all incoming and outgoing communications from the helloworld-v1 pod to be HBONE.

{
  "workloads": {
    "10.244.2.8": {
      "workloadIp": "10.244.2.8",
      "protocol": "HBONE",
      ...
}

After you deploy any workload level authorization policy, the policy configuration will be pushed as xDS configuration from Istiod to ztunnel and shown under policies:

{
  "policies": {
    "default/hw-viewer": {
      "name": "hw-viewer",
      "namespace": "default",
      "scope": "WorkloadSelector",
      "action": "Allow",
      "groups": [[[{
        "principals": [{"Exact": "cluster.local/ns/default/sa/sleep"}]
      }]]]
    }
  }
  ...
}

You’ll also notice the workload’s configuration is updated with reference to the authorization policy.

{
  "workloads": {
    "10.244.2.8": {
    "workloadIp": "10.244.2.8",
    ...
    "authorizationPolicies": [
        "default/hw-viewer"
    ],
  }
  ...
}

L4 telemetry provided by ztunnel

You may be pleasantly surprised that the ztunnel logs are easy to understand. For example, you’ll see the HTTP Connect request on the destination ztunnel that indicates the source pod IP (peer_ip) and destination pod IP.

2023-02-15T20:40:48.628251Z  INFO inbound{id=4399fa68cf25b8ebccd472d320ba733f peer_ip=10.244.2.5 peer_id=spiffe://cluster.local/ns/default/sa/sleep}: ztunnel::proxy::inbound: got CONNECT request to 10.244.2.8:5000

You can view L4 metrics of your workloads by accessing the localhost:15020/metrics API which provides the full set of TCP standard metrics, with same labels that sidecars expose. For example:

istio_tcp_connections_opened_total{
  reporter="source",
  source_workload="sleep",
  source_workload_namespace="default",
  source_principal="spiffe://cluster.local/ns/default/sa/sleep",
  destination_workload="helloworld-v1",
  destination_workload_namespace="default",
  destination_principal="spiffe://cluster.local/ns/default/sa/helloworld",
  request_protocol="tcp",
  connection_security_policy="mutual_tls"
  ...
} 1

If you install Prometheus and Kiali, you can view these metrics easily from Kiali’s UI.

Kiali dashboard - L4 telemetry provided by ztunnel

Wrapping up

We are super excited that the new Rust-based ztunnel is drastically simplified, more lightweight and performant than the prior Envoy-based ztunnel. With the purposefully designed workload xDS for the Rust-based ztunnel, you’ll not only be able to understand the xDS configuration much more easily, but also have drastically reduced network traffic and cost between the Istiod control plane and ztunnels. With Istio ambient now merged to upstream master, you can try the new Rust-based ztunnel by following our getting started guide.

]]>
Tue, 28 Feb 2023 00:00:00 +0000/v1.24//blog/2023/rust-based-ztunnel//v1.24//blog/2023/rust-based-ztunnel/istioambientztunnel
Announcing the Contribution Seat holders for 2023The Istio Steering Committee consists of 9 Contribution Seats, proportionally allocated based on corporate contributions to the project, and 4 elected Community Seats.

Last year, we elected four members to the community seats. It’s now time to announce the companies who fuel our growth by selecting the Contribution Seat members. As per the Steering charter, every February we look at which companies have made the most contributions to Istio based on an annually agreed metric.

According to our seat allocation process, this year Google will be allocated 5 seats and IBM/Red Hat will be allocated 2. As the third largest contributor to Istio in the last 12 months, we are pleased to announce that Huawei has earned two Contribution Seats.

Based on this, here is the complete list of Istio Steering Committee members, including both the Contribution and Community Seats:

Our sincerest thanks to Louis Ryan, Srihari Angaluri, Kebe Liu and Jason McGee, all long-time contributors to the Istio project, whose terms have come to an end.

]]>
Mon, 06 Feb 2023 00:00:00 +0000/v1.24//blog/2023/steering-contribution-seat-results//v1.24//blog/2023/steering-contribution-seat-results/istiosteeringgovernancecommunitycontributor
Istio publishes results of 2022 security auditIstio is a project that platform engineers trust to enforce security policy in their production Kubernetes environments. We pay a lot of care to security in our code, and maintain a robust vulnerability program. To validate our work, we periodically invite external review of the project, and we are pleased to publish the results of our second security audit.

The auditors’ assessment was that “Istio is a well-maintained project that has a strong and sustainable approach to security”. No critical issues were found; the highlight of the report was the discovery of a vulnerability in the Go programming language.

We would like to thank the Cloud Native Computing Foundation for funding this work, as a benefit offered to us after we joined the CNCF in August. It was arranged by OSTIF, and performed by ADA Logics.

Scope and overall findings

Istio received its first security assessment in 2020, with its data plane, the Envoy proxy, having been independently assessed in 2018 and 2021. The Istio Product Security Working Group and ADA Logics therefore decided on the following scope:

  • Produce a formal threat model, to guide this and future security audits
  • Carry out a manual code audit for security issues
  • Review the fixes for the issues found in the 2020 audit
  • Review and improve Istio’s fuzzing suite
  • Perform a SLSA review of Istio

Once again, no Critical issues were found in the review. The assessment found 11 security issues; two High, four Medium, four Low and one informational. All the reported issues have been fixed.

Aside from their observations above, the auditors note that Istio follows a high level of industry standards in dealing with security. In particular, they highlight that:

  • The Istio Product Security Working Group responds swiftly to security disclosures
  • The documentation on the project’s security is comprehensive, well-written and up to date
  • Security vulnerability disclosures follow industry standards and security advisories are clear and detailed
  • Security fixes include regression tests

Resolution and learnings

Request smuggling vulnerability in Go

The auditors uncovered a situation where Istio could accept traffic using HTTP/2 Over Cleartext (h2c), a method of making an unencrypted connection with HTTP/1.1 and then upgrading to HTTP/2. The Go library for h2c connections reads the entire request into memory, and notes that if you wish to avoid this, the request should be wrapped in a MaxBytesHandler.

In fixing this bug, Istio TOC member John Howard noticed that the recommended fix introduces a request smuggling vulnerability. The Go team thus published CVE-2022-41721 — the only vulnerability discovered by this audit!

Istio has since been changed to disable h2c upgrade support throughout.

Improvements to file fetching

The most common class of issue found were related to Istio fetching files over a network (for example, the Istio Operator installing Helm charts, or the WebAssembly module downloader):

  • A crafted Helm chart could exhaust disk space (#1) or overwrite other files in the Operator’s pod (#2)
  • File handles were not closed in the case of an error, and could be exhausted (#3)
  • Crafted files could exhaust memory (#4 and #5)

To execute these code paths, an attacker would need enough privilege to either specify a URL for a Helm chart or a WebAssembly module. With such access, they would not need an exploit: they could already cause an arbitrary chart to be installed to the cluster or an arbitrary WebAssembly module to be loaded into memory on the proxy servers.

The auditors and maintainers both note that the Operator is not recommended as a method of installation, as this requires a high-privilege controller to run in the cluster.

Other issues

The remaining issues found were:

  • In some testing code, or where a control plane component connects to another component over localhost, minimum TLS settings were not enforced (#6)
  • Operations that failed may not return error codes (#7)
  • A deprecated library was being used (#8)
  • TOC/TOU race conditions in a library used for copying files (#9)
  • A user could exhaust the memory of the Security Token Service if running in Debug mode (#11)

Please refer to the full report for details.

Reviewing the 2020 report

All 18 issues reported in Istio’s first security assessment were found to have been fixed.

Fuzzing

The OSS-Fuzz project helps open source projects perform free fuzz testing. Istio is integrated into OSS-Fuzz with 63 fuzzers running continuously: this support was built by ADA Logics and the Istio team in late 2021.

The assessment notes that “Istio benefits largely from having a substantial fuzz test suite that runs continuously on OSS-Fuzz”, and identified a few APIs in security-critical code that would benefit from further fuzzing, Six new fuzzers were contributed as a result of this work; by the end of the audit, the new tests had run over 3 billion times.

SLSA

Supply chain Levels for Software Artifacts (SLSA) is a check-list of standards and controls to prevent tampering, improve integrity, and secure software packages and infrastructure. It is organized into a series of levels that provide increasing integrity guarantees.

Istio does not currently generate provenance artifacts, so it does not meet the requirements for any SLSA levels. Work on reaching SLSA Level 1 is currently underway. If you would like to get involved, please join the Istio Slack and reach out to our Test and Release working group.

Get involved

If you want to get involved with Istio product security, or become a maintainer, we’d love to have you! Join our public meetings to raise issues or learn about what we are doing to keep Istio secure.

]]>
Mon, 30 Jan 2023 00:00:00 +0000/v1.24//blog/2023/ada-logics-security-assessment//v1.24//blog/2023/ada-logics-security-assessment/istiosecurityauditada logicsassessmentcncfostif
Join us for Istio Day at KubeCon Europe 2023!Istio is sailing up the canals this April! We are delighted to announce Istio Day Europe 2023, a “Day 0” event co-located with KubeCon + CloudNativeCon Europe 2023.

Istio Day is the perfect opportunity to meet the Istio maintainers and contributors in person, and hear from users why Istio is constantly ranked the #1 service mesh in production.

Submit a talk

We now encourage Istio users, developers, partners, and advocates to submit a session proposal through the CNCF event portal, which is open until February 12.

We want to see real world examples, case studies, and success stories that can inspire newcomers to use Istio in production. The content will cover introductory to advanced levels, split into four main topic tracks:

  • New Features: What have you been working on that the community should know about?
  • Case Studies: How have you built a platform or service on top of Istio?
  • Istio Recipes: How you can solve a specific business problem using Istio.
  • Project Updates: The evolution of Istio, and the latest updates from the project maintainers.

You can pick one of these formats to submit a session proposal:

  • Presentation: 25 minutes, 1 or 2 speaker(s) presenting a topic
  • Panel Discussion: 35 minutes of discussion among 3 to 5 speakers
  • Lightning Talk: A brief 5-minute presentation, maximum of 1 speaker

Accepted speakers will receive a complimentary All-Access In-Person ticket for all four days of KubeCon + CloudNativeCon.

Do you want to put your product or service in front of the most discerning Cloud Native users: those who demand 25% more conference than the crowd? Check out page 11 of the CNCF events prospectus to learn more.

Register to attend

Istio Day is a CNCF-hosted co-located event on 18 April 2023. KubeCon + CloudNativeCon Europe in-person attendees now have the option to buy an All-Access ticket, which includes entry to all the Day 0 events, as well as the main three days of the conference. You must be attending KubeCon to attend Istio Day, but virtual registration options are available, and the recordings will be posted to YouTube soon after the event.

For those of you who can’t make it, keep your eyes peeled for announcements of IstioCon 2023 and Istio Day North America later this year.

Stay tuned to hear more about the event, and we hope you can join us at Istio Day Europe!

]]>
Fri, 27 Jan 2023 00:00:00 +0000/v1.24//blog/2023/istioday-kubecon-eu//v1.24//blog/2023/istioday-kubecon-eu/Istio DayIstioConIstioconferenceKubeConCloudNativeCon
Getting started with the Kubernetes Gateway APIWhether you’re running your Kubernetes application services using Istio, or any service mesh for that matter, or simply using ordinary services in a Kubernetes cluster, you need to provide access to your application services for clients outside of the cluster. If you’re using plain Kubernetes clusters, you’re probably using Kubernetes Ingress resources to configure the incoming traffic. If you’re using Istio, you are more likely to be using Istio’s recommended configuration resources, Gateway and VirtualService, to do the job.

The Kubernetes Ingress resource has for some time been known to have significant shortcomings, especially when using it to configure ingress traffic for large applications and when working with protocols other than HTTP. One problem is that it configures both the client-side L4-L6 properties (e.g., ports, TLS, etc.) and service-side L7 routing in a single resource, configurations that for large applications should be managed by different teams and in different namespaces. Also, by trying to draw a common denominator across different HTTP proxies, Ingress is only able to support the most basic HTTP routing and ends up pushing every other feature of modern proxies into non-portable annotations.

To overcome Ingress’ shortcomings, Istio introduced its own configuration API for ingress traffic management. With Istio’s API, the client-side representation is defined using an Istio Gateway resource, with L7 traffic moved to a VirtualService, not coincidentally the same configuration resource used for routing traffic between services inside the mesh. Although the Istio API provides a good solution for ingress traffic management for large-scale applications, it is unfortunately an Istio-only API. If you are using a different service mesh implementation, or no service mesh at all, you’re out of luck.

Enter Gateway API

There’s a lot of excitement surrounding a new Kubernetes traffic management API, dubbed Gateway API, which has recently been promoted to Beta. Gateway API provides a set of Kubernetes configuration resources for ingress traffic control that, like Istio’s API, overcomes the shortcoming of Ingress, but unlike Istio’s, is a standard Kubernetes API with broad industry agreement. There are several implementations of the API in the works, including a Beta implementation in Istio, so now may be a good time to start thinking about how you can start moving your ingress traffic configuration from Kubernetes Ingress or Istio Gateway/VirtualService to the new Gateway API.

Whether or not you use, or plan to use, Istio to manage your service mesh, the Istio implementation of the Gateway API can easily be used to get started with your cluster ingress control. Even though it’s still a Beta feature in Istio, mostly driven by the fact that the Gateway API is itself still a Beta level API, Istio’s implementation is quite robust because under the covers it uses Istio’s same tried-and-proven internal resources to implement the configuration.

Gateway API quick-start

To get started using the Gateway API, you need to first download the CRDs, which don’t come installed by default on most Kubernetes clusters, at least not yet:

$ kubectl get crd gateways.gateway.networking.k8s.io &> /dev/null || \
  { kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.2.0" | kubectl apply -f -; }

Once the CRDs are installed, you can use them to create Gateway API resources to configure ingress traffic, but in order for the resources to work, the cluster needs to have a gateway controller running. You can enable Istio’s gateway controller implementation by simply installing Istio with the minimal profile:

$ curl -L https://istio.io/downloadIstio | sh -
$ cd istio-1.24.3
$ ./bin/istioctl install --set profile=minimal -y

Your cluster will now have a fully-functional implementation of the Gateway API, via Istio’s gateway controller named istio.io/gateway-controller, ready to use.

Deploy a Kubernetes target service

To try out the Gateway API, we’ll use the Istio helloworld sample as an ingress target, but only running as a simple Kubernetes service without sidecar injection enabled. Because we’re only going to use the Gateway API to control ingress traffic into the “Kubernetes cluster”, it makes no difference if the target service is running inside or outside of a mesh.

We’ll use the following command to deploy the helloworld service:

Zip
$ kubectl create ns sample
$ kubectl apply -f @samples/helloworld/helloworld.yaml@ -n sample

The helloworld service includes two backing deployments, corresponding to different versions (v1 and v2). We can confirm they are both running using the following command:

$ kubectl get pod -n sample
NAME                             READY   STATUS    RESTARTS   AGE
helloworld-v1-776f57d5f6-s7zfc   1/1     Running   0          10s
helloworld-v2-54df5f84b-9hxgww   1/1     Running   0          10s

Configure the helloworld ingress traffic

With the helloworld service up and running, we can now use the Gateway API to configure ingress traffic for it.

The ingress entry point is defined using a Gateway resource:

$ kubectl create namespace sample-ingress
$ kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: sample-gateway
  namespace: sample-ingress
spec:
  gatewayClassName: istio
  listeners:
  - name: http
    hostname: "*.sample.com"
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: All
EOF

The controller that will implement a Gateway is selected by referencing a GatewayClass. There must be at least one GatewayClass defined in the cluster to have functional Gateways. In our case, we’re selecting Istio’s gateway controller, istio.io/gateway-controller, by referencing its associated GatewayClass (named istio) with the gatewayClassName: istio setting in the Gateway.

Notice that unlike Ingress, a Kubernetes Gateway doesn’t include any references to the target service, helloworld. With the Gateway API, routes to services are defined in separate configuration resources that get attached to the Gateway to direct subsets of traffic to specific services, like helloworld in our example. This separation allows us to define the Gateway and routes in different namespaces, presumably managed by different teams. Here, while acting in the role of cluster operator, we’re applying the Gateway in the sample-ingress namespace. We’ll add the route, below, in the sample namespace, next to the helloworld service itself, on behalf of the application developer.

Because the Gateway resource is owned by a cluster operator, it can very well be used to provide ingress for more than one team’s services, in our case more than just the helloworld service. To emphasize this point, we’ve set hostname to *.sample.com in the Gateway, allowing routes for multiple subdomains to be attached.

After applying the Gateway resource, we need to wait for it to be ready before retrieving its external address:

$ kubectl wait -n sample-ingress --for=condition=programmed gateway sample-gateway
$ export INGRESS_HOST=$(kubectl get -n sample-ingress gateway sample-gateway -o jsonpath='{.status.addresses[0].value}')

Next, we attach an HTTPRoute to the sample-gateway (i.e., using the parentRefs field) to expose and route traffic to the helloworld service:

$ kubectl apply -n sample -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: helloworld
spec:
  parentRefs:
  - name: sample-gateway
    namespace: sample-ingress
  hostnames: ["helloworld.sample.com"]
  rules:
  - matches:
    - path:
        type: Exact
        value: /hello
    backendRefs:
    - name: helloworld
      port: 5000
EOF

Here we’ve exposed the /hello path of the helloworld service to clients outside of the cluster, specifically via host helloworld.sample.com. You can confirm the helloworld sample is accessible using curl:

$ for run in {1..10}; do curl -HHost:helloworld.sample.com http://$INGRESS_HOST/hello; done
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v2, instance: helloworld-v2-54dddc5567-2lm7b
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v2, instance: helloworld-v2-54dddc5567-2lm7b
Hello version: v2, instance: helloworld-v2-54dddc5567-2lm7b
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v2, instance: helloworld-v2-54dddc5567-2lm7b
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v2, instance: helloworld-v2-54dddc5567-2lm7b

Since no version routing has been configured in the route rule, you should see an equal split of traffic, about half handled by helloworld-v1 and the other half handled by helloworld-v2.

Configure weight-based version routing

Among other “traffic shaping” features, you can use Gateway API to send all of the traffic to one of the versions or split the traffic based on request percentages. For example, you can use the following rule to distribute the helloworld traffic 90% to v1, 10% to v2:

$ kubectl apply -n sample -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: helloworld
spec:
  parentRefs:
  - name: sample-gateway
    namespace: sample-ingress
  hostnames: ["helloworld.sample.com"]
  rules:
  - matches:
    - path:
        type: Exact
        value: /hello
    backendRefs:
    - name: helloworld-v1
      port: 5000
      weight: 90
    - name: helloworld-v2
      port: 5000
      weight: 10
EOF

Gateway API relies on version-specific backend service definitions for the route targets, helloworld-v1 and helloworld-v2 in this example. The helloworld sample already includes service definitions for the helloworld versions v1 and v2, we just need to run the following command to define them:

Zip
$ kubectl apply -n sample -f @samples/helloworld/gateway-api/helloworld-versions.yaml@

Now, we can run the previous curl commands again:

$ for run in {1..10}; do curl -HHost:helloworld.sample.com http://$INGRESS_HOST/hello; done
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj
Hello version: v2, instance: helloworld-v2-54dddc5567-2lm7b
Hello version: v1, instance: helloworld-v1-78b9f5c87f-2sskj

This time we see that about 9 out of 10 requests are now handled by helloworld-v1 and only about 1 in 10 are handled by helloworld-v2.

Gateway API for internal mesh traffic

You may have noticed that we’ve been talking about the Gateway API only as an ingress configuration API, often referred to as north-south traffic management, and not an API for service-to-service (aka, east-west) traffic management within a cluster.

If you are using a service mesh, it would be highly desirable to use the same API resources to configure both ingress traffic routing and internal traffic, similar to the way Istio uses VirtualService to configure route rules for both. Fortunately, the Kubernetes Gateway API is working to add this support. Although not as mature as the Gateway API for ingress traffic, an effort known as the Gateway API for Mesh Management and Administration (GAMMA) initiative is underway to make this a reality and Istio intends to make Gateway API the default API for all of its traffic management in the future.

The first significant Gateway Enhancement Proposal (GEP) has recently been accepted and is, in-fact, already available to use in Istio. To try it out, you’ll need to use the experimental version of the Gateway API CRDs, instead of the standard Beta version we installed above, but otherwise, you’re ready to go. Check out the Istio request routing task to get started.

Summary

In this article, we’ve seen how a light-weight minimal install of Istio can be used to provide a Beta-quality implementation of the new Kubernetes Gateway API for cluster ingress traffic control. For Istio users, the Istio implementation also lets you start trying out the experimental Gateway API support for east-west traffic management within the mesh.

Much of Istio’s documentation, including all of the ingress tasks and several mesh-internal traffic management tasks, already includes parallel instructions for configuring traffic using either the Gateway API or the Istio configuration API. Check out the Gateway API task for more information about the Gateway API implementation in Istio.

]]>
Wed, 14 Dec 2022 00:00:00 +0000/v1.24//blog/2022/getting-started-gtwapi//v1.24//blog/2022/getting-started-gtwapi/traffic-managementgatewaygateway-apiapigammasig-network
2022 Istio Steering Committee Election ResultsThe Istio Steering Committee consists of 9 proportionally-allocated Contribution Seats, and 4 elected Community Seats. Our third annual election for our Community Seats has concluded, and we are pleased to announce the choice of our members:

We would like to extend our heartfelt thanks to Zack Butcher, Lin Sun and Zhonghu Xu, whose terms have now ended. With Contribution Seat holders from Google, IBM, Red Hat and DaoCloud, we have representation from 8 organizations on the Steering Committee, reflecting the breadth of our worldwide contributor ecosystem.

Thank you to everyone who participated in the election process, with special thanks to our election officers Josh Berkus, Cameron Etezadi and Ram Vennam.

]]>
Fri, 04 Nov 2022 00:00:00 +0000/v1.24//blog/2022/steering-election-results//v1.24//blog/2022/steering-election-results/istiosteeringgovernancecommunityelection
Announcing Istio's acceptance as a CNCF projectWe are pleased to share that Istio is now an official incubating CNCF project.

In April, Istio applied to become a CNCF project. Today, the TOC announced they have voted to accept our application.

This journey began with Istio’s inception in 2016. We are grateful for all who have collaborated over the last six years on Istio’s design, development, and deployment.

We especially appreciate the efforts of TOC sponsor Dave Zolotusky, TAG Network, and the engineering teams at Airbnb, Intuit, Splunk, and WP Engine for sharing their feedback as end users.

While project work continues uninterrupted, with the acceptance of Istio, we now will begin the processes of transferring trademarks and build infrastructure to CNCF ownership. We are hard at work on our upcoming 1.16 release, while continuing to collect feedback on ambient mesh and driving it to production readiness. Our project members are currently electing our community representatives to the Steering Committee for the next year.

As a CNCF project, we will now be much more visible at KubeCon NA in October. Come to the maintainer session, find us in the project pavilion, or grab an Istio t-shirt at the CNCF Store. Watch our Twitter throughout the conference for more exciting updates!

]]>
Wed, 28 Sep 2022 00:00:00 +0000/v1.24//blog/2022/istio-accepted-into-cncf//v1.24//blog/2022/istio-accepted-into-cncf/IstioCNCF
Ambient Mode Security Deep DiveWe recently announced Istio’s new ambient mode, which is a sidecar-less data plane for Istio and the reference implementation of the ambient mesh pattern. As stated in the announcement blog, the top concerns we address with ambient mesh are simplified operations, broader application compatibility, reduced infrastructure costs and improved performance. When designing the ambient data plane, we wanted to carefully balance the concerns around operations, cost, and performance while not sacrificing security or functionality. As the components of ambient mesh run outside of the application pods, the security boundaries have changed – we believe for the better. In this blog, we go into some detail about these changes and how they compare to a sidecar deployment.

Layering of ambient mesh data plane

To recap, Istio’s ambient mode introduces a layered mesh data plane with a secure overlay responsible for transport security and routing, that has the option to add L7 capabilities for namespaces that need them. To understand more, please see the announcement blog and the getting started blog. The secure overlay consists of a node-shared component, the ztunnel, that is responsible for L4 telemetry and mTLS which is deployed as a DaemonSet. The L7 layer of the mesh is provided by waypoint proxies, full L7 Envoy proxies that are deployed per identity/workload type. Some of the core implications of this design include:

  • Separation of application from data plane
  • Components of the secure overlay layer resemble that of a CNI
  • Simplicity of operations is better for security
  • Avoiding multi-tenant L7 proxies
  • Sidecars are still a first-class supported deployment

Separation of application and data plane

Although the primary goal of ambient mesh is simplifying operations of the service mesh, it does serve to improve security as well. Complexity breeds vulnerabilities and enterprise applications (and their transitive dependencies, libraries, and frameworks) are exceedingly complex and prone to vulnerabilities. From handling complex business logic to leveraging OSS libraries or buggy internal shared libraries, a user’s application code is a prime target for attackers (internal or external). If an application is compromised, credentials, secrets, and keys are exposed to an attacker including those mounted or stored in memory. When looking at the sidecar model, an application compromise includes takeover of the sidecar and any associated identity/key material. In Istio’s ambient mode, no data plane components run in the same pod as the application and therefore an application compromise does not lead to the access of secrets.

What about Envoy Proxy as a potential target for vulnerabilities? Envoy is an extremely hardened piece of infrastructure under intense scrutiny and run at scale in critical environments (e.g., used in production to front Google’s network). However, since Envoy is software, it is not immune to vulnerabilities. When those vulnerabilities do arise, Envoy has a robust CVE process for identifying them, fixing them quickly, and rolling them out to customers before they have the chance for wide impact.

Circling back to the earlier comment that “complexity breeds vulnerabilities”, the most complex parts of Envoy Proxy is in its L7 processing, and indeed historically the majority of Envoy’s vulnerabilities have been in its L7 processing stack. But what if you just use Istio for mTLS? Why take the risk of deploying a full-blown L7 proxy which has a higher chance of CVE when you don’t use that functionality? Separating L4 and L7 mesh capabilities comes into play here. While in sidecar deployments you adopt all of the proxy, even if you use only a fraction of the functionality, in ambient mode we can limit the exposure by providing a secure overlay and only layering in L7 as needed. Additionally, the L7 components run completely separate from the applications and do not give an attack avenue.

Pushing L4 down into the CNI

The L4 components of the ambient mode data plane run as a DaemonSet, or one per node. This means it is shared infrastructure for any of the pods running on a particular node. This component is particularly sensitive and should be treated at the same level as any other shared component on the node such as any CNI agents, kube-proxy, kubelet, or even the Linux kernel. Traffic from workloads is redirected to the ztunnel which then identifies the workload and selects the right certificates to represent that workload in a mTLS connection.

The ztunnel uses a distinct credential for every pod which is only issued if the pod is currently running on the node. This ensures that the blast radius for a compromised ztunnel is that only credentials for pods currently scheduled on that node could be stolen. This is a similar property to other well implemented shared node infrastructure including other secure CNI implementations. The ztunnel does not use cluster-wide or per-node credentials which, if stolen, could immediately compromise all application traffic in the cluster unless a complex secondary authorization mechanism is also implemented.

If we compare this to the sidecar model, we notice that the ztunnel is shared and compromise could result in exfiltration of the identities of the applications running on the node. However, the likelihood of a CVE in this component is lower than that of an Istio sidecar since the attack surface is greatly reduced (only L4 handling); the ztunnel does not do any L7 processing. In addition, a CVE in a sidecar (with a larger attack surface with L7) is not truly contained to only that particular workload which is compromised. Any serious CVE in a sidecar is likely repeatable to any of the workloads in the mesh as well.

Simplicity of operations is better for security

Ultimately, Istio is a critical piece of infrastructure that must be maintained. Istio is trusted to implement some of the tenets of zero-trust network security on behalf of applications and rolling out patches on a schedule or on demand is paramount. Platform teams often have predictable patching or maintenance cycles which is quite different from that of applications. Applications likely get updated when new capabilities and functionality are required and usually part of a project. This approach to application changes, upgrades, framework and library patches, is highly unpredictable, allows a lot of time to pass, and does not lend itself to safe security practices. Therefore, keeping these security features part of the platform and separate from the applications is likely to lead to a better security posture.

As we’ve identified in the announcement blog, operating sidecars can be more complex because of the invasive nature of them (injecting the sidecar/changing the deployment descriptors, restarting the applications, race conditions between containers, etc). Upgrades to workloads with sidecars require a bit more planning and rolling restarts that may need to be coordinated to not bring down the application. With ambient mode, upgrades to the ztunnel can coincide with any normal node patching or upgrades, while the waypoint proxies are part of the network and can be upgraded completely transparently to the applications as needed.

Avoiding multi-tenant L7 proxies

Supporting L7 protocols such as HTTP 1/2/3, gRPC, parsing headers, implementing retries, customizations with Wasm and/or Lua in the data plane is significantly more complex than supporting L4. There is a lot more code to implement these behaviors (including user-custom code for things like Lua and Wasm) and this complexity can lead to the potential for vulnerabilities. Because of this, CVEs have a higher chance of being discovered in these areas of L7 functionality.

Each namespace/identity has its own L7 proxies; no multi-tenant proxies

In ambient mode, we do not share L7 processing in a proxy across multiple identities. Each identity (service account in Kubernetes) has its own dedicated L7 proxy (waypoint proxy) which is very similar to the model we use with sidecars. Trying to co-locate multiple identities and their distinct complex policies and customizations adds a lot of variability to a shared resource which leads to unfair cost attribution at best and total proxy compromise at worst.

Sidecars are still a first-class supported deployment

We understand that some folks are comfortable with the sidecar model and their known security boundaries and wish to stay on that model. With Istio, sidecars are a first-class citizen to the mesh and platform owners have the choice to continue using them. If a platform owner wants to support both sidecar and ambient modes, they can. A workload with the ambient data plane can natively communicate with workloads that have a sidecar deployed. As folks better understand the security posture of ambient mode, we are confident that it will be the preferred data plane mode of Istio, with sidecars used for specific optimizations.

]]>
Wed, 07 Sep 2022 09:00:00 -0600/v1.24//blog/2022/ambient-security//v1.24//blog/2022/ambient-security/ambient
Get Started with Istio Ambient Mesh

Ambient mesh is a new data plane mode for Istio introduced today. Following this getting started guide, you can experience how ambient mesh can simplify your application onboarding, help with ongoing operations, and reduce service mesh infrastructure resource usage.

Install Istio with Ambient Mode

  1. Download the preview version of Istio with support for ambient mesh.
  2. Check out supported environments. We recommend using a Kubernetes cluster that is version 1.21 or newer that has two nodes or more. If you don’t have a Kubernetes cluster, you can set up using locally (e.g. using kind as below) or deploy one in Google or AWS Cloud:
$ kind create cluster --config=- <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: ambient
nodes:
- role: control-plane
- role: worker
- role: worker
EOF

The ambient profile is designed to help you get started with ambient mesh. Install Istio with the ambient profile on your Kubernetes cluster, using the istioctl downloaded above:

$ istioctl install --set profile=ambient

After running the above command, you’ll get the following output that indicates these four components are installed successfully!

✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
✔ CNI installed
✔ Installation complete

By default, the ambient profile has the Istio core, Istiod, ingress gateway, zero-trust tunnel agent (ztunnel) and CNI plugin enabled. The Istio CNI plugin is responsible for detecting which application pods are part of the ambient mesh and configuring the traffic redirection between the ztunnels. You’ll notice the following pods are installed in the istio-system namespace with the default ambient profile:

$ kubectl get pod -n istio-system
NAME                                    READY   STATUS    RESTARTS   AGE
istio-cni-node-97p9l                    1/1     Running   0          29s
istio-cni-node-rtnvr                    1/1     Running   0          29s
istio-cni-node-vkqzv                    1/1     Running   0          29s
istio-ingressgateway-5dc9759c74-xlp2j   1/1     Running   0          29s
istiod-64f6d7db7c-dq8lt                 1/1     Running   0          47s
ztunnel-bq6w2                           1/1     Running   0          47s
ztunnel-tcn4m                           1/1     Running   0          47s
ztunnel-tm9zl                           1/1     Running   0          47s

The istio-cni and ztunnel components are deployed as Kubernetes DaemonSets which run on every node. Each Istio CNI pod checks all pods co-located on the same node to see if these pods are part of the ambient mesh. For those pods, the CNI plugin configures traffic redirection so that all incoming and outgoing traffic to the pods are redirected to the co-located ztunnel first. As new pods are deployed or removed on the node, CNI plugin continues to monitor and update the redirection logic accordingly.

Deploy Your Applications

You’ll use the sample bookinfo application, which is part of your Istio download from previous steps. In ambient mode, you deploy applications to your Kubernetes cluster exactly the same way you would without Istio. This means you can have your applications running in your Kubernetes before you enable ambient mesh, and have them join the mesh without needing to restart or reconfigure your applications.

$ kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml
$ kubectl apply -f https://raw.githubusercontent.com/linsun/sample-apps/main/sleep/sleep.yaml
$ kubectl apply -f https://raw.githubusercontent.com/linsun/sample-apps/main/sleep/notsleep.yaml
Applications not in the ambient mesh with plain text traffic

Note: sleep and notsleep are two simple applications that can serve as curl clients.

Connect productpage to the Istio ingress gateway so you can access the bookinfo app from outside of the cluster:

$ kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml

Test your bookinfo application, it should work with or without the gateway. Note: you can replace istio-ingressgateway.istio-system below with its load balancer IP (or hostname) if it has one:

$ kubectl exec deploy/sleep -- curl -s http://istio-ingressgateway.istio-system/productpage | head -n1
$ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n1
$ kubectl exec deploy/notsleep -- curl -s http://productpage:9080/ | head -n1

Adding your application to the ambient mesh

You can enable all pods in a given namespace to be part of the ambient mesh by simply labeling the namespace:

$ kubectl label namespace default istio.io/dataplane-mode=ambient

Congratulations! You have successfully added all pods in the default namespace to the ambient mesh. The best part is that there is no need to restart or redeploy anything!

Send some test traffic:

$ kubectl exec deploy/sleep -- curl -s http://istio-ingressgateway.istio-system/productpage | head -n1
$ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n1
$ kubectl exec deploy/notsleep -- curl -s http://productpage:9080/ | head -n1

You’ll immediately gain mTLS communication among the applications in the Ambient mesh.

Inbound requests from sleep to `productpage` and from `productpage` to reviews with secure overlay layer

If you are curious about the X.509 certificate for each identity, you can learn more about it by stepping through a certificate:

$ istioctl pc secret ds/ztunnel -n istio-system -o json | jq -r '.dynamicActiveSecrets[0].secret.tlsCertificate.certificateChain.inlineBytes' | base64 --decode | openssl x509 -noout -text -in /dev/stdin

For example, the output shows the certificate for the sleep principle that is valid for 24 hours, issued by the local Kubernetes cluster.

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 307564724378612391645160879542592778778 (0xe762cfae32a3b8e3e50cb9abad32b21a)
    Signature Algorithm: SHA256-RSA
        Issuer: O=cluster.local
        Validity
            Not Before: Aug 29 21:00:14 2022 UTC
            Not After : Aug 30 21:02:14 2022 UTC
        Subject:
        Subject Public Key Info:
            Public Key Algorithm: RSA
                Public-Key: (2048 bit)
                Modulus:
                    ac:db:1a:77:72:8a:99:28:4a:0c:7e:43:fa:ff:35:
                    75:aa:88:4b:80:4f:86:ca:69:59:1c:b5:16:7b:71:
                    dd:74:57:e2:bc:cf:ed:29:7d:7b:fa:a2:c9:06:e6:
                    d6:41:43:2a:3c:2c:18:8e:e8:17:f6:82:7a:64:5f:
                    c4:8a:a4:cd:f1:4a:9c:3f:e0:cc:c5:d5:79:49:37:
                    30:10:1b:97:94:2c:b7:1b:ed:a2:62:d9:3b:cd:3b:
                    12:c9:b2:6c:3c:2c:ac:54:5b:a7:79:97:fb:55:89:
                    ca:08:0e:2e:2a:b8:d2:e0:3b:df:b2:21:99:06:1b:
                    60:0d:e8:9d:91:dc:93:2f:7c:27:af:3e:fc:42:99:
                    69:03:9c:05:0b:c2:11:25:1f:71:f0:8a:b1:da:4a:
                    da:11:7c:b4:14:df:6e:75:38:55:29:53:63:f5:56:
                    15:d9:6f:e6:eb:be:61:e4:ce:4b:2a:f9:cb:a6:7f:
                    84:b7:4c:e4:39:c1:4b:1b:d4:4c:70:ac:98:95:fe:
                    3e:ea:5a:2c:6c:12:7d:4e:24:ab:dc:0e:8f:bc:88:
                    02:f2:66:c9:12:f0:f7:9e:23:c9:e2:4d:87:75:b8:
                    17:97:3c:96:83:84:3f:d1:02:6d:1c:17:1a:43:ce:
                    68:e2:f3:d7:dd:9e:a6:7d:d3:12:aa:f5:62:91:d9:
                    8d
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                Server Authentication, Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier:
                keyid:93:49:C1:B8:AB:BF:0F:7D:44:69:5A:C3:2A:7A:3C:79:19:BE:6A:B7
            X509v3 Subject Alternative Name: critical
                URI:spiffe://cluster.local/ns/default/sa/sleep

Note: If you don’t get any output, it may mean ds/ztunnel has selected a node that doesn’t manage any certificates. You can specify a specific ztunnel pod (e.g. istioctl pc secret ztunnel-tcn4m -n istio-system) that manages either one of the sample application pods instead.

Secure application access

After you have added your application to ambient mesh, you can secure application access using L4 authorization policies. This lets you control access to and from a service based on client workload identities, but not at the L7 level, such as HTTP methods like GET and POST.

L4 Authorization Policies

Explicitly allow the sleep service account and istio-ingressgateway service accounts to call the productpage service:

$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: productpage-viewer
 namespace: default
spec:
 selector:
   matchLabels:
     app: productpage
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep", "cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"]
EOF

Confirm the above authorization policy is working:

$ # this should succeed
$ kubectl exec deploy/sleep -- curl -s http://istio-ingressgateway.istio-system/productpage | head -n1
$ # this should succeed
$ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n1
$ # this should fail with an empty reply
$ kubectl exec deploy/notsleep -- curl -s http://productpage:9080/ | head -n1

Layer 7 Authorization Policies

Using the Kubernetes Gateway API, you can deploy a waypoint proxy for the productpage service that uses the bookinfo-productpage service account. Any traffic going to the productpage service will be mediated, enforced and observed by the Layer 7 (L7) proxy.

$ kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
 name: productpage
 annotations:
   istio.io/service-account: bookinfo-productpage
spec:
 gatewayClassName: istio-mesh
EOF

Note the gatewayClassName has to be istio-mesh for the waypoint proxy.

View the productpage waypoint proxy status; you should see the details of the gateway resource with Ready status:

$ kubectl get gateway productpage -o yaml
...
status:
  conditions:
  - lastTransitionTime: "2022-09-06T20:24:41Z"
    message: Deployed waypoint proxy to "default" namespace for "bookinfo-productpage"
      service account
    observedGeneration: 1
    reason: Ready
    status: "True"
    type: Ready

Update our AuthorizationPolicy to explicitly allow the sleep service account and istio-ingressgateway service accounts to GET the productpage service, but perform no other operations:

$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: productpage-viewer
 namespace: default
spec:
 selector:
   matchLabels:
     app: productpage
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep", "cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"]
   to:
   - operation:
       methods: ["GET"]
EOF

Confirm the above authorization policy is working:

$ # this should fail with an RBAC error because it is not a GET operation
$ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ -X DELETE | head -n1
$ # this should fail with an RBAC error because the identity is not allowed
$ kubectl exec deploy/notsleep -- curl -s http://productpage:9080/  | head -n1
$ # this should continue to work
$ kubectl exec deploy/sleep -- curl -s http://productpage:9080/ | head -n1
Inbound requests from sleep to `productpage` and from `productpage` to reviews with secure overlay and L7 processing layers

With the productpage waypoint proxy deployed, you’ll also automatically get L7 metrics for all requests to the productpage service:

$ kubectl exec deploy/bookinfo-productpage-waypoint-proxy -- curl -s http://localhost:15020/stats/prometheus | grep istio_requests_total

You’ll notice the metric with response_code=403 and some metrics response_code=200, like below:

istio_requests_total{
  response_code="403",
  source_workload="notsleep",
  source_workload_namespace="default",
  source_principal="spiffe://cluster.local/ns/default/sa/notsleep",
  destination_workload="productpage-v1",
  destination_principal="spiffe://cluster.local/ns/default/sa/bookinfo-productpage",
  connection_security_policy="mutual_tls",
  ...
}

The metric shows two 403 responses when the source workload (notsleep) calls the destination workload(productpage-v1) along with source and destination principals via mutual TLS connection.

Control Traffic

Deploy a waypoint proxy for the review service, using the bookinfo-review service account, so that any traffic going to the review service will be mediated by the waypoint proxy.

$ kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
 name: reviews
 annotations:
   istio.io/service-account: bookinfo-reviews
spec:
 gatewayClassName: istio-mesh
EOF

Apply the reviews virtual service to control 90% traffic to reviews v1 and 10% traffic to reviews v2.

$ kubectl apply -f samples/bookinfo/networking/virtual-service-reviews-90-10.yaml
$ kubectl apply -f samples/bookinfo/networking/destination-rule-reviews.yaml

Confirm that roughly 10% traffic from the 100 requests go to reviews-v2:

$ kubectl exec -it deploy/sleep -- sh -c 'for i in $(seq 1 100); do curl -s http://istio-ingressgateway.istio-system/productpage | grep reviews-v.-; done'

Wrapping up

The existing Istio resources continue to work, regardless if you choose to use the sidecar or ambient data plane mode.

Take a look at the short video to watch Lin run through the Istio ambient mesh demo in 5 minutes:

What’s next

We are super excited about the new Istio ambient data plane with its simple “ambient” architecture. Onboarding your applications onto a service mesh with ambient mode is now as easy as labeling a namespace. Your applications will gain instant benefits such as mTLS with cryptographic identity for mesh traffic and L4 observability. If you need to control access or routes or increase resiliency or gain L7 metrics among your applications in ambient mesh, you can apply waypoint proxies to your applications as needed. We’re big fans of paying for only what we need, as it not only saves resources but also saves operation cost from constantly updating many proxies! We invite you to try the new Istio ambient data plane architecture to experience how simple it is. We look forward to your feedback in the Istio community!

]]>
Wed, 07 Sep 2022 08:00:00 -0600/v1.24//blog/2022/get-started-ambient//v1.24//blog/2022/get-started-ambient/ambientdemoguide
Introducing Ambient Mesh

Today, we are excited to introduce “ambient mesh”, and its reference implementation: a new Istio data plane mode that’s designed for simplified operations, broader application compatibility, and reduced infrastructure cost. Ambient mesh gives users the option to forgo sidecar proxies in favor of a data plane that’s integrated into their infrastructure, all while maintaining Istio’s core features of zero-trust security, telemetry, and traffic management. We are sharing a preview of ambient mesh with the Istio community that we are working to bring to production readiness in the coming months.

Istio and sidecars

Since its inception, a defining feature of Istio’s architecture has been the use of sidecars – programmable proxies deployed alongside application containers. Sidecars allow operators to reap Istio’s benefits, without requiring applications to undergo major surgery and its associated costs.

Istio’s traditional model deploys Envoy proxies as sidecars within the workloads’ pods

Although sidecars have significant advantages over refactoring applications, they do not provide a perfect separation between applications and the Istio data plane. This results in a few limitations:

  • Invasiveness - Sidecars must be “injected” into applications by modifying their Kubernetes pod spec and redirecting traffic within the pod. As a result, installing or upgrading sidecars requires restarting the application pod, which can be disruptive for workloads.
  • Underutilization of resources - Since the sidecar proxy is dedicated to its associated workload, the CPU and memory resources must be provisioned for worst case usage of each individual pod. This adds up to large reservations that can lead to underutilization of resources across the cluster.
  • Traffic breaking - Traffic capture and HTTP processing, as typically done by Istio’s sidecars, is computationally expensive and can break some applications with non-conformant HTTP implementations.

While sidecars have their place — more on that later — we think there is a need for a less invasive and easier option that will be a better fit for many service mesh users.

Slicing the layers

Traditionally, Istio implements all data plane functionality, from basic encryption through advanced L7 policy, in a single architectural component: the sidecar. In practice, this makes sidecars an all-or-nothing proposition. Even if a workload just needs simple transport security, administrators still need to pay the operational cost of deploying and maintaining a sidecar. Sidecars have a fixed operational cost per workload that does not scale to fit the complexity of the use case.

The ambient data plane takes a different approach. It splits Istio’s functionality into two distinct layers. At the base, there’s a secure overlay that handles routing and zero trust security for traffic. Above that, when needed, users can enable L7 processing to get access to the full range of Istio features. The L7 processing mode, while heavier than the secure overlay, still runs as an ambient component of the infrastructure, requiring no modifications to application pods.

Layers of the ambient mesh

This layered approach allows users to adopt Istio in a more incremental fashion, smoothly transitioning from no mesh, to the secure overlay, to full L7 processing — on a per-namespace basis, as needed. Furthermore, workloads running in different ambient layers, or with sidecars, interoperate seamlessly, allowing users to mix and match capabilities based on the particular needs as they change over time.

Building an ambient mesh

Istio’s ambient data plane mode uses a shared agent, running on each node in the Kubernetes cluster. This agent is a zero-trust tunnel (or ztunnel), and its primary responsibility is to securely connect and authenticate elements within the mesh. The networking stack on the node redirects all traffic of participating workloads through the local ztunnel agent. This fully separates the concerns of Istio’s data plane from those of the application, ultimately allowing operators to enable, disable, scale, and upgrade the data plane without disturbing applications. The ztunnel performs no L7 processing on workload traffic, making it significantly leaner than sidecars. This large reduction in complexity and associated resource costs make it amenable to delivery as shared infrastructure.

Ztunnels enable the core functionality of a service mesh: zero trust. A secure overlay is created when ambient mode is enabled for a namespace. It provides workloads with mTLS, telemetry, authentication, and L4 authorization, without terminating or parsing HTTP.

Ambient mesh uses a shared, per-node ztunnel to provide a zero-trust secure overlay

After ambient mode is enabled and a secure overlay is created, a namespace can be configured to utilize L7 features. This allows a namespace to implement the full set of Istio capabilities, including the Virtual Service API, L7 telemetry, and L7 authorization policies. Namespaces operating in this mode use one or more Envoy-based waypoint proxies to handle L7 processing for workloads in that namespace. Istio’s control plane configures the ztunnels in the cluster to pass all traffic that requires L7 processing through the waypoint proxy. Importantly, from a Kubernetes perspective, waypoint proxies are just regular pods that can be auto-scaled like any other Kubernetes deployment. We expect this to yield significant resource savings for users, as the waypoint proxies can be auto-scaled to fit the real time traffic demand of the namespaces they serve, not the maximum worst-case load operators expect.

When additional features are needed, ambient mesh deploys waypoint proxies, which ztunnels connect through for policy enforcement

Ambient mesh uses HTTP CONNECT over mTLS to implement its secure tunnels and insert waypoint proxies in the path, a pattern we call HBONE (HTTP-Based Overlay Network Environment). HBONE provides for a cleaner encapsulation of traffic than TLS on its own while enabling interoperability with common load-balancer infrastructure. FIPS builds are used by default to meet compliance needs. More details on HBONE, its standards-based approach, and plans for UDP and other non-TCP protocols will be provided in a future blog.

Mixing sidecar and ambient modes in a single mesh does not introduce limitations on the capabilities or security properties of the system. The Istio control plane ensures that policies are properly enforced regardless of the deployment model chosen. Ambient mode simply introduces an option that has better ergonomics and more flexibility.

Why no L7 processing on the local node?

Ambient mode uses a shared ztunnel agent on the node, which handles the zero trust aspects of the mesh, while L7 processing happens in the waypoint proxy in separately scheduled pods. Why bother with the indirection, and not just use a shared full L7 proxy on the node? There are several reasons for this:

  • Envoy is not inherently multi-tenant. As a result, we have security concerns with commingling complex processing rules for L7 traffic from multiple unconstrained tenants in a shared instance. By strictly limiting to L4 processing, we reduce the vulnerability surface area significantly.
  • The mTLS and L4 features provided by the ztunnel need a much smaller CPU and memory footprint when compared to the L7 processing required in the waypoint proxy. By running waypoint proxies as a shared namespace resource, we can scale them independently based on the needs of that namespace, and its costs are not unfairly distributed across unrelated tenants.
  • By reducing ztunnel’s scope we allow for it to be replaced by other secure tunnel implementations that can meet a well-defined interoperability contract.

But what about those extra hops?

With ambient mode, a waypoint isn’t necessarily guaranteed to be on the same node as the workloads it serves. While at first glance this may appear to be a performance concern, we’re confident that latency will ultimately be in-line with Istio’s current sidecar implementation. We’ll discuss more in a dedicated performance blog post, but for now we’ll summarize with two points:

  • The majority of Istio’s network latency does not, in fact, come from the network (modern cloud providers have extremely fast networks). Instead the biggest culprit is the intensive L7 processing Istio needs to implement its sophisticated feature set. Unlike sidecars, which implement two L7 processing steps for each connection (one for each sidecar), ambient mode collapses these two steps into one. In most cases, we expect this reduced processing cost to compensate for an additional network hop.
  • Users often deploy a mesh to enable a zero-trust security posture as a first-step and then selectively enable L7 capabilities as needed. Ambient mode allows those users to bypass the cost of L7 processing entirely when it’s not needed.

Resource overhead

Overall we expect Istio’s ambient mode to have fewer and more predictable resource requirements for most users. The ztunnel’s limited responsibilities allows it to be deployed as a shared resource on the node. This will substantially reduce the per-workload reservations required for most users. Furthermore, since the waypoint proxies are normal Kubernetes pods, they can be dynamically deployed and scaled based on the real-time traffic demands of the workloads they serve.

Sidecars, on the other hand, need to reserve memory and CPU for the worst case for each workload. Making these calculations are complicated, so in practice administrators tend to over-provision. This leads to underutilized nodes due to high reservations that prevent other workloads from being scheduled. Ambient mode’s lower fixed per-node overhead and dynamically scaled waypoint proxies will require far fewer resource reservations in aggregate, leading to more efficient use of a cluster.

What about security?

With a radically new architecture naturally comes questions around security. The ambient mode security blog does a deep dive, but we’ll summarize here.

Sidecars co-locate with the workloads they serve and as a result, a vulnerability in one compromises the other. In the ambient mesh model, even if an application is compromised, the ztunnels and waypoint proxies can still enforce strict security policy on the compromised application’s traffic. Furthermore, given that Envoy is a mature battle-tested piece of software used by the world’s largest network operators, it is likely less vulnerable than the applications it runs alongside.

While the ztunnel is a shared resource, it only has access to the keys of the workloads currently on the node it’s running. Thus, its blast radius is no worse than any other encrypted CNI that relies on per-node keys for encryption. Also, given the ztunnel’s limited L4 only attack surface area and Envoy’s aforementioned security properties, we feel this risk is limited and acceptable.

Finally, while the waypoint proxies are a shared resource, they can be limited to serving just one service account. This makes them no worse than sidecars are today; if one waypoint proxy is compromised, the credential associated with that waypoint is lost, and nothing else.

Is this the end of the road for the sidecar?

Definitely not. While we believe ambient mesh will be the best option for many mesh users going forward, sidecars continue to be a good choice for those that need dedicated data plane resources, such as for compliance or performance tuning. Istio will continue to support sidecars, and importantly, allow them to interoperate seamlessly with ambient mode. In fact, the ambient mode code we’re releasing today already supports interoperation with sidecar-based Istio.

Learn more

Take a look at a short video to watch Christian run through the Istio ambient mode components and demo some capabilities:

Get involved

What we have released today is an early version of ambient mode in Istio, and it is very much still under active development. We are excited to share it with the broader community and look forward to getting more people involved in shaping it as we move to production readiness in 2023.

We would love your feedback to help shape the solution. A build of Istio which supports ambient mode is available to download and try in the Istio Experimental repo. A list of missing features and work items is available in the README. Please try it out and let us know what you think!

Thank you to the team that contributed to the launch of ambient mesh!

  • Google: Craig Box, John Howard, Ethan J. Jackson, Abhi Joglekar, Steven Landow, Oliver Liu, Justin Pettit, Doug Reid, Louis Ryan, Kuat Yessenov, Francis Zhou
  • Solo.io: Aaron Birkland, Kevin Dorosh, Greg Hanson, Daniel Hawton, Denis Jannot, Yuval Kohavi, Idit Levine, Yossi Mesika, Neeraj Poddar, Nina Polshakova, Christian Posta, Lin Sun, Eitan Yarmush
]]>
Wed, 07 Sep 2022 07:00:00 -0600/v1.24//blog/2022/introducing-ambient-mesh//v1.24//blog/2022/introducing-ambient-mesh/ambient
Extending Gateway API support in IstioToday we want to congratulate the Kubernetes SIG Network community on the beta release of the Gateway API specification. Alongside this milestone, we are pleased to announce that support for using the Gateway API in Istio ingress is being promoted to Beta, and our intention for the Gateway API to become the default API for all Istio traffic management in the future. We are also excited to welcome our friends from the Service Mesh Interface (SMI) community, who are joining us in a new effort to standardize service mesh use cases using the Gateway API.

The history of Istio’s traffic management APIs

API design is more of an art than a science, and Istio is often used as an API to configure the serving of other APIs! In the case of traffic routing alone, we must consider producer vs consumer, routing vs. post-routing, and how to express a complex feature set with the correct number of objects — factoring in that these must be owned by different teams.

When we launched Istio in 2017, we brought many years of experience from Google’s production API serving infrastructure and IBM’s Amalgam8 project, and mapped it onto Kubernetes. We soon came up against the limitations of Kubernetes’ Ingress API. A desire to support all proxy implementations meant that Ingress only supported the most basic of HTTP routing features, with other features often implemented as vendor-specific annotations. The Ingress API was shared between infrastructure admins (“create and configure a load balancer”), cluster operators (“manage a TLS certificate for my entire domain”) and application users (“use it to route /foo to the foo service”).

We rewrote our traffic APIs in early 2018 to address user feedback, and to more adequately address these concerns.

A primary feature of Istio’s new model was having separate APIs that describe infrastructure (the load balancer, represented by the Gateway), and application (routing and post-routing, represented by the VirtualService and DestinationRule).

Ingress worked well as a lowest common denominator between different implementations, but its shortcomings led SIG Network to investigate the design of a “version 2”. A user survey in 2018 was followed by a proposal for new APIs in 2019, based in large part on Istio’s traffic APIs. That effort came to be known as the “Gateway API”.

The Gateway API was built to be able to model many more use cases, with extension points to enable functionality that differs between implementations. Furthermore, adopting the Gateway API opens a service mesh up to compatibility with the whole ecosystem of software that is written to support it. You don’t have to ask your vendor to support Istio routing directly: all they need to do is create Gateway API objects, and Istio will do what it needs to do, out of the box.

Support for the Gateway API in Istio

Istio added support for the Gateway API in November 2020, with support marked Alpha along with the API implementation. With the Beta release of the API spec we are pleased to announce support for ingress use in Istio is being promoted to Beta. We also encourage early adopters to start experimenting with the Gateway API for mesh (service-to-service) use, and we will move that support to Beta when SIG Network has standardized the required semantics.

Around the time of the v1 release of the API, we intend to make the Gateway API the default method for configuring all traffic routing in Istio - for ingress (north-south) and service-to-service (east-west). At that time, we will change our documentation and examples to reflect the recommendation.

Just like Kubernetes intends to support the Ingress API for many years after the Gateway API goes stable, the Istio APIs (Gateway, VirtualService and DestinationRule) will remain supported for the foreseeable future.

Not only that, but you can continue to use the existing Istio traffic APIs alongside the Gateway API, for example, using an HTTPRoute with an Istio VirtualService.

The similarity between the APIs means that we will be able to offer a tool to easily convert Istio API objects to Gateway API objects, and we will release this alongside the v1 version of the API.

Other parts of Istio functionality, including policy and telemetry, will continue to be configured using Istio-specific APIs while we work with SIG Network on standardization of these use cases.

Welcoming the SMI community to the Gateway API project

Throughout its design and implementation, members of the Istio team have been working with members of SIG Network on the implementation of the Gateway API, making sure the API was suitable for use in mesh use cases.

We are delighted to be formally joined in this effort by members of the Service Mesh Interface (SMI) community, including leaders from Linkerd, Consul and Open Service Mesh, who have collectively decided to standardize their API efforts on the Gateway API. To that end, we have set up a Gateway API Mesh Management and Administration (GAMMA) workstream within the Gateway API project. John Howard, a member of the Istio Technical Oversight Committee and a lead of our Networking WG, will be a lead of this group.

Our combined next steps are to provide enhancement proposals to the Gateway API project to support mesh use cases. We have started looking at API semantics for mesh traffic management, and will work with vendors and communities implementing Gateway API in their projects to build on a standard implementation. After that, we intend to build a representation for authorization and authentication policy.

With SIG Network as a vendor neutral forum for ensuring the service mesh community implements the Gateway API using the same semantics, we look forward to having a standard API which works with all projects, regardless of their technology stack or proxy.

]]>
Wed, 13 Jul 2022 00:00:00 +0000/v1.24//blog/2022/gateway-api-beta//v1.24//blog/2022/gateway-api-beta/traffic-managementgatewaygateway-apiapigammasig-network
CryptoMB - TLS handshake acceleration for IstioCryptographic operations are among the most compute-intensive and critical operations when it comes to secured connections. Istio uses Envoy as the “gateways/sidecar” to handle secure connections and intercept the traffic.

Depending upon use cases, when an ingress gateway must handle a large number of incoming TLS and secured service-to-service connections through sidecar proxies, the load on Envoy increases. The potential performance depends on many factors, such as size of the cpuset on which Envoy is running, incoming traffic patterns, and key size. These factors can impact Envoy serving many new incoming TLS requests. To achieve performance improvements and accelerated handshakes, a new feature was introduced in Envoy 1.20 and Istio 1.14. It can be achieved with 3rd Gen Intel® Xeon® Scalable processors, the Intel® Integrated Performance Primitives (Intel® IPP) crypto library, CryptoMB Private Key Provider Method support in Envoy, and Private Key Provider configuration in Istio using ProxyConfig.

CryptoMB

The Intel IPP crypto library supports multi-buffer crypto operations. Briefly, multi-buffer cryptography is implemented with Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions using a SIMD (single instruction, multiple data) mechanism. Up to eight RSA or ECDSA operations are gathered into a buffer and processed at the same time, providing potentially improved performance. Intel AVX-512 instructions are available on recently launched 3rd generation Intel Xeon Scalable processor server processors (Ice Lake server).

The idea of Envoy’s CryptoMB private key provider is that incoming TLS handshakes’ RSA operations are accelerated using Intel AVX-512 multi-buffer instructions.

Accelerate Envoy with Intel AVX-512 instructions

Envoy uses BoringSSL as the default TLS library. BoringSSL supports setting private key methods for offloading asynchronous private key operations, and Envoy implements a private key provider framework to allow creation of Envoy extensions which handle TLS handshakes private key operations (signing and decryption) using the BoringSSL hooks.

CryptoMB private key provider is an Envoy extension which handles BoringSSL TLS RSA operations using Intel AVX-512 multi-buffer acceleration. When a new handshake happens, BoringSSL invokes the private key provider to request the cryptographic operation, and then the control returns to Envoy. The RSA requests are gathered in a buffer. When the buffer is full or the timer expires, the private key provider invokes Intel AVX-512 processing of the buffer. When processing is done, Envoy is notified that the cryptographic operation is done and that it may continue with the handshakes.

Envoy <-> BoringSSL <-> PrivateKeyProvider

The Envoy worker thread has a buffer size for eight RSA requests. When the first RSA request is stored in the buffer, a timer will be initiated (timer duration is set by the poll_delay field in the CryptoMB configuration).

Buffer timer started

When the buffer is full or when the timer expires, the crypto operations are performed for all RSA requests simultaneously. The SIMD (single instruction, multiple data) processing gives the potential performance benefit compared to the non-accelerated case.

Buffer timer expired

Envoy CryptoMB Private Key Provider configuration

A regular TLS configuration only uses a private key. When a private key provider is used, the private key field is replaced with a private key provider field. It contains two fields, provider name and typed config. Typed config is CryptoMbPrivateKeyMethodConfig, and it specifies the private key and the poll delay.

TLS configuration with just a private key.

tls_certificates:
  certificate_chain: { "filename": "/path/cert.pem" }
  private_key: { "filename": "/path/key.pem" }

TLS configuration with CryptoMB private key provider.

tls_certificates:
  certificate_chain: { "filename": "/path/cert.pem" }
  private_key_provider:
    provider_name: cryptomb
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.private_key_providers.cryptomb.v3alpha.CryptoMbPrivateKeyMethodConfig
      private_key: { "filename": "/path/key.pem" }
      poll_delay: 10ms

Istio CryptoMB Private Key Provider configuration

In Istio, CryptoMB private key provider configuration can be applied mesh wide, gateways specific or pod specific configurations using pod annotations. The User will provide the PrivateKeyProvider in the ProxyConfig with the pollDelay value. This configuration will be applied to mesh wide (gateways and all sidecars).

Sample mesh wide configuration

Istio Mesh wide Configuration

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  namespace: istio-system
  name: example-istiocontrolplane
spec:
  profile: demo
  components:
    egressGateways:
    - name: istio-egressgateway
      enabled: true
    ingressGateways:
    - name: istio-ingressgateway
      enabled: true
  meshConfig:
    defaultConfig:
      privateKeyProvider:
        cryptomb:
          pollDelay: 10ms

Istio Gateways Configuration

If a user wants to apply a private key provider configuration for ingress gateway only, follow the below sample configuration.

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  namespace: istio-system
  name: example-istiocontrolplane
spec:
  profile: demo
  components:
    egressGateways:
    - name: istio-egressgateway
      enabled: true
    ingressGateways:
    - name: istio-ingressgateway
      enabled: true
      k8s:
        podAnnotations:
          proxy.istio.io/config: |
            privateKeyProvider:
              cryptomb:
                pollDelay: 10ms

Istio Sidecar Configuration using pod annotations

If a user wants to apply private key provider configuration to application specific pods, configure them using pod annotations like the below sample.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  labels:
    app: httpbin
    service: httpbin
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
      annotations:
        proxy.istio.io/config: |
          privateKeyProvider:
            cryptomb:
              pollDelay: 10ms
    spec:
      serviceAccountName: httpbin
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80

Performance

The potential performance benefit depends on many factors. For example, the size of the cpuset Envoy is running on, incoming traffic pattern, encryption type (RSA or ECDSA), and key size.

Below, we show performance based on the total latency between k6, gateway and Fortio server. These show relative performance improvement using the CryptoMB provider, and are in no way representative of Istio’s general performance or benchmark results. Our measurements use different client tools (k6 and fortio), different setup (client, gateway and server running on separate nodes) and we create a new TLS handshake with every HTTP request.

We have published a white paper with general cryptographic performance numbers.

Istio ingress gateway TLS handshake performance comparison. Tested using 1.14-dev on May 10th 2022

Configuration used in above comparison.

  • Azure AKS Kubernetes cluster
    • v1.21
    • Three-node cluster
    • Each node Standard_D4ds_v5: 3rd Generation Intel® Xeon® Platinum 8370C (Ice Lake), 4 vCPU, 16 GB memory
  • Istio
    • 1.14-dev
    • Istio ingress gateway pod
      • resources.request.cpu: 2
      • resources.request.memory: 4 GB
      • resources.limits.cpu: 2
      • resources.limits.memory: 4 GB
  • K6
    • loadimpact/k6:latest
  • Fortio
    • fortio/fortio:1.27.0
  • K6 client, envoy and fortio pods are forced to run on separate nodes via Kubernetes AntiAffinity and node selectors
  • In above picture
    • Istio is installed with above configuration
    • Istio with CryptoMB (AVX-512) with above configuration + below settings
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  components:
    ingressGateways:
    - enabled: true
      name: istio-ingressgateway
      k8s:
        # this controls the SDS service which configures ingress gateway
        podAnnotations:
          proxy.istio.io/config: |
            privateKeyProvider:
              cryptomb:
                pollDelay: 1ms
  values:
    # Annotate pods with
    #     inject.istio.io/templates: sidecar, cryptomb
    sidecarInjectorWebhook:
      templates:
        cryptomb: |
          spec:
            containers:
            - name: istio-proxy
]]>
Wed, 15 Jun 2022 00:00:00 +0000/v1.24//blog/2022/cryptomb-privatekeyprovider//v1.24//blog/2022/cryptomb-privatekeyprovider/IstioCryptoMBgatewayssidecar
Istio has applied to become a CNCF projectThe Istio project is pleased to announce its intention to join the Cloud Native Computing Foundation (CNCF). With the support of the Istio Steering Committee, Google has submitted an application proposal for Istio to join the CNCF, the home of its companion projects Kubernetes and Envoy.

It is almost 5 years since Google, IBM and Lyft launched Istio 0.1 in May 2017. That first version set the standard for what a service mesh should be: traffic management, policy enforcement, and observability, powered by sidecars next to workloads. We’re proud to be the most popular service mesh according to a recent CNCF survey, and look forward to working closer with the CNCF communities around networking and service mesh.

As we deepen our integration with Kubernetes through the Gateway API and gRPC with proxyless mesh — not to mention Envoy, which has grown up beside Istio — we think it’s time to unite the premier Cloud Native stack under a single umbrella.

What’s next?

Today is just the start of a journey. The CNCF Technical Oversight Committee will carefully consider our application, and perform due diligence. After that, they’ll open up for a vote, and if successful, the project will be transferred.

The work we did in establishing guidelines for the Istio trademark through the Open Usage Commons (OUC) will ensure the whole ecosystem can continue to use the Istio trademarks in a free and fair fashion. The trademarks will move to the Linux Foundation but continue to be managed under OUC’s trademark guidelines.

Google currently funds and manages Istio’s build/test infrastructure. The company has committed to continue sponsoring this infrastructure as it moves to management by the CNCF, and it will be supported with credits from Google and other contributors after the transition is complete.

Nothing about our current open governance model has to change as a result of this transfer. We will continue to reward corporate contribution, community influence and long-term maintainership through our Steering Committee and Technical Oversight Committee model. Istio is key to the future of Google Cloud and Google intends to continue investing heavily in the project.

We want to thank the ecosystem of Istio users, integrated projects, and professional services vendors. Please send us a PR if you want to be listed on our site!

Istio is the building block for products by over 20 different vendors. No other service mesh has a comparable footprint. We want to thank all the clouds, technology enterprises, startups and everyone else who has built a product based on Istio, or who makes Istio available with their hosted Kubernetes service. We look forward to our continued collaboration.

Finally, we want to thank Google for their stewardship of the Istio community to date, their immeasurable contributions to Istio, and for their continued support during this transition.

See also

For more perspectives on today’s news, please read blog posts from Google, IBM, Tetrate, VMware, Solo.io, Aspen Mesh and Red Hat.

]]>
Mon, 25 Apr 2022 00:00:00 +0000/v1.24//blog/2022/istio-has-applied-to-join-the-cncf//v1.24//blog/2022/istio-has-applied-to-join-the-cncf/IstioCNCF
Configuring istioctl for a remote clusterWhen using the istioctl CLI on a remote cluster of an external control plane or a multicluster Istio deployment, some of the commands will not work by default. For example, istioctl proxy-status requires access to the istiod service to retrieve the status and configuration of the proxies it’s managing. If you try running it on a remote cluster, you’ll get an error message like this:

$ istioctl proxy-status
Error: unable to find any Istiod instances

Notice that the error message doesn’t just say that it’s unable to access the istiod service, it specifically mentions its inability to find istiod instances. This is because the istioctl proxy-status implementation needs to retrieve the sync status of not just any single istiod instance, but rather all of them. When there is more than one istiod instance (replica) running, each instance is only connected to a subset of the service proxies running in the mesh. The istioctl command needs to return the status for the entire mesh, not just the subset managed by one of the instances.

In an ordinary Istio installation where the istiod service is running locally on the cluster (i.e., a primary cluster), the command is implemented by simply finding all of the running istiod pods, calling each one in turn, and then aggregating the result before returning it to the user.

CLI with local access to istiod pods

When using a remote cluster, on the other hand, this is not possible since the istiod instances are running outside of the mesh cluster and not accessible to the mesh user. The instances may not even be deployed using pods on a Kubernetes cluster.

Fortunately, istioctl provides a configuration option to address this issue. You can configure istioctl with the address of an external proxy service that will have access to the istiod instances. Unlike an ordinary load-balancer service, which would delegate incoming requests to one of the instances, this proxy service must instead delegate to all of the istiod instances, aggregate the responses, and then return the combined result.

If the external proxy service is, in fact, running on another Kubernetes cluster, the proxy implementation code can be very similar to the implementation code that istioctl runs in the primary cluster case, i.e., find all of the running istiod pods, call each one in turn, and then aggregate the result.

CLI without local access to istiod pods

An Istio Ecosystem project that includes an implementation of such an istioctl proxy server can be found here. To try it out, you’ll need two clusters, one of which is configured as a remote cluster using a control plane installed in the other cluster.

Install Istio with a remote cluster topology

To demonstrate istioctl working on a remote cluster, we’ll start by using the external control plane install instructions to set up a single remote cluster mesh with an external control plane running in a separate external cluster.

After completing the installation, we should have two environment variables, CTX_REMOTE_CLUSTER and CTX_EXTERNAL_CLUSTER, containing the context names of the remote (mesh) and external (control plane) clusters, respectively.

We should also have the helloworld and sleep samples running in the mesh, i.e., on the remote cluster:

$ kubectl get pod -n sample --context="${CTX_REMOTE_CLUSTER}"
NAME                             READY   STATUS    RESTARTS   AGE
helloworld-v1-776f57d5f6-tmpkd   2/2     Running   0          10s
sleep-557747455f-v627d           2/2     Running   0          9s

Notice that if you try to run istioctl proxy-status in the remote cluster, you will see the error message described earlier:

$ istioctl proxy-status --context="${CTX_REMOTE_CLUSTER}"
Error: unable to find any Istiod instances

Configure istioctl to use the sample proxy service

To configure istioctl, we first need to deploy the proxy service next to the running istiod pods. In our installation, we’ve deployed the control plane in the external-istiod namespace, so we start the proxy service on the external cluster using the following command:

$ kubectl apply -n external-istiod --context="${CTX_EXTERNAL_CLUSTER}" \
    -f https://raw.githubusercontent.com/istio-ecosystem/istioctl-proxy-sample/main/istioctl-proxy.yaml
service/istioctl-proxy created
serviceaccount/istioctl-proxy created
secret/jwt-cert-key-secret created
deployment.apps/istioctl-proxy created
role.rbac.authorization.k8s.io/istioctl-proxy-role created
rolebinding.rbac.authorization.k8s.io/istioctl-proxy-role created

You can run the following command to confirm that the istioctl-proxy service is running next to istiod:

$ kubectl get po -n external-istiod --context="${CTX_EXTERNAL_CLUSTER}"
NAME                              READY   STATUS    RESTARTS   AGE
istioctl-proxy-664bcc596f-9q8px   1/1     Running   0          15s
istiod-666fb6694d-jklkt           1/1     Running   0          5m31s

The proxy service is a gRPC server that is serving on port 9090:

$ kubectl get svc istioctl-proxy -n external-istiod --context="${CTX_EXTERNAL_CLUSTER}"
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
istioctl-proxy   ClusterIP   172.21.127.192   <none>        9090/TCP   11m

Before we can use it, however, we need to expose it outside of the external cluster. There are many ways to do that, depending on the deployment environment. In our setup, we have an ingress gateway running on the external cluster, so we could update it to also expose port 9090, update the associated virtual service to direct port 9090 requests to the proxy service, and then configure istioctl to use the gateway address for the proxy service. This would be a “proper” approach.

However, since this is just a simple demonstration where we have access to both clusters, we will simply port-forward the proxy service to localhost:

$ kubectl port-forward -n external-istiod service/istioctl-proxy 9090:9090 --context="${CTX_EXTERNAL_CLUSTER}"

We now configure istioctl to use localhost:9090 to access the proxy by setting the ISTIOCTL_XDS_ADDRESS environment variable:

$ export ISTIOCTL_XDS_ADDRESS=localhost:9090
$ export ISTIOCTL_ISTIONAMESPACE=external-istiod
$ export ISTIOCTL_PREFER_EXPERIMENTAL=true

Because our control plane is running in the external-istiod namespace, instead of the default istio-system, we also need to set the ISTIOCTL_ISTIONAMESPACE environment variable.

Setting ISTIOCTL_PREFER_EXPERIMENTAL is optional. It instructs istioctl to redirect istioctl command calls to an experimental equivalent, istioctl x command, for any command that has both a stable and experimental implementation. In our case we need to use istioctl x proxy-status, the version that implements the proxy delegation feature.

Run the istioctl proxy-status command

Now that we’re finished configuring istioctl we can try it out by running the proxy-status command again:

$ istioctl proxy-status --context="${CTX_REMOTE_CLUSTER}"
NAME                                                      CDS        LDS        EDS        RDS        ISTIOD         VERSION
helloworld-v1-776f57d5f6-tmpkd.sample                     SYNCED     SYNCED     SYNCED     SYNCED     <external>     1.12.1
istio-ingressgateway-75bfd5668f-lggn4.external-istiod     SYNCED     SYNCED     SYNCED     SYNCED     <external>     1.12.1
sleep-557747455f-v627d.sample                             SYNCED     SYNCED     SYNCED     SYNCED     <external>     1.12.1

As you can see, this time it correctly displays the sync status of all the services running in the mesh. Notice that the ISTIOD column returns the generic value <external>, instead of the instance name (e.g., istiod-666fb6694d-jklkt) that would be displayed if the pod was running locally. In this case, this detail is not available, or needed, by the mesh user. It’s only available on the external cluster for the mesh operator to see.

Summary

In this article, we used a sample proxy server to configure istioctl to work with an external control plane installation. We’ve seen how some of the istioctl CLI commands don’t work out of the box on a remote cluster managed by an external control plane. Commands such as istioctl proxy-status, among others, need access to the istiod service instances managing the mesh, which are unavailable when the control plane is running outside of the mesh cluster. To address this issue, istioctl was configured to delegate to a proxy server, running along side the external control plane, which accesses the istiod instances on its behalf.

]]>
Fri, 25 Mar 2022 00:00:00 +0000/v1.24//blog/2022/istioctl-proxy//v1.24//blog/2022/istioctl-proxy/istioctlcliexternalremotemulticluster
Register now for IstioCon 2022!IstioCon is the annual user-centered event for Istio, the industry’s most popular service mesh. This event will take place April 25-29, it will be 100% virtual, and registrations are now open free of charge. If you are among the first 400 people to register to the conference, you are eligible to receive a conference t-shirt!

In 2021, more than 4,000 people from across 84 countries joined the event online, to hear from 27 end-user companies how they are using Istio in production. Participants were able to learn how Airbnb navigated scalability issues to finally find a solution in Istio, how HP set up a secure and wise platform with Istio, and how eBay used Istio to create federated access points, among many more examples of using Istio in production.

IstioCon 2022 will be an industry-focused event, a platform to connect contributors and users to discuss uses of Istio in different architectural setups, what are some limitations, and where to take the project next. The main focus will be in end-user companies, as we look forward to sharing a diversity of case studies showing how to use Istio in production. The content will be categorized according to expertise level.

This community-led event also has an interactive social hour to take the load off and mesh with the Istio community, vendors, and maintainers. Participation in the event is free of charge, register today for a chance to get the conference t-shirt!

]]>
Mon, 21 Mar 2022 00:00:00 +0000/v1.24//blog/2022/istiocon-register//v1.24//blog/2022/istiocon-register/IstioConIstioconference
Merbridge - Accelerate your mesh with eBPFThe secret of Istio’s abilities in traffic management, security, observability and policy is all in the Envoy proxy. Istio uses Envoy as the “sidecar” to intercept service traffic, with the kernel’s netfilter packet filter functionality configured by iptables.

There are shortcomings in using iptables to perform this interception. Since netfilter is a highly versatile tool for filtering packets, several routing rules and data filtering processes are applied before reaching the destination socket. For example, from the network layer to the transport layer, netfilter will be used for processing for several times with the rules predefined, like pre_routing, post_routing and etc. When the packet becomes a TCP packet or UDP packet, and is forwarded to user space, some additional steps like packet validation, protocol policy processing and destination socket searching will be performed. When a sidecar is configured to intercept traffic, the original data path can become very long, since duplicated steps are performed several times.

Over the past two years, eBPF has become a trending technology, and many projects based on eBPF have been released to the community. Tools like Cilium and Pixie show great use cases for eBPF in observability and network packet processing. With eBPF’s sockops and redir capabilities, data packets can be processed efficiently by directly being transported from an inbound socket to an outbound socket. In an Istio mesh, it is possible to use eBPF to replace iptables rules, and accelerate the data plane by shortening the data path.

We have created an open source project called Merbridge, and by applying the following command to your Istio-managed cluster, you can use eBPF to achieve such network acceleration.

$ kubectl apply -f https://raw.githubusercontent.com/merbridge/merbridge/main/deploy/all-in-one.yaml

With Merbridge, the packet datapath can be shortened directly from one socket to another destination socket, and here’s how it works.

Using eBPF sockops for performance optimization

Network connection is essentially socket communication. eBPF provides a function bpf_msg_redirect_hash, to directly forward the packets sent by the application in the inbound socket to the outbound socket. By entering the function mentioned before, developers can perform any logic to decide the packet destination. According to this characteristic, the datapath of packets can noticeably be optimized in the kernel.

The sock_map is the crucial piece in recording information for packet forwarding. When a packet arrives, an existing socket is selected from the sock_map to forward the packet to. As a result, we need to save all the socket information for packets to make the transportation process function properly. When there are new socket operations — like a new socket being created — the sock_ops function is executed. The socket metadata is obtained and stored in the sock_map to be used when processing packets. The common key type in the sock_map is a “quadruple” of source and destination addresses and ports. With the key and the rules stored in the map, the destination socket will be found when a new packet arrives.

The Merbridge approach

Let’s introduce the detailed design and implementation principles of Merbridge step by step, with a real scenario.

Istio sidecar traffic interception based on iptables

Istio Sidecar Traffic Interception Based on iptables

When external traffic hits your application’s ports, it will be intercepted by a PREROUTING rule in iptables, forwarded to port 15006 of the sidecar container, and handed over to Envoy for processing. This is shown as steps 1-4 in the red path in the above diagram.

Envoy processes the traffic using the policies issued by the Istio control plane. If allowed, the traffic will be sent to the actual container port of the application container.

When the application tries to access other services, it will be intercepted by an OUTPUT rule in iptables, and then be forwarded to port 15001 of the sidecar container, where Envoy is listening. This is steps 9-12 on the red path, similar to inbound traffic processing.

Traffic to the application port needs to be forwarded to the sidecar, then sent to the container port from the sidecar port, which is overhead. Moreover, iptables’ versatility determines that its performance is not always ideal because it inevitably adds delays to the whole datapath with different filtering rules applied. Although iptables is the common way to do packet filtering, in the Envoy proxy case, the longer datapath amplifies the bottleneck of packet filtering process in the kernel.

If we use sockops to directly connect the sidecar’s socket to the application’s socket, the traffic will not need to go through iptables rules, and thus performance can be improved.

Processing outbound traffic

As mentioned above, we would like to use eBPF’s sockops to bypass iptables to accelerate network requests. At the same time, we also do not want to modify any parts of Istio, to make Merbridge fully adaptive to the community version. As a result, we need to simulate what iptables does in eBPF.

Traffic redirection in iptables utilizes its DNAT function. When trying to simulate the capabilities of iptables using eBPF, there are two main things we need to do:

  1. Modify the destination address, when the connection is initiated, so that traffic can be sent to the new interface.
  2. Enable Envoy to identify the original destination address, to be able to identify the traffic.

For the first part, we can use eBPF’s connect program to process it, by modifying user_ip and user_port.

For the second part, we need to understand the concept of ORIGINAL_DST which belongs to the netfilter module in the kernel.

When an application (including Envoy) receives a connection, it will call the get_sockopt function to obtain ORIGINAL_DST. If going through the iptables DNAT process, iptables will set this parameter, with the “original IP + port” value, to the current socket. Thus, the application can get the original destination address according to the connection.

We have to modify this call process through eBPF’s get_sockopts function. (bpf_setsockopt is not used here because this parameter does not currently support the optname of SO_ORIGINAL_DST).

Referring to the figure below, when an application initiates a request, it will go through the following steps:

  1. When the application initiates a connection, the connect program will modify the destination address to 127.x.y.z:15001, and use cookie_original_dst to save the original destination address.
  2. In the sockops program, the current socket information and the quadruple are saved in sock_pair_map. At the same time, the same quadruple and its corresponding original destination address will be written to pair_original_dest. (Cookie is not used here because it cannot be obtained in the get_sockopt program).
  3. After Envoy receives the connection, it will call the get_sockopt function to read the destination address of the current connection. get_sockopt will extract and return the original destination address from pair_original_dst, according to the quadruple information. Thus, the connection is completely established.
  4. In the data transport step, the redir program will read the sock information from sock_pair_map according to the quadruple information, and then forward it directly through bpf_msg_redirect_hash to speed up the request.
Processing Outbound Traffic

Why do we set the destination address to 127.x.y.z instead of 127.0.0.1? When different pods exist, there might be conflicting quadruples, and this gracefully avoids conflict. (Pods’ IPs are different, and they will not be in the conflicting condition at any time.)

Inbound traffic processing

The processing of inbound traffic is basically similar to outbound traffic, with the only difference: revising the port of the destination to 15006.

It should be noted that since eBPF cannot take effect in a specified namespace like iptables, the change will be global, which means that if we use a Pod that is not originally managed by Istio, or an external IP address, serious problems will be encountered — like the connection not being established at all.

As a result, we designed a tiny control plane (deployed as a DaemonSet), which watches all pods — similar to the kubelet watching pods on the node — to write the pod IP addresses that have been injected into the sidecar to the local_pod_ips map.

When processing inbound traffic, if the destination address is not in the map, we will not do anything to the traffic.

Otherwise, the steps are the same as for outbound traffic.

Processing Inbound Traffic

Same-node acceleration

Theoretically, acceleration between Envoy sidecars on the same node can be achieved directly through inbound traffic processing. However, Envoy will raise an error when accessing the application of the current pod in this scenario.

In Istio, Envoy accesses the application by using the current pod IP and port number. With the above scenario, we realized that the pod IP exists in the local_pod_ips map as well, and the traffic will be redirected to the pod IP on port 15006 again because it is the same address that the inbound traffic comes from. Redirecting to the same inbound address causes an infinite loop.

Here comes the question: are there any ways to get the IP address in the current namespace with eBPF? The answer is yes!

We have designed a feedback mechanism: When Envoy tries to establish the connection, we redirect it to port 15006. However, in the sockops step, we will determine if the source IP and the destination IP are the same. If yes, it means the wrong request is sent, and we will discard this connection in the sockops process. In the meantime, the current ProcessID and IP information will be written into the process_ip map, to allow eBPF to support correspondence between processes and IPs.

When the next request is sent, the same process need not be performed again. We will check directly from the process_ip map if the destination address is the same as the current IP address.

Same-node acceleration

Connection relationship

Before applying eBPF using Merbridge, the data path between pods is like:

iptables's data path

After applying Merbridge, the outbound traffic will skip many filter steps to improve the performance:

eBPF's data path

If two pods are on the same machine, the connection can even be faster:

eBPF's data path on the same machine

Performance results

Let’s see the effect on overall latency using eBPF instead of iptables (lower is better):

Latency vs Client Connections Graph

We can also see overall QPS after using eBPF (higher is better). Test results are generated with wrk.

QPS vs Client Connections Graph

Summary

We have introduced the core ideas of Merbridge in this post. By replacing iptables with eBPF, the data transportation process can be accelerated in a mesh scenario. At the same time, Istio will not be changed at all. This means if you do not want to use eBPF any more, just delete the DaemonSet, and the datapath will be reverted to the traditional iptables-based routing without any problems.

Merbridge is a completely independent open source project. It is still at an early stage, and we are looking forward to having more users and developers to get engaged. It would be greatly appreciated if you would try this new technology to accelerate your mesh, and provide us with some feedback!

See also

]]>
Mon, 07 Mar 2022 00:00:00 +0000/v1.24//blog/2022/merbridge//v1.24//blog/2022/merbridge/Istioebpfiptablessidecar
Join us for IstioCon 2022!IstioCon 2022, set for April 25-29, will be the second annual conference for Istio, the industry’s most popular service mesh. This year’s conference will again be 100% virtual, connecting community members across the globe with Istio’s ecosystem. Visit the conference website for all the information related to the event.

IstioCon provides an opportunity to showcase the lessons learned from running Istio in production, hands-on experiences from the Istio community, and will feature maintainers from across the Istio ecosystem. This year’s IstioCon features sessions focused on sharing real world examples, case studies, and success stories that can inspire newcomers to use Istio in production. The content will range from introductory to advanced levels, split into four main topic tracks:

  • Getting started & getting involved
  • Tools, features & functionality: Observability, traceability, and other things built on top of Istio.
  • Infrastructure & networking: How Istio works, with deep-dives into performance, cost, and multi-cloud environments.
  • Tech evolution & what’s next: The evolution of Istio, new standards, new extensions, and how to address problems that are interesting to tackle.

At this time, we encourage Istio users, developers, partners, and advocates to submit a session proposal through the conference’s CFP portal. The conference offers a mix of keynotes, technical talks, lightning talks, workshops, and roadmap sessions. Choose from the following formats to submit a session proposal for IstioCon:

  • Presentation: 40 minute presentation, maximum of 2 speakers
  • Panel: 40 minutes of discussion among 3 to 5 speakers
  • Workshop: 160 minute (2h 40m), in-depth, hands-on presentation with 1–4 speakers
  • Lighting Talk: 10 minute presentation, limited to 1 speaker

This community-led event also has an interactive social hour to take the load off and mesh with the Istio community, vendors, and maintainers. Participation in the event is free of charge, and will only require participants to register in order to attend.

Stay tuned to hear more about this conference, and we hope you can join us at IstioCon 2022!

]]>
Mon, 14 Feb 2022 00:00:00 +0000/v1.24//blog/2022/istiocon-2022//v1.24//blog/2022/istiocon-2022/IstioConIstioconference
An easier way to add virtual machines to Istio service meshVirtual Machine Traffic Flow

Some of the complexities involved with joining a virtual machine are due to the sheer number of features that a service mesh provides. But, what if you only need a subset of those features? For example, secure communication from your virtual machine to services running inside your service mesh. With only a few tradeoffs, you can give your virtual machine service mesh features without all of the overhead.

What about local development? As more and more micro-services are deployed to Kubernetes and as your dependency graph resembles a spider web, it has become increasingly difficult to do local development. What if a local machine could simply join a service mesh and make calls to mesh applications. This solution could potentially save time and money by not requiring developers to wait for their code to be deployed.

Reduce Complexity

Virtual Machine Istio Installation

Today, adding a virtual machine to your Istio service mesh involves a lot of moving parts. You must create a Kubernetes service account, Istio workload entry and then generate configuration all before on-boarding a single virtual machine. There are also complexities to automating this, especially for auto scaling VMs. Finally you are required to expose Istiod externally to your cluster.

The complexity of adding a virtual machine comes from the expectation that the VM should participate 100% within the service mesh. For many this is not a necessity, by looking at the actual requirements of your system you may be able to simplify your virtual machine on-boarding and still get the features you need.

So what are some use cases that could be met but yet still make virtual machines easier to work with service mesh?

Single Direction Traffic Flow

Sometimes a virtual machine just needs to talk securely to applications within the service mesh. This is often the case when migrating VM based applications to Kubernetes in which other VMs may have depended on those applications. With the described approach below you can still achieve this without all of the operational overhead as shown above.

Single Direction Traffic Flow

Developer Access to Service Mesh

Engineers often do not have the resources to run all of the required micro-services for their environment. The approach below explains how you can achieve this in the same way that virtual machines securely communicate with mesh applications.

Local Machine Service Mesh

Decouple Envoy and Istio

The largest amount of complexity as it relates to virtual machines is the connecting of envoy to istiod to get its configuration. A simpler approach is to just not connect them anymore. Even though Istio will no longer know about the virtual machines that are communicating in the mesh, that communication can still be secure and authenticated. The trick is issuing virtual machines their own workload certificates that are rooted in the same trust chain as the mesh workloads. This also means that the end user will be responsible for configuring envoy manually on the virtual machine. For most this shouldn’t be an issue because it is not expected that it will change very often.

A Simpler On-boarding Experience

Virtual Machine Traffic Flow

We can achieve a simpler setup by utilizing some built-in Istio features. First we need to expose a secure tunnel for applications outside the mesh to communicate with applications within.

To do this we simply need to create an Istio east-west gateway and enable AUTO_PASSTHROUGH. This automatically configures the east-west gateway to pass traffic through to the correct service over mTLS. This gives your virtual machine end to end authenticated encryption with the application its trying to reach.

Virtual Machine On-boarding Steps

Due to the complexity involved in configuring envoy to talk to istiod, it is more practical to directly configure the virtual machine envoy. At first this sounds quite daunting, but due to the reduced complexity we only need to enable a few features to make this work. Envoy will need to be configured to know about each service mesh application that the virtual machine will need to communicate with. We then will configure these as clusters within envoy and set them up to use mTLS communication passing through the east-west gateway in the service mesh. Secondly a listener will need to be exposed to handle incoming traffic from the virtual machine application. Finally certificates will need to be issued for each virtual machine that share the same root of trust as the service mesh applications. This allows end to end encryption as well as the ability to authorize which applications the virtual machine can communicate with.

Easier to Automate

Given that no initialization has to occur on the service mesh cluster when on-boarding a virtual machine, it is much easier to automate. The configuration needed for the virtual machine envoy can be added to your pipeline; the envoy container can either be pulled via docker, or added to your image building infrastructure; the mTLS certificates can also be provisioned and maintained by a third party such as Hashicorp’s Vault.

More Runtime Support

Due to the fact that this installation method does not require access to the underlying OS networking. You can run this approach in more types of environments including Windows and Docker. The only requirement is that your Envoy include the Istio extensions found here. Using Docker, you can now run the Envoy proxy on your local machine and communicate with the service mesh directly.

Runtime Support

Advanced Use Cases

gRPC to JSON

This technique can also be leveraged to enable virtual machine applications to communicate with gRPC applications without having to implement the gRPC endpoints. Using envoys gRPC / JSON transformation, the virtual machine application can communicate with its local envoy over REST and envoy will translate that to gRPC.

gRPC to JSON Transformation

Multi Direction

Even though your service mesh may not know about the virtual machines that are communicating with it, you can still add them as external endpoints using Service Entries. That service entry could be an HTTPS Load Balancer endpoint that manages traffic to multiple virtual machines. This setup is still often more feasible than fully on-boarding virtual machines into the virtual mesh.

Multi Direction Traffic Flow

Forwarding Proxy

Maybe installing envoy on every virtual machine is still too complex. An alternative is to run envoy (or an autoscaling group) to run on its own virtual machine and act as a forwarding proxy into the mesh. This is a much simpler solution to accessing mesh services as the virtual machines that run the applications are left untouched.

Forwarding Proxy

Part 2…

In part 2, I will explain how to configure Istio as well as a virtual machine to communicate within the mesh. If you would like a preview, feel free to reach out to nick.nellis@solo.io

Special Thanks

A special thanks to Dave Ortiz for this virtual machine idea and congrats to Constant Contact a new registered Istio user!

]]>
Mon, 20 Dec 2021 00:00:00 +0000/v1.24//blog/2021/simple-vms//v1.24//blog/2021/simple-vms/virtualmachineIstionetworkingenvoy
Announcing the alpha availability of WebAssembly PluginsIstio 1.9 introduced experimental support for WebAssembly (Wasm) module distribution and a Wasm extensions ecosystem repository with canonical examples and use cases for extension development. Over the past 9 months, the Istio, Envoy, and Proxy-Wasm communities have continued our joint effort to make Wasm extensibility stable, reliable, and easy to adopt, and we are pleased to announce Alpha support for Wasm extensibility in Istio 1.12! In the following sections, we’ll walk through the updates that have been made to the Wasm support for the 1.12 release.

New WasmPlugin API

With the new WasmPlugin CRD in the extensions.istio.io namespace, we’re introducing a new high-level API for extending the functionality of the Istio proxy with custom Wasm modules. This effort builds on the excellent work that has gone into the Proxy-Wasm specification and implementation over the last two years. From now on, you no longer need to use EnvoyFilter resources to add custom Wasm modules to your proxies. Instead, you can now use a WasmPlugin resource:

apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  name: your-filter
spec:
  selector:
    matchLabels:
      app: server
  phase: AUTHN
  priority: 10
  pluginConfig:
    someSetting: true
    someOtherSetting: false
    youNameIt:
    - first
    - second
  url: docker.io/your-org/your-filter:1.0.0

There are a lot of similarities and a few differences between WasmPlugin and EnvoyFilter, so let’s go through the fields one by one.

The above example deploys a Wasm module to all workloads (including gateway pods) that match the selector field - this very much works the same as in an EnvoyFilter.

The next field below that is the phase. This determines where in the proxy’s filter chain the Wasm module will be injected. We have defined four distinct phases for injection:

  • AUTHN: prior to any Istio authentication and authorization filters.
  • AUTHZ: after the Istio authentication filters and before any first-class authorization filters, i.e., before AuthorizationPolicy resources have been applied.
  • STATS: after all authorization filters and prior to the Istio stats filter.
  • UNSPECIFIED_PHASE: let the control plane decide where to insert. This will generally be at the end of the filter chain, right before the router. This is the default value for this phase field.

The pluginConfig field is used for configuring your Wasm plugin. Whatever you put into this field will be encoded in JSON and passed on to your filter, where you can access it in the configuration callback of the Proxy-Wasm SDKs. For example, you can retrieve the config with onConfigure in the C++ SDK, on_configure in the Rust SDK or the OnPluginStart call back in the Go SDK.

The url field specifies where to pull the Wasm module. You’ll notice that the url in this example is a docker URI. Apart from loading Wasm modules via HTTP, HTTPS and the local file system (using file://), we are introducing the OCI image format as the preferred mechanism for distributing Wasm modules.

One last thing to note is currently the Wasm Plugin API only applies to inbound HTTP filter chains. Support for network filters and outbound traffic will be added in the future.

Wasm image specification

We believe that containers are the ideal way to store, publish and manage proxy extensions, so we worked with Solo.io to extend their existing Proxy-Wasm container format with a variant that aims to be compatible with all registries and the CLI toolchain. Depending on your processes, you can now build your proxy extension containers using your existing container CLI tooling such as Docker CLI or buildah.

To learn how to build OCI images, please refer to these instructions.

Image fetcher in Istio agent

Since Istio 1.9, Istio-agent has provided a reliable solution for loading Wasm binaries, fetched from remote HTTP sources configured in the EnvoyFilters, by leveraging the xDS proxy inside istio-agent and Envoy’s Extension Configuration Discovery Service (ECDS). The same mechanism applies for the new Wasm API implementation in Istio 1.12. You can use HTTP remote resources reliably without concern that Envoy might get stuck with a bad configuration when a remote fetch fails.

In addition, Istio 1.12 expands this capability to Wasm OCI images. This means the Istio-agent is now able to fetch Wasm images from any OCI registry including Docker Hub, Google Container Registry(GCR), Amazon Elastic Container Registry (Amazon ECR), etc. After fetching images, Istio-agent extracts and caches Wasm binaries from them, and then inserts them into the Envoy filter chains.

Remote Wasm module fetch flow

Improvements in Envoy Wasm runtime

The Wasm runtime powered by V8 in Envoy has been shipped since Istio 1.5 and there have been a lot of improvements since then.

WASI supports

First, some of the WASI (WebAssembly System Interface) system calls are now supported. For example, the clock_time_get system call can be made from Wasm programs so you can use std::time::SystemTime::now() in Rust or time.Now().UnixNano() in Go in your Envoy Wasm extensions, just like any other native platform. Another example is random_get is now supported by Envoy, so the “crypto/rand” package is available in Go as a cryptographically secure random number generator. We are also currently looking into file system support as we have seen requests for reading and writing local files from Wasm programs running in Envoy.

Debuggability

Next is the improvement in debuggability. The Envoy runtime now emits the stack trace of your program when it causes runtime errors, for example, when null pointer exceptions occur in C++ or the panic function is called in Go or Rust. While Envoy error messages did not previously include anything about the cause, they now show the trace which you can use to debug your program:

Function: proxy_on_request_headers failed: Uncaught RuntimeError: unreachable
Proxy-Wasm plugin in-VM backtrace:
  0:  0xdbd - runtime._panic
  1:  0x103ab - main.anotherCalculation
  2:  0x10399 - main.someCalculation
  3:  0xea57 - main.myHeaderHandler
  4:  0xea15 - proxy_on_request_headers

The above is an example stack trace from a Go SDK based Wasm extension. You might notice that the output does not include file names and line numbers in the trace. This is an important future work item and open issue related to the DWARF format for WebAssembly and the Exception Handling proposal for the WebAssembly specification.

Strace support for Wasm programs

You can see strace equivalent logs emitted by Envoy. With Istio proxy’s component log level wasm:trace, you can observe all the system calls and Proxy-Wasm ABI calls that go across the boundary between Wasm virtual machines and Envoy. The following is an example of such an strace log stream:

[host->vm] proxy_on_context_create(2, 1)
[host<-vm] proxy_on_context_create return: void
[host->vm] proxy_on_request_headers(2, 8, 1)
[vm->host] wasi_snapshot_preview1.random_get(86928, 32)
[vm<-host] wasi_snapshot_preview1.random_get return: 0
[vm->host] env.proxy_log(2, 87776, 18)

This is especially useful to debug a Wasm program’s execution at runtime, for example, to verify it is not making any malicious system calls.

Arbitrary Prometheus namespace for in-Wasm metrics

The last update is about metrics. Wasm extensions have been able to define their own custom metrics and expose them in Envoy, just like any other metric, but prior to Istio 1.12, all of these custom metrics were prefixed with the envoy_ Prometheus namespace and users were not able to use their own namespaces. Now, you can choose whatever namespace you want, and your metrics will be exposed in Envoy as-is, without being prefixed by envoy_.

Note that in order to actually expose these custom metrics, you have to configure ProxyConfig.proxyStatsMatcher in meshConfig for global configuration or in proxy.istio.io/config for per proxy configuration. For detail, please refer to Envoy Statistics.

Future work and looking for feedback

Although we have announced the alpha availability of Wasm plugins, there is still a lot of work left to be done. One important work item is “Image pull secrets” support in the Wasm API which will allow you to easily consume OCI images in a private repository. Others include first-class support for L4 filters, signature verification of Wasm binaries, runtime improvements in Envoy, Proxy-Wasm SDK improvements, documentation, etc.

This is just the beginning of our plan to provide 1st-class Wasm support in Istio. We would love to hear your feedback so that we can improve the developer experience using Wasm plugins, in future releases of Istio!

]]>
Thu, 16 Dec 2021 00:00:00 +0000/v1.24//blog/2021/wasm-api-alpha//v1.24//blog/2021/wasm-api-alpha/wasmextensibilityWebAssembly
gRPC Proxyless Service MeshIstio dynamically configures its Envoy sidecar proxies using a set of discovery APIs, collectively known as the xDS APIs. These APIs aim to become a universal data-plane API. The gRPC project has significant support for the xDS APIs, which means you can manage gRPC workloads without having to deploy an Envoy sidecar along with them. You can learn more about the integration in a KubeCon EU 2021 talk from Megan Yahya. The latest updates on gRPC’s support can be found in their proposals along with implementation status.

Istio 1.11 adds experimental support for adding gRPC services directly to the mesh. We support basic service discovery, some VirtualService based traffic policy, and mutual TLS.

Supported Features

The current implementation of the xDS APIs within gRPC is limited in some areas compared to Envoy. The following features should work, although this is not an exhaustive list and other features may have partial functionality:

  • Basic service discovery. Your gRPC service can reach other pods and virtual machines registered in the mesh.
  • DestinationRule:
    • Subsets: Your gRPC service can split traffic based on label selectors to different groups of instances.
    • The only Istio loadBalancer currently supported is ROUND_ROBIN, consistentHash will be added in future versions of Istio (it is supported by gRPC).
    • tls settings are restricted to DISABLE or ISTIO_MUTUAL. Other modes will be treated as DISABLE.
  • VirtualService:
    • Header match and URI match in the format /ServiceName/RPCName.
    • Override destination host and subset.
    • Weighted traffic shifting.
  • PeerAuthentication:
    • Only DISABLE and STRICT are supported. Other modes will be treated as DISABLE.
    • Support for auto-mTLS may exist in a future release.

Other features including faults, retries, timeouts, mirroring and rewrite rules may be supported in a future release. Some of these features are awaiting implementation in gRPC, and others require work in Istio to support. The status of xDS features in gRPC can be found here. The status of Istio’s support will exist in future official docs.

Architecture Overview

Diagram of how gRPC services communicate with the istiod

Although this doesn’t use a proxy for data plane communication, it still requires an agent for initialization and communication with the control-plane. First, the agent generates a bootstrap file at startup the same way it would generate bootstrap for Envoy. This tells the gRPC library how to connect to istiod, where it can find certificates for data plane communication, and what metadata to send to the control plane. Next, the agent acts as an xDS proxy, connecting and authenticating with istiod on the application’s behalf. Finally, the agent fetches and rotates certificates used in data plane traffic.

Changes to application code

To enable the xDS features in gRPC, there are a handful of required changes your application must make. Your gRPC version should be at least 1.39.0.

In the client

The following side-effect import will register the xDS resolvers and balancers within gRPC. It should be added in your main package or in the same package calling grpc.Dial.

import _ "google.golang.org/grpc/xds"

When creating a gRPC connection the URL must use the xds:/// scheme.

conn, err := grpc.DialContext(ctx, "xds:///foo.ns.svc.cluster.local:7070")

Additionally, for (m)TLS support, a special TransportCredentials option has to be passed to DialContext. The FallbackCreds allow us to succeed when istiod doesn’t send security config.

import "google.golang.org/grpc/credentials/xds"

...

creds, err := xds.NewClientCredentials(xds.ClientOptions{
FallbackCreds: insecure.NewCredentials()
})
// handle err
conn, err := grpc.DialContext(
ctx,
"xds:///foo.ns.svc.cluster.local:7070",
grpc.WithTransportCredentials(creds),
)

On the server

To support server-side configurations, such as mTLS, there are a couple of modifications that must be made.

First, we use a special constructor to create the GRPCServer:

import "google.golang.org/grpc/xds"

...

server = xds.NewGRPCServer()
RegisterFooServer(server, &fooServerImpl)

If your protoc generated Go code is out of date, you may need to regenerate it to be compatible with the xDS server. Your generated RegisterFooServer function should look like the following:

func RegisterFooServer(s grpc.ServiceRegistrar, srv FooServer) {
s.RegisterService(&FooServer_ServiceDesc, srv)
}

Finally, as with the client-side changes, we must enable security support:

creds, err := xds.NewServerCredentials(xdscreds.ServerOptions{FallbackCreds: insecure.NewCredentials()})
// handle err
server = xds.NewGRPCServer(grpc.Creds(creds))

In your Kubernetes Deployment

Assuming your application code is compatible, the Pod simply needs the annotation inject.istio.io/templates: grpc-agent. This adds a sidecar container running the agent described above, and some environment variables that gRPC uses to find the bootstrap file and enable certain features.

For gRPC servers, your Pod should also be annotated with proxy.istio.io/config: '{"holdApplicationUntilProxyStarts": true}' to make sure the in-agent xDS proxy and bootstrap file are ready before your gRPC server is initialized.

Example

In this guide you will deploy echo, an application that already supports both server-side and client-side proxyless gRPC. With this app you can try out some supported traffic policies enabling mTLS.

Prerequisites

This guide requires the Istio (1.11+) control plane to be installed before proceeding.

Deploy the application

Create an injection-enabled namespace echo-grpc. Next deploy two instances of the echo app as well as the Service.

$ kubectl create namespace echo-grpc
$ kubectl label namespace echo-grpc istio-injection=enabled
$ kubectl -n echo-grpc apply -f samples/grpc-echo/grpc-echo.yaml

Make sure the two pods are running:

$ kubectl -n echo-grpc get pods
NAME                       READY   STATUS    RESTARTS   AGE
echo-v1-69d6d96cb7-gpcpd   2/2     Running   0          58s
echo-v2-5c6cbf6dc7-dfhcb   2/2     Running   0          58s

Test the gRPC resolver

First, port-forward 17171 to one of the Pods. This port is a non-xDS backed gRPC server that allows making requests from the port-forwarded Pod.

$ kubectl -n echo-grpc port-forward $(kubectl -n echo-grpc get pods -l version=v1 -ojsonpath='{.items[0].metadata.name}') 17171 &

Next, we can fire off a batch of 5 requests:

$ grpcurl -plaintext -d '{"url": "xds:///echo.echo-grpc.svc.cluster.local:7070", "count": 5}' :17171 proto.EchoTestService/ForwardEcho | jq -r '.output | join("")'  | grep Hostname
Handling connection for 17171
[0 body] Hostname=echo-v1-7cf5b76586-bgn6t
[1 body] Hostname=echo-v2-cf97bd94d-qf628
[2 body] Hostname=echo-v1-7cf5b76586-bgn6t
[3 body] Hostname=echo-v2-cf97bd94d-qf628
[4 body] Hostname=echo-v1-7cf5b76586-bgn6t

You can also use Kubernetes-like name resolution for short names:

$ grpcurl -plaintext -d '{"url": "xds:///echo:7070"}' :17171 proto.EchoTestService/ForwardEcho | jq -r '.output | join
("")'  | grep Hostname
[0 body] Hostname=echo-v1-7cf5b76586-ltr8q
$ grpcurl -plaintext -d '{"url": "xds:///echo.echo-grpc:7070"}' :17171 proto.EchoTestService/ForwardEcho | jq -r
'.output | join("")'  | grep Hostname
[0 body] Hostname=echo-v1-7cf5b76586-ltr8q
$ grpcurl -plaintext -d '{"url": "xds:///echo.echo-grpc.svc:7070"}' :17171 proto.EchoTestService/ForwardEcho | jq -r
'.output | join("")'  | grep Hostname
[0 body] Hostname=echo-v2-cf97bd94d-jt5mf

Creating subsets with destination rule

First, create a subset for each version of the workload.

$ cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: echo-versions
  namespace: echo-grpc
spec:
  host: echo.echo-grpc.svc.cluster.local
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
EOF

Traffic shifting

Using the subsets defined above, you can send 80 percent of the traffic to a specific version:

$ cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: echo-weights
  namespace: echo-grpc
spec:
  hosts:
  - echo.echo-grpc.svc.cluster.local
  http:
  - route:
    - destination:
        host: echo.echo-grpc.svc.cluster.local
        subset: v1
      weight: 20
    - destination:
        host: echo.echo-grpc.svc.cluster.local
        subset: v2
      weight: 80
EOF

Now, send a set of 10 requests:

$ grpcurl -plaintext -d '{"url": "xds:///echo.echo-grpc.svc.cluster.local:7070", "count": 10}' :17171 proto.EchoTestService/ForwardEcho | jq -r '.output | join("")'  | grep ServiceVersion

The response should contain mostly v2 responses:

[0 body] ServiceVersion=v2
[1 body] ServiceVersion=v2
[2 body] ServiceVersion=v1
[3 body] ServiceVersion=v2
[4 body] ServiceVersion=v1
[5 body] ServiceVersion=v2
[6 body] ServiceVersion=v2
[7 body] ServiceVersion=v2
[8 body] ServiceVersion=v2
[9 body] ServiceVersion=v2

Enabling mTLS

Due to the changes to the application itself required to enable security in gRPC, Istio’s traditional method of automatically detecting mTLS support is unreliable. For this reason, the initial release requires explicitly enabling mTLS on both the client and server.

To enable client-side mTLS, apply a DestinationRule with tls settings:

$ cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: echo-mtls
  namespace: echo-grpc
spec:
  host: echo.echo-grpc.svc.cluster.local
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
EOF

Now an attempt to call the server that is not yet configured for mTLS will fail.

$ grpcurl -plaintext -d '{"url": "xds:///echo.echo-grpc.svc.cluster.local:7070"}' :17171 proto.EchoTestService/ForwardEcho | jq -r '.output | join("")'
Handling connection for 17171
ERROR:
Code: Unknown
Message: 1/1 requests had errors; first error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure

To enable server-side mTLS, apply a PeerAuthentication.

$ cat <<EOF | kubectl apply -f -
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: echo-mtls
  namespace: echo-grpc
spec:
  mtls:
    mode: STRICT
EOF

Requests will start to succeed after applying the policy.

$ grpcurl -plaintext -d '{"url": "xds:///echo.echo-grpc.svc.cluster.local:7070"}' :17171 proto.EchoTestService/ForwardEcho | jq -r '.output | join("")'
Handling connection for 17171
[0] grpcecho.Echo(&{xds:///echo.echo-grpc.svc.cluster.local:7070 map[] 0  5s false })
[0 body] x-request-id=0
[0 body] Host=echo.echo-grpc.svc.cluster.local:7070
[0 body] content-type=application/grpc
[0 body] user-agent=grpc-go/1.39.1
[0 body] StatusCode=200
[0 body] ServiceVersion=v1
[0 body] ServicePort=17070
[0 body] Cluster=
[0 body] IP=10.68.1.18
[0 body] IstioVersion=
[0 body] Echo=
[0 body] Hostname=echo-v1-7cf5b76586-z5p8l

Limitations

The initial release comes with several limitations that may be fixed in a future version:

  • Auto-mTLS isn’t supported, and permissive mode isn’t supported. Instead we require explicit mTLS configuration with STRICT on the server and ISTIO_MUTUAL on the client. Envoy can be used during the migration to STRICT.
  • grpc.Serve(listener) or grpc.Dial("xds:///...") called before the bootstrap is written or xDS proxy is ready can cause a failure. holdApplicationUntilProxyStarts can be used to work around this, or the application can be more robust to these failures.
  • If the xDS-enabled gRPC server uses mTLS then you will need to make sure your health checks can work around this. Either a separate port should be used, or your health-checking client needs a way to get the proper client certificates.
  • The implementation of xDS in gRPC does not match Envoys. Certain behaviors may be different, and some features may be missing. The feature status for gRPC provides more detail. Make sure to test that any Istio configuration actually applies on your proxyless gRPC apps.

Performance

Experiment Setup

  • Using Fortio, a Go-based load testing app
    • Slightly modified, to support gRPC’s XDS features (PR)
  • Resources:
    • GKE 1.20 cluster with 3 e2-standard-16 nodes (16 CPUs + 64 GB memory each)
    • Fortio client and server apps: 1.5 vCPU, 1000 MiB memory
    • Sidecar (istio-agent and possibly Envoy proxy): 1 vCPU, 512 MiB memory
  • Workload types tested:
    • Baseline: regular gRPC with no Envoy proxy or Proxyless xDS in use
    • Envoy: standard istio-agent + Envoy proxy sidecar
    • Proxyless: gRPC using the xDS gRPC server implementation and xds:/// resolver on the client
    • mTLS enabled/disabled via PeerAuthentication and DestinationRule

Latency

p50 latency comparison chart
p99 latency comparison chart

There is a marginal increase in latency when using the proxyless gRPC resolvers. Compared to Envoy this is a massive improvement that still allows for advanced traffic management features and mTLS.

istio-proxy container resource usage

Client mCPU Client Memory (MiB) Server mCPU Server Memory (MiB)
Envoy Plaintext 320.44 66.93 243.78 64.91
Envoy mTLS 340.87 66.76 309.82 64.82
Proxyless Plaintext 0.72 23.54 0.84 24.31
Proxyless mTLS 0.73 25.05 0.78 25.43

Even though we still require an agent, the agent uses less than 0.1% of a full vCPU, and only 25 MiB of memory, which is less than half of what running Envoy requires.

These metrics don’t include additional resource usage by gRPC in the application container, but serve to demonstrate the resource usage impact of the istio-agent when running in this mode.

]]>
Thu, 28 Oct 2021 00:00:00 +0000/v1.24//blog/2021/proxyless-grpc//v1.24//blog/2021/proxyless-grpc/
Aeraki — Manage Any Layer-7 Protocol in Istio Service MeshAeraki [Air-rah-ki] is the Greek word for ‘breeze’. While Istio connects microservices in a service mesh, Aeraki provides a framework to allow Istio to support more layer-7 protocols other than just HTTP and gRPC. We hope this breeze can help Istio sail a little further.

Lack of Protocols Support in Service Mesh

We are now facing some challenges with service meshes:

  • Istio and other popular service mesh implementations have very limited support for layer 7 protocols other than HTTP and gRPC.
  • Envoy RDS(Route Discovery Service) is solely designed for HTTP. Other protocols such as Dubbo and Thrift can only use listener in-line routes for traffic management, which breaks existing connections when routes change.
  • It takes a lot of effort to introduce a proprietary protocol into a service mesh. You’ll need to write an Envoy filter to handle the traffic in the data plane, and a control plane to manage those Envoys.

Those obstacles make it very hard, if not impossible, for users to manage the traffic of other widely-used layer-7 protocols in microservices. For example, in a microservices application, we may have the below protocols:

  • RPC: HTTP, gRPC, Thrift, Dubbo, Proprietary RPC Protocol …
  • Messaging: Kafka, RabbitMQ …
  • Cache: Redis, Memcached …
  • Database: MySQL, PostgreSQL, MongoDB …
Common Layer-7 Protocols Used in Microservices

If you have already invested a lot of effort in migrating to a service mesh, of course, you want to get the most out of it — managing the traffic of all the protocols in your microservices.

Aeraki’s approach

To address these problems, we create an open-source project, Aeraki Mesh, to provide a non-intrusive, extendable way to manage any layer-7 traffic in an Istio service mesh.

Aeraki Architecture

As this diagram shows, Aeraki Framework consists of the following components:

  • Aeraki: Aeraki provides high-level, user-friendly traffic management rules to operations, translates the rules to envoy filter configurations, and leverages Istio’s EnvoyFilter API to push the configurations to the sidecar proxies. Aeraki also serves as the RDS server for MetaProtocol proxies in the data plane. Contrary to Envoy RDS, which focuses on HTTP, Aeraki RDS is aimed to provide a general dynamic route capability for all layer-7 protocols.
  • MetaProtocol Proxy: MetaProtocol Proxy provides common capabilities for Layer-7 protocols, such as load balancing, circuit breaker, load balancing, routing, rate limiting, fault injection, and auth. Layer-7 protocols can be built on top of MetaProtocol. To add a new protocol into the service mesh, the only thing you need to do is implementing the codec interface and a couple of lines of configuration. If you have special requirements which can’t be accommodated by the built-in capabilities, MetaProtocol Proxy also has an application-level filter chain mechanism, allowing users to write their own layer-7 filters to add custom logic into MetaProtocol Proxy.

Dubbo and Thrift have already been implemented based on MetaProtocol. More protocols are on the way. If you’re using a close-source, proprietary protocol, you can also manage it in your service mesh simply by writing a MetaProtocol codec for it.

Most request/response style, stateless protocols can be built on top of the MetaProtocol Proxy. However, some protocols’ routing policies are too “special” to be normalized in MetaProtocol. For example, Redis proxy uses a slot number to map a client query to a specific Redis server node, and the slot number is computed by the key in the request. Aeraki can still manage those protocols as long as there’s an available Envoy Filter in the Envoy proxy side. Currently, for protocols in this category, Redis and Kafka are supported in Aeraki.

Deep Dive Into MetaProtocol

Let’s look into how MetaProtocol works. Before MetaProtocol is introduced, if we want to proxy traffic for a specific protocol, we need to write an Envoy filter that understands that protocol and add the code to manipulate the traffic, including routing, header modification, fault injection, traffic mirroring, etc.

For most request/response style protocols, the code for traffic manipulation is very similar. Therefore, to avoid duplicating these functionalities in different Envoy filters, Aeraki Framework implements most of the common functions of a layer-7 protocol proxy in a single place — the MetaProtocol Proxy filter.

MetaProtocol Proxy

This approach significantly lowers the barrier to write a new Envoy filter: instead of writing a fully functional filter, now you only need to implement the codec interface. In addition to that, the control plane is already in place — Aeraki works at the control plane to provides MetaProtocol configuration and dynamic routes for all protocols built on top of MetaProtocol.

Writing an Envoy Filter Before and After MetProtocol

There are two important data structures in MetaProtocol Proxy: Metadata and Mutation. Metadata is used for routing, and Mutation is used for header manipulation.

At the request path, the decoder(the decode method of the codec implementation) populates the Metadata data structure with key-value pairs parsed from the request, then the Metadata will be passed to the MetaProtocol Router. The Router selects an appropriate upstream cluster after matching the route configuration it receives from Aeraki via RDS and the Metadata.

A custom filter can populate the Mutation data structure with arbitrary key-value pairs if the request needs to be modified: adding a header or changing the value of a header. Then the Mutation data structure will be passed to the encoder(the encode method of the codec implementation). The encoder is responsible for writing the key-value pairs into the wire protocol.

The Request Path

The response path is similar to the request path, only in a different direction.

The Response Path

An Example

If you need to implement an application protocol based on MetaProtocol, you can follow the below steps(use Thrift as an example):

Data Plane

  • Implement the codec interface to encode and decode the protocol package. You can refer to Dubbo codec and Thrift codec as writing your own implementation.

  • Define the protocol with Aeraki ApplicationProtocol CRD, as this YAML snippet shows:

apiVersion: metaprotocol.aeraki.io/v1alpha1
kind: ApplicationProtocol
metadata:
  name: thrift
  namespace: istio-system
spec:
  protocol: thrift
  codec: aeraki.meta_protocol.codec.thrift

Control Plane

You don’t need to implement the control plane. Aeraki watches services and traffic rules, generates the configurations for the sidecar proxies, and sends the configurations to the data plane via EnvoyFilter and MetaProtocol RDS.

Protocol selection

Similar to Istio, protocols are identified by service port prefix. Please name service ports with this pattern: tcp-metaprotocol-{application protocol}-xxx. For example, a Thrift service port should be named tcp-metaprotocol-thrift.

Traffic management

You can change the route via MetaRouter CRD. For example: send 20% of the requests to v1 and 80% to v2:

apiVersion: metaprotocol.aeraki.io/v1alpha1
kind: MetaRouter
metadata:
  name: test-metaprotocol-route
spec:
  hosts:
    - thrift-sample-server.thrift.svc.cluster.local
  routes:
    - name: traffic-spilt
      route:
        - destination:
            host: thrift-sample-server.thrift.svc.cluster.local
            subset: v1
          weight: 20
        - destination:
            host: thrift-sample-server.thrift.svc.cluster.local
            subset: v2
          weight: 80

Hope this helps if you need to manage protocols other than HTTP in a service mesh. Reach out to zhaohuabing if you have any questions.

Reference

]]>
Tue, 28 Sep 2021 00:00:00 +0000/v1.24//blog/2021/aeraki//v1.24//blog/2021/aeraki/istioenvoyaeraki
Announcing Extended Support for Istio 1.9In keeping with our 2021 theme of improving Day 2 Istio operations, the Istio team has been evaluating extending the support window for our releases to give users more time to upgrade. For starters, we are extending the support window of Istio 1.9 by six weeks, to October 5, 2021. We hope that this additional support window will allow the many users who are currently using Istio 1.9 to upgrade, either to Istio 1.10 or directly to Istio 1.11. By overlapping support between 1.9 and 1.11, we intend to create a stable cadence of upgrade windows twice a year for users upgrading directly across two minor versions (i.e. 1.9 to 1.11). Users who prefer upgrading through each minor release to get all the latest and greatest features may continue doing so quarterly.

Extended Support and Upgrades

During this extended period of support, Istio 1.9 will receive CVE and critical bug fixes only, as our goal is simply to provide users with time to migrate off the release and on to 1.10 or 1.11. And speaking of users, we would love to hear how we’re doing at improving your Day 2 experience of Istio. Is two upgrades per year not the right number? Is a six week upgrade window too short? Please share your thoughts with us on slack (in the user-experience channel), or on twitter. Thanks!

]]>
Fri, 03 Sep 2021 00:00:00 +0000/v1.24//blog/2021/extended-support//v1.24//blog/2021/extended-support/upgradeIstiosupport
Announcing the results of Istio’s first security assessmentThe Istio service mesh has gained wide production adoption across a wide variety of industries. The success of the project, and its critical usage for enforcing key security policies in infrastructure warranted an open and neutral assessment of the security risks associated with the project.

To achieve this goal, the Istio community contracted the NCC Group last year to conduct a third-party security assessment of the project. The goal of the review was “to identify security issues related to the Istio code base, highlight high-risk configurations commonly used by administrators, and provide perspective on whether security features sufficiently address the concerns they are designed to provide”.

NCC Group carried out the review over a period of five weeks with collaboration from subject matter experts across the Istio community. In this blog, we will examine the key findings of the report, actions taken to implement various fixes and recommendations, and our plan of action for continuous security evaluation and improvement of the Istio project. You can download and read the unabridged version of the security assessment report.

Scope and Key Findings

The assessment evaluated Istio’s architecture as a whole for security related issues with focus on key components like istiod (Pilot), Ingress/Egress gateways, and Istio’s overall Envoy usage as its data plane proxy. Additionally, Istio documentation, including security guides, were audited for correctness and clarity. The report was compiled against Istio version 1.6.5, and since then the Product Security Working Group has issued several security releases as new vulnerabilities were disclosed, along with fixes to address concerns raised in the new report.

An important conclusion from the report is that the auditors found no “Critical” issues within the Istio project. This finding validates the continuous and proactive security review and vulnerability management process implemented by Istio’s Product Security Working Group (PSWG). For the remaining issues surfaced by the report, the PSWG went to work on addressing them, and we are glad to report that all issues marked “High”, and several marked “Medium/Low”, have been resolved in the releases following the report.

The report also makes strategic recommendations around creating a hardening guide which is now available in our Security Best Practices guide. This is a comprehensive document which pulls together recommendations from security experts within the Istio community, and industry leaders running Istio in production. Work is underway to create an opinionated and hardened security profile for installing Istio in secure environments, but in the interim we recommend users follow the Security Best Practices guide and configure Istio to meet their security requirements. With that, let’s look at the analysis and resolution for various issues raised in the report.

Resolution and learnings

Inability to secure control plane network communications

The report flags configuration options that were available in older versions of Istio to control how communication is secured to the control plane. Since 1.7, Istio by default secures all control plane communication and many configuration options mentioned in the report to manage control plane encryption are no longer required.

The debug endpoint mentioned in the report is enabled by default (as of Istio 1.10) to allow users to debug their Istio service mesh using the istioctl tool. It can be disabled by setting the environment variable ENABLE_DEBUG_ON_HTTP to false as mentioned in the Security Best Practices guide. Additionally, in an upcoming version (1.11), this debug endpoint will be secured by default and a valid Kubernetes service account token will be required to gain access.

The report points out gaps in the security related documentation published with Istio 1.6. Since then, we have created a detailed Security Best Practices guide with recommendations to ensure users can deploy Istio securely to meet their requirements. Moving forward, we will continue to augment this documentation with more hardening recommendations. We advise users to monitor the guide for updates.

Lack of VirtualService Gateway field validation enables request hijacking

For this issue, the report uses a valid but permissive Gateway configuration that can cause requests to be routed incorrectly. Similar to the Kubernetes RBAC, Istio APIs, including Gateways, can be tuned to be permissive or restrictive depending upon your requirements. However, the report surfaced missing links in our documentation related to best practices and guiding our users to secure their environments. To address them, we have added a section to our Security Best Practices guide with steps for running Gateways securely. In particular, the section describing using namespace prefixes in hosts specification on Gateway resources is strongly recommended to harden your configuration and prevent this type of request hijacking.

Ingress Gateway configuration generation enables request hijacking

The report raises possible request hijacking when using the default mechanism of selecting gateway workloads by labels across namespaces in a Gateway resource. This behavior was chosen by default as it allows delegation of managing Gateway and VirtualService resources to the applications team while allowing operations teams to centrally manage the ingress gateway workloads for meeting their unique security requirements like running on dedicated nodes for instance. As highlighted in the report, if this deployment topology is not a requirement in your environment it is strongly recommended to co-locate Gateway resources with your gateway workloads and set the environment variable PILOT_SCOPE_GATEWAY_TO_NAMESPACE to true.

Please refer to the gateway deployment topologies guide to understand the various recommended deployment models by the Istio community. Additionally, as mentioned in the Security Best Practices guide, Gateway resource creation should be access controlled using Kubernetes RBAC or other policy enforcement mechanisms to ensure only authorized entities can create them.

Other Medium and Low Severity Issues

There are two medium severity issues reported related to debug information exposed at various levels within the project which can be used to gain access to sensitive information or orchestrate Denial of Service (DOS) attacks. While Istio by default enables these debug interfaces for profiling or enabling tools like “istioctl”, they can be disabled by setting the environment variable ENABLE_DEBUG_ON_HTTP to false as discussed above.

The report correctly points out that various utilities like sudo, tcpdump, etc. installed in the default images shipped by Istio can lead to privilege escalation attacks. These utilities are provided to aid runtime debugging of packets flowing through the mesh, and users are recommended to use hardened versions of these images in production.

The report also surfaces a known architectural limitation with any sidecar proxy-based service mesh implementation which uses iptables for intercepting traffic. This mechanism is susceptible to sidecar proxy bypass, which is a valid concern for secure environments. It can be addressed by following the defense-in-depth recommendation of the Security Best Practices guide. We are also investigating more secure options in collaboration with the Kubernetes community.

The tradeoff between useful and secure

You may have noticed a trend in the findings of the assessment and the recommendations made to address them. Istio provides various configuration options to create a more secure installation based on your requirement, and we have introduced a comprehensive Security Best Practices guide for our users to follow. As Istio is widely adopted in production, it is a tradeoff for us between switching to secure defaults and possible migration issues for our existing users on upgrades. The Istio Product Security Working Group evaluates each of these issues and creates a plan of action to enable secure default on a case-by-case basis after giving our users a number of releases to opt-in the secure configuration and migrate their workloads.

Lastly, there were several lessons for us during and after undergoing a neutral security assessment. The primary one was to ensure our security practices are robust to quickly respond to the findings, and more importantly making security enhancements while maintaining our standards for upgrades without disruption.

To continue this endeavor, we are always looking for feedback and participation in the Istio Product Security Working Group, so join our public meetings to raise issues or learn about what we are doing to keep Istio secure!

]]>
Tue, 13 Jul 2021 00:00:00 +0000/v1.24//blog/2021/ncc-security-assessment//v1.24//blog/2021/ncc-security-assessment/istiosecurityauditnccassessment
Join us at the Istio Community Meetup in ChinaWith the rapid popularization of cloud native technology in China, Istio has also gained popularity in this corner of the world. Almost all Chinese CSPs have creating and are running service mesh products based on Istio.

We welcomed thousands of Istio users and developers to the first IstioCon in February 2021, and the attendees expressed an interest in participating in more meetups and helping to grow the community at the local level.

To this end, the Istio community united six partners — the China Academy of Information and Communications Technology, Alibaba Cloud, Huawei Cloud, Intel, Tencent Cloud, and Tetrate — to co-host the first official Istio Community Meetup China. We have invited a number of industry experts to share comprehensive Istio technical practices with everyone at an in-person meetup. We will serve some refreshments, and seats are limited, so we will operate on a first-come first-served basis. Please register to attend.

Time and day: 13:30-17:30 (CST), July 10, 2021

Venue: Industrial Internet Exhibition Hall, 2nd Floor, Research Building, China Academy of Information and Communications Technology, No. 52 Huayuan North Road, Haidian District, Beijing

Agenda

Session time (CST) Title
13:30 - 13:50 Sign in
13:50 - 14:00 Welcome
Craig Box, Istio Steering Committee member, Google Cloud
Iris Ding, cloud computing engineer, Intel
14:00 - 14:30 Interpretation of the “Service Grid Technical Capability Requirements” Standard
Yin Xia Mengxue, Engineer, Cloud Computing Department, Academy of Information and Communications Technology
14:30 - 15:00 Service Mesh Data Plane Hot Upgrade
Shi Zehuan, Alibaba Cloud
15:00 - 15:30 Envoy Principle Introduction and Online Problem Pit
Zhang Wei, Data Plane Technical Expert, Huawei Cloud Service Mesh
15:30 - 15:45 Coffee break
15:45 - 16:15 Use eBPF to accelerate Istio/Envoy networking
Zhong Luyao, Intel
16:15 - 16:45 Full-stack service mesh: how Aeraki helps you manage any Layer 7 traffic in Istio
Huabing Zhao, Senior Engineer, Tencent Cloud
16:45 - 17:15 Securing workload deployment with Istio CNI
Zhang Zhihan, Tetrate

We want to thank our community in China who have worked on this event, especially Iris Ding, Wei W Hu, Jimmy Song, Zhonghu Xu, Xining Wang, and Huabing Zhao. We hope you can join!

]]>
Tue, 06 Jul 2021 00:00:00 +0000/v1.24//blog/2021/istio-community-meetup-china//v1.24//blog/2021/istio-community-meetup-china/communitymeetupChina
Steering and TOC updatesLast year we introduced a new Steering Committee charter, which shares governance responsibilities between Contribution Seats, selected based on contributions to the project, and Community Seats, elected by the project members. We elected four members, with the committee representing seven different companies.

It’s now time to kick off our 2021 election for Community Seats. Members have two weeks to submit nominations, and voting will run from 12 to 25 July. You can learn all about the election, including how to stand and how to vote, in the istio/community repository on GitHub.

Just like last year, any project member can stand for election. All Istio members who have been active in the last 12 months are eligible to vote.

Technical Oversight Committee updates

We wish to offer our thanks to Dan Berg and Joshua Blatt, both long-time contributors to the Istio project, who have recently taken new jobs outside the service mesh space. That left two vacancies on the Istio Technical Oversight Committee (TOC), responsible for cross-cutting product and design decisions.

TOC members are elected by the Steering Committee from the working group leads, and last week we voted for two new members:

  • John Howard, from Google, has become one of the most active contributors to Istio since joining the project in January 2019. He is currently a lead in the Networking working group, and has also served as an Environments working group lead and release manager for version 1.4.
  • Brian Avery, from Red Hat, has been active in the Istio community for over 3 years. He served as Release Manager for Istio 1.3 and 1.6, and has remained actively involved in the Istio release process, including introducing tooling for release notes, streamlining the feature maturity process, and working on documentation testing. Most recently, Brian was a lead in the Test and Release and Product Security working groups.

Congratulations to John and Brian!

As our new TOC members step into their roles, they will be vacating their current positions as working group leads. We are always on the lookout for community members who are interested in joining, or leading, Istio working groups. If you’re interested, please reach out in the working group channels on Slack, or during the public working group meetings of your interest.

]]>
Tue, 29 Jun 2021 00:00:00 +0000/v1.24//blog/2021/steering-election//v1.24//blog/2021/steering-election/istiosteeringgovernancecommunityelection
Configuring failover for external servicesIstio’s powerful APIs can be used to solve a variety of service mesh use cases. Many users know about its strong ingress and east-west capabilities but it also offers many features for egress (outgoing) traffic. This is especially useful when your application needs to talk to an external service - such as a database endpoint provided by a cloud provider. There are often multiple endpoints to chose from depending on where your workload is running. For example, Amazon’s DynamoDB provides several endpoints across their regions. You typically want to choose the endpoint closest to your workload for latency reasons, but you may need to configure automatic failover to another endpoint in case things are not working as expected.

Similar to services running inside the service mesh, you can configure Istio to detect outliers and failover to a healthy endpoint, while still being completely transparent to your application. In this example, we’ll use Amazon DynamoDB endpoints and pick a primary region that is the same or close to workloads running in a Google Kubernetes Engine (GKE) cluster. We’ll also configure a failover region.

Routing Endpoint
Primary http://dynamodb.us-east-1.amazonaws.com
Failover http://dynamodb.us-west-1.amazonaws.com

failover

Define external endpoints using a ServiceEntry

Locality load balancing works based on region or zone, which are usually inferred from labels set on the Kubernetes nodes. First, determine the location of your workloads:

$ kubectl describe node | grep failure-domain.beta.kubernetes.io/region
                    failure-domain.beta.kubernetes.io/region=us-east1
                    failure-domain.beta.kubernetes.io/region=us-east1

In this example, the GKE cluster nodes are running in us-east1.

Next, create a ServiceEntry which aggregates the endpoints you want to use. In this example, we have selected mydb.com as the host. This is the address your application should be configured to connect to. Set the locality of the primary endpoint to the same region as your workload:

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: external-svc-dns
spec:
  hosts:
  - mydb.com
  location: MESH_EXTERNAL
  ports:
  - number: 80
    name: http
    protocol: HTTP
  resolution: DNS
  endpoints:
  - address: dynamodb.us-east-1.amazonaws.com
    locality: us-east1
    ports:
      http: 80
  - address: dynamodb.us-west-1.amazonaws.com
    locality: us-west
    ports:
      http: 80

Let’s deploy a sleep container to use as a test source for sending requests.

Zip
$ kubectl apply -f @samples/sleep/sleep.yaml@

From the sleep container try going to http://mydb.com 5 times:

$ for i in {1..5}; do kubectl exec deploy/sleep -c sleep -- curl -sS http://mydb.com; echo; sleep 2; done
healthy: dynamodb.us-east-1.amazonaws.com
healthy: dynamodb.us-west-1.amazonaws.com
healthy: dynamodb.us-west-1.amazonaws.com
healthy: dynamodb.us-east-1.amazonaws.com
healthy: dynamodb.us-east-1.amazonaws.com

You will see that Istio is sending requests to both endpoints. We only want it to send to the endpoint marked with the same region as our nodes.

For that, we need to configure a DestinationRule.

Set failover conditions using a DestinationRule

Istio’s DestinationRule lets you configure load balancing, connection pool, and outlier detection settings. We can specify the conditions used to identify an endpoint as unhealthy and remove it from the load balancing pool.

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: mydynamodb
spec:
  host: mydb.com
  trafficPolicy:
    outlierDetection:
      consecutive5xxErrors: 1
      interval: 15s
      baseEjectionTime: 1m

The above DestinationRule configures the endpoints to be scanned every 15 seconds, and if any endpoint fails with a 5xx error code, even once, it will be marked unhealthy for one minute. If this circuit breaker is not triggered, the traffic will route to the same region as the pod.

If we run our curl again, we should see that traffic is always going to the us-east1 endpoint.

$ for i in {1..5}; do kubectl exec deploy/sleep -c sleep -- curl -sS http://mydb.com; echo; sleep 2; done

healthy: dynamodb.us-east-1.amazonaws.com
healthy: dynamodb.us-east-1.amazonaws.com
healthy: dynamodb.us-east-1.amazonaws.com
healthy: dynamodb.us-east-1.amazonaws.com
healthy: dynamodb.us-east-1.amazonaws.com

Simulate a failure

Next, let’s see what happens if the us-east endpoint goes down. To simulate this, let’s modify the ServiceEntry and set the us-east endpoint to an invalid port:

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: external-svc-dns
spec:
  hosts:
  - mydb.com
  location: MESH_EXTERNAL
  ports:
  - number: 80
    name: http
    protocol: HTTP
  resolution: DNS
  endpoints:
  - address: dynamodb.us-east-1.amazonaws.com
    locality: us-east1
    ports:
      http: 81 # INVALID - This is purposefully wrong to trigger failover
  - address: dynamodb.us-west-1.amazonaws.com
    locality: us-west
    ports:
      http: 80

Running our curl again shows that traffic is automatically failed over to our us-west region after failing to connect to the us-east endpoint:

$ for i in {1..5}; do kubectl exec deploy/sleep -c sleep -- curl -sS http://mydb.com; echo; sleep 2; done
upstream connect error or disconnect/reset before headers. reset reason: connection failure
healthy: dynamodb.us-west-1.amazonaws.com
healthy: dynamodb.us-west-1.amazonaws.com
healthy: dynamodb.us-west-1.amazonaws.com
healthy: dynamodb.us-west-1.amazonaws.com

You can check the outlier status of the us-east endpoint by running:

$ istioctl pc endpoints <sleep-pod> | grep mydb
ENDPOINT                         STATUS      OUTLIER CHECK     CLUSTER
52.119.226.80:81                 HEALTHY     FAILED            outbound|80||mydb.com
52.94.12.144:80                  HEALTHY     OK                outbound|80||mydb.com

Failover for HTTPS

Configuring failover for external HTTPS services is just as easy. Your application can still continue to use plain HTTP, and you can let the Istio proxy perform the TLS origination to the HTTPS endpoint.

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: external-svc-dns
spec:
  hosts:
  - mydb.com
  ports:
  - number: 80
    name: http-port
    protocol: HTTP
    targetPort: 443
  resolution: DNS
  endpoints:
  - address: dynamodb.us-east-1.amazonaws.com
    locality: us-east1
  - address: dynamodb.us-west-1.amazonaws.com
    locality: us-west

The above ServiceEntry defines the mydb.com service on port 80 and redirects traffic to the real DynamoDB endpoints on port 443.

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: mydynamodb
spec:
  host: mydb.com
  trafficPolicy:
    tls:
      mode: SIMPLE
    loadBalancer:
      simple: ROUND_ROBIN
      localityLbSetting:
        enabled: true
        failover:
          - from: us-east1
            to: us-west
    outlierDetection:
      consecutive5xxErrors: 1
      interval: 15s
      baseEjectionTime: 1m

The DestinationRule now performs TLS origination and configures the outlier detection. The rule also has a failover field configured where you can specify exactly what regions are failover targets. This is useful when you have several regions defined.

Wrapping Up

Istio’s VirtualService and DestinationRule API’s provide traffic routing, failure recovery and fault injection features so that you can create resilient applications. The ServiceEntry API extends many of these features to external services that are not part of your service mesh.

]]>
Fri, 04 Jun 2021 00:00:00 +0000/v1.24//blog/2021/external-locality-failover//v1.24//blog/2021/external-locality-failover/localityregionfailoverIstiooutlierexternal
Safely upgrade the Istio control plane with revisions and tagsLike all security software, your service mesh should be kept up-to-date. The Istio community releases new versions every quarter, with regular patch releases for bug fixes and security vulnerabilities. The operator of a service mesh will need to upgrade the control plane and data plane components many times. You must take care when upgrading, as a mistake could affect your business traffic. Istio has many mechanisms to make it safe to perform upgrades in a controlled manner, and in Istio 1.10 we further improve this operational experience.

Background

In Istio 1.6, we added basic support for upgrading the service mesh following a canary pattern using revisions. Using this approach, you can run multiple control planes side-by-side without impacting an existing deployment and slowly migrate workloads from the old control plane to the new.

To support this revision-based upgrade, Istio introduced a istio.io/rev label for namespaces. This indicates which control plane revision should inject sidecar proxies for the workloads in the respective namespace. For example, a label of istio.io/rev=1-9-5 indicates the control plane revision 1-9-5 should inject the data plane using proxies for 1-9-5 for workloads in that namespace.

If you wanted to upgrade the data-plane proxies for a particular namespace, you would update the istio.io/rev label to point to a new version, such as istio.io/rev=1-10-0. Manually changing (or even trying to orchestrate) changes of labels across a large number of namespaces can be error-prone and lead to unintended downtime.

Introducing Revision Tags

In Istio 1.10, we’ve improved revision-based upgrades with a new feature called revision tags. A revision tag reduces the number of changes an operator has to make to use revisions, and safely upgrade an Istio control plane. You use the tag as the label for your namespaces, and assign a revision to that tag. This means you don’t have to change the labels on a namespace while upgrading, and minimizes the number of manual steps and configuration changes.

For example, you can define a tag named prod-stable and point it to the 1-9-5 revision of a control plane. You can also define another tag named prod-canary which points to the 1-10-0 revision. You may have a lot of important namespaces in your cluster, and you can label those namespaces with istio.io/rev=prod-stable. In other namespaces you may be willing to test the new version of Istio, and you can label that namespace istio.io/rev=prod-canary. The tag will indirectly associate those namespaces with the 1-9-5 revision for prod-stable and 1-10-0 for prod-canary respectively.

Stable revision tags

Once you’ve determined the new control plane is suitable for the rest of the prod-stable namespaces, you can change the tag to point to the new revision. This enables you to update all the namespaces labeled prod-stable to the new 1-10-0 revision without making any changes to the labels on the namespaces. You will need to restart the workloads in a namespace once you’ve changed the tag to point to a different revision.

Updated revision tags

Once you’re satisfied with the upgrade to the new control-plane revision, you can remove the old control plane.

Stable revision tags in action

To create a new prod-stable tag for a revision 1-9-5, run the following command:

$ istioctl x revision tag set prod-stable --revision 1-9-5

You can then label your namespaces with the istio.io/rev=prod-stable label. Note, if you installed a default revision (i.e., no revision) of Istio, you will first have to remove the standard injection label:

$ kubectl label ns istioinaction istio-injection-
$ kubectl label ns istioinaction istio.io/rev=prod-stable

You can list the tags in your mesh with the following:

$ istioctl x revision tag list

TAG         REVISION NAMESPACES
prod-stable 1-9-5    istioinaction

A tag is implemented with a MutatingWebhookConfiguration. You can verify a corresponding MutatingWebhookConfiguration has been created:

$ kubectl get MutatingWebhookConfiguration

NAME                             WEBHOOKS   AGE
istio-revision-tag-prod-stable   2          75s
istio-sidecar-injector           1          5m32s

Let’s say you are trying to canary a new revision of the control plane based on 1.10.0. First you would install the new version using a revision:

$ istioctl install -y --set profile=minimal --revision 1-10-0

You can create a new tag called prod-canary and point that to your 1-10-0 revision:

$ istioctl x revision tag set prod-canary --revision 1-10-0

Then label your namespaces accordingly:

$ kubectl label ns istioinaction-canary istio.io/rev=prod-canary

If you list out the tags in your mesh, you will see two stable tags pointing to two different revisions:

$ istioctl x revision tag list

TAG         REVISION NAMESPACES
prod-stable 1-9-5    istioinaction
prod-canary 1-10-0   istioinaction-canary

Any of the namespaces that you have labeled with istio.io/rev=prod-canary will be injected by the control plane that corresponds to the prod-canary stable tag name (which in this example points to the 1-10-0 revision). When you’re ready, you can switch the prod-stable tag to the new control plane with:

$ istioctl x revision tag set prod-stable --revision 1-10-0 --overwrite

Any time you switch a tag to point to a new revision, you will need to restart the workloads in any respective namespace to pick up the new revision’s proxy.

When both the prod-stable and prod-canary no longer point to the old revision, it may be safe to remove the old revision as follows:

$ istioctl x uninstall --revision 1-9-5

Wrapping up

Using revisions makes it safer to canary changes to an Istio control plane. In large environments with lots of namespaces, you may prefer to use stable tags, as we’ve introduced in this blog, to remove the number of moving pieces and simplify any automation you may build around updating an Istio control plane. Please check out the 1.10 release and the new tag feature and give us your feedback!

]]>
Wed, 26 May 2021 00:00:00 +0000/v1.24//blog/2021/revision-tags//v1.24//blog/2021/revision-tags/upgradesrevisionsoperationscanary
Happy Birthday, Istio!Celebrating Istio’s 4th birthday

Four years ago today, the Istio project was born to the open source world. To celebrate this anniversary, we are hosting a week-long birthday celebration that focuses on contributions to the Istio project that stem from using Istio in production. Read on to learn how to participate in this celebration and enter a chance to win some Istio swag.

Istio's 4th Birthday!

A year of important developments for Istio

Over the last 12 months, the Istio project has been very focused on the day-0 & day-1 experience for users by actively listening to our users through UX surveys and GitHub issues.

  • We simplified the control plane architecture and made Istio easier to install, configure and upgrade.
  • We provided clarity and process to our feature status and promotion of features and APIs.
  • We simplified the debugging experience with various istioctl commands.
  • We expanded the mesh to services running in VMs and multiple clusters.
  • We made StatefulSet easier to use in Istio 1.10 with zero-configuration.
  • We made various performance improvements to the Istio control plane and data plane via discovery selectors, sidecar resources etc.
  • We introduced WebAssembly as our extensibility platform which has helped users tailor Istio to their needs.
  • We beefed up our CVE management and release processes to meet enterprise needs.

Read more about improvements to Istio in 2020 that made this technology easier to use.

In February 2021, we celebrated the first IstioCon! This community-led event was an opportunity for users and developers to share many examples of how they use Istio in production and lessons learned from it. IstioCon was a great opportunity to feature more than 25 end-user companies, like Salesforce, T-Mobile, and Airbnb, among others, to feature maintainers from across the Istio ecosystem, and to share the Istio project roadmap.

This inaugural community conference was a major success, with more than 4,000 registrants from 80 countries participating. The program was conducted during US and Asia time zones, and in English and Chinese languages to accommodate big user communities in various continents. Learn more about the impact of IstioCon and find the presentations on the conference website.

How to participate in Istio’s 4th Birthday celebration

Contributions are key to the long life of an open source project. This is why, on its 4th birthday, we want to hear about your contributions to the Istio project. To participate in this campaign, share on Twitter a contribution you made to the project and why it matters, using the hashtag #IstioTurns4 and #IstioBirthday. You can submit posts from Monday, May 24th at 9 am Pacific, until Friday, May 28th, at 12 pm Pacific, to enter a chance to win some Istio swag.

The other way of participating in this campaign is by joining the Istio community meetup, which will take place on Thursday, May 27th at 10 am Pacific. At this event, we will have Pratima Nambiar discuss contributions that have stemmed from using Istio in production at Salesforce. Join the event, and ask a question or make a comment on the demo, and enter a chance to win some Istio swag.

Istio Community Meetup!
]]>
Mon, 24 May 2021 00:00:00 +0000/v1.24//blog/2021/istio-4th-birthday//v1.24//blog/2021/istio-4th-birthday/communitybirthdaycelebration
Announcing Support for 1.8 to 1.10 Direct UpgradesAs Service Mesh technology moves from cutting edge to stable infrastructure, many users have expressed an interest in upgrading their service mesh less frequently, as qualifying a new minor release can take a lot of time. Upgrading can be especially difficult for users who don’t keep up with new releases, as Istio has not supported upgrades across multiple minor versions. To upgrade from 1.6.x to 1.8.x, users first had to upgrade to 1.7.x and then to 1.8.x.

With the release of Istio 1.10, we are announcing Alpha level support for upgrading directly from Istio 1.8.x to 1.10.x, without upgrading to 1.9.x. We hope this will reduce the operational burden of running Istio, in keeping with our 2021 theme of improving Day 2 Operations.

Upgrade From 1.8 to 1.10

For direct upgrades we recommend using the canary upgrade method so that control plane functionality can be verified before cutting workloads over to the new version. We’ll also be using revision tags in this guide, an improvement to canary upgrades that was introduced in 1.10, so users don’t have to change the labels on a namespace while upgrading.

First, using a version 1.10 or newer istioctl, create a revision tag stable pointed to your existing 1.8 revision. From now on let’s assume this revision is called 1-8-5:

$ istioctl x revision tag set stable --revision 1-8-5

If your 1.8 installation did not have an associated revision, we can create this revision tag with:

$ istioctl x revision tag set stable --revision default

Now, relabel your namespaces that were previously labeled with istio-injection=enabled or istio.io/rev=<REVISION> with istio.io/rev=stable. Download the Istio 1.10.0 release and install the new control plane with a revision:

$ istioctl install --revision 1-10-0 -y

Now evaluate that the 1.10 revision has come up correctly and is healthy. Once satisfied with the stability of new revision you can set the revision tag to the new revision:

$ istioctl x revision tag set stable --revision 1-10-0 --overwrite

Verify that the revision tag stable is pointing to the new revision:

$ istioctl x revision tag list
TAG    REVISION NAMESPACES
stable 1-10-0        ...

Once prepared to move existing workloads over to the new 1.10 revision, the workloads must be restarted so that the sidecar proxies will use the new control plane. We can go through namespaces one by one and roll the workloads over to the new version:

$ kubectl rollout restart deployments -n …

Notice an issue after rolling out workloads to the new Istio version? No problem! Since you’re using canary upgrades, the old control plane is still running and we can just switch back over.

$ istioctl x revision tag set prod --revision 1-8-5

Then after triggering another rollout, your workloads will be back on the old version.

We look forward to hearing about your experience with direct upgrades, and look forward to improving and expanding this functionality in the future.

]]>
Mon, 24 May 2021 00:00:00 +0000/v1.24//blog/2021/direct-upgrade//v1.24//blog/2021/direct-upgrade/upgradeIstiorevision
StatefulSets Made Easier With Istio 1.10Kubernetes StatefulSets are commonly used to manage stateful applications. In addition to managing the deployment and scaling of a set of Pods, StatefulSets provide guarantees about the ordering and uniqueness of those Pods. Common applications used with StatefulSets include ZooKeeper, Cassandra, Elasticsearch, Redis and NiFi.

The Istio community has been making gradual progress towards zero-configuration support for StatefulSets; from automatic mTLS, to eliminating the need to create DestinationRule or ServiceEntry resources, to the most recent pod networking changes in Istio 1.10.

What is unique about using a StatefulSet with a service mesh? The StatefulSet pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling. The kind of apps that run in a StatefulSet are often those that need to communicate among their pods, and, as they come from a world of hard-coded IP addresses, may listen on the pod IP only, instead of 0.0.0.0.

ZooKeeper, for example, is configured by default to not listen on all IPs for quorum communication:

quorumListenOnAllIPs=false

Over the last few releases, the Istio community has reported many issues around support for applications running in StatefulSets.

StatefulSets in action, prior to Istio 1.10

In a GKE cluster running Kubernetes 1.19, we have Istio 1.9.5 installed. We enabled automatic sidecar injection in the default namespace, then we installed ZooKeeper using the Helm charts provided by Bitnami, along with the Istio sleep pod for interactive debugging:

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-release bitnami/zookeeper --set replicaCount=3
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.24/samples/sleep/sleep.yaml

After a few minutes, all pods come up nicely with sidecar proxies:

$ kubectl get pods,svc
NAME                             READY   STATUS    RESTARTS   AGE
my-release-zookeeper-0           2/2     Running   0          3h4m
my-release-zookeeper-1           2/2     Running   0          3h4m
my-release-zookeeper-2           2/2     Running   0          3h5m
pod/sleep-8f795f47d-qkgh4        2/2     Running   0          3h8m

NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                            AGE
my-release-zookeeper            ClusterIP   10.100.1.113   <none>        2181/TCP,2888/TCP,3888/TCP         3h
my-release-zookeeper-headless   ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP         3h
service/sleep                   ClusterIP   10.100.9.26    <none>        80/TCP                             3h

Are our ZooKeeper services working and is the status Running? Let’s find out! ZooKeeper listens on 3 ports:

  • Port 2181 is the TCP port for clients to connect to the ZooKeeper service
  • Port 2888 is the TCP port for peers to connect to other peers
  • Port 3888 is the dedicated TCP port for leader election

By default, the ZooKeeper installation configures port 2181 to listen on 0.0.0.0 but ports 2888 and 3888 only listen on the pod IP. Let’s check out the network status on each of these ports from one of the ZooKeeper pods:

$ kubectl exec my-release-zookeeper-1 -c istio-proxy -- netstat -na | grep -E '(2181|2888|3888)'
tcp        0      0 0.0.0.0:2181            0.0.0.0:*               LISTEN
tcp        0      0 10.96.7.7:3888          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:2181          127.0.0.1:37412         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37486         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37456         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37498         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37384         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37514         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37402         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37434         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37526         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37374         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37442         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:37464         TIME_WAIT

There is nothing ESTABLISHED on port 2888 or 3888. Next, let us get the ZooKeeper server status:

$ kubectl exec my-release-zookeeper-1 -c zookeeper -- /opt/bitnami/zookeeper/bin/zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.

From the above output, you can see the ZooKeeper service is not functioning properly. Let us check the cluster configuration for one of the ZooKeeper pods:

$ istioctl proxy-config cluster my-release-zookeeper-1 --port 3888 --direction inbound -o json
[
    {
        "name": "inbound|3888||",
        "type": "STATIC",
        "connectTimeout": "10s",
        "loadAssignment": {
            "clusterName": "inbound|3888||",
            "endpoints": [
                {
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "127.0.0.1",
                                        "portValue": 3888
                                    }
                                }
                            }
                        }
                    ]
                }
            ]
        },
...

What is interesting here is that the inbound on port 3888 has 127.0.0.1 as its endpoint. This is because the Envoy proxy, in versions of Istio prior to 1.10, redirects the inbound traffic to the loopback interface, as described in our blog post about the change.

StatefulSets in action with Istio 1.10

Now, we have upgraded our cluster to Istio 1.10 and configured the default namespace to enable 1.10 sidecar injection. Let’s rolling restart the ZooKeeper StatefulSet to update the pods to use the new version of the sidecar proxy:

$ kubectl rollout restart statefulset my-release-zookeeper

Once the ZooKeeper pods reach the running status, let’s check out the network connections for these 3 ports from any of the ZooKeeper pods:

$ kubectl exec my-release-zookeeper-1 -c istio-proxy -- netstat -na | grep -E '(2181|2888|3888)'
tcp        0      0 0.0.0.0:2181            0.0.0.0:*               LISTEN
tcp        0      0 10.96.8.10:2888         0.0.0.0:*               LISTEN
tcp        0      0 10.96.8.10:3888         0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.6:42571         10.96.8.10:2888         ESTABLISHED
tcp        0      0 10.96.8.10:2888         127.0.0.6:42571         ESTABLISHED
tcp        0      0 127.0.0.6:42655         10.96.8.10:2888         ESTABLISHED
tcp        0      0 10.96.8.10:2888         127.0.0.6:42655         ESTABLISHED
tcp        0      0 10.96.8.10:37876        10.96.6.11:3888         ESTABLISHED
tcp        0      0 10.96.8.10:44872        10.96.7.10:3888         ESTABLISHED
tcp        0      0 10.96.8.10:37878        10.96.6.11:3888         ESTABLISHED
tcp        0      0 10.96.8.10:44870        10.96.7.10:3888         ESTABLISHED
tcp        0      0 127.0.0.1:2181          127.0.0.1:54508         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54616         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54664         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54526         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54532         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54578         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54634         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54588         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54610         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54550         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54560         TIME_WAIT
tcp        0      0 127.0.0.1:2181          127.0.0.1:54644         TIME_WAIT

There are ESTABLISHED connections on both port 2888 and 3888! Next, let us check out the ZooKeeper server status:

$ kubectl exec my-release-zookeeper-1 -c zookeeper -- /opt/bitnami/zookeeper/bin/zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower

The ZooKeeper service is now running!

We can connect to each of the ZooKeeper pods from the sleep pod and run the below command to discover the server status of each pod within the StatefulSet. Note that there is no need to create ServiceEntry resources for any of the ZooKeeper pods and we can call these pods directly using their DNS names (e.g. my-release-zookeeper-0.my-release-zookeeper-headless) from the sleep pod.

$ kubectl exec -it deploy/sleep -c sleep -- sh  -c 'for x in my-release-zookeeper-0.my-release-zookeeper-headless my-release-zookeeper-1.my-release-zookeeper-headless my-release-zookeeper-2.my-release-zookeeper-headless; do echo $x; echo srvr|nc $x 2181; echo; done'
my-release-zookeeper-0.my-release-zookeeper-headless
Zookeeper version: 3.7.0-e3704b390a6697bfdf4b0bef79e3da7a4f6bac4b, built on 2021-03-17 09:46 UTC
Latency min/avg/max: 1/7.5/20
Received: 3845
Sent: 3844
Connections: 1
Outstanding: 0
Zxid: 0x200000002
Mode: follower
Node count: 6

my-release-zookeeper-1.my-release-zookeeper-headless
Zookeeper version: 3.7.0-e3704b390a6697bfdf4b0bef79e3da7a4f6bac4b, built on 2021-03-17 09:46 UTC
Latency min/avg/max: 0/0.0/0
Received: 3856
Sent: 3855
Connections: 1
Outstanding: 0
Zxid: 0x200000002
Mode: follower
Node count: 6

my-release-zookeeper-2.my-release-zookeeper-headless
Zookeeper version: 3.7.0-e3704b390a6697bfdf4b0bef79e3da7a4f6bac4b, built on 2021-03-17 09:46 UTC
Latency min/avg/max: 0/0.0/0
Received: 3855
Sent: 3854
Connections: 1
Outstanding: 0
Zxid: 0x200000002
Mode: leader
Node count: 6
Proposal sizes last/min/max: 48/48/48

Now our ZooKeeper service is running, let’s use Istio to secure all communication to our regular and headless services. Apply mutual TLS to the default namespace:

$ kubectl apply -n default -f - <<EOF
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
spec:
  mtls:
    mode: STRICT
EOF

Continue sending some traffic from the sleep pod and bring up the Kiali dashboard to visualize the services in the default namespace:

Visualize the ZooKeeper Services in Kiali

The padlock icons on the traffic flows indicate that the connections are secure.

Wrapping up

With the new networking changes in Istio 1.10, a Kubernetes pod with a sidecar has the same networking behavior as a pod without a sidecar. This change enables stateful applications to function properly in Istio as we have shown you in this post. We believe this is a huge step towards Istio’s goal of providing transparent service mesh and zero-configuration Istio.

]]>
Wed, 19 May 2021 00:00:00 +0000/v1.24//blog/2021/statefulsets-made-easier//v1.24//blog/2021/statefulsets-made-easier/statefulsetIstionetworkinglocalhostloopbacketh0
Updates to how Istio security releases are handled: Patch Tuesday, embargoes, and 0-daysWhile most of the work in the Istio Product Security Working Group is done behind the scenes, we are listening to the community in setting expectations for security releases. We understand that it is difficult for mesh administrators, operators and vendors to be aware of security bulletins and security releases.

We currently disclose vulnerabilities and security releases via numerous channels:

When operating any software, it is preferable to plan for possible downtime when upgrading. Given the work that the Istio community is doing around Day 2 operations in 2021, the Environments working group has done a good job to streamline many upgrade issues users have seen. The Product Security Working Group intends to help Day 2 operations by having routine security release days so that upgrade operations can be planned in advance for our users.

Patch Tuesdays

The Product Security working group is intending to ship a security release the 2nd Tuesday of each month. These security releases may contain fixes for multiple CVEs. It is the intent of the Product Security working group to have these security releases not contain any other fixes, although that may not always be possible.

When the Product Security working group intends to ship an upcoming security patch, an announcement will be made on the Istio discussion board 2 weeks prior to release. If you’re running Istio in production, we suggest you watch the Announcements category to be notified of such a release. If no such announcement is made there will not be a security release for that month, barring some exceptions listed below.

First Patch Tuesday

We are pleased to announce that Istio 1.9.5, and the final release of Istio 1.8, 1.8.6, are the first security releases to fit this pattern. As Istio 1.10 will be shipping soon we are intending to continue this new tradition in June.

These releases fix 3 CVEs. Please see the release pages for information regarding the specific CVEs fixed.

Unscheduled security releases

0-day vulnerabilities

Unfortunately, 0-day vulnerabilities cannot be planned. Upon disclosure, the Product Security Working Group will need to issue an out-of-band security release. The above methods will be used to disclose such issues, so please use at least one of them to be notified of such disclosures.

Third party embargoes

Similar to 0-day vulnerabilities, security releases can be dictated by third party embargoes, namely Envoy. When this occurs, Istio will release a same-day patch once the embargo is lifted.

Security Best Practices

The Istio Security Best Practices has seen many improvements over the past few months. We recommend you check it regularly, as many of our recent security bulletins can be mitigated by utilizing methods discussed in the Security Best Practices page.

Early Disclosure List

If you meet the criteria to be a part of the Istio Early Disclosure list, please apply for membership. Patches for upcoming security releases will be made available to the early disclosure list ~2 weeks prior to Istio’s Patch Tuesday.

There will be times when an upcoming Istio security release will also need patches from Envoy. We cannot redistribute Envoy patches due to their embargo. Please refer to Envoy’s guidance on how to join their early disclosure list.

Security Feedback

The Product Security Working Group holds bi-weekly meetings on Tuesdays from 9:00-9:30 Pacific. For more information see the Istio Working Group Calendar.

Our next public meeting will be held on May 25, 2021. Please join us!

]]>
Tue, 11 May 2021 00:00:00 +0000/v1.24//blog/2021/patch-tuesdays//v1.24//blog/2021/patch-tuesdays/cveproduct security
Use discovery selectors to configure namespaces for your Istio service meshAs users move their services to run in the Istio service mesh, they are often surprised that the control plane watches and processes all of the Kubernetes resources, from all namespaces in the cluster, by default. This can be an issue for very large clusters with lots of namespaces and deployments, or even for a moderately sized cluster with rapidly churning resources (for example, Spark jobs).

Both in the community as well as for our large-scale customers at Solo.io, we need a way to dynamically restrict the set of namespaces that are part of the mesh so that the Istio control plane only processes resources in those namespaces. The ability to restrict the namespaces enables Istiod to watch and push fewer resources and associated changes to the sidecars, thus improving the overall performance on the control plane and data plane.

Background

By default, Istio watches all Namespaces, Services, Endpoints and Pods in a cluster. For example, in my Kubernetes cluster, I deployed the sleep service in the default namespace, and the httpbin service in the ns-x namespace. I’ve added the sleep service to the mesh, but I have no plan to add the httpbin service to the mesh, or have any service in the mesh interact with the httpbin service.

Use istioctl proxy-config endpoint command to display all the endpoints for the sleep deployment:

Endpoints for Sleep Deployment

Note that the httpbin service endpoint in the ns-x namespace is in the list of discovered endpoints. This may not be an issue when you only have a few services. However, when you have hundreds of services that don’t interact with any of the services running in the Istio service mesh, you probably don’t want your Istio control plane to watch these services and send their information to the sidecars of your services in the mesh.

Introducing Discovery Selectors

Starting with Istio 1.10, we are introducing the new discoverySelectors option to MeshConfig, which is an array of Kubernetes selectors. The exact type is []LabelSelector, as defined here, allowing both simple selectors and set-based selectors. These selectors apply to labels on namespaces.

You can configure each label selector for expressing a variety of use cases, including but not limited to:

  • Arbitrary label names/values, for example, all namespaces with label istio-discovery=enabled
  • A list of namespace labels using set-based selectors which carries OR semantics, for example, all namespaces with label istio-discovery=enabled OR region=us-east1
  • Inclusion and/or exclusion of namespaces, for example, all namespaces with label istio-discovery=enabled AND label key app equal to helloworld

Note: discoverySelectors is not a security boundary. Istiod will continue to have access to all namespaces even when you have configured your discoverySelectors.

Discovery Selectors in Action

Assuming you know which namespaces to include as part of the service mesh, as a mesh administrator, you can configure discoverySelectors at installation time or post-installation by adding your desired discovery selectors to Istio’s MeshConfig resource. For example, you can configure Istio to discover only the namespaces that have the label istio-discovery=enabled.

  1. Using our examples earlier, let’s label the default namespace with label istio-discovery=enabled.

    $ kubectl label namespace default istio-discovery=enabled
  2. Use istioctl to apply the yaml with discoverySelectors to update your Istio installation. Note, to avoid any impact to your stable environment, we recommend that you use a different revision for your Istio installation:

    $ istioctl install --skip-confirmation -f - <<EOF
    apiVersion: install.istio.io/v1alpha1
    kind: IstioOperator
    metadata:
    namespace: istio-system
    spec:
    # You may override parts of meshconfig by uncommenting the following lines.
      meshConfig:
        discoverySelectors:
          - matchLabels:
              istio-discovery: enabled
    EOF
  3. Display the endpoint configuration for the sleep deployment:

    Endpoints for Sleep Deployment With Discovery Selectors

    Note this time the httpbin service in the ns-x namespace is NOT in the list of discovered endpoints, along with many other services that are not in the default namespace. If you display routes (or cluster or listeners) information for the sleep deployment, you will also notice much less configuration is returned:

    Routes for Sleep Deployment With Discovery Selectors

You can use matchLabels to configure multiple labels with AND semantics or use matchLabels sets to configure OR semantics among multiple labels. Whether you deploy services or pods to namespaces with different sets of labels or multiple application teams in your organization use different labeling conventions, discoverySelectors provides the flexibility you need. Furthermore, you could use matchLabels and matchExpressions together per our documentation. Refer to the Kubernetes selector docs for additional detail on selector semantics.

Discovery Selectors vs Sidecar Resource

The discoverySelectors configuration enables users to dynamically restrict the set of namespaces that are part of the mesh. A Sidecar resource also controls the visibility of sidecar configurations and what gets pushed to the sidecar proxy. What are the differences between them?

  • The discoverySelectors configuration declares what Istio control plane watches and processes. Without discoverySelectors configuration, the Istio control plane watches and processes all namespaces/services/endpoints/pods in the cluster regardless of the sidecar resources you have.
  • discoverySelectors is configured globally for the mesh by the mesh administrators. While Sidecar resources can also be configured for the mesh globally by the mesh administrators in the MeshConfig root namespace, they are commonly configured by service owners for their namespaces.

You can use discoverySelectors with Sidecar resources. You can use discoverySelectors to configure at the mesh-wide level what namespaces the Istio control plane should watch and process. For these namespaces in the Istio service mesh, you can create Sidecar resources globally or per namespace to further control what gets pushed to the sidecar proxies. Let us add Bookinfo services to the ns-y namespace in the mesh as shown in the diagram below. discoverySelectors enables us to define the default and ns-y namespaces are part of the mesh. How can we configure the sleep service not to see anything other than the default namespace? Adding a Sidecar resource for the default namespace, we can effectively configure the sleep sidecar to only have visibility to the clusters/routes/listeners/endpoints associated with its current namespace plus any other required namespaces.

Discovery Selectors vs Sidecar Resource

Wrapping up

Discovery selectors are powerful configurations to tune the Istio control plane to only watch and process specific namespaces. If you don’t want all namespaces in your Kubernetes cluster to be part of the service mesh or you have multiple Istio service meshes within your Kubernetes cluster, we highly recommend that you explore this configuration and reach out to us for feedback on our Istio slack or GitHub.

]]>
Fri, 30 Apr 2021 00:00:00 +0000/v1.24//blog/2021/discovery-selectors//v1.24//blog/2021/discovery-selectors/discoveryselectorsIstionamespacessidecar
Upcoming networking changes in Istio 1.10Background

While Kubernetes networking is customizable, a typical pod’s network will look like this:

A pod's network

An application may choose to bind to either the loopback interface lo (typically binding to 127.0.0.1), or the pods network interface eth0 (typically to the pod’s IP), or both (typically binding to 0.0.0.0).

Binding to lo allows calls such as curl localhost to work from within the pod. Binding to eth0 allows calls to the pod from other pods.

Typically, an application will bind to both. However, applications which have internal logic, such as an admin interface may choose to bind to only lo to avoid access from other pods. Additionally, some applications, typically stateful applications, choose to bind only to eth0.

Current behavior

In Istio prior to release 1.10, the Envoy proxy, running in the same pod as the application, binds to the eth0 interface and redirects all inbound traffic to the lo interface.

A pod's network with Istio today

This has two important side effects that cause the behavior to differ from standard Kubernetes:

  • Applications binding only to lo will receive traffic from other pods, when otherwise this is not allowed.
  • Applications binding only to eth0 will not receive traffic.

Applications that bind to both interfaces (which is typical) will not be impacted.

Future behavior

Starting with Istio 1.10, the networking behavior is changed to align with the standard behavior present in Kubernetes.

A pod's network with Istio in the future

Here we can see that the proxy no longer redirects the traffic to the lo interface, but instead forwards it to the application on eth0. As a result, the standard behavior of Kubernetes is retained, but we still get all the benefits of Istio. This change allows Istio to get closer to its goal of being a drop-in transparent proxy that works with existing workloads with zero configuration. Additionally, it avoids unintended exposure of applications binding only to lo.

Am I impacted?

For new users, this change should only be an improvement. However, if you are an existing user, you may have come to depend on the old behavior, intentionally or accidentally.

To help detect these situations, we have added a check to find pods that will be impacted. You can run the istioctl experimental precheck command to get a report of any pods binding to lo on a port exposed in a Service. This command is available in Istio 1.10+. Without action, these ports will no longer be accessible upon upgrade.

$ istioctl experimental precheck
Error [IST0143] (Pod echo-local-849647c5bd-g9wxf.default) Port 443 is exposed in a Service but listens on localhost. It will not be exposed to other pods.
Error [IST0143] (Pod echo-local-849647c5bd-g9wxf.default) Port 7070 is exposed in a Service but listens on localhost. It will not be exposed to other pods.
Error: Issues found when checking the cluster. Istio may not be safe to install or upgrade.
See https://istio.io/latest/docs/reference/config/analysis for more information about causes and resolutions.

Migration

If you are currently binding to lo, you have a few options:

  • Switch your application to bind to all interfaces (0.0.0.0 or ::).

  • Explicitly configure the port using the Sidecar ingress configuration to send to lo, preserving the old behavior.

    For example, to configure request to be sent to localhost for the ratings application:

    apiVersion: networking.istio.io/v1beta1
    kind: Sidecar
    metadata:
      name: ratings
    spec:
      workloadSelector:
        labels:
          app: ratings
      ingress:
      - port:
          number: 8080
          protocol: HTTP
          name: http
        defaultEndpoint: 127.0.0.1:8080
  • Disable the change entirely with the PILOT_ENABLE_INBOUND_PASSTHROUGH=false environment variable in Istiod, to enable the same behavior as prior to Istio 1.10. This option will be removed in the future.

]]>
Thu, 15 Apr 2021 00:00:00 +0000/v1.24//blog/2021/upcoming-networking-changes//v1.24//blog/2021/upcoming-networking-changes/
Istio and Envoy WebAssembly Extensibility, One Year OnOne year ago today, in the 1.5 release, we introduced WebAssembly-based extensibility to Istio. Over the course of the year, the Istio, Envoy, and Proxy-Wasm communities have continued our joint efforts to make WebAssembly (Wasm) extensibility stable, reliable, and easy to adopt. Let’s walk through the updates to Wasm support through the Istio 1.9 release, and our plans for the future.

WebAssembly support merged in upstream Envoy

After adding experimental support for Wasm and the WebAssembly for Proxies (Proxy-Wasm) ABI to Istio’s fork of Envoy, we collected some great feedback from our community of early adopters. This, combined with the experience gained from developing core Istio Wasm extensions, helped us mature and stabilize the runtime. These improvements unblocked merging Wasm support directly into Envoy upstream in October 2020, allowing it to become part of all official Envoy releases. This was a significant milestone, since it indicates that:

  • The runtime is ready for wider adoption.
  • The programming ABI/API, extension configuration API, and runtime behavior, are becoming stable.
  • You can expect a larger community of adoption and support moving forward.

wasm-extensions Ecosystem Repository

As an early adopter of the Envoy Wasm runtime, the Istio Extensions and Telemetry working group gained a lot of experience in developing extensions. We built several first-class extensions, including metadata exchange, Prometheus stats, and attribute generation. In order to share our learning more broadly, we created a wasm-extensions repository in the istio-ecosystem organization. This repository serves two purposes:

  • It provides canonical example extensions, covering several highly demanded features (such as basic authentication).
  • It provides a guide for Wasm extension development, testing, and release. The guide is based on the same build tool chains and test frameworks that are used, maintained and tested by the Istio extensibility team.

The guide currently covers WebAssembly extension development and unit testing with C++, as well as integration testing with a Go test framework, which simulates a real runtime by running a Wasm module with the Istio proxy binary. In the future, we will also add several more canonical extensions, such as an integration with Open Policy Agent, and header manipulation based on JWT tokens.

Wasm module distribution via the Istio Agent

Prior to Istio 1.9, Envoy remote data sources were needed to distribute remote Wasm modules to the proxy. In this example, you can see two EnvoyFilter resources are defined: one to add a remote fetch Envoy cluster, and the other one to inject a Wasm filter into the HTTP filter chain. This method has a drawback: if remote fetch fails, either due to bad configuration or transient error, Envoy will be stuck with the bad configuration. If a Wasm extension is configured as fail closed, a bad remote fetch will stop Envoy from serving. To fix this issue, a fundamental change is needed to the Envoy xDS protocol to make it allow asynchronous xDS responses.

Istio 1.9 provides a reliable distribution mechanism out of the box by leveraging the xDS proxy inside istio-agent and Envoy’s Extension Configuration Discovery Service (ECDS).

istio-agent intercepts the extension config resource update from istiod, reads the remote fetch hint from it, downloads the Wasm module, and rewrites the ECDS configuration with the path of the downloaded Wasm module. If the download fails, istio-agent will reject the ECDS update and prevent a bad configuration reaching Envoy. For more detail, please see our docs on Wasm module distribution.

Remote Wasm module fetch flow

Istio Wasm SIG and Future Work

Although we have made a lot of progress on Wasm extensibility, there are still many aspects of the project that remain to be completed. In order to consolidate the efforts from various parties and better tackle the challenges ahead, we have formed an Istio WebAssembly SIG, with aim of providing a standard and reliable way for Istio to consume Wasm extensions. Here are some of the things we are working on:

  • A first-class extension API: Currently Wasm extensions needs to be injected via Istio’s EnvoyFilter API. A first-class extension API will make using Wasm with Istio easier, and we expect this to be introduced in Istio 1.10.
  • Distribution artifacts interoperability: Built on top of Solo.io’s WebAssembly OCI image spec effort, a standard Wasm artifacts format will make it easy to build, pull, publish, and execute.
  • Container Storage Interface (CSI) based artifacts distribution: Using istio-agent to distribute modules is easy for adoption, but may not be efficient as each proxy will keep a copy of the Wasm module. As a more efficient solution, with Ephemeral CSI, a DaemonSet will be provided which could configure storage for pods. Working similarly to a CNI plugin, a CSI driver would fetch the Wasm module out-of-band from the xDS flow and mount it inside the rootfs when the pod starts up.

If you would like to join us, the group will meet every other week Tuesdays at 2PM PT. You can find the meeting on the Istio working group calendar.

We look forward to seeing how you will use Wasm to extend Istio!

]]>
Fri, 05 Mar 2021 00:00:00 +0000/v1.24//blog/2021/wasm-progress//v1.24//blog/2021/wasm-progress/wasmextensibilityWebAssembly
Migrate pre-Istio 1.4 Alpha security policy to the current APIsIn versions of Istio prior to 1.4, security policy was configured using v1alpha1 APIs (MeshPolicy, Policy, ClusterRbacConfig, ServiceRole and ServiceRoleBinding). After consulting with our early adopters, we made major improvements to the policy system and released v1beta1 APIs along with Istio 1.4. These refreshed APIs (PeerAuthentication, RequestAuthentication and AuthorizationPolicy) helped standardize how we define policy targets in Istio, helped users understand where policies were applied, and cut the number of configuration objects required.

The old APIs were deprecated in Istio 1.4. Two releases after the v1beta1 APIs were introduced, Istio 1.6 removed support for the v1alpha1 APIs.

If you are using a version of Istio prior to 1.6 and you want to upgrade, you will have to migrate your alpha security policy objects to the beta API. This tutorial will help you make that move.

Overview

Your control plane must first be upgraded to a version that supports the v1beta1 security policy.

It is recommended to first upgrade to Istio 1.5 as a transitive version, because it is the only version that supports both v1alpha1 and v1beta1 security policies. You will complete the security policy migration in Istio 1.5, remove the v1alpha1 security policy, and then continue to upgrade to later Istio versions. For a given workload, the v1beta1 version will take precedence over the v1alpha1 version.

Alternatively, if you want to do a skip-level upgrade directly from Istio 1.4 to 1.6 or later, you should use the canary upgrade method to install a new Istio version as a separate control plane, and gradually migrate your workloads to the new control plane completing the security policy migration at the same time.

In either case, it is recommended to migrate using namespace granularity: for each namespace, find all the v1alpha1 policies that have an effect on workloads in the namespace and migrate all the policies to v1beta1 at the same time. This allows a safer migration as you can make sure everything is working as expected, and then move forward to the next namespace.

Major differences

Before starting the migration, read through the v1beta1 authentication and authorization documentation to understand the v1beta1 policy.

You should examine all of your existing v1alpha1 security policies, find out what fields are used and which policies need migration, compare the findings with the major differences listed below and confirm there are no blocking issues (e.g., using an alpha feature that is no longer supported in beta):

Major Differences v1alpha1 v1beta1
API stability not backward compatible backward compatible
mTLS MeshPolicy and Policy PeerAuthentication
JWT MeshPolicy and Policy RequestAuthentication
Authorization ClusterRbacConfig, ServiceRole and ServiceRoleBinding AuthorizationPolicy
Policy target service name based workload selector based
Port number service ports workload ports

Although RequestAuthentication in v1beta1 security policy is similar to the v1alpha1 JWT policy, there is a notable semantics change. The v1alpha1 JWT policy needs to be migrated to two v1beta1 resources: RequestAuthentication and AuthorizationPolicy. This will change the JWT deny message due to the use of AuthorizationPolicy. In the alpha version, the HTTP code 401 is returned with the body Origin authentication failed. In the beta version, the HTTP code 403 is returned with the body RBAC: access denied.

The v1alpha1 JWT policy triggerRule field is replaced by the AuthorizationPolicy with the exception that the regex field is no longer supported.

Migration flow

This section describes in detail how to migrate a v1alpha1 security policy.

For each namespace, find all v1alpha1 security policies that have an effect on workloads in the namespace. The result could include:

  • a single MeshPolicy that applies to all services in the mesh;
  • a single namespace-level Policy that applies to all workloads in the namespace;
  • multiple service-level Policy objects that apply to the selected services in the namespace;
  • a single ClusterRbacConfig that enables the RBAC on the whole namespace or some services in the namespace;
  • multiple namespace-level ServiceRole and ServiceRoleBinding objects that apply to all services in the namespace;
  • multiple service-level ServiceRole and ServiceRoleBinding objects that apply to the selected services in the namespace;

Step 2: Convert service name to workload selector

The v1alpha1 policy selects targets using their service name. You should refer to the corresponding service definition to decide the workload selector that should be used in the v1beta1 policy.

A single v1alpha1 policy may include multiple services. It will need to be migrated to multiple v1beta1 policies because the v1beta1 policy currently only supports at most one workload selector per policy.

Also note the v1alpha1 policy uses service port but the v1beta1 policy uses the workload port. This means the port number might be different in the migrated v1beta1 policy.

Step 3: Migrate authentication policy

For each v1alpha1 authentication policy, migrate with the following rules:

  1. If the whole namespace is enabled with mTLS or JWT, create the PeerAuthentication, RequestAuthentication and AuthorizationPolicy without a workload selector for the whole namespace. Fill out the policy based on the semantics of the corresponding MeshPolicy or Policy for the namespace.

  2. If a workload is enabled with mTLS or JWT, create the PeerAuthentication, RequestAuthentication and AuthorizationPolicy with a corresponding workload selector for the workload. Fill out the policy based on the semantics of the corresponding MeshPolicy or Policy for the workload.

  3. For mTLS related configuration, use STRICT mode if the alpha policy is using STRICT, or use PERMISSIVE in all other cases.

  4. For JWT related configuration, refer to the end-user authentication documentation to learn how to migrate to RequestAuthentication and AuthorizationPolicy.

A security policy migration tool is provided to automatically migrate authentication policy automatically. Please refer to the tool’s README for its usage.

Step 4: Migrate RBAC policy

For each v1alpha1 RBAC policy, migrate with the following rules:

  1. If the whole namespace is enabled with RBAC, create an AuthorizationPolicy without a workload selector for the whole namespace. Add an empty rule so that it will deny all requests to the namespace by default.

  2. If a workload is enabled with RBAC, create an AuthorizationPolicy with a corresponding workload selector for the workload. Add rules based on the semantics of the corresponding ServiceRole and ServiceRoleBinding for the workload.

Step 5: Verify migrated policy

  1. Double check the migrated v1beta1 policies: make sure there are no policies with duplicate names, the namespace is specified correctly and all v1alpha1 policies for the given namespace are migrated.

  2. Dry-run the v1beta1 policy with the command kubectl apply --dry-run=server -f beta-policy.yaml to make sure it is valid.

  3. Apply the v1beta1 policy to the given namespace and closely monitor the effect. Make sure to test both allow and deny scenarios if JWT or authorization are used.

  4. Migrate the next namespace. Only remove the v1alpha1 policy after completing migration for all namespaces successfully.

Example

v1alpha1 policy

This section gives a full example showing the migration for namespace foo. Assume the namespace foo has the following v1alpha1 policies that affect the workloads in it:

# A MeshPolicy that enables mTLS globally, including the whole foo namespace
apiVersion: "authentication.istio.io/v1alpha1"
kind: "MeshPolicy"
metadata:
  name: "default"
spec:
  peers:
  - mtls: {}
---
# A Policy that enables mTLS permissive mode and enables JWT for the httpbin service on port 8000
apiVersion: authentication.istio.io/v1alpha1
kind: Policy
metadata:
  name: httpbin
  namespace: foo
spec:
  targets:
  - name: httpbin
    ports:
    - number: 8000
  peers:
  - mtls:
      mode: PERMISSIVE
  origins:
  - jwt:
      issuer: testing@example.com
      jwksUri: https://www.example.com/jwks.json
      triggerRules:
      - includedPaths:
        - prefix: /admin/
        excludedPaths:
        - exact: /admin/status
  principalBinding: USE_ORIGIN
---
# A ClusterRbacConfig that enables RBAC globally, including the foo namespace
apiVersion: "rbac.istio.io/v1alpha1"
kind: ClusterRbacConfig
metadata:
  name: default
spec:
  mode: 'ON'
---
# A ServiceRole that enables RBAC for the httpbin service
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: httpbin
  namespace: foo
spec:
  rules:
  - services: ["httpbin.foo.svc.cluster.local"]
    methods: ["GET"]
---
# A ServiceRoleBinding for the above ServiceRole
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: httpbin
  namespace: foo
spec:
  subjects:
  - user: cluster.local/ns/foo/sa/sleep
    roleRef:
      kind: ServiceRole
      name: httpbin

httpbin service

The httpbin service has the following definition:

apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: foo
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin

This means the service name httpbin should be replaced by the workload selector app: httpbin, and the service port 8000 should be replaced by the workload port 80.

v1beta1 authentication policy

The migrated v1beta1 policies for the v1alpha1 authentication policies in foo namespace are listed below:

# A PeerAuthentication that enables mTLS for the foo namespace, migrated from the MeshPolicy
# Alternatively the MeshPolicy could also be migrated to a PeerAuthentication at mesh level
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: foo
spec:
  mtls:
    mode: STRICT
---
# A PeerAuthentication that enables mTLS for the httpbin workload, migrated from the Policy
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: httpbin
  namespace: foo
spec:
  selector:
    matchLabels:
      app: httpbin
  # port level mtls set for the workload port 80 corresponding to the service port 8000
  portLevelMtls:
    80:
      mode: PERMISSIVE
--
# A RequestAuthentication that enables JWT for the httpbin workload, migrated from the Policy
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: httpbin
  namespace: foo
spec:
  selector:
    matchLabels:
      app: httpbin
  jwtRules:
  - issuer: testing@example.com
    jwksUri: https://www.example.com/jwks.json
---
# An AuthorizationPolicy that enforces to require JWT validation for the httpbin workload, migrated from the Policy
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: httpbin-jwt
  namespace: foo
spec:
  # Use DENY action to explicitly deny requests without JWT token
  action: DENY
  selector:
    matchLabels:
      app: httpbin
  rules:
  - from:
    - source:
        # This makes sure requests without JWT token will be denied
        notRequestPrincipals: ["*"]
    to:
    - operation:
        # This should be the workload port 80, not the service port 8000
        ports: ["80"]
        # The path and notPath is converted from the trigger rule in the Policy
        paths: ["/admin/*"]
        notPaths: ["/admin/status"]

v1beta1 authorization policy

The migrated v1beta1 policies for the v1alpha1 RBAC policies in foo namespace are listed below:

# An AuthorizationPolicy that denies by default, migrated from the ClusterRbacConfig
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: default
  namespace: foo
spec:
  # An empty rule that allows nothing
  {}
---
# An AuthorizationPolicy that enforces to authorization for the httpbin workload, migrated from the ServiceRole and ServiceRoleBinding
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: httpbin
  namespace: foo
spec:
  selector:
    matchLabels:
      app: httpbin
      version: v1
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/foo/sa/sleep"]
    to:
    - operation:
        methods: ["GET"]

Finish the upgrade

Congratulations; having reached this point, you should only have v1beta1 policy objects, and you will be able to continue upgrading Istio to 1.6 and beyond.

]]>
Wed, 03 Mar 2021 00:00:00 +0000/v1.24//blog/2021/migrate-alpha-policy//v1.24//blog/2021/migrate-alpha-policy/securitypolicymigratealphabetadeprecatepeerjwtauthorization
Zero Configuration IstioWhen a new user encounters Istio for the first time, they are sometimes overwhelmed by the vast feature set it exposes. Unfortunately, this can give the impression that Istio is needlessly complex and not fit for small teams or clusters.

One great part about Istio, however, is that it aims to bring as much value to users out of the box without any configuration at all. This enables users to get most of the benefits of Istio with minimal efforts. For some users with simple requirements, custom configurations may never be required at all. Others will be able to incrementally add Istio configurations once they are more comfortable and as they need them, such as to add ingress routing, fine-tune networking settings, or lock down security policies.

Getting started

To get started, check out our getting started documentation, where you will learn how to install Istio. If you are already familiar, you can simply run istioctl install.

Next, we will explore all the benefits Istio provides us, without any configuration or changes to application code.

Security

Istio automatically enables mutual TLS for traffic between pods in the mesh. This enables applications to forgo complex TLS configuration and certificate management, and offload all transport layer security to the sidecar.

Once comfortable with automatic TLS, you may choose to allow only mTLS traffic, or configure custom authorization policies for your needs.

Observability

Istio automatically generates detailed telemetry for all service communications within a mesh. This telemetry provides observability of service behavior, empowering operators to troubleshoot, maintain, and optimize their applications – without imposing any additional burdens on service developers. Through Istio, operators gain a thorough understanding of how monitored services are interacting, both with other services and with the Istio components themselves.

All of this functionality is added by Istio without any configuration. Integrations with tools such as Prometheus, Grafana, Jaeger, Zipkin, and Kiali are also available.

For more information about the observability Istio provides, check out the observability overview.

Traffic Management

While Kubernetes provides a lot of networking functionality, such as service discovery and DNS, this is done at Layer 4, which can have unintended inefficiencies. For example, in a simple HTTP application sending traffic to a service with 3 replicas, we can see unbalanced load:

$ curl http://echo/{0..5} -s | grep Hostname
Hostname=echo-cb96f8d94-2ssll
Hostname=echo-cb96f8d94-2ssll
Hostname=echo-cb96f8d94-2ssll
Hostname=echo-cb96f8d94-2ssll
Hostname=echo-cb96f8d94-2ssll
Hostname=echo-cb96f8d94-2ssll
$ curl http://echo/{0..5} -s | grep Hostname
Hostname=echo-cb96f8d94-879sn
Hostname=echo-cb96f8d94-879sn
Hostname=echo-cb96f8d94-879sn
Hostname=echo-cb96f8d94-879sn
Hostname=echo-cb96f8d94-879sn
Hostname=echo-cb96f8d94-879sn

The problem here is Kubernetes will determine the backend to send to when the connection is established, and all future requests on the same connection will be sent to the same backend. In our example here, our first 5 requests are all sent to echo-cb96f8d94-2ssll, while our next set (using a new connection) are all sent to echo-cb96f8d94-879sn. Our third instance never receives any requests.

With Istio, HTTP traffic (including HTTP/2 and gRPC) is automatically detected, and our services will automatically be load balanced per request, rather than per connection:

$ curl http://echo/{0..5} -s | grep Hostname
Hostname=echo-cb96f8d94-wf4xk
Hostname=echo-cb96f8d94-rpfqz
Hostname=echo-cb96f8d94-cgmxr
Hostname=echo-cb96f8d94-wf4xk
Hostname=echo-cb96f8d94-rpfqz
Hostname=echo-cb96f8d94-cgmxr

Here we can see our requests are round-robin load balanced between all backends.

In addition to these better defaults, Istio offers customization of a variety of traffic management settings, including timeouts, retries, and much more.

]]>
Thu, 25 Feb 2021 00:00:00 +0000/v1.24//blog/2021/zero-config-istio//v1.24//blog/2021/zero-config-istio/
IstioCon 2021: Schedule Is Live!IstioCon 2021 is a week-long, community-led, virtual conference starting on February 22. This event provides an opportunity to hear the lessons learned from companies like Atlassian, Airbnb, FICO, eBay, T-Mobile and Salesforce running Istio in production, hands-on experiences from the Istio community, and will feature maintainers from across the Istio ecosystem.

You can now find the full schedule of events which includes a series of English sessions and Chinese sessions.

By attending the conference, you’ll connect with community members from across the globe. Each day you will find keynotes, technical talks, lightning talks, panel discussions, workshops and roadmap sessions led by diverse speakers representing the Istio community. You can also connect with other Istio and Open Source ecosystem community members through social hour events that include activities on the social platform Gather.town, a live cartoonist, virtual swag bags, raffles, live music and games.

Don’t miss it! Registration is free. We look forward to seeing you at the first IstioCon!

]]>
Tue, 16 Feb 2021 00:00:00 +0000/v1.24//blog/2021/istiocon-2021-program//v1.24//blog/2021/istiocon-2021-program/IstioConIstioconference
Better External AuthorizationBackground

Istio’s authorization policy provides access control for services in the mesh. It is fast, powerful and a widely used feature. We have made continuous improvements to make policy more flexible since its first release in Istio 1.4, including the DENY action, exclusion semantics, X-Forwarded-For header support, nested JWT claim support and more. These features improve the flexibility of the authorization policy, but there are still many use cases that cannot be supported with this model, for example:

  • You have your own in-house authorization system that cannot be easily migrated to, or cannot be easily replaced by, the authorization policy.

  • You want to integrate with a 3rd-party solution (e.g. Open Policy Agent or oauth2 proxy) which may require use of the low-level Envoy configuration APIs in Istio, or may not be possible at all.

  • Authorization policy lacks necessary semantics for your use case.

Solution

In Istio 1.9, we have implemented extensibility into authorization policy by introducing a CUSTOM action, which allows you to delegate the access control decision to an external authorization service.

The CUSTOM action allows you to integrate Istio with an external authorization system that implements its own custom authorization logic. The following diagram shows the high level architecture of this integration:

External Authorization Architecture

At configuration time, the mesh admin configures an authorization policy with a CUSTOM action to enable the external authorization on a proxy (either gateway or sidecar). The admin should verify the external auth service is up and running.

At runtime,

  1. A request is intercepted by the proxy, and the proxy will send check requests to the external auth service, as configured by the user in the authorization policy.

  2. The external auth service will make the decision whether to allow it or not.

  3. If allowed, the request will continue and will be enforced by any local authorization defined by ALLOW/DENY action.

  4. If denied, the request will be rejected immediately.

Let’s look at an example authorization policy with the CUSTOM action:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: ext-authz
  namespace: istio-system
spec:
  # The selector applies to the ingress gateway in the istio-system namespace.
  selector:
    matchLabels:
      app: istio-ingressgateway
  # The action "CUSTOM" delegates the access control to an external authorizer, this is different from
  # the ALLOW/DENY action that enforces the access control right inside the proxy.
  action: CUSTOM
  # The provider specifies the name of the external authorizer defined in the meshconfig, which tells where and how to
  # talk to the external auth service. We will cover this more later.
  provider:
    name: "my-ext-authz-service"
  # The rule specifies that the access control is triggered only if the request path has the prefix "/admin/".
  # This allows you to easily enable or disable the external authorization based on the requests, avoiding the external
  # check request if it is not needed.
  rules:
  - to:
    - operation:
        paths: ["/admin/*"]

It refers to a provider called my-ext-authz-service which is defined in the mesh config:

extensionProviders:
# The name "my-ext-authz-service" is referred to by the authorization policy in its provider field.
- name: "my-ext-authz-service"
  # The "envoyExtAuthzGrpc" field specifies the type of the external authorization service is implemented by the Envoy
  # ext-authz filter gRPC API. The other supported type is the Envoy ext-authz filter HTTP API.
  # See more in https://www.envoyproxy.io/docs/envoy/v1.16.2/intro/arch_overview/security/ext_authz_filter.
  envoyExtAuthzGrpc:
    # The service and port specifies the address of the external auth service, "ext-authz.istio-system.svc.cluster.local"
    # means the service is deployed in the mesh. It can also be defined out of the mesh or even inside the pod as a separate
    # container.
    service: "ext-authz.istio-system.svc.cluster.local"
    port: 9000

The authorization policy of CUSTOM action enables the external authorization in runtime, it could be configured to trigger the external authorization conditionally based on the request using the same rule that you have already been using with other actions.

The external authorization service is currently defined in the meshconfig API and referred to by its name. It could be deployed in the mesh with or without proxy. If with the proxy, you could further use PeerAuthentication to enable mTLS between the proxy and your external authorization service.

The CUSTOM action is currently in the experimental stage; the API might change in a non-backward compatible way based on user feedback. The authorization policy rules currently don’t support authentication fields (e.g. source principal or JWT claim) when used with the CUSTOM action. Only one provider is allowed for a given workload, but you can still use different providers on different workloads.

For more information, please see the Better External Authorization design doc.

Example with OPA

In this section, we will demonstrate using the CUSTOM action with the Open Policy Agent as the external authorizer on the ingress gateway. We will conditionally enable the external authorization on all paths except /ip.

You can also refer to the external authorization task for a more basic introduction that uses a sample ext-authz server.

Create the example OPA policy

Run the following command create an OPA policy that allows the request if the prefix of the path is matched with the claim “path” (base64 encoded) in the JWT token:

$ cat > policy.rego <<EOF
package envoy.authz

import input.attributes.request.http as http_request

default allow = false

token = {"valid": valid, "payload": payload} {
    [_, encoded] := split(http_request.headers.authorization, " ")
    [valid, _, payload] := io.jwt.decode_verify(encoded, {"secret": "secret"})
}

allow {
    is_token_valid
    action_allowed
}

is_token_valid {
  token.valid
  now := time.now_ns() / 1000000000
  token.payload.nbf <= now
  now < token.payload.exp
}

action_allowed {
  startswith(http_request.path, base64url.decode(token.payload.path))
}
EOF
$ kubectl create secret generic opa-policy --from-file policy.rego

Deploy httpbin and OPA

Enable the sidecar injection:

$ kubectl label ns default istio-injection=enabled

Run the following command to deploy the example application httpbin and OPA. The OPA could be deployed either as a separate container in the httpbin pod or completely in a separate pod:

$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: httpbin-with-opa
  labels:
    app: httpbin-with-opa
    service: httpbin-with-opa
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin-with-opa
---
# Define the service entry for the local OPA service on port 9191.
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: local-opa-grpc
spec:
  hosts:
  - "local-opa-grpc.local"
  endpoints:
  - address: "127.0.0.1"
  ports:
  - name: grpc
    number: 9191
    protocol: GRPC
  resolution: STATIC
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: httpbin-with-opa
  labels:
    app: httpbin-with-opa
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin-with-opa
  template:
    metadata:
      labels:
        app: httpbin-with-opa
    spec:
      containers:
        - image: docker.io/kennethreitz/httpbin
          imagePullPolicy: IfNotPresent
          name: httpbin
          ports:
          - containerPort: 80
        - name: opa
          image: openpolicyagent/opa:latest-envoy
          securityContext:
            runAsUser: 1111
          volumeMounts:
          - readOnly: true
            mountPath: /policy
            name: opa-policy
          args:
          - "run"
          - "--server"
          - "--addr=localhost:8181"
          - "--diagnostic-addr=0.0.0.0:8282"
          - "--set=plugins.envoy_ext_authz_grpc.addr=:9191"
          - "--set=plugins.envoy_ext_authz_grpc.query=data.envoy.authz.allow"
          - "--set=decision_logs.console=true"
          - "--ignore=.*"
          - "/policy/policy.rego"
          livenessProbe:
            httpGet:
              path: /health?plugins
              scheme: HTTP
              port: 8282
            initialDelaySeconds: 5
            periodSeconds: 5
          readinessProbe:
            httpGet:
              path: /health?plugins
              scheme: HTTP
              port: 8282
            initialDelaySeconds: 5
            periodSeconds: 5
      volumes:
        - name: proxy-config
          configMap:
            name: proxy-config
        - name: opa-policy
          secret:
            secretName: opa-policy
EOF

Define external authorizer

Run the following command to edit the meshconfig:

$ kubectl edit configmap istio -n istio-system

Add the following extensionProviders to the meshconfig:

apiVersion: v1
data:
  mesh: |-
    # Add the following contents:
    extensionProviders:
    - name: "opa.local"
      envoyExtAuthzGrpc:
        service: "local-opa-grpc.local"
        port: "9191"

Create an AuthorizationPolicy with a CUSTOM action

Run the following command to create the authorization policy that enables the external authorization on all paths except /ip:

$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: httpbin-opa
spec:
  selector:
    matchLabels:
      app: httpbin-with-opa
  action: CUSTOM
  provider:
    name: "opa.local"
  rules:
  - to:
    - operation:
        notPaths: ["/ip"]
EOF

Test the OPA policy

  1. Create a client pod to send the request:

    Zip
    $ kubectl apply -f @samples/sleep/sleep.yaml@
    $ export SLEEP_POD=$(kubectl get pod -l app=sleep -o jsonpath={.items..metadata.name})
  2. Use a test JWT token signed by the OPA:

    $ export TOKEN_PATH_HEADERS="eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJwYXRoIjoiTDJobFlXUmxjbk09IiwibmJmIjoxNTAwMDAwMDAwLCJleHAiOjE5MDAwMDAwMDB9.9yl8LcZdq-5UpNLm0Hn0nnoBHXXAnK4e8RSl9vn6l98"

    The test JWT token has the following claims:

    {
      "path": "L2hlYWRlcnM=",
      "nbf": 1500000000,
      "exp": 1900000000
    }

    The path claim has value L2hlYWRlcnM= which is the base64 encode of /headers.

  3. Send a request to path /headers without a token. This should be rejected with 403 because there is no JWT token:

    $ kubectl exec ${SLEEP_POD} -c sleep  -- curl http://httpbin-with-opa:8000/headers -s -o /dev/null -w "%{http_code}\n"
    403

  4. Send a request to path /get with a valid token. This should be rejected with 403 because the path /get is not matched with the token /headers:

    $ kubectl exec ${SLEEP_POD} -c sleep  -- curl http://httpbin-with-opa:8000/get -H "Authorization: Bearer $TOKEN_PATH_HEADERS" -s -o /dev/null -w "%{http_code}\n"
    403
  5. Send a request to path /headers with valid token. This should be allowed with 200 because the path is matched with the token:

    $ kubectl exec ${SLEEP_POD} -c sleep  -- curl http://httpbin-with-opa:8000/headers -H "Authorization: Bearer $TOKEN_PATH_HEADERS" -s -o /dev/null -w "%{http_code}\n"
    200
  6. Send request to path /ip without token. This should be allowed with 200 because the path /ip is excluded from authorization:

    $ kubectl exec ${SLEEP_POD} -c sleep  -- curl http://httpbin-with-opa:8000/ip -s -o /dev/null -w "%{http_code}\n"
    200
  7. Check the proxy and OPA logs to confirm the result.

Summary

In Istio 1.9, the CUSTOM action in the authorization policy allows you to easily integrate Istio with any external authorization system with the following benefits:

  • First-class support in the authorization policy API

  • Ease of usage: define the external authorizer simply with a URL and enable with the authorization policy, no more hassle with the EnvoyFilter API

  • Conditional triggering, allowing improved performance

  • Support for various deployment type of the external authorizer:

    • A normal service and pod with or without proxy

    • Inside the workload pod as a separate container

    • Outside the mesh

We’re working to promote this feature to a more stable stage in following versions and welcome your feedback at discuss.istio.io.

Acknowledgements

Thanks to Craig Box, Christian Posta and Limin Wang for reviewing drafts of this blog.

]]>
Tue, 09 Feb 2021 00:00:00 +0000/v1.24//blog/2021/better-external-authz//v1.24//blog/2021/better-external-authz/authorizationaccess controlopaoauth2
Proxying legacy services using Istio egress gatewaysAt Deutsche Telekom Pan-Net, we have embraced Istio as the umbrella to cover our services. Unfortunately, there are services which have not yet been migrated to Kubernetes, or cannot be.

We can set Istio up as a proxy service for these upstream services. This allows us to benefit from capabilities like authorization/authentication, traceability and observability, even while legacy services stand as they are.

At the end of this article there is a hands-on exercise where you can simulate the scenario. In the exercise, an upstream service hosted at https://httpbin.org will be proxied by an Istio egress gateway.

If you are familiar with Istio, one of the methods offered to connect to upstream services is through an egress gateway.

You can deploy one to control all the upstream traffic or you can deploy multiple in order to have fine-grained control and satisfy the single-responsibility principle as this picture shows:

Overview multiple Egress Gateways

With this model, one egress gateway is in charge of exactly one upstream service.

Although the Operator spec allows you to deploy multiple egress gateways, the manifest can become unmanageable:

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
[...]
spec:
    egressGateways:
    - name: egressgateway-1
      enabled: true
    - name: egressgateway-2
      enabled: true
    [egressgateway-3, egressgateway-4, ...]
    - name: egressgateway-N
      enabled: true
[...]

As a benefit of decoupling egress getaways from the Operator manifest, you have enabled the possibility of setting up custom readiness probes to have both services (Gateway and upstream Service) aligned.

You can also inject OPA as a sidecar into the pod to perform authorization with complex rules (OPA envoy plugin).

Authorization with OPA and `healthcheck` to external

As you can see, your possibilities increase and Istio becomes very extensible.

Let’s look at how you can implement this pattern.

Solution

There are several ways to perform this task, but here you will find how to define multiple Operators and deploy the generated resources.

In the following section you will deploy an egress gateway to connect to an upstream service: httpbin (https://httpbin.org/)

At the end, you will have:

Communication

Hands on

Prerequisites

  • kind (Kubernetes-in-Docker - perfect for local development)
  • istioctl

Kind

Save this as config.yaml.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
  - |
    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterConfiguration
    metadata:
      name: config
    apiServer:
      extraArgs:
        "service-account-issuer": "kubernetes.default.svc"
        "service-account-signing-key-file": "/etc/kubernetes/pki/sa.key"
$ kind create cluster --name <my-cluster-name> --config config.yaml

Where <my-cluster-name> is the name for the cluster.

Istio Operator with Istioctl

Install the Operator

$ istioctl operator init --watchedNamespaces=istio-operator
$ kubectl create ns istio-system

Save this as operator.yaml:

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: istio-operator
  namespace: istio-operator
spec:
  profile: default
  tag: 1.8.0
  meshConfig:
    accessLogFile: /dev/stdout
    outboundTrafficPolicy:
      mode: REGISTRY_ONLY
$ kubectl apply -f operator.yaml

Deploy Egress Gateway

The steps for this task assume:

  • The service is installed under the namespace: httpbin.
  • The service name is: http-egress.

Istio 1.8 introduced the possibility to apply overlay configuration, to give fine-grain control over the created resources.

Save this as egress.yaml:

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  profile: empty
  tag: 1.8.0
  namespace: httpbin
  components:
    egressGateways:
    - name: httpbin-egress
      enabled: true
      label:
        app: istio-egressgateway
        istio: egressgateway
        custom-egress: httpbin-egress
      k8s:
        overlays:
        - kind: Deployment
          name: httpbin-egress
          patches:
          - path: spec.template.spec.containers[0].readinessProbe
            value:
              failureThreshold: 30
              exec:
                command:
                  - /bin/sh
                  - -c
                  - curl http://localhost:15021/healthz/ready && curl https://httpbin.org/status/200
              initialDelaySeconds: 1
              periodSeconds: 2
              successThreshold: 1
              timeoutSeconds: 1
  values:
    gateways:
      istio-egressgateway:
        runAsRoot: true

Create the namespace where you will install the egress gateway:

$ kubectl create ns httpbin

As it is described in the documentation, you can deploy several Operator resources. However, they have to be pre-parsed and then applied to the cluster.

$ istioctl manifest generate -f egress.yaml | kubectl apply -f -

Istio configuration

Now you will configure Istio to allow connections to the upstream service at https://httpbin.org.

Certificate for TLS

You need a certificate to make a secure connection from outside the cluster to your egress service.

How to generate a certificate is explained in the Istio ingress documentation.

Create and apply one to be used at the end of this article to access the service from outside the cluster (<my-proxied-service-hostname>):

$ kubectl create -n istio-system secret tls <my-secret-name> --key=<key> --cert=<cert>

Where <my-secret-name> is the name used later for the Gateway resource. <key> and <cert> are the files for the certificate. <cert>.

Ingress Gateway

Create a Gateway resource to operate ingress gateway to accept requests.

An example:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: my-ingressgateway
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - "<my-proxied-service-hostname>"
    port:
      name: http
      number: 80
      protocol: HTTP
    tls:
     httpsRedirect: true
  - port:
      number: 443
      name: https
      protocol: https
    hosts:
    - "<my-proxied-service-hostname>"
    tls:
      mode: SIMPLE
      credentialName: <my-secret-name>

Where <my-proxied-service-hostname> is the hostname to access the service through the my-ingressgateway and <my-secret-name> is the secret which contains the certificate.

Egress Gateway

Create another Gateway object, but this time to operate the egress gateway you have already installed:

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: "httpbin-egress"
  namespace: "httpbin"
spec:
  selector:
    istio: egressgateway
    service.istio.io/canonical-name: "httpbin-egress"
  servers:
  - hosts:
    - "<my-proxied-service-hostname>"
    port:
      number: 80
      name: http
      protocol: HTTP

Where <my-proxied-service-hostname> is the hostname to access through the my-ingressgateway.

Virtual Service

Create a VirtualService for three use cases:

  • Mesh gateway for service-to-service communications within the mesh
  • Ingress Gateway for the communication from outside the mesh
  • Egress Gateway for the communication to the upstream service
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: "httpbin-egress"
  namespace: "httpbin"
spec:
  hosts:
  - "<my-proxied-service-hostname>"
  gateways:
  - mesh
  - "istio-system/my-ingressgateway"
  - "httpbin/httpbin-egress"
  http:
  - match:
    - gateways:
      - "istio-system/my-ingressgateway"
      - mesh
      uri:
        prefix: "/"
    route:
    - destination:
        host: "httpbin-egress.httpbin.svc.cluster.local"
        port:
          number: 80
  - match:
    - gateways:
      - "httpbin/httpbin-egress"
      uri:
        prefix: "/"
    route:
    - destination:
        host: "httpbin.org"
        subset: "http-egress-subset"
        port:
          number: 443

Where <my-proxied-service-hostname> is the hostname to access through the my-ingressgateway.

Service Entry

Create a ServiceEntry to allow the communication to the upstream service:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: "httpbin-egress"
  namespace: "httpbin"
spec:
  hosts:
  - "httpbin.org"
  location: MESH_EXTERNAL
  ports:
  - number: 443
    name: https
    protocol: TLS
  resolution: DNS

Destination Rule

Create a DestinationRule to allow TLS origination for egress traffic as explained in the documentation

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: "httpbin-egress"
  namespace: "httpbin"
spec:
  host: "httpbin.org"
  subsets:
  - name: "http-egress-subset"
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
      portLevelSettings:
      - port:
          number: 443
        tls:
          mode: SIMPLE

Peer Authentication

To secure the service-to-service, you need to enforce mTLS:

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "httpbin-egress"
  namespace: "httpbin"
spec:
  mtls:
    mode: STRICT

Test

Verify that your objects were all specified correctly:

$ istioctl analyze --all-namespaces

External access

Test the egress gateway from outside the cluster forwarding the ingressgateway service’s port and calling the service

$ kubectl -n istio-system port-forward svc/istio-ingressgateway 15443:443
$ curl -vvv -k -HHost:<my-proxied-service-hostname> --resolve "<my-proxied-service-hostname>:15443:127.0.0.1" --cacert <cert> "https://<my-proxied-service-hostname>:15443/status/200"

Where <my-proxied-service-hostname> is the hostname to access through the my-ingressgateway and <cert> is the certificate defined for the ingressgateway object. This is due to tls.mode: SIMPLE which does not terminate TLS

Service-to-service access

Test the egress gateway from inside the cluster deploying the sleep service. This is useful when you design failover.

$ kubectl label namespace httpbin istio-injection=enabled --overwrite
$ kubectl apply -n httpbin -f  https://raw.githubusercontent.com/istio/istio/release-1.24/samples/sleep/sleep.yaml
$ kubectl -n httpbin "$(kubectl get pod -n httpbin -l app=sleep -o jsonpath={.items..metadata.name})" -- curl -vvv http://<my-proxied-service-hostname>/status/200

Where <my-proxied-service-hostname> is the hostname to access through the my-ingressgateway.

Now it is time to create a second, third and fourth egress gateway pointing to other upstream services.

Final thoughts

Istio might seem complex to configure. But it is definitely worthwhile, due to the huge set of benefits it brings to your services (with an extra Olé! for Kiali).

The way Istio is developed allows us, with minimal effort, to satisfy uncommon requirements like the one presented in this article.

To finish, I just wanted to point out that Istio, as a good cloud native technology, does not require a large team to maintain. For example, our current team is composed of 3 engineers.

To discuss more about Istio and its possibilities, please contact one of us:

]]>
Wed, 16 Dec 2020 00:00:00 +0000/v1.24//blog/2020/proxying-legacy-services-using-egress-gateways//v1.24//blog/2020/proxying-legacy-services-using-egress-gateways/configurationegressgatewayexternalservice
Proxy protocol on AWS NLB and Istio ingress gatewayThis blog presents my latest experience about how to configure and enable proxy protocol with stack of AWS NLB and Istio Ingress gateway. The Proxy Protocol was designed to chain proxies and reverse-proxies without losing the client information. The proxy protocol prevents the need for infrastructure changes or NATing firewalls, and offers the benefits of being protocol agnostic and providing good scalability. Additionally, we also enable the X-Forwarded-For HTTP header in the deployment to make the client IP address easy to read. In this blog, traffic management of Istio ingress is shown with an httpbin service on ports 80 and 443 to demonstrate the use of proxy protocol. Note that both v1 and v2 of the proxy protocol work for the purpose of this example, but because the AWS NLB currently only supports v2, proxy protocol v2 is used in the rest of this blog by default. The following image shows the use of proxy protocol v2 with an AWS NLB.

AWS NLB portal to enable proxy protocol

Separate setups for 80 and 443

Before going through the following steps, an AWS environment that is configured with the proper VPC, IAM, and Kubernetes setup is assumed.

Step 1: Install Istio with AWS NLB

The blog Configuring Istio Ingress with AWS NLB provides detailed steps to set up AWS IAM roles and enable the usage of AWS NLB by Helm. You can also use other automation tools, such as Terraform, to achieve the same goal. In the following example, more complete configurations are shown in order to enable proxy protocol and X-Forwarded-For at the same time.

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    proxy.istio.io/config: '{"gatewayTopology" : { "numTrustedProxies": 2 } }'
  labels:
    app: istio-ingressgateway
    istio: ingressgateway
    release: istio
  name: istio-ingressgateway

Step 2: Create proxy-protocol Envoy Filter

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: proxy-protocol
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
  - applyTo: LISTENER
    patch:
      operation: MERGE
      value:
        listener_filters:
        - name: envoy.filters.listener.proxy_protocol
        - name: envoy.filters.listener.tls_inspector

Step 3: Enable X-Forwarded-For header

This blog includes several samples of configuring Gateway Network Topology. In the following example, the configurations are tuned to enable X-Forwarded-For without any middle proxy.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: ingressgateway-settings
  namespace: istio-system
spec:
  configPatches:
  - applyTo: NETWORK_FILTER
    match:
      listener:
        filterChain:
          filter:
            name: envoy.http_connection_manager
    patch:
      operation: MERGE
      value:
        name: envoy.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
          skip_xff_append: false
          use_remote_address: true
          xff_num_trusted_hops: 1

Step 4: Deploy ingress gateway for httpbin on port 80 and 443

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: httpbin-gateway
spec:
  selector:
    istio: ingressgateway # use Istio default gateway implementation
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "a25fa0b4835b.elb.us-west-2.amazonaws.com"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - "a25fa0b4835b.elb.us-west-2.amazonaws.com"
  gateways:
  - httpbin-gateway
  http:
  - match:
    - uri:
        prefix: /headers
    route:
    - destination:
        port:
          number: 8000
        host: httpbin
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: mygateway2
spec:
  selector:
    istio: ingressgateway # use istio default ingress gateway
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    tls:
      mode: SIMPLE
      credentialName: httpbin-credential # must be the same as secret
    hosts:
    - "a25fa0b4835b.elb.us-west-2.amazonaws.com"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - "a25fa0b4835b.elb.us-west-2.amazonaws.com"
  gateways:
  - mygateway2
  http:
  - match:
    - uri:
        prefix: /headers
    route:
    - destination:
        port:
          number: 8000
        host: httpbin

Step 5: Check header output of httpbin

Check port 443 (80 will be similar) and compare the cases with and without proxy protocol.

//////with proxy_protocal enabled in the stack
*   Trying YY.XXX.141.26...
* TCP_NODELAY set
* Connection failed
* connect to YY.XXX.141.26 port 443 failed: Operation timed out
*   Trying YY.XXX.205.117...
* TCP_NODELAY set
* Connected to a25fa0b4835b.elb.us-west-2.amazonaws.com (XX.YYY.205.117) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: new_certificates/example.com.crt
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=a25fa0b4835b.elb.us-west-2.amazonaws.com; O=httpbin organization
*  start date: Oct 29 20:39:12 2020 GMT
*  expire date: Oct 29 20:39:12 2021 GMT
*  common name: a25fa0b4835b.elb.us-west-2.amazonaws.com (matched)
*  issuer: O=example Inc.; CN=example.com
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fc6c8810800)
> GET /headers?show_env=1 HTTP/2
> Host: a25fa0b4835b.elb.us-west-2.amazonaws.com
> User-Agent: curl/7.64.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)!
< HTTP/2 200
< server: istio-envoy
< date: Thu, 29 Oct 2020 21:39:46 GMT
< content-type: application/json
< content-length: 629
< access-control-allow-origin: *
< access-control-allow-credentials: true
< x-envoy-upstream-service-time: 2
<
{
  "headers": {
    "Accept": "*/*",
    "Content-Length": "0",
    "Host": "a25fa0b4835b.elb.us-west-2.amazonaws.com",
    "User-Agent": "curl/7.64.1",
    "X-B3-Sampled": "0",
    "X-B3-Spanid": "74f99a1c6fc29975",
    "X-B3-Traceid": "85db86fe6aa322a074f99a1c6fc29975",
    "X-Envoy-Attempt-Count": "1",
    "X-Envoy-Decorator-Operation": "httpbin.default.svc.cluster.local:8000/headers*",
    "X-Envoy-External-Address": "XX.110.54.41",
    "X-Forwarded-For": "XX.110.54.41",
    "X-Forwarded-Proto": "https",
    "X-Request-Id": "5c3bc236-0c49-4401-b2fd-2dbfbce506fc"
  }
}
* Connection #0 to host a25fa0b4835b.elb.us-west-2.amazonaws.com left intact
* Closing connection 0
//////////without proxy_protocal
*   Trying YY.XXX.141.26...
* TCP_NODELAY set
* Connection failed
* connect to YY.XXX.141.26 port 443 failed: Operation timed out
*   Trying YY.XXX.205.117...
* TCP_NODELAY set
* Connected to a25fa0b4835b.elb.us-west-2.amazonaws.com (YY.XXX.205.117) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: new_certificates/example.com.crt
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=a25fa0b4835b.elb.us-west-2.amazonaws.com; O=httpbin organization
*  start date: Oct 29 20:39:12 2020 GMT
*  expire date: Oct 29 20:39:12 2021 GMT
*  common name: a25fa0b4835b.elb.us-west-2.amazonaws.com (matched)
*  issuer: O=example Inc.; CN=example.com
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fbf8c808200)
> GET /headers?show_env=1 HTTP/2
> Host: a25fa0b4835b.elb.us-west-2.amazonaws.com
> User-Agent: curl/7.64.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)!
< HTTP/2 200
< server: istio-envoy
< date: Thu, 29 Oct 2020 20:44:01 GMT
< content-type: application/json
< content-length: 612
< access-control-allow-origin: *
< access-control-allow-credentials: true
< x-envoy-upstream-service-time: 1
<
{
  "headers": {
    "Accept": "*/*",
    "Content-Length": "0",
    "Host": "a25fa0b4835b.elb.us-west-2.amazonaws.com",
    "User-Agent": "curl/7.64.1",
    "X-B3-Sampled": "0",
    "X-B3-Spanid": "69913a6e6e949334",
    "X-B3-Traceid": "729d5da3618545da69913a6e6e949334",
    "X-Envoy-Attempt-Count": "1",
    "X-Envoy-Decorator-Operation": "httpbin.default.svc.cluster.local:8000/headers*",
    "X-Envoy-Internal": "true",
    "X-Forwarded-For": "172.16.5.30",
    "X-Forwarded-Proto": "https",
    "X-Request-Id": "299c7f8a-5f89-480a-82c9-028c76d45d84"
  }
}
* Connection #0 to host a25fa0b4835b.elb.us-west-2.amazonaws.com left intact
* Closing connection 0

Conclusion

This blog presents the deployment of a stack that consists of an AWS NLB and Istio ingress gateway that are enabled with proxy-protocol. We hope it is useful to you if you are interested in protocol enabling in an anecdotal, experiential, and more informal way. However, note that the X-Forwarded-For header should be used only for the convenience of reading in test, as dealing with fake X-Forwarded-For attacks is not within the scope of this blog.

References

]]>
Fri, 11 Dec 2020 00:00:00 +0000/v1.24//blog/2020/show-source-ip//v1.24//blog/2020/show-source-ip/trafficManagementprotocol extending
Join us for the first IstioCon in 2021!IstioCon 2021 will be the inaugural conference for Istio, the industry’s most popular service mesh. In its inaugural year, IstioCon will be 100% virtual, connecting community members across the globe with Istio’s ecosystem. This conference will take place at the end of February.

All the information related to IstioCon will be published on the conference website. IstioCon provides an opportunity to showcase the lessons learned from running Istio in production, hands-on experiences from the Istio community, and will feature maintainers from across the Istio ecosystem. At this time, we encourage Istio users, developers, partners, and advocates to submit a session proposal through the conference’s CFP portal. The conference offers a mix of keynotes, technical talks, lightning talks, workshops, and roadmap sessions. Choose from the following formats to submit a session proposal for IstioCon:

  • Presentation: 40 minute presentation, maximum of 2 speakers
  • Panel: 40 minutes of discussion among 3 to 5 speakers
  • Workshop: 160 minute (2h 40m), in-depth, hands-on presentation with 1–4 speakers
  • Lighting Talk: 10 minute presentation, limited to 1 speaker

This community-led event also has in store two social hours to take the load off and mesh with the Istio community, vendors, and maintainers. Participation in the event is free of charge, and will only require participants to register in order to join.

Stay tuned to hear more about this conference, and we hope you can join us at the first IstioCon in 2021!

]]>
Tue, 08 Dec 2020 00:00:00 +0000/v1.24//blog/2020/istiocon-2021//v1.24//blog/2020/istiocon-2021/IstioConIstioconference
Handling Docker Hub rate limitingSince November 20th, 2020, Docker Hub has introduced rate limits on image pulls.

Because Istio uses Docker Hub as the default registry, usage on a large cluster may lead to pods failing to startup due to exceeding rate limits. This can be especially problematic for Istio, as there is typically the Istio sidecar image alongside most pods in the cluster.

Mitigations

Istio allows you to specify a custom docker registry which you can use to make container images be fetched from your private registry. This can be configured by passing --set hub=<some-custom-registry> at installation time.

Istio provides official mirrors to Google Container Registry. This can be configured with --set hub=gcr.io/istio-release. This is available for Istio 1.5+.

Alternatively, you can copy the official Istio images to your own registry. This is especially useful if your cluster runs in an environment with a registry tailored for your use case (for example, on AWS you may want to mirror images to Amazon ECR) or you have air gapped security requirements where access to public registries is restricted. This can be done with the following script:

$ SOURCE_HUB=istio
$ DEST_HUB=my-registry # Replace this with the destination hub
$ IMAGES=( install-cni operator pilot proxyv2 ) # Images to mirror.
$ VERSIONS=( 1.7.5 1.8.0 ) # Versions to copy
$ VARIANTS=( "" "-distroless" ) # Variants to copy
$ for image in $IMAGES; do
$ for version in $VERSIONS; do
$ for variant in $VARIANTS; do
$   name=$image:$version$variant
$   docker pull $SOURCE_HUB/$name
$   docker tag $SOURCE_HUB/$name $DEST_HUB/$name
$   docker push $DEST_HUB/$name
$   docker rmi $SOURCE_HUB/$name
$   docker rmi $DEST_HUB/$name
$ done
$ done
$ done
]]>
Mon, 07 Dec 2020 00:00:00 +0000/v1.24//blog/2020/docker-rate-limit//v1.24//blog/2020/docker-rate-limit/docker
Expanding into New Frontiers - Smart DNS Proxying in IstioDNS resolution is a vital component of any application infrastructure on Kubernetes. When your application code attempts to access another service in the Kubernetes cluster or even a service on the internet, it has to first lookup the IP address corresponding to the hostname of the service, before initiating a connection to the service. This name lookup process is often referred to as service discovery. In Kubernetes, the cluster DNS server, be it kube-dns or CoreDNS, resolves the service’s hostname to a unique non-routable virtual IP (VIP), if it is a service of type clusterIP. The kube-proxy on each node maps this VIP to a set of pods of the service, and forwards the traffic to one of them selected at random. When using a service mesh, the sidecar works similarly to the kube-proxy as far as traffic forwarding is concerned.

The following diagram depicts the role of DNS today:

Role of DNS in Istio, today

Problems posed by DNS

While the role of DNS within the service mesh may seem insignificant, it has consistently stood in the way of expanding the mesh to VMs and enabling seamless multicluster access.

VM access to Kubernetes services

Consider the case of a VM with a sidecar. As shown in the illustration below, applications on the VM look up the IP addresses of services inside the Kubernetes cluster as they typically have no access to the cluster’s DNS server.

DNS resolution issues on VMs accessing Kubernetes services

It is technically possible to use kube-dns as a name server on the VM if one is willing to engage in some convoluted workarounds involving dnsmasq and external exposure of kube-dns using NodePort services: assuming you manage to convince your cluster administrator to do so. Even so, you are opening the door to a host of security issues. At the end of the day, these are point solutions that are typically out of scope for those with limited organizational capability and domain expertise.

External TCP services without VIPs

It is not just the VMs in the mesh that suffer from the DNS issue. For the sidecar to accurately distinguish traffic between two different TCP services that are outside the mesh, the services must be on different ports or they need to have a globally unique VIP, much like the clusterIP assigned to Kubernetes services. But what if there is no VIP? Cloud hosted services like hosted databases, typically do not have a VIP. Instead, the provider’s DNS server returns one of the instance IPs that can then be directly accessed by the application. For example, consider the two service entries below, pointing to two different AWS RDS services:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: db1
  namespace: ns1
spec:
  hosts:
  - mysql-instance1.us-east-1.rds.amazonaws.com
  ports:
  - name: mysql
    number: 3306
    protocol: TCP
  resolution: DNS
---
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: db2
  namespace: ns1
spec:
  hosts:
  - mysql-instance2.us-east-1.rds.amazonaws.com
  ports:
  - name: mysql
    number: 3306
    protocol: TCP
  resolution: DNS

The sidecar has a single listener on 0.0.0.0:3306 that looks up the IP address of mysql-instance1.us-east1.rds.amazonaws.com from public DNS servers and forwards traffic to it. It cannot route traffic to db2 as it has no way of distinguishing whether traffic arriving at 0.0.0.0:3306 is bound for db1 or db2. The only way to accomplish this is to set the resolution to NONE causing the sidecar to blindly forward any traffic on port 3306 to the original IP requested by the application. This is akin to punching a hole in the firewall allowing all traffic to port 3306 irrespective of the destination IP. To get traffic flowing, you are now forced to compromise on the security posture of your system.

Resolving DNS for services in remote clusters

The DNS limitations of a multicluster mesh are well known. Services in one cluster cannot lookup the IP addresses of services in other clusters, without clunky workarounds such as creating stub services in the caller namespace.

Taking control of DNS

All in all, DNS has been a thorny issue in Istio for a while. It was time to slay the beast. We (the Istio networking team) decided to tackle the problem once and for all in a way that is completely transparent to you, the end user. Our first attempt involved utilizing Envoy’s DNS proxy. It turned out to be very unreliable, and disappointing overall due to the general lack of sophistication in the c-ares DNS library used by Envoy. Determined to solve the problem, we decided to implement the DNS proxy in the Istio sidecar agent, written in Go. We were able to optimize the implementation to handle all the scenarios that we wanted to tackle without compromising on scale and stability. The Go DNS library we use is the same one used by scalable DNS implementations such as CoreDNS, Consul, Mesos, etc. It has been battle tested in production for scale and stability.

Starting with Istio 1.8, the Istio agent on the sidecar will ship with a caching DNS proxy, programmed dynamically by Istiod. Istiod pushes the hostname-to-IP-address mappings for all the services that the application may access based on the Kubernetes services and service entries in the cluster. DNS lookup queries from the application are transparently intercepted and served by the Istio agent in the pod or VM. If the query is for a service within the mesh, irrespective of the cluster that the service is in, the agent responds directly to the application. If not, it forwards the query to the upstream name servers defined in /etc/resolv.conf. The following diagram depicts the interactions that occur when an application tries to access a service using its hostname.

Smart DNS proxying in Istio sidecar agent

As you will see in the following sections, the DNS proxying feature has had an enormous impact across many aspects of Istio.

Reduced load on your DNS servers w/ faster resolution

The load on your cluster’s Kubernetes DNS server drops drastically as almost all DNS queries are resolved within the pod by Istio. The bigger the footprint of mesh on a cluster, the lesser the load on your DNS servers. Implementing our own DNS proxy in the Istio agent has allowed us to implement cool optimizations such as CoreDNS auto-path without the correctness issues that CoreDNS currently faces.

To understand the impact of this optimization, lets take a simple DNS lookup scenario, in a standard Kubernetes cluster without any custom DNS setup for pods - i.e., with the default setting of ndots:5 in /etc/resolv.conf. When your application starts a DNS lookup for productpage.ns1.svc.cluster.local, it appends the DNS search namespaces in /etc/resolv.conf (e.g., ns1.svc.cluster.local) as part of the DNS query, before querying the host as-is. As a result, the first DNS query that is actually sent out will look like productpage.ns1.svc.cluster.local.ns1.svc.cluster.local, which will inevitably fail DNS resolution when Istio is not involved. If your /etc/resolv.conf has 5 search namespaces, the application will send two DNS queries for each search namespace, one for the IPv4 A record and another for the IPv6 AAAA record, and then a final pair of queries with the exact hostname used in the code. Before establishing the connection, the application performs 12 DNS lookup queries for each host!

With Istio’s implementation of the CoreDNS style auto-path technique, the sidecar agent will detect the real hostname being queried within the first query and return a cname record to productpage.ns1.svc.cluster.local as part of this DNS response, as well as the A/AAAA record for productpage.ns1.svc.cluster.local. The application receiving this response can now extract the IP address immediately and proceed to establishing a TCP connection to that IP. The smart DNS proxy in the Istio agent dramatically cuts down the number of DNS queries from 12 to just 2!

VMs to Kubernetes integration

Since the Istio agent performs local DNS resolution for services within the mesh, DNS lookup queries for Kubernetes services from VMs will now succeed without requiring clunky workarounds for exposing kube-dns outside the cluster. The ability to seamlessly resolve internal services in a cluster will now simplify your monolith to microservice journey, as the monolith on VMs can now access microservices on Kubernetes without additional levels of indirection via API gateways.

Automatic VIP allocation where possible

You may ask, how does this DNS functionality in the agent solve the problem of distinguishing between multiple external TCP services without VIPs on the same port?

Taking inspiration from Kubernetes, Istio will now automatically allocate non-routable VIPs (from the Class E subnet) to such services as long as they do not use a wildcard host. The Istio agent on the sidecar will use the VIPs as responses to the DNS lookup queries from the application. Envoy can now clearly distinguish traffic bound for each external TCP service and forward it to the right target. With the introduction of the DNS proxying, you will no longer need to use resolution: NONE for non-wildcard TCP services, improving your overall security posture. Istio cannot help much with wildcard external services (e.g., *.us-east1.rds.amazonaws.com). You will have to resort to NONE resolution mode to handle such services.

Multicluster DNS lookup

For the adventurous lot, attempting to weave a multicluster mesh where applications directly call internal services of a namespace in a remote cluster, the DNS proxy functionality comes in quite handy. Your applications can resolve Kubernetes services on any cluster in any namespace, without the need to create stub Kubernetes services in every cluster.

The benefits of the DNS proxy extend beyond the multicluster models that are currently described in Istio today. At Tetrate, we use this mechanism extensively in our customers’ multicluster deployments to enable sidecars to resolve DNS for hosts exposed at ingress gateways of all the clusters in a mesh, and access them over mutual TLS.

Concluding thoughts

The problems caused by lack of control over DNS have often been overlooked and ignored in its entirety when it comes to weaving a mesh across many clusters, different environments, and integrating external services. The introduction of a caching DNS proxy in the Istio sidecar agent solves these issues. Exercising control over the application’s DNS resolution allows Istio to accurately identify the target service to which traffic is bound, and enhance the overall security, routing, and telemetry posture in Istio within and across clusters.

Smart DNS proxying is enabled in the preview profile in Istio 1.8. Please try it out!

]]>
Thu, 12 Nov 2020 00:00:00 +0000/v1.24//blog/2020/dns-proxy//v1.24//blog/2020/dns-proxy/dnssidecarmulticlustervmexternal services
2020 Steering Committee Election ResultsLast month, we announced a revision to our Steering Committee charter, opening up governance roles to more contributors and community members. The Steering Committee now consists of 9 proportionally-allocated Contribution Seats, and 4 elected Community Seats.

We have now concluded our inaugural election for the Community Seats, and we’re excited to welcome the following new members to the Committee:

They join Contribution Seat holders from Google, IBM/Red Hat and Salesforce. We now have representation from 7 organizations on Steering, reflecting the breadth of our contributor ecosystem.

Thank you to everyone who participated in the election process. The next election will be in July 2021.

]]>
Tue, 29 Sep 2020 00:00:00 +0000/v1.24//blog/2020/steering-election-results//v1.24//blog/2020/steering-election-results/istiosteeringgovernancecommunityelection
Large Scale Security Policy Performance TestsOverview

Istio has a wide range of security policies which can be easily configured into systems of services. As the number of applied policies increases, it is important to understand the relationship of latency, memory usage, and CPU usage of the system.

This blog post goes over common security policies use cases and how the number of security policies or the number of specific rules in a security policy can affect the overall latency of requests.

Setup

There are a wide range of security policies and many more combinations of those policies. We will go over 6 of the most commonly used test cases.

The following test cases are run in an environment which consists of a Fortio client sending requests to a Fortio server, with a baseline of no Envoy sidecars deployed. The following data was gathered by using the Istio performance benchmarking tool.

In these test cases, requests either do not match any rules or match only the very last rule in the security policies. This ensures that the RBAC filter is applied to all policy rules, and never matches a policy rule before before viewing all the policies. Even though this is not necessarily what will happen in your own system, this policy setup provides data for the worst possible performance of each test case.

Test cases

  1. Mutual TLS STRICT vs plaintext.

  2. A single authorization policy with a variable number of principal rules as well as a PeerAuthentication policy. The principal rule is dependent on the PeerAuthentication policy being applied to the system.

  3. A single authorization policy with a variable number of requestPrincipal rules as well as a RequestAuthentication policy. The requestPrincipal is dependent on the RequestAuthentication policy being applied to the system.

  4. A single authorization policy with a variable number of paths vs sourceIP rules.

  5. A variable number of authorization policies consisting of a single path or sourceIP rule.

  6. A single RequestAuthentication policy with variable number of JWTRules rules.

Data

The y-axis of each test is the latency in milliseconds, and the x-axis is the number of concurrent connections. The x-axis of each graph consists of 3 data points that represent a small load (qps=100, conn=8), medium load (qps=500, conn=32), and large load (qps=1000, conn=64).

The difference of latency between MTLS mode STRICT and plaintext is very small in lower loads. As the `qps` and `conn` increase, the latency of requests with MTLS STRICT increases. The additional latency increased in larger loads is minimal compared to that of the increase from having no sidecars to having sidecars in the plaintext.

Conclusion

  • In general, adding security policies does not add relatively high overhead to the system. The policies that add the most latency include:

    1. Authorization policy with JWTRules rules.

    2. Authorization policy with requestPrincipal rules.

    3. Authorization policy with principals rules.

  • In lower loads (requests with lower qps and conn) the difference in latency for most policies is minimal.

  • Envoy proxy sidecars increase latency more than most policies, even if the policies are large.

  • The latency increase of extremely large policies is relatively similar to the latency increase of adding Envoy proxy sidecars compared to that of no sidecars.

  • Two different tests determined that the sourceIP rule is marginally slower than a path rule.

If you are interested in creating your own large scale security policies and running performance tests with them, see the performance benchmarking tool README.

If you are interested in reading more about the security policies tests, see our design doc. If you don’t already have access, you can join the Istio team drive.

]]>
Tue, 15 Sep 2020 00:00:00 +0000/v1.24//blog/2020/large-scale-security-policy-performance-tests//v1.24//blog/2020/large-scale-security-policy-performance-tests/testsecurity policyperformance
Deploying Istio Control Planes Outside the MeshOverview

From experience working with various service mesh users and vendors, we believe there are 3 key personas for a typical service mesh:

  • Mesh Operator, who manages the service mesh control plane installation and upgrade.

  • Mesh Admin, often referred as Platform Owner, who owns the service mesh platform and defines the overall strategy and implementation for service owners to adopt service mesh.

  • Mesh User, often referred as Service Owner, who owns one or more services in the mesh.

Prior to version 1.7, Istio required the control plane to run in one of the primary clusters in the mesh, leading to a lack of separation between the mesh operator and the mesh admin. Istio 1.7 introduces a new external control plane deployment model which enables mesh operators to install and manage mesh control planes on separate external clusters. This deployment model allows a clear separation between mesh operators and mesh admins. Istio mesh operators can now run Istio control planes for mesh admins while mesh admins can still control the configuration of the control plane without worrying about installing or managing the control plane. This model is transparent to mesh users.

External control plane deployment model

After installing Istio using the default installation profile, you will have an Istiod control plane installed in a single cluster like the diagram below:

Istio mesh in a single cluster

With the new deployment model in Istio 1.7, it’s possible to run Istiod on an external cluster, separate from the mesh services as shown in the diagram below. The external control plane cluster is owned by the mesh operator while the mesh admin owns the cluster running services deployed in the mesh. The mesh admin has no access to the external control plane cluster. Mesh operators can follow the external istiod single cluster step by step guide to explore more on this. (Note: In some internal discussions among Istio maintainers, this model was previously referred to as “central istiod”.)

Single cluster Istio mesh with Istiod in an external control plane cluster

Mesh admins can expand the service mesh to multiple clusters, which are managed by the same Istiod running in the external cluster. None of the mesh clusters are primary clusters, in this case. They are all remote clusters. However, one of them also serves as the Istio configuration cluster, in addition to running services. The external control plane reads Istio configurations from the config cluster and Istiod pushes configuration to the data plane running in both the config cluster and other remote clusters as shown in the diagram below.

Multicluster Istio mesh with Istiod in an external control plane cluster

Mesh operators can further expand this deployment model to manage multiple Istio control planes from an external cluster running multiple Istiod control planes:

Multiple single clusters with multiple Istiod control planes in an external control plane cluster

In this case, each Istiod manages its own remote cluster(s). Mesh operators can even install their own Istio mesh in the external control plane cluster and configure its istio-ingress gateway to route traffic from remote clusters to their corresponding Istiod control planes. To learn more about this, check out these steps.

Conclusion

The external control plane deployment model enables the Istio control plane to be run and managed by mesh operators who have operational expertise in Istio, and provides a clean separation between service mesh control and data planes. Mesh operators can run the control plane in their own clusters or other environments, providing the control plane as a service to mesh admins. Mesh operators can run multiple Istiod control planes in a single cluster, deploying their own Istio mesh and using istio-ingress gateways to control access to these Istiod control planes. Through the examples provided here, mesh operators can explore different implementation choices and choose what works best for them.

This new model reduces complexity for mesh admins by allowing them to focus on mesh configurations without operating the control plane themselves. Mesh admins can continue to configure mesh-wide settings and Istio resources without any access to external control plane clusters. Mesh users can continue to interact with the service mesh without any changes.

]]>
Thu, 27 Aug 2020 00:00:00 +0000/v1.24//blog/2020/new-deployment-model//v1.24//blog/2020/new-deployment-model/istioddeployment modelinstalldeploy1.7
Introducing the new Istio steering committeeToday, the Istio project is pleased to announce a new revision to its steering charter, which opens up governance roles to more contributors and community members. This revision solidifies our commitment to open governance, ensuring that the community around the project will always be able to steer its direction, and that no one company has majority voting control over the project.

The Istio Steering Committee oversees the administrative aspects of the project and sets the marketing direction. From the earliest days of the project, it was bootstrapped with members from Google and IBM, the two founders and largest contributors, with the explicit intention that other seats would be added. We are very happy to deliver on that promise today, with a new charter designed to reward contribution and community.

The new Steering Committee consists of 13 seats: 9 proportionally allocated Contribution Seats, and 4 elected Community Seats.

Contribution Seats

The direction of a project is set by the people who contribute to it. We’ve designed our committee to reflect that, with 9 seats to be attributed in proportion to contributions made to Istio in the previous 12 months. In Kubernetes, the mantra was “chop wood, carry water,” and we similarly want to reward companies who are fueling the growth of the project with contributions.

This year, we’ve chosen to use merged pull requests as our proxy for proportional contribution. We know that no measure of contribution is perfect, and as such we will explicitly reconsider the formula every year. (Other measures we considered, including commits, comments, and actions, gave the same results for this period.)

In order to ensure corporate diversity, there will always be a minimum of three companies represented in Contribution Seats.

Community Seats

There are many wonderful contributors to the Istio community, including developers, SREs and mesh admins, working for companies large and small. We wanted to ensure that their voices were included, both in terms of representation and selection.

We have added 4 seats for representatives from 4 different organizations, who are not represented in the Contribution Seat allocation. These seats will be voted on by the Istio community in an annual election.

Any project member can stand for election; all Istio members who have been active in the last 12 months are eligible to vote.

Corporate diversification is the goal

Our goal is that the governance of Istio reflects the diverse set of contributors. Both Google and IBM/Red Hat will have fewer seats than previously, and the new model is designed to ensure representation from at least 7 different organizations.

We also want to make it clear that no single vendor, no matter how large their contribution, has majority voting control over the Istio project. We’ve implemented a cap on the number of seats a company can hold, such that they can neither unanimously win a vote, or veto a decision of the rest of the committee.

The 2020 committee and election

According to our seat allocation process, this year Google will be allocated 5 seats and IBM/Red Hat will be allocated 3. As the third largest contributor to Istio in the last 12 months, we are pleased to announce that Salesforce has earned a Contribution Seat.

The first election for Community Seats begins today. Members have two weeks to nominate themselves, and voting will run from 14 to 27 September. You can learn all about the election in the istio/community repository on GitHub. We’re also hosting a special community meeting this Thursday at 10:00 Pacific to discuss the changes and the election process. We’d love to see you there!

]]>
Mon, 24 Aug 2020 00:00:00 +0000/v1.24//blog/2020/steering-changes//v1.24//blog/2020/steering-changes/istiosteeringgovernancecommunityelection
Using MOSN with Istio: an alternative data planeMOSN (Modular Open Smart Network) is a network proxy server written in Go. It was built at Ant Group as a sidecar/API Gateway/cloud-native Ingress/Layer 4 or Layer 7 load balancer etc. Over time, we’ve added extra features, like a multi-protocol framework, multi-process plug-in mechanism, a DSL, and support for the xDS APIs. Supporting xDS means we are now able to use MOSN as the network proxy for Istio. This configuration is not supported by the Istio project; for help, please see Learn More below.

Background

In the service mesh world, using Istio as the control plane has become the mainstream. Because Istio was built on Envoy, it uses Envoy’s data plane APIs (collectively known as the xDS APIs). These APIs have been standardized separately from Envoy, and so by implementing them in MOSN, we are able to drop in MOSN as a replacement for Envoy. Istio’s integration of third-party data planes can be implemented in three steps, as follows.

  • Implement xDS protocols to fulfill the capabilities for data plane related services.
  • Build proxyv2 images using Istio’s script and set the relevant SIDECAR and other parameters.
  • Specify a specific data plane via the istioctl tool and set the proxy-related configuration.

Architecture

MOSN has a layered architecture with four layers, NET/IO, Protocol, Stream, and Proxy, as shown in the following figure.

The architecture of MOSN
  • NET/IO acts as the network layer, monitoring connections and incoming packets, and as a mount point for the listener filter and network filter.
  • Protocol is the multi-protocol engine layer that examines packets and uses the corresponding protocol for decode/encode processing.
  • Stream does a secondary encapsulation of the decode packet into stream, which acts as a mount for the stream filter.
  • Proxy acts as a forwarding framework for MOSN, and does proxy processing on the encapsulated streams.

Why use MOSN?

Before the service mesh transformation, we have expected that as the next generation of Ant Group’s infrastructure, service mesh will inevitably bring revolutionary changes and evolution costs. We have a very ambitious blueprint: ready to integrate the original network and middleware various capabilities have been re-precipitated and polished to create a low-level platform for the next-generation architecture of the future, which will carry the responsibility of various service communications.

This is a long-term planning project that takes many years to build and meets the needs of the next five or even ten years, and cooperates to build a team that spans business, SRE, middleware, and infrastructure departments. We must have a network proxy forwarding plane with flexible expansion, high performance, and long-term evolution. Nginx and Envoy have a very long-term capacity accumulation and active community in the field of network agents. We have also borrowed from other excellent open source network agents such as Nginx and Envoy. At the same time, we have enhanced research and development efficiency and flexible expansion. Mesh transformation involves a large number of departments and R & D personnel. We must consider the landing cost of cross-team cooperation. Therefore, we have developed a new network proxy MOSN based on Go in the cloud-native scenario. For Go’s performance, we also did a full investigation and test in the early stage to meet the performance requirements of Ant Group’s services.

At the same time, we received a lot of feedback and needs from the end user community. Everyone has the same needs and thoughts. So we combined the actual situation of the community and ourselves to conduct the research and development of MOSN from the perspective of satisfying the community and users. We believe that the open source competition is mainly competition between standards and specifications. We need to make the most suitable implementation choice based on open source standards.

What is the difference between MOSN and Istio’s default proxy?

Differences in language stacks

MOSN is written in Go. Go has strong guarantees in terms of production efficiency and memory security. At the same time, Go has an extensive library ecosystem in the cloud-native era. The performance is acceptable and usable in the service mesh scenario. Therefore, MOSN has a lower intellectual cost for companies and individuals using languages such as Go and Java.

Differentiation of core competence

  • MOSN supports a multi-protocol framework, and users can easily access private protocols with a unified routing framework.
  • Multi-process plug-in mechanism, which can easily extend the plug-ins of independent MOSN processes through the plug-in framework, and do some other management, bypass and other functional module extensions.
  • Transport layer national secret algorithm support with Chinese encryption compliance, etc.

What are the drawbacks of MOSN

  • Because MOSN is written in Go, it doesn’t have as good performance as Istio default proxy, but the performance is acceptable and usable in the service mesh scenario.
  • Compared with Istio default proxy, some features are not fully supported, such as WASM, HTTP3, Lua, etc. However, these are all in the roadmap of MOSN, and the goal is to be fully compatible with Istio.

MOSN with Istio

The following describes how to set up MOSN as the data plane for Istio.

Setup Istio

You can download a zip file for your operating system from the Istio release page. This file contains: the installation file, examples and the istioctl command line tool. To download Istio (this example uses Istio 1.5.2) uses the following command.

$ export ISTIO_VERSION=1.5.2
$ curl -L https://istio.io/downloadIstio | sh -

The downloaded Istio package is named istio-1.5.2 and contains:

  • install/kubernetes: Contains YAML installation files related to Kubernetes.
  • examples/: Contains example applications.
  • bin/: Contains the istioctl client files.

Switch to the folder where Istio is located.

$ cd istio-$ISTIO_VERSION/

Add the istioctl client path to $PATH with the following command.

$ export PATH=$PATH:$(pwd)/bin

Setting MOSN as the Data Plane

It is possible to flexibly customize the Istio control plane and data plane configuration parameters using the istioctl command line tool. MOSN can be specified as the data plane for Istio using the following command.

$ istioctl manifest apply  --set .values.global.proxy.image="mosnio/proxyv2:1.5.2-mosn"  --set meshConfig.defaultConfig.binaryPath="/usr/local/bin/mosn"

Check that Istio-related pods and services are deployed successfully.

$ kubectl get svc -n istio-system

If the service STATUS is Running, then Istio has been successfully installed using MOSN and you can now deploy the Bookinfo sample.

Bookinfo Examples

You can run the Bookinfo sample by following the MOSN with Istio tutorial where you can find instructions for using MOSN and Istio. You can install MOSN and get to the same point you would have using the default Istio instructions with Envoy.

Moving forward

Next, MOSN will not only be compatible with the features of the latest version of Istio, but also evolve in the following aspects.

  • As a microservices runtime, MOSN oriented programming makes services lighter, smaller and faster.
  • Programmable, support WASM.
  • More scenario support, Cache Mesh/Message Mesh/Block-chain Mesh etc.

MOSN is an open source project that anyone in the community can use, improve, and enjoy. We’d love you to join us! Here are a few ways to find out what’s happening and get involved.

Learn More

]]>
Tue, 28 Jul 2020 00:00:00 +0000/v1.24//blog/2020/mosn-proxy//v1.24//blog/2020/mosn-proxy/mosnsidecarproxy
Open and neutral: transferring our trademarks to the Open Usage CommonsSince day one, the Istio project has believed in the importance of being contributor-run, open, transparent and available to all. In that spirit, Google is pleased to announce that it will be transferring ownership of the project’s trademarks to the new Open Usage Commons.

Istio is an open source project, released under the Apache 2.0 license. That means people can copy, modify, distribute, make, use and sell the source code. The only freedom people don’t have under the Apache 2.0 license is to use the name Istio, or its logo, in a way that would confuse consumers.

As one of the founders of the project, Google is the current owner of the Istio trademark. While anyone who is using the software in accordance with the license can use the trademarks, the historic ownership has caused some confusion and uncertainty about who can use the name and how, and at times this confusion has been a barrier to community growth. So today, as part of Istio’s continued commitment to openness, Google is announcing that the Istio trademarks will be transferred to a new organization, the Open Usage Commons, to provide neutral, independent oversight of the marks.

A neutral home for Istio’s trademarks

The Open Usage Commons is a new organization that is focused solely on providing management and guidance of open source project trademarks in a way that is aligned with the Open Source Definition. For projects, particularly projects with robust ecosystems like Istio, ensuring that the trademark is available to anyone who is using the software in accordance with the license is important. The trademark allows maintainers to grow a community and use the name to do so. It also lets ecosystem partners create services on top of the project, and it enables developers to create tooling and integrations that reference the project. Maintainers, ecosystem partners, and developers alike must feel confident in their investments in Istio - for the long term. Google thinks having the Istio trademarks in the Open Usage Commons is the right way to give that clarity and provide that confidence.

The Open Usage Commons will work with the Istio Steering Committee to generate trademark usage guidelines. There will be no immediate changes to the Istio usage guidelines, and if you are currently using the Istio marks in a way that follows the existing brand guide, you can continue to do so.

You can learn more about open source project IP and the Open Usage Commons at openusage.org.

A continued commitment to open

The Open Usage Commons is focused on project trademarks; it does not address other facets of an open project, like rules around who gets decision-making votes. Similar to many projects in their early days, Istio’s committees started as small groups that stemmed from the founding companies. But Istio has grown and matured (last year Istio was #4 on GitHub’s list of fastest growing open source projects!), and it is time for the next evolution of Istio’s governance.

Recently, we were proud to appoint Neeraj Poddar, Co-founder & Chief Architect of Aspen Mesh, to the Technical Oversight Committee — the group responsible for all technical decision-making in the project. Neeraj is a long-time contributor to the project and served as a Working Group lead. The TOC is now made up of 7 members from 4 different companies - Tetrate, IBM, Google & now Aspen Mesh.

Our community is currently discussing how the Steering Committee, which oversees marketing and community activities, should be governed, to reflect the expanding community and ecosystem. If you have ideas for this new governance, visit the pull request on GitHub where an active discussion is taking place.

In the last 12 months, Istio has had commits from more than 100 organizations and currently has 70 maintainers from 14 different companies. This trend is the kind of contributor diversity the project’s founders intended, and nurturing it remains a priority. Google is excited about what the future holds for Istio, and hopes you’ll be a part of it.

]]>
Wed, 08 Jul 2020 00:00:00 +0000/v1.24//blog/2020/open-usage//v1.24//blog/2020/open-usage/trademarkgovernancesteering
Reworking our Addon IntegrationsStarting with Istio 1.6, we are introducing a new method for integration with telemetry addons, such as Grafana, Prometheus, Zipkin, Jaeger, and Kiali.

In previous releases, these addons were bundled as part of the Istio installation. This allowed users to quickly get started with Istio without any complicated configurations to install and integrate these addons. However, it came with some issues:

  • The Istio addon installations were not as up to date or feature rich as upstream installation methods. Users were left missing out on some of the great features provided by these applications, such as:
    • Persistent storage
    • Features like Alertmanager for Prometheus
    • Advanced security settings
  • Integration with existing deployments that were using these features was more challenging than it should be.

Changes

In order to address these gaps, we have made a number of changes:

  • Added a new Integrations documentation section to explain which applications Istio can integrate with, how to use them, and best practices.

  • Reduced the amount of configuration required to set up telemetry addons

  • Removed the bundled addon installations from istioctl and the operator. Istio does not install components that are not delivered by the Istio project. As a result, Istio will stop shipping installation artifacts related to addons. However, Istio will guarantee version compatibility where necessary. It is the user’s responsibility to install these components by using the official Integrations documentation and artifacts provided by the respective projects. For demos, users can deploy simple YAML files from the samples/addons/ directory.

We hope these changes allow users to make the most of these addons so as to fully experience what Istio can offer.

Timeline

  • Istio 1.6: The new demo deployments for telemetry addons are available under samples/addons/ directory.
  • Istio 1.7: Upstream installation methods or the new samples deployment are the recommended installation methods. Installation by istioctl is deprecated.
  • Istio 1.8: Installation of addons by istioctl is removed.
]]>
Thu, 04 Jun 2020 00:00:00 +0000/v1.24//blog/2020/addon-rework//v1.24//blog/2020/addon-rework/telemetryaddonsintegrationsgrafanaprometheus
Introducing Workload EntriesIntroducing Workload Entries: Bridging Kubernetes and VMs

Historically, Istio has provided great experience to workloads that run on Kubernetes, but it has been less smooth for other types of workloads, such as Virtual Machines (VMs) and bare metal. The gaps included the inability to declaratively specify the properties of a sidecar on a VM, inability to properly respond to the lifecycle changes of the workload (e.g., booting to not ready to ready, or health checks), and cumbersome DNS workarounds as the workloads are migrated into Kubernetes to name a few.

Istio 1.6 has introduced a few changes in how you manage non-Kubernetes workloads, driven by a desire to make it easier to gain Istio’s benefits for use cases beyond containers, such as running traditional databases on a platform outside of Kubernetes, or adopting Istio’s features for existing applications without rewriting them.

Background

Prior to Istio 1.6, non-containerized workloads were configurable simply as an IP address in a ServiceEntry, which meant that they only existed as part of a service. Istio lacked a first-class abstraction for these non-containerized workloads, something similar to how Kubernetes treats Pods as the fundamental unit of compute - a named object that serves as the collection point for all things related to a workload - name, labels, security properties, lifecycle status events, etc. Enter WorkloadEntry.

Consider the following ServiceEntry describing a service implemented by a few tens of VMs with IP addresses:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: svc1
spec:
  hosts:
  - svc1.internal.com
  ports:
  - number: 80
    name: http
    protocol: HTTP
  resolution: STATIC
  endpoints:
  - address: 1.1.1.1
  - address: 2.2.2.2
  ....

If you wanted to migrate this service into Kubernetes in an active-active manner - i.e. launch a bunch of Pods, send a portion of the traffic to the Pods over Istio mutual TLS (mTLS) and send the rest to the VMs without sidecars - how would you do it? You would have needed to use a combination of a Kubernetes service, a virtual service, and a destination rule to achieve the behavior. Now, let’s say you decided to add sidecars to these VMs, one by one, such that you want only the traffic to the VMs with sidecars to use Istio mTLS. If any other Service Entry happens to include the same VM in its addresses, things start to get very complicated and error prone.

The primary source of these complications is that Istio lacked a first-class definition of a non-containerized workload, whose properties can be described independently of the service(s) it is part of.

The Internal of Service Entries Pointing to Workload Entries

Workload Entry: A Non-Kubernetes Endpoint

WorkloadEntry was created specifically to solve this problem. WorkloadEntry allows you to describe non-Pod endpoints that should still be part of the mesh, and treat them the same as a Pod. From here everything becomes easier, like enabling MUTUAL_TLS between workloads, whether they are containerized or not.

To create a WorkloadEntry and attach it to a ServiceEntry you can do something like this:

---
apiVersion: networking.istio.io/v1alpha3
kind: WorkloadEntry
metadata:
  name: vm1
  namespace: ns1
spec:
  address: 1.1.1.1
  labels:
    app: foo
    instance-id: vm-78ad2
    class: vm
---
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: svc1
  namespace: ns1
spec:
  hosts:
  - svc1.internal.com
  ports:
  - number: 80
    name: http
    protocol: HTTP
  resolution: STATIC
  workloadSelector:
    labels:
      app: foo

This creates a new WorkloadEntry with a set of labels and an address, and a ServiceEntry that uses a WorkloadSelector to select all endpoints with the desired labels, in this case including the WorkloadEntry that are created for the VM.

The Internal of Service Entries Pointing to Workload Entries

Notice that the ServiceEntry can reference both Pods and WorkloadEntries, using the same selector. VMs and Pods can now be treated identically by Istio, rather than being kept separate.

If you were to migrate some of your workloads to Kubernetes, and you choose to keep a substantial number of your VMs, the WorkloadSelector can select both Pods and VMs, and Istio will automatically load balance between them. The 1.6 changes also mean that WorkloadSelector syncs configurations between the Pods and VMs and removes the manual requirement to target both infrastructures with duplicate policies like mTLS and authorization. The Istio 1.6 release provides a great starting point for what will be possible for the future of Istio. The ability to describe what exists outside of the mesh the same way you do with a Pod leads to added benefits like improved bootstrapping experience. However, these benefits are merely side effects. The core benefit is you can now have VMs, and Pods co-exist without any configuration needed to bridge the two together.

]]>
Thu, 21 May 2020 00:00:00 +0000/v1.24//blog/2020/workload-entry//v1.24//blog/2020/workload-entry/vmworkloadentrymigration1.6baremetalserviceentrydiscovery
Safely Upgrade Istio using a Canary Control Plane DeploymentCanary deployments are a core feature of Istio. Users rely on Istio’s traffic management features to safely control the rollout of new versions of their applications, while making use of Istio’s rich telemetry to compare the performance of canaries. However, when it came to upgrading Istio, there was not an easy way to canary the upgrade, and due to the in-place nature of the upgrade, issues or changes found affect the entire mesh at once.

Istio 1.6 will support a new upgrade model to safely canary-deploy new versions of Istio. In this new model, proxies will associate with a specific control plane that they use. This allows a new version to deploy to the cluster with less risk - no proxies connect to the new version until the user explicitly chooses to. This allows gradually migrating workloads to the new control plane, while monitoring changes using Istio telemetry to investigate any issues, just like using VirtualService for workloads. Each independent control plane is referred to as a “revision” and has an istio.io/rev label.

Understanding upgrades

Upgrading Istio is a complicated process. During the transition period between two versions, which might take a long time for large clusters, there are version differences between proxies and the control plane. In the old model the old and new control planes use the same Service, traffic is randomly distributed between the two, offering no control to the user. However, in the new model, there is not cross-version communication. Look at how the upgrade changes:

Configuring

Control plane selection is done based on the sidecar injection webhook. Each control plane is configured to select objects with a matching istio.io/rev label on the namespace. Then, the upgrade process configures the pods to connect to a control plane specific to that revision. Unlike in the current model, this means that a given proxy connects to the same revision during its lifetime. This avoids subtle issues that might arise when a proxy switches which control plane it is connected to.

The new istio.io/rev label will replace the istio-injection=enabled label when using revisions. For example, if we had a revision named canary, we would label our namespaces that we want to use this revision with istio.io/rev=canary. See the upgrade guide for more information.

]]>
Tue, 19 May 2020 00:00:00 +0000/v1.24//blog/2020/multiple-control-planes//v1.24//blog/2020/multiple-control-planes/installupgraderevisioncontrol plane
Direct encrypted traffic from IBM Cloud Kubernetes Service Ingress to Istio Ingress GatewayIn this blog post I show how to configure the Ingress Application Load Balancer (ALB) on IBM Cloud Kubernetes Service (IKS) to direct traffic to the Istio ingress gateway, while securing the traffic between them using mutual TLS authentication.

When you use IKS without Istio, you may control your ingress traffic using the provided ALB. This ingress-traffic routing is configured using a Kubernetes Ingress resource with ALB-specific annotations. IKS provides a DNS domain name, a TLS certificate that matches the domain, and a private key for the certificate. IKS stores the certificates and the private key in a Kubernetes secret.

When you start using Istio in your IKS cluster, the recommended method to send traffic to your Istio enabled workloads is by using the Istio Ingress Gateway instead of using the Kubernetes Ingress. One of the main reasons to use the Istio ingress gateway is the fact the ALB provided by IKS will not be able to communicate directly with the services inside the mesh when you enable STRICT mutual TLS. During your transition to having only Istio ingress gateway as your main entry point, you can continue to use the traditional Ingress for non-Istio services while using the Istio ingress gateway for services that are part of the mesh.

IKS provides a convenient way for clients to access Istio ingress gateway by letting you register a new DNS subdomain for the Istio gateway’s IP with an IKS command. The domain is in the following format: <cluster_name>-<globally_unique_account_HASH>-0001.<region>.containers.appdomain.cloud, for example mycluster-a1b2cdef345678g9hi012j3kl4567890-0001.us-south.containers.appdomain.cloud. In the same way as for the ALB domain, IKS provides a certificate and a private key, storing them in another Kubernetes secret.

This blog describes how you can chain together the IKS Ingress ALB and the Istio ingress gateway to send traffic to your Istio enabled workloads while being able to continue using the ALB specific features and the ALB subdomain name. You configure the IKS Ingress ALB to direct traffic to the services inside an Istio service mesh through the Istio ingress gateway, while using mutual TLS authentication between the ALB and the gateway. For the mutual TLS authentication, you will configure the ALB and the Istio ingress gateway to use the certificates and keys provided by IKS for the ALB and NLB subdomains. Using certificates provided by IKS saves you the overhead of managing your own certificates for the connection between the ALB and the Istio ingress gateway.

You will use the NLB subdomain certificate as the server certificate for the Istio ingress gateway as intended. The NLB subdomain certificate represents the identity of the server that serves a particular NLB subdomain, in this case, the ingress gateway.

You will use the ALB subdomain certificate as the client certificate in mutual TLS authentication between the ALB and the Istio Ingress. When ALB acts as a server it presents the ALB certificate to the clients so the clients can authenticate the ALB. When ALB acts as a client of the Istio ingress gateway, it presents the same certificate to the Istio ingress gateway, so the Istio ingress gateway could authenticate the ALB.

Traffic to the services without an Istio sidecar can continue to flow as before directly from the ALB.

The diagram below exemplifies the described setting. It shows two services in the cluster, service A and service B. service A has an Istio sidecar injected and requires mutual TLS. service B has no Istio sidecar. service B can be accessed by clients through the ALB, which directly communicates with service B. service A can be also accessed by clients through the ALB, but in this case the traffic must pass through the Istio ingress gateway. Mutual TLS authentication between the ALB and the gateway is based on the certificates provided by IKS. The clients can also access the Istio ingress gateway directly. IKS registers different DNS domains for the ALB and for the ingress gateway.

A cluster with the ALB and the Istio ingress gateway

Initial setting

  1. Create the httptools namespace and enable Istio sidecar injection:

    $ kubectl create namespace httptools
    $ kubectl label namespace httptools istio-injection=enabled
    namespace/httptools created
    namespace/httptools labeled
  2. Deploy the httpbin sample to httptools:

    Zip
    $ kubectl apply -f @samples/httpbin/httpbin.yaml@ -n httptools
    service/httpbin created
    deployment.apps/httpbin created

Create secrets for the ALB and the Istio ingress gateway

IKS generates a TLS certificate and a private key and stores them as a secret in the default namespace when you register a DNS domain for an external IP by using the ibmcloud ks nlb-dns-create command. IKS stores the ALB’s certificate and private key also as a secret in the default namespace. You need these credentials to establish the identities that the ALB and the Istio ingress gateway will present during the mutual TLS authentication between them. You will configure the ALB and the Istio ingress gateway to exchange these certificates, to trust the certificates of one another, and to use their private keys to encrypt and sign the traffic.

  1. Store the name of your cluster in the CLUSTER_NAME environment variable:

    $ export CLUSTER_NAME=<your cluster name>
  2. Store the domain name of your ALB in the ALB_INGRESS_DOMAIN environment variable:

    $ ibmcloud ks cluster get --cluster $CLUSTER_NAME | grep Ingress
    Ingress Subdomain:              <your ALB ingress domain>
    Ingress Secret:                 <your ALB secret>
    $ export ALB_INGRESS_DOMAIN=<your ALB ingress domain>
    $ export ALB_SECRET=<your ALB secret>
  3. Store the external IP of your istio-ingressgateway service in an environment variable.

    $ export INGRESS_GATEWAY_IP=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
    $ echo INGRESS_GATEWAY_IP = $INGRESS_GATEWAY_IP
  4. Create a DNS domain and certificates for the IP of the Istio Ingress Gateway service:

    $ ibmcloud ks nlb-dns create classic --cluster $CLUSTER_NAME --ip $INGRESS_GATEWAY_IP --secret-namespace istio-system
    Host name subdomain is created as <some domain>
  5. Store the domain name from the previous command in an environment variable:

    $ export INGRESS_GATEWAY_DOMAIN=<the domain from the previous command>
  6. List the registered domain names:

    $ ibmcloud ks nlb-dnss --cluster $CLUSTER_NAME
    Retrieving host names, certificates, IPs, and health check monitors for network load balancer (NLB) pods in cluster <your cluster>...
    OK
    Hostname                          IP(s)                       Health Monitor   SSL Cert Status   SSL Cert Secret Name                          Secret Namespace
    <your ingress gateway hostname>   <your ingress gateway IP>   None             created           <the matching secret name>           istio-system
    ...

    Wait until the status of the certificate (the fourth field) of the new domain name becomes enabled (initially it is pending).

  7. Store the name of the secret of the new domain name:

    $ export INGRESS_GATEWAY_SECRET=<the secret's name as shown in the SSL Cert Secret Name column>
  8. Extract the certificate and the key from the secret provided for the ALB:

    $ mkdir alb_certs
    $ kubectl get secret $ALB_SECRET --namespace=default -o yaml | grep 'tls.key:' | cut -f2 -d: | base64 --decode > alb_certs/client.key
    $ kubectl get secret $ALB_SECRET --namespace=default -o yaml | grep 'tls.crt:' | cut -f2 -d: | base64 --decode > alb_certs/client.crt
    $ ls -al alb_certs
    -rw-r--r--   1 user  staff  3738 Sep 11 07:57 client.crt
    -rw-r--r--   1 user  staff  1675 Sep 11 07:57 client.key
  9. Download the issuer certificate of the Let’s Encrypt certificate, which is the issuer of the certificates provided by IKS. You specify this certificate as the certificate of a certificate authority to trust, for both the ALB and the Istio ingress gateway.

    $ curl https://letsencrypt.org/certs/trustid-x3-root.pem --output trusted.crt
  10. Create a Kubernetes secret to be used by the ALB to establish mutual TLS connection.

    $ kubectl create secret generic alb-certs -n istio-system --from-file=trusted.crt --from-file=alb_certs/client.crt --from-file=alb_certs/client.key
    secret "alb-certs" created
  11. For mutual TLS, a separate Secret named <tls-cert-secret>-cacert with a cacert key is needed for the ingress gateway.

    $ kubectl create -n istio-system secret generic $INGRESS_GATEWAY_SECRET-cacert --from-file=ca.crt=trusted.crt
    secret/cluster_name-hash-XXXX-cacert created

Configure a mutual TLS ingress gateway

In this section you configure the Istio ingress gateway to perform mutual TLS between external clients and the gateway. You use the certificates and the keys provided to you for the ingress gateway and the ALB.

  1. Define a Gateway to allow access on port 443 only, with mutual TLS:

    $ kubectl apply -n httptools -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: default-ingress-gateway
    spec:
      selector:
        istio: ingressgateway # use istio default ingress gateway
      servers:
      - port:
          number: 443
          name: https
          protocol: HTTPS
        tls:
          mode: MUTUAL
          credentialName: $INGRESS_GATEWAY_SECRET
        hosts:
        - "$INGRESS_GATEWAY_DOMAIN"
        - "httpbin.$ALB_INGRESS_DOMAIN"
    EOF
  2. Configure routes for traffic entering via the Gateway:

    $ kubectl apply -n httptools -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: default-ingress
    spec:
      hosts:
      - "$INGRESS_GATEWAY_DOMAIN"
      - "httpbin.$ALB_INGRESS_DOMAIN"
      gateways:
      - default-ingress-gateway
      http:
      - match:
        - uri:
            prefix: /status
        route:
        - destination:
            port:
              number: 8000
            host: httpbin.httptools.svc.cluster.local
    EOF
  3. Send a request to httpbin by curl, passing as parameters the client certificate (the --cert option) and the private key (the --key option):

    $ curl https://$INGRESS_GATEWAY_DOMAIN/status/418 --cert alb_certs/client.crt  --key alb_certs/client.key
    
    -=[ teapot ]=-
    
       _...._
     .'  _ _ `.
    | ."` ^ `". _,
    \_;`"---"`|//
      |       ;/
      \_     _/
        `"""`
  4. Remove the directories with the ALB and ingress gateway certificates and keys.

    $ rm -r alb_certs trusted.crt

Configure the ALB

You need to configure your Ingress resource to direct traffic to the Istio ingress gateway while using the certificate stored in the alb-certs secret. Normally, the ALB decrypts HTTPS requests before forwarding traffic to your apps. You can configure the ALB to re-encrypt the traffic before it is forwarded to the Istio ingress gateway by using the ssl-services annotation on the Ingress resource. This annotation also allows you to specify the certificate stored in the alb-certs secret, required for mutual TLS.

  1. Configure the Ingress resource for the ALB. You must create the Ingress resource in the istio-system namespace in order to forward the traffic to the Istio ingress gateway.

    $ kubectl apply -f - <<EOF
    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      name: alb-ingress
      namespace: istio-system
      annotations:
        ingress.bluemix.net/ssl-services: "ssl-service=istio-ingressgateway ssl-secret=alb-certs proxy-ssl-name=$INGRESS_GATEWAY_DOMAIN"
    spec:
      tls:
      - hosts:
        - httpbin.$ALB_INGRESS_DOMAIN
        secretName: $ALB_SECRET
      rules:
      - host: httpbin.$ALB_INGRESS_DOMAIN
        http:
          paths:
          - path: /status
            backend:
              serviceName: istio-ingressgateway
              servicePort: 443
    EOF
  2. Test the ALB ingress:

    $ curl https://httpbin.$ALB_INGRESS_DOMAIN/status/418
    
    -=[ teapot ]=-
    
       _...._
     .'  _ _ `.
    | ."` ^ `". _,
    \_;`"---"`|//
      |       ;/
      \_     _/
        `"""`

Congratulations! You configured the IKS Ingress ALB to send encrypted traffic to the Istio ingress gateway. You allocated a host name and certificate for your Istio ingress gateway and used that certificate as the server certificate for Istio ingress gateway. As the client certificate of the ALB you used the certificate provided by IKS for the ALB. Once you had the certificates deployed as Kubernetes secrets, you directed the ingress traffic from the ALB to the Istio ingress gateway for some specific paths and used the certificates for mutual TLS authentication between the ALB and the Istio ingress gateway.

Cleanup

  1. Delete the Gateway configuration, the VirtualService, and the secrets:

    $ kubectl delete ingress alb-ingress -n istio-system
    $ kubectl delete virtualservice default-ingress -n httptools
    $ kubectl delete gateway default-ingress-gateway -n httptools
    $ kubectl delete secrets alb-certs -n istio-system
    $ rm -rf alb_certs trusted.crt
    $ unset CLUSTER_NAME ALB_INGRESS_DOMAIN ALB_SECRET INGRESS_GATEWAY_DOMAIN INGRESS_GATEWAY_SECRET
  2. Shutdown the httpbin service:

    Zip
    $ kubectl delete -f @samples/httpbin/httpbin.yaml@ -n httptools
  3. Delete the httptools namespace:

    $ kubectl delete namespace httptools
]]>
Fri, 15 May 2020 00:00:00 +0000/v1.24//blog/2020/alb-ingress-gateway-iks//v1.24//blog/2020/alb-ingress-gateway-iks/traffic-managementingresssds-credentialsiksmutual-tls
Provision a certificate and key for an application without sidecars

Istio sidecars obtain their certificates using the secret discovery service. A service in the service mesh may not need (or want) an Envoy sidecar to handle its traffic. In this case, the service will need to obtain a certificate itself if it wants to connect to other TLS or mutual TLS secured services.

For a service with no need of a sidecar to manage its traffic, a sidecar can nevertheless still be deployed only to provision the private key and certificates through the CSR flow from the CA and then share the certificate with the service through a mounted file in tmpfs. We have used Prometheus as our example application for provisioning a certificate using this mechanism.

In the example application (i.e., Prometheus), a sidecar is added to the Prometheus deployment by setting the flag .Values.prometheus.provisionPrometheusCert to true (this flag is set to true by default in an Istio installation). This deployed sidecar will then request and share a certificate with Prometheus.

The key and certificate provisioned for the example application are mounted in the directory /etc/istio-certs/. We can list the key and certificate provisioned for the application by running the following command:

$ kubectl exec -it `kubectl get pod -l app=prometheus -n istio-system -o jsonpath='{.items[0].metadata.name}'` -c prometheus -n istio-system -- ls -la /etc/istio-certs/

The output from the above command should include non-empty key and certificate files, similar to the following:

-rwxr-xr-x    1 root     root          2209 Feb 25 13:06 cert-chain.pem
-rwxr-xr-x    1 root     root          1679 Feb 25 13:06 key.pem
-rwxr-xr-x    1 root     root          1054 Feb 25 13:06 root-cert.pem

If you want to use this mechanism to provision a certificate for your own application, take a look at our Prometheus example application and simply follow the same pattern.

]]>
Wed, 25 Mar 2020 00:00:00 +0000/v1.24//blog/2020/proxy-cert//v1.24//blog/2020/proxy-cert/certificatesidecar
Extended and Improved WebAssemblyHub to Bring the Power of WebAssembly to Envoy and IstioOriginally posted on the Solo.io blog

As organizations adopt Envoy-based infrastructure like Istio to help solve challenges with microservices communication, they inevitably find themselves needing to customize some part of that infrastructure to fit within their organization’s constraints. WebAssembly (Wasm) has emerged as a safe, secure, and dynamic environment for platform extension.

In the recent announcement of Istio 1.5, the Istio project lays the foundation for bringing WebAssembly to the popular Envoy proxy. Solo.io is collaborating with Google and the Istio community to simplify the overall experience of creating, sharing, and deploying WebAssembly extensions to Envoy and Istio. It wasn’t that long ago that Google and others laid the foundation for containers, and Docker built a great user experience to make it consumable. Similarly, this effort makes Wasm consumable by building the best user experience for WebAssembly on Istio.

Back in December 2019, Solo.io began an effort to provide a great developer experience for WebAssembly with the announcement of WebAssembly Hub. The WebAssembly Hub allows developers to very quickly spin up a new WebAssembly project in C++ (we’re expanding this language choice, see below), build it using Bazel in Docker, and push it to an OCI-compliant registry. From there, operators had to pull the module, and configure Envoy proxies themselves to load it from disk. Beta support in Gloo, an API Gateway built on Envoy allows you to declaratively and dynamically load the module, the Solo.io team wanted to bring the same effortless and secure experience to other Envoy-based frameworks as well - like Istio.

There has been a lot of interest in the innovation in this area, and the Solo.io team has been working hard to further the capabilities of WebAssembly Hub and the workflows it supports. In conjunction with Istio 1.5, Solo.io is thrilled to announce new enhancements to WebAssembly Hub that evolve the viability of WebAssembly with Envoy for production, improve the developer experience, and streamline using Wasm with Envoy in Istio.

Evolving toward production

The Envoy community is working hard to bring Wasm support into the upstream project (right now it lives on a working development fork), with Istio declaring Wasm support an Alpha feature. In Gloo 1.0, we also announced early, non-production support for Wasm. What is Gloo? Gloo is a modern API Gateway and Ingress Controller (built on Envoy Proxy) that supports routing and securing incoming traffic to legacy monoliths, microservices / Kubernetes and serverless functions. Dev and ops teams are able to shape and control traffic patterns from external end users/clients to backend application services. Gloo is a Kubernetes and Istio native ingress gateway.

Although it’s still maturing in each individual project, there are things that we, as a community, can do to improve the foundation for production support.

The first area is standardizing what a WebAssembly extension for Envoy looks like. Solo.io, Google, and the Istio community have defined an open specification for bundling and distributing WebAssembly modules as OCI images. This specification provides a powerful model for distributing any type of Wasm module including Envoy extensions.

This is open to the community - Join in the effort

The next area is improving the experience of deploying Wasm extensions into an Envoy-based framework running in production. In the Kubernetes ecosystem, it is considered a best practice in production to use declarative CRD-based configuration to manage cluster configuration. The new WebAssembly Hub Operator adds a single, declarative CRD which automatically deploys and configures Wasm filters to Envoy proxies running inside of a Kubernetes cluster. This operator enables GitOps workflows and cluster automation to manage Wasm filters without human intervention or imperative workflows. We will provide more information about the Operator in an upcoming blog post.

Lastly, the interactions between developers of Wasm extensions and the teams that deploy them need some kind of role-based access, organization management, and facilities to share, discover, and consume these extensions. The WebAssembly Hub adds team management features like permissions, organizations, user management, sharing, and more.

Improving the developer experience

As developers want to target more languages and runtimes, the experience must be kept as simple and as productive as possible. Multi-language support and runtime ABI (Application Binary Interface) targets should be handled automatically in tooling.

One of the benefits of Wasm is the ability to write modules in many languages. The collaboration between Solo.io and Google provides out-of-the-box support for Envoy filters written in C++, Rust, and AssemblyScript. We will continue to add support for more languages.

Wasm extensions use the Application Binary Interface (ABI) within the Envoy proxy to which they are deployed. The WebAssembly Hub provides strong ABI versioning guarantees between Envoy, Istio, and Gloo to prevent unpredictable behavior and bugs. All you have to worry about is writing your extension code.

Lastly, like Docker, the WebAssembly Hub stores and distributes Wasm extensions as OCI images. This makes pushing, pulling, and running extensions as easy as Docker containers. Wasm extension images are versioned and cryptographically secure, making it safe to run extensions locally the same way you would in production. This allows you to build and push as well as trust the source when they pull down and deploy images.

WebAssembly Hub with Istio

The WebAssembly Hub now fully automates the process of deploying Wasm extensions to Istio, (as well as other Envoy-based frameworks like Gloo API Gateway) installed in Kubernetes. With this deployment feature, the WebAssembly Hub relieves the operator or end user from having to manually configure the Envoy proxy in their Istio service mesh to use their WebAssembly modules.

Take a look at the following video to see just how easy it is to get started with WebAssembly and Istio:

Get Started

We hope that the WebAssembly Hub will become a meeting place for the community to share, discover, and distribute Wasm extensions. By providing a great user experience, we hope to make developing, installing, and running Wasm easier and more rewarding. Join us at the WebAssembly Hub, share your extensions and ideas, and join an upcoming webinar.

]]>
Wed, 25 Mar 2020 00:00:00 +0000/v1.24//blog/2020/wasmhub-istio//v1.24//blog/2020/wasmhub-istio/wasmextensibilityalphaperformanceoperator
Introducing istiod: simplifying the control planeMicroservices are a great pattern when they map services to disparate teams that deliver them, or when the value of independent rollout and the value of independent scale are greater than the cost of orchestration. We regularly talk to customers and teams running Istio in the real world, and they told us that none of these were the case for the Istio control plane. So, in Istio 1.5, we’ve changed how Istio is packaged, consolidating the control plane functionality into a single binary called istiod.

History of the Istio control plane

Istio implements a pattern that has been in use at both Google and IBM for many years, which later became known as “service mesh”. By pairing client and server processes with proxy servers, they act as an application-aware data plane that’s not simply moving packets around hosts, or pulses over wires.

This pattern helps the world come to terms with microservices: fine-grained, loosely-coupled services connected via lightweight protocols. The common cross-platform and cross-language standards like HTTP and gRPC that replace proprietary transports, and the widespread presence of the needed libraries, empower different teams to write different parts of an overall architecture in whatever language makes the most sense. Furthermore, each service can scale independently as needed. A desire to implement security, observability and traffic control for such a network powers Istio’s popularity.

Istio’s control plane is, itself, a modern, cloud-native application. Thus, it was built from the start as a set of microservices. Individual Istio components like service discovery (Pilot), configuration (Galley), certificate generation (Citadel) and extensibility (Mixer) were all written and deployed as separate microservices. The need for these components to communicate securely and be observable, provided opportunities for Istio to eat its own dogfood (or “drink its own champagne”, to use a more French version of the metaphor!).

The cost of complexity

Good teams look back upon their choices and, with the benefit of hindsight, revisit them. Generally, when a team adopts microservices and their inherent complexity, they look for improvements in other areas to justify the tradeoffs. Let’s look at the Istio control plane through that lens.

  • Microservices empower you to write in different languages. The data plane (the Envoy proxy) is written in C++, and this boundary benefits from a clean separation in terms of the xDS APIs. However, all of the Istio control plane components are written in Go. We were able to choose the appropriate language for the appropriate job: highly performant C++ for the proxy, but accessible and speedy-development for everything else.

  • Microservices empower you to allow different teams to manage services individually.. In the vast majority of Istio installations, all the components are installed and operated by a single team or individual. The componentization done within Istio is aligned along the boundaries of the development teams who build it. This would make sense if the Istio components were delivered as a managed service by the people who wrote them, but this is not the case! Making life simpler for the development teams had an outsized impact of the usability for the orders-of-magnitude more users.

  • Microservices empower you to decouple versions, and release different components at different times. All the components of the control plane have always been released at the same version, at the same time. We have never tested or supported running different versions of (for example) Citadel and Pilot.

  • Microservices empower you to scale components independently. In Istio 1.5, control plane costs are dominated by a single feature: serving the Envoy xDS APIs that program the data plane. Every other feature has a marginal cost, which means there is very little value to having those features in separately-scalable microservices.

  • Microservices empower you to maintain security boundaries. Another good reason to separate an application into different microservices is if they have different security roles. Multiple Istio microservices like the sidecar injector, the Envoy bootstrap, Citadel, and Pilot hold nearly equivalent permissions to change the proxy configuration. Therefore, exploiting any of these services would cause near equivalent damage. When you deploy Istio, all the components are installed by default into the same Kubernetes namespace, offering limited security isolation.

The benefit of consolidation: introducing istiod

Having established that many of the common benefits of microservices didn’t apply to the Istio control plane, we decided to unify them into a single binary: istiod (the ’d’ is for daemon).

Let’s look at the benefits of the new packaging:

  • Installation becomes easier. Fewer Kubernetes deployments and associated configurations are required, so the set of configuration options and flags for Istio is reduced significantly. In the simplest case, you can start the Istio control plane, with all features enabled, by starting a single Pod.

  • Configuration becomes easier. Many of the configuration options that Istio has today are ways to orchestrate the control plane components, and so are no longer needed. You also no longer need to change cluster-wide PodSecurityPolicy to deploy Istio.

  • Using VMs becomes easier. To add a workload to a mesh, you now just need to install one agent and the generated certificates. That agent connects back to only a single service.

  • Maintenance becomes easier. Installing, upgrading, and removing Istio no longer require a complicated dance of version dependencies and startup orders. For example: To upgrade, you only need to start a new istiod version alongside your existing control plane, canary it, and then move all traffic over to it.

  • Scalability becomes easier. There is now only one component to scale.

  • Debugging becomes easier. Fewer components means less cross-component environmental debugging.

  • Startup time goes down. Components no longer need to wait for each other to start in a defined order.

  • Resource usage goes down and responsiveness goes up. Communication between components becomes guaranteed, and not subject to gRPC size limits. Caches can be shared safely, which decreases the resource footprint as a result.

istiod unifies functionality that Pilot, Galley, Citadel and the sidecar injector previously performed, into a single binary.

A separate component, the istio-agent, helps each sidecar connect to the mesh by securely passing configuration and secrets to the Envoy proxies. While the agent, strictly speaking, is still part of the control plane, it runs on a per-pod basis. We’ve further simplified by rolling per-node functionality that used to run as a DaemonSet, into that per-pod agent.

Extra for experts

There will still be some cases where you might want to run Istio components independently, or replace certain components.

Some users might want to use a Certificate Authority (CA) outside the mesh, and we have documentation on how to do that. If you do your certificate provisioning using a different tool, we can use that instead of the built-in CA.

Moving forward

At its heart, istiod is just a packaging and optimization change. It’s built on the same code and API contracts as the separate components, and remains covered by our comprehensive test suite. This gives us confidence in making it the default in Istio 1.5. The service is now called istiod - you’ll see an istio-pilot for existing proxies as the upgrade process completes.

While the move to istiod may seem like a big change, and is a huge improvement for the people who administer and maintain the mesh, it won’t make the day-to-day life of using Istio any different. istiod is not changing any of the APIs used to configure your mesh, so your existing processes will all stay the same.

Does this change imply that microservice are a mistake for all workloads and architectures? Of course not. They are a tool in a toolbelt, and they work best when they are reflected in your organizational reality. Instead, this change shows a willingness in the project to change based on user feedback, and a continued focus on simplification for all users. Microservices have to be right sized, and we believe we have found the right size for Istio.

]]>
Thu, 19 Mar 2020 00:00:00 +0000/v1.24//blog/2020/istiod//v1.24//blog/2020/istiod/istiodcontrol planeoperator
Declarative WebAssembly deployment for IstioAs outlined in the Istio 2020 trade winds blog and more recently announced with Istio 1.5, WebAssembly (Wasm) is now an (alpha) option for extending the functionality of the Istio service proxy (Envoy proxy). With Wasm, users can build support for new protocols, custom metrics, loggers, and other filters. Working closely with Google, we in the community (Solo.io) have focused on the user experience of building, socializing, and deploying Wasm extensions to Istio. We’ve announced WebAssembly Hub and associated tooling to build a “docker-like” experience for working with Wasm.

Background

With the WebAssembly Hub tooling, we can use the wasme CLI to easily bootstrap a Wasm project for Envoy, push it to a repository, and then pull/deploy it to Istio. For example, to deploy a Wasm extension to Istio with wasme we can run the following:

$  wasme deploy istio webassemblyhub.io/ceposta/demo-add-header:v0.2 \
  --id=myfilter \
  --namespace=bookinfo \
  --config 'tomorrow'

This will add the demo-add-header extension to all workloads running in the bookinfo namespace. We can get more fine-grained control over which workloads get the extension by using the --labels parameter:

$  wasme deploy istio webassemblyhub.io/ceposta/demo-add-header:v0.2 \
  --id=myfilter  \
  --namespace=bookinfo  \
  --config 'tomorrow' \
  --labels app=details

This is a much easier experience than manually creating EnvoyFilter resources and trying to get the Wasm module to each of the pods that are part of the workload you’re trying to target. However, this is a very imperative approach to interacting with Istio. Just like users typically don’t use kubectl directly in production and prefer a declarative, resource-based workflow, we want the same for making customizations to our Istio proxies.

A declarative approach

The WebAssembly Hub tooling also includes an operator for deploying Wasm extensions to Istio workloads. The operator allows users to define their WebAssembly extensions using a declarative format and leave it to the operator to rectify the deployment. For example, we use a FilterDeployment resource to define what image and workloads need the extension:

apiVersion: wasme.io/v1
kind: FilterDeployment
metadata:
  name: bookinfo-custom-filter
  namespace: bookinfo
spec:
  deployment:
    istio:
      kind: Deployment
      labels:
        app: details
  filter:
    config: 'world'
    image: webassemblyhub.io/ceposta/demo-add-header:v0.2

We could then take this FilterDeployment document and version it with the rest of our Istio resources. You may be wondering why we need this Custom Resource to configure Istio’s service proxy to use a Wasm extension when Istio already has the EnvoyFilter resource.

Let’s take a look at exactly how all of this works under the covers.

How it works

Under the covers the operator is doing a few things that aid in deploying and configuring a Wasm extension into the Istio service proxy (Envoy Proxy).

  • Set up local cache of Wasm extensions
  • Pull desired Wasm extension into the local cache
  • Mount the wasm-cache into appropriate workloads
  • Configure Envoy with EnvoyFilter CRD to use the Wasm filter
Understanding how wasme operator works

At the moment, the Wasm image needs to be published into a registry for the operator to correctly cache it. The cache pods run as DaemonSet on each node so that the cache can be mounted into the Envoy container. This is being improved, as it’s not the ideal mechanism. Ideally we wouldn’t have to deal with mounting anything and could stream the module to the proxy directly over HTTP, so stay tuned for updates (should land within next few days). The mount is established by using the sidecar.istio.io/userVolume and sidecar.istio.io/userVolumeMount annotations. See the docs on Istio Resource Annotations for more about how that works.

Once the Wasm module is cached correctly and mounted into the workload’s service proxy, the operator then configures the EnvoyFilter resources.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: details-v1-myfilter
  namespace: bookinfo
spec:
  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: SIDECAR_INBOUND
      listener:
        filterChain:
          filter:
            name: envoy.http_connection_manager
            subFilter:
              name: envoy.router
    patch:
      operation: INSERT_BEFORE
      value:
        config:
          config:
            configuration: tomorrow
            name: myfilter
            rootId: add_header
            vmConfig:
              code:
                local:
                  filename: /var/local/lib/wasme-cache/44bf95b368e78fafb663020b43cf099b23fc6032814653f2f47e4d20643e7267
              runtime: envoy.wasm.runtime.v8
              vmId: myfilter
        name: envoy.filters.http.wasm
  workloadSelector:
    labels:
      app: details
      version: v1

You can see the EnvoyFilter resource configures the proxy to add the envoy.filter.http.wasm filter and load the Wasm module from the wasme-cache.

Once the Wasm extension is loaded into the Istio service proxy, it will extend the capabilities of the proxy with whatever custom code you introduced.

Next Steps

In this blog we explored options for installing Wasm extensions into Istio workloads. The easiest way to get started with WebAssembly on Istio is to use the wasme tool to bootstrap a new Wasm project with C++, AssemblyScript [or Rust coming really soon!]. For example, to set up a C++ Wasm module, you can run:

$ wasme init ./filter --language cpp --platform istio --platform-version 1.5.x

If we didn’t have the extra flags, wasme init would enter an interactive mode walking you through the correct values to choose.

Take a look at the WebAssembly Hub wasme tooling to get started with Wasm on Istio.

Learn more

]]>
Mon, 16 Mar 2020 00:00:00 +0000/v1.24//blog/2020/deploy-wasm-declarative//v1.24//blog/2020/deploy-wasm-declarative/wasmextensibilityalphaoperator
Redefining extensibility in proxies - introducing WebAssembly to Envoy and IstioSince adopting Envoy in 2016, the Istio project has always wanted to provide a platform on top of which a rich set of extensions could be built, to meet the diverse needs of our users. There are many reasons to add capability to the data plane of a service mesh — to support newer protocols, integrate with proprietary security controls, or enhance observability with custom metrics, to name a few.

Over the last year and a half our team here at Google has been working on adding dynamic extensibility to the Envoy proxy using WebAssembly. We are delighted to share that work with the world today, as well as unveiling WebAssembly (Wasm) for Proxies (Proxy-Wasm): an ABI, which we intend to standardize; SDKs; and its first major implementation, the new, lower-latency Istio telemetry system.

We have also worked closely with the community to ensure that there is a great developer experience for users to get started quickly. The Google team has been working closely with the team at Solo.io who have built the WebAssembly Hub, a service for building, sharing, discovering and deploying Wasm extensions. With the WebAssembly Hub, Wasm extensions are as easy to manage, install and and run as containers.

This work is being released today in Alpha and there is still lots of work to be done, but we are excited to get this into the hands of developers so they can start experimenting with the tremendous possibilities this opens up.

Background

The need for extensibility has been a founding tenet of both the Istio and Envoy projects, but the two projects took different approaches. Istio project focused on enabling a generic out-of-process extension model called Mixer with a lightweight developer experience, while Envoy focused on in-proxy extensions.

Each approach has its share of pros and cons. The Istio model led to significant resource inefficiencies that impacted tail latencies and resource utilization. This model was also intrinsically limited - for example, it was never going to provide support for implementing custom protocol handling.

The Envoy model imposed a monolithic build process, and required extensions to be written in C++, limiting the developer ecosystem. Rolling out a new extension to the fleet required pushing new binaries and rolling restarts, which can be difficult to coordinate, and risk downtime. This also incentivized developers to upstream extensions into Envoy that were used by only a small percentage of deployments, just to piggyback on its release mechanisms.

Over time some of the most performance-sensitive features of Istio have been upstreamed into Envoy - policy checks on traffic, and JWT authentication, for example. Still, we have always wanted to converge on a single stack for extensibility that imposes fewer tradeoffs: something that decouples Envoy releases from its extension ecosystem, enables developers to work in their languages of choice, and enables Istio to reliably roll out new capability without downtime risk. Enter WebAssembly.

What is WebAssembly?

WebAssembly (Wasm) is a portable bytecode format for executing code written in multiple languages at near-native speed. Its initial design goals align well with the challenges outlined above, and it has sizable industry support behind it. Wasm is the fourth standard language (following HTML, CSS and JavaScript) to run natively in all the major browsers, having become a W3C Recommendation in December 2019. That gives us confidence in making a strategic bet on it.

While WebAssembly started life as a client-side technology, there are a number of advantages to using it on the server. The runtime is memory-safe and sandboxed for security. There is a large tooling ecosystem for compiling and debugging Wasm in its textual or binary format. The W3C and BytecodeAlliance have become active hubs for other server-side efforts. For example, the Wasm community is standardizing a “WebAssembly System Interface” (WASI) at the W3C, with a sample implementation, which provides an OS-like abstraction to Wasm ‘programs’.

Bringing WebAssembly to Envoy

Over the past 18 months, we have been working with the Envoy community to build Wasm extensibility into Envoy and contribute it upstream. We’re pleased to announce it is available as Alpha in the Envoy build shipped with Istio 1.5, with source in the envoy-wasm development fork and work ongoing to merge it into the main Envoy tree. The implementation uses the WebAssembly runtime built into Google’s high performance V8 engine.

In addition to the underlying runtime, we have also built:

  • A generic Application Binary Interface (ABI) for embedding Wasm in proxies, which means compiled extensions will work across different versions of Envoy - or even other proxies, should they choose to implement the ABI

  • SDKs for easy extension development in C++, Rust and AssemblyScript, with more to follow

  • Comprehensive samples and instructions on how to deploy in Istio and standalone Envoy

  • Abstractions to allow for other Wasm runtimes to be used, including a ’null’ runtime which simply compiles the extension natively into Envoy — very useful for testing and debugging

Using Wasm for extending Envoy brings us several key benefits:

  • Agility: Extensions can be delivered and reloaded at runtime using the Istio control plane. This enables a fast develop → test → release cycle for extensions without requiring Envoy rollouts.

  • Stock releases: Once merging into the main tree is complete, Istio and others will be able to use stock releases of Envoy, instead of custom builds. This will also free the Envoy community to move some of the built-in extensions to this model, thereby reducing their supported footprint.

  • Reliability and isolation: Extensions are deployed inside a sandbox with resource constraints, which means they can now crash, or leak memory, without bringing the whole Envoy process down. CPU and memory usage can also be constrained.

  • Security: The sandbox has a clearly defined API for communicating with Envoy, so extensions only have access to, and can modify, a limited number of properties of a connection or request. Furthermore, because Envoy mediates this interaction, it can hide or sanitize sensitive information from the extension (e.g. “Authorization” and “Cookie” HTTP headers, or the client’s IP address).

  • Flexibility: over 30 programming languages can be compiled to WebAssembly, allowing developers from all backgrounds - C++, Go, Rust, Java, TypeScript, etc. - to write Envoy extensions in their language of choice.

“I am extremely excited to see WASM support land in Envoy; this is the future of Envoy extensibility, full stop. Envoy’s WASM support coupled with a community driven hub will unlock an incredible amount of innovation in the networking space across both service mesh and API gateway use cases. I can’t wait to see what the community builds moving forward.” – Matt Klein, Envoy creator.

For technical details of the implementation, look out for an upcoming post to the Envoy blog.

The Proxy-Wasm interface between host environment and extensions is deliberately proxy agnostic. We’ve built it into Envoy, but it was designed to be adopted by other proxy vendors. We want to see a world where you can take an extension written for Istio and Envoy and run it in other infrastructure; you’ll hear more about that soon.

Building on WebAssembly in Istio

Istio moved several of its extensions into its build of Envoy as part of the 1.5 release, in order to significantly improve performance. While doing that work we have been testing to ensure those same extensions can compile and run as Proxy-Wasm modules with no variation in behavior. We’re not quite ready to make this setup the default, given that we consider Wasm support to be Alpha; however, this has given us a lot of confidence in our general approach and in the host environment, ABI and SDKs that have been developed.

We have also been careful to ensure that the Istio control plane and its Envoy configuration APIs are Wasm-ready. We have samples to show how several common customizations such as custom header decoding or programmatic routing can be performed which are common asks from users. As we move this support to Beta, you will see documentation showing best practices for using Wasm with Istio.

Finally, we are working with the many vendors who have written Mixer adapters, to help them with a migration to Wasm — if that is the best path forward. Mixer will move to a community project in a future release, where it will remain available for legacy use cases.

Developer Experience

Powerful tooling is nothing without a great developer experience. Solo.io recently announced the release of WebAssembly Hub, a set of tools and repository for building, deploying, sharing and discovering Envoy Proxy Wasm extensions for Envoy and Istio.

The WebAssembly Hub fully automates many of the steps required for developing and deploying Wasm extensions. Using WebAssembly Hub tooling, users can easily compile their code - in any supported language - into Wasm extensions. The extensions can then be uploaded to the Hub registry, and be deployed and undeployed to Istio with a single command.

Behind the scenes the Hub takes care of much of the nitty-gritty, such as pulling in the correct toolchain, ABI version verification, permission control, and more. The workflow also eliminates toil with configuration changes across Istio service proxies by automating the deployment of your extensions. This tooling helps users and operators avoid unexpected behaviors due to misconfiguration or version mismatches.

The WebAssembly Hub tools provide a powerful CLI as well as an elegant and easy-to-use graphical user interface. An important goal of the WebAssembly Hub is to simplify the experience around building Wasm modules and provide a place of collaboration for developers to share and discover useful extensions.

Check out the getting started guide to create your first Proxy-Wasm extension.

Next Steps

In addition to working towards a beta release, we are committed to making sure that there is a durable community around Proxy-Wasm. The ABI needs to be finalized, and turning it into a standard will be done with broader feedback within the appropriate standards body. Completing upstreaming support into the Envoy mainline is still in progress. We are also seeking an appropriate community home for the tooling and the WebAssembly Hub

Learn more

]]>
Thu, 05 Mar 2020 00:00:00 +0000/v1.24//blog/2020/wasm-announce//v1.24//blog/2020/wasm-announce/wasmextensibilityalphaperformanceoperator
Istio in 2020 - Following the Trade WindsIstio solves real problems that people encounter running microservices. Even very early pre-release versions helped users debug the latency in their architecture, increase the reliability of services, and transparently secure traffic behind the firewall.

Last year, the Istio project experienced major growth. After a 9-month gestation before the 1.1 release in Q1, we set a goal of having a quarterly release cadence. We knew it was important to deliver value consistently and predictably. With three releases landing in the successive quarters as planned, we are proud to have reached that goal.

During that time, we improved our build and test infrastructure, resulting in higher quality and easier release cycles. We doubled down on user experience, adding many commands to make operating and debugging the mesh easier. We also saw tremendous growth in the number of developers and companies contributing to the product - culminating in us being #4 on GitHub’s top ten list of fastest growing projects!

We have ambitious goals for Istio in 2020 and there are many major efforts underway, but at the same time we strongly believe that good infrastructure should be “boring.” Using Istio in production should be a seamless experience; performance should not be a concern, upgrades should be a non-event and complex tasks should be automated away. With our investment in a more powerful extensibility story we think the pace of innovation in the service mesh space can increase while Istio focuses on being gloriously dull. More details on our major efforts in 2020 below.

Sleeker, smoother and faster

Istio provided for extensibility from day one, implemented by a component called Mixer. Mixer is a platform that allows custom adapters to act as an intermediary between the data plane and the backends you use for policy or telemetry. Mixer necessarily added overhead to requests because it required extensions to be out-of-process. So, we’re moving to a model that enables extension directly in the proxies instead.

Most of Mixer’s use cases for policy enforcement are already addressed with Istio’s authentication and authorization policies, which allow you to control workload-to-workload and end-user-to-workload authorization directly in the proxy. Common monitoring use cases have already moved into the proxy too - we have introduced in-proxy support for sending telemetry to Prometheus and Stackdriver.

Our benchmarking shows that the new telemetry model reduces our latency dramatically and gives us industry-leading performance, with 50% reductions in both latency and CPU consumption.

A new model for Istio extensibility

The model that replaces Mixer uses extensions in Envoy to provide even more capability. The Istio community is leading the implementation of a WebAssembly (Wasm) runtime in Envoy, which lets us implement extensions that are modular, sandboxed, and developed in one of over 20 languages. Extensions can be dynamically loaded and reloaded while the proxy continues serving traffic. Wasm extensions will also be able to extend the platform in ways that Mixer simply couldn’t. They can act as custom protocol handlers and transform payloads as they pass through Envoy — in short they can do the same things as modules built into Envoy.

We’re working with the Envoy community on ways to discover and distribute these extensions. We want to make WebAssembly extensions as easy to install and run as containers. Many of our partners have written Mixer adapters, and together we are getting them ported to Wasm. We are also developing guides and codelabs on how to write your own extensions for custom integrations.

By changing the extension model, we were also able to remove dozens of CRDs. You no longer need a unique CRD for every piece of software you integrate with Istio.

Installing Istio 1.5 with the ‘preview’ configuration profile won’t install Mixer. If you upgrade from a previous release, or install the ‘default’ profile, we still keep Mixer around, to be safe. When using Prometheus or Stackdriver for metrics, we recommend you try out the new mode and see how much your performance improves.

You can keep Mixer installed and enabled if you need it. Eventually Mixer will become a separately released add-on to Istio that is part of the istio-ecosystem.

Fewer moving parts

We are also simplifying the deployment of the rest of the control plane. To that end, we combined several of the control plane components into a single component: Istiod. This binary includes the features of Pilot, Citadel, Galley, and the sidecar injector. This approach improves many aspects of installing and managing Istio – reducing installation and configuration complexity, maintenance effort, and issue diagnosis time while increasing responsiveness. Read more about Istiod in this post from Christian Posta.

We are shipping Istiod as the default for all profiles in 1.5.

To reduce the per-node footprint, we are getting rid of the node-agent, used to distribute certificates, and moving its functionality to the istio-agent, which already runs in each Pod. For those of you who like pictures we are moving from this …

The Istio architecture today

to this…

The Istio architecture in 2020

In 2020, we will continue to invest in onboarding to achieve our goal of a “zero config” default that doesn’t require you to change any of your application’s configuration to take advantage of most Istio features.

Improved lifecycle management

To improve Istio’s life-cycle management, we moved to an operator-based installation. We introduced the IstioOperator CRD and two installation modes:

  • Human-triggered: use istioctl to apply the settings to the cluster.
  • Machine-triggered: use a controller that is continually watching for changes in that CRD and affecting those in real time.

In 2020 upgrades will getting easier too. We will add support for “canarying” new versions of the Istio control plane, which allows you to run a new version alongside the existing version and gradually switch your data plane over to use the new one.

Secure By Default

Istio already provides the fundamentals for strong service security: reliable workload identity, robust access policies and comprehensive audit logging. We’re stabilizing APIs for these features; many Alpha APIs are moving to Beta in 1.5, and we expect them to all be v1 by the end of 2020. To learn more about the status of our APIs, see our features page.

Network traffic is also becoming more secure by default. After many users enabled it in preview, automated rollout of mutual TLS is becoming the recommended practice in Istio 1.5.

In addition we will make Istio require fewer privileges and simplify its dependencies which in turn make it a more robust system. Historically, you had to mount certificates into Envoy using Kubernetes Secrets, which were mounted as files into each proxy. By leveraging the Secret Discovery Service we can distribute these certificates securely without concern of them being intercepted by other workloads on the machine. This mode will become the default in 1.5.

Getting rid of the node-agent not only simplifies the deployment, but also removes the requirement for a cluster-wide PodSecurityPolicy, further improving the security posture of your cluster.

Other features

Here’s a snapshot of some more exciting things you can expect from Istio in 2020:

  • Integration with more hosted Kubernetes environments - service meshes powered by Istio are currently available from 15 vendors, including Google, IBM, Red Hat, VMware, Alibaba and Huawei
  • More investment in istioctl and its ability to help diagnose problems
  • Better integration of VM-based workloads into meshes
  • Continued work towards making multi-cluster and multi-network meshes easier to configure, maintain, and run
  • Integration with more service discovery systems, including Functions-as-a-Service
  • Implementation of the new Kubernetes service APIs, which are currently in development
  • An enhancement repository, to track feature development
  • Making it easier to run Istio without needing Kubernetes!

From the seas to the skies, we’re excited to see where you take Istio next.

]]>
Tue, 03 Mar 2020 00:00:00 +0000/v1.24//blog/2020/tradewinds-2020//v1.24//blog/2020/tradewinds-2020/roadmapsecurityperformanceoperator
Remove cross-pod unix domain socketsIn Istio versions before 1.5, during secret discovery service (SDS) execution, the SDS client and the SDS server communicate through a cross-pod Unix domain socket (UDS), which needs to be protected by Kubernetes pod security policies.

With Istio 1.5, Pilot Agent, Envoy, and Citadel Agent will be running in the same container (the architecture is shown in the following diagram). To defend against attackers eavesdropping on the cross-pod UDS between Envoy (SDS client) and Citadel Agent (SDS server), Istio 1.5 merges Pilot Agent and Citadel Agent into a single Istio Agent and makes the UDS between Envoy and Citadel Agent private to the Istio Agent container. The Istio Agent container is deployed as the sidecar of the application service container.

The architecture of Istio Agent
]]>
Thu, 20 Feb 2020 00:00:00 +0000/v1.24//blog/2020/istio-agent//v1.24//blog/2020/istio-agent/securitysecret discovery serviceunix domain socket
Multicluster Istio configuration and service discovery using AdmiralAt Intuit, we read the blog post Multi-Mesh Deployments for Isolation and Boundary Protection and immediately related to some of the problems mentioned. We realized that even though we wanted to configure a single multi-cluster mesh, instead of a federation of multiple meshes as described in the blog post, the same non-uniform naming issues also applied in our environment. This blog post explains how we solved these problems using Admiral, an open source project under istio-ecosystem in GitHub.

Background

Using Istio, we realized the configuration for multi-cluster was complex and challenging to maintain over time. As a result, we chose the model described in Multi-Cluster Istio Service Mesh with replicated control planes for scalability and other operational reasons. Following this model, we had to solve these key requirements before widely adopting an Istio service mesh:

  • Creation of service DNS entries decoupled from the namespace, as described in Features of multi-mesh deployments.
  • Service discovery across many clusters.
  • Supporting active-active & HA/DR deployments. We also had to support these crucial resiliency patterns with services being deployed in globally unique namespaces across discrete clusters.

We have over 160 Kubernetes clusters with a globally unique namespace name across all clusters. In this configuration, we can have the same service workload deployed in different regions running in namespaces with different names. As a result, following the routing strategy mentioned in Multicluster version routing, the example name foo.namespace.global wouldn’t work across clusters. We needed a globally unique and discoverable service DNS that resolves service instances in multiple clusters, each instance running/addressable with its own unique Kubernetes FQDN. For example, foo.global should resolve to both foo.uswest2.svc.cluster.local & foo.useast2.svc.cluster.local if foo is running in two Kubernetes clusters with different names. Also, our services need additional DNS names with different resolution and global routing properties. For example, foo.global should resolve locally first, then route to a remote instance using topology routing, while foo-west.global and foo-east.global (names used for testing) should always resolve to the respective regions.

Contextual Configuration

After further investigation, it was apparent that configuration needed to be contextual: each cluster needs a configuration specifically tailored for its view of the world.

For example, we have a payments service consumed by orders and reports. The payments service has a HA/DR deployment across us-east (cluster 3) and us-west (cluster 2). The payments service is deployed in namespaces with different names in each region. The orders service is deployed in a different cluster as payments in us-west (cluster 1). The reports service is deployed in the same cluster as payments in us-west (cluster 2).

Cross cluster workload communication with Istio

Istio ServiceEntry yaml for payments service in Cluster 1 and Cluster 2 below illustrates the contextual configuration that other services need to use the payments service:

Cluster 1 Service Entry

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: payments.global-se
spec:
  addresses:
  - 240.0.0.10
  endpoints:
  - address: ef394f...us-east-2.elb.amazonaws.com
    locality: us-east-2
    ports:
      http: 15443
  - address: ad38bc...us-west-2.elb.amazonaws.com
    locality: us-west-2
    ports:
      http: 15443
  hosts:
  - payments.global
  location: MESH_INTERNAL
  ports:
  - name: http
    number: 80
    protocol: http
  resolution: DNS

Cluster 2 Service Entry

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: payments.global-se
spec:
  addresses:
  - 240.0.0.10
  endpoints:
  - address: ef39xf...us-east-2.elb.amazonaws.com
    locality: us-east-2
    ports:
      http: 15443
  - address: payments.default.svc.cluster.local
    locality: us-west-2
    ports:
      http: 80
  hosts:
  - payments.global
  location: MESH_INTERNAL
  ports:
  - name: http
    number: 80
    protocol: http
  resolution: DNS

The payments ServiceEntry (Istio CRD) from the point of view of the reports service in Cluster 2, would set the locality us-west pointing to the local Kubernetes FQDN and locality us-east pointing to the istio-ingressgateway (load balancer) for Cluster 3. The payments ServiceEntry from the point of view of the orders service in Cluster 1, will set the locality us-west pointing to Cluster 2 istio-ingressgateway and locality us-east pointing to the istio-ingressgateway for Cluster 3.

But wait, there’s even more complexity: What if the payment services want to move traffic to the us-east region for a planned maintenance in us-west? This would require the payments service to change the Istio configuration in all of their clients’ clusters. This would be nearly impossible to do without automation.

Admiral to the Rescue: Admiral is that Automation

Admiral is a controller of Istio control planes.

Cross cluster workload communication with Istio and Admiral

Admiral provides automatic configuration for an Istio mesh spanning multiple clusters to work as a single mesh based on a unique service identifier that associates workloads running on multiple clusters to a service. It also provides automatic provisioning and syncing of Istio configuration across clusters. This removes the burden on developers and mesh operators, which helps scale beyond a few clusters.

Admiral CRDs

Global Traffic Routing

With Admiral’s global traffic policy CRD, the payments service can update regional traffic weights and Admiral updates the Istio configuration in all clusters that consume the payments service.

apiVersion: admiral.io/v1alpha1
kind: GlobalTrafficPolicy
metadata:
  name: payments-gtp
spec:
  selector:
    identity: payments
  policy:
  - dns: default.payments.global
    lbType: 1
    target:
    - region: us-west-2/*
      weight: 10
    - region: us-east-2/*
      weight: 90

In the example above, 90% of the payments service traffic is routed to the us-east region. This Global Traffic Configuration is automatically converted into Istio configuration and contextually mapped into Kubernetes clusters to enable multi-cluster global routing for the payments service for its clients within the Mesh.

This Global Traffic Routing feature relies on Istio’s locality load-balancing per service available in Istio 1.5 or later.

Dependency

The Admiral Dependency CRD allows us to specify a service’s dependencies based on a service identifier. This optimizes the delivery of Admiral generated configuration only to the required clusters where the dependent clients of a service are running (instead of writing it to all clusters). Admiral also configures and/or updates the Sidecar Istio CRD in the client’s workload namespace to limit the Istio configuration to only its dependencies. We use service-to-service authorization information recorded elsewhere to generate this dependency records for Admiral to use.

An example dependency for the orders service:

apiVersion: admiral.io/v1alpha1
kind: Dependency
metadata:
  name: dependency
  namespace: admiral
spec:
  source: orders
  identityLabel: identity
  destinations:
  - payments

Dependency is optional and a missing dependency for a service will result in an Istio configuration for that service pushed to all clusters.

Summary

Admiral provides a new Global Traffic Routing and unique service naming functionality to address some challenges posed by the Istio model described in multi-cluster deployment with replicated control planes. It removes the need for manual configuration synchronization between clusters and generates contextual configuration for each cluster. This makes it possible to operate a Service Mesh composed of many Kubernetes clusters.

We think Istio/Service Mesh community would benefit from this approach, so we open sourced Admiral and would love your feedback and support!

]]>
Sun, 05 Jan 2020 00:00:00 +0000/v1.24//blog/2020/multi-cluster-mesh-automation//v1.24//blog/2020/multi-cluster-mesh-automation/traffic-managementautomationconfigurationmulticlustermulti-meshgatewayfederatedglobalidentifer
Secure Webhook ManagementIstio has two webhooks: Galley and the sidecar injector. Galley validates Kubernetes resources and the sidecar injector injects sidecar containers into Istio.

By default, Galley and the sidecar injector manage their own webhook configurations. This can pose a security risk if they are compromised, for example, through buffer overflow attacks. Configuring a webhook is a highly privileged operation as a webhook may monitor and mutate all Kubernetes resources.

In the following example, the attacker compromises Galley and modifies the webhook configuration of Galley to eavesdrop on all Kubernetes secrets (the clientConfig is modified by the attacker to direct the secrets resources to a service owned by the attacker).

An example attack

To protect against this kind of attack, Istio 1.4 introduces a new feature to securely manage webhooks using istioctl:

  1. istioctl, instead of Galley and the sidecar injector, manage the webhook configurations. Galley and the sidecar injector are de-privileged so even if they are compromised, they will not be able to alter the webhook configurations.

  2. Before configuring a webhook, istioctl will verify the webhook server is up and that the certificate chain used by the webhook server is valid. This reduces the errors that can occur before a server is ready or if a server has invalid certificates.

To try this new feature, refer to the Istio webhook management task.

]]>
Thu, 14 Nov 2019 00:00:00 +0000/v1.24//blog/2019/webhook//v1.24//blog/2019/webhook/securitykuberneteswebhook
Introducing the Istio v1beta1 Authorization PolicyIstio 1.4 introduces the v1beta1 authorization policy, which is a major update to the previous v1alpha1 role-based access control (RBAC) policy. The new policy provides these improvements:

  • Aligns with Istio configuration model.
  • Improves the user experience by simplifying the API.
  • Supports more use cases (e.g. Ingress/Egress gateway support) without added complexity.

The v1beta1 policy is not backward compatible and requires a one time conversion. A tool is provided to automate this process. The previous configuration resources ClusterRbacConfig, ServiceRole, and ServiceRoleBinding will not be supported from Istio 1.6 onwards.

This post describes the new v1beta1 authorization policy model, its design goals and the migration from v1alpha1 RBAC policies. See the authorization concept page for a detailed in-depth explanation of the v1beta1 authorization policy.

We welcome your feedback about the v1beta1 authorization policy at discuss.istio.io.

Background

To date, Istio provided RBAC policies to enforce access control on services using three configuration resources: ClusterRbacConfig, ServiceRole and ServiceRoleBinding. With this API, users have been able to enforce control access at mesh-level, namespace-level and service-level. Like other RBAC policies, Istio RBAC uses the same concept of role and binding for granting permissions to identities.

Although Istio RBAC has been working reliably, we’ve found that many improvements were possible.

For example, users have mistakenly assumed that access control enforcement happens at service-level because ServiceRole uses service to specify where to apply the policy, however, the policy is actually applied on workloads, the service is only used to find the corresponding workload. This nuance is significant when multiple services are referring to the same workload. A ServiceRole for service A will also affect service B if the two services are referring to the same workload, which can cause confusion and incorrect configuration.

An other example is that it’s proven difficult for users to maintain and manage the Istio RBAC configurations because of the need to deeply understand three related resources.

Design goals

The new v1beta1 authorization policy had several design goals:

  • Align with Istio Configuration Model for better clarity on the policy target. The configuration model provides a unified configuration hierarchy, resolution and target selection.

  • Improve the user experience by simplifying the API. It’s easier to manage one custom resource definition (CRD) that includes all access control specifications, instead of multiple CRDs.

  • Support more use cases without added complexity. For example, allow the policy to be applied on Ingress/Egress gateway to enforce access control for traffic entering/exiting the mesh.

AuthorizationPolicy

An AuthorizationPolicy custom resource enables access control on workloads. This section gives an overview of the changes in the v1beta1 authorization policy.

An AuthorizationPolicy includes a selector and a list of rule. The selector specifies on which workload to apply the policy and the list of rule specifies the detailed access control rule for the workload.

The rule is additive, which means a request is allowed if any rule allows the request. Each rule includes a list of from, to and when, which specifies who is allowed to do what under which conditions.

The selector replaces the functionality provided by ClusterRbacConfig and the services field in ServiceRole. The rule replaces the other fields in the ServiceRole and ServiceRoleBinding.

Example

The following authorization policy applies to workloads with app: httpbin and version: v1 label in the foo namespace:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: foo
spec:
 selector:
   matchLabels:
     app: httpbin
     version: v1
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep"]
   to:
   - operation:
       methods: ["GET"]
   when:
   - key: request.headers[version]
     values: ["v1", "v2"]

The policy allows principal cluster.local/ns/default/sa/sleep to access the workload using the GET method when the request includes a version header of value v1 or v2. Any requests not matched with the policy will be denied by default.

Assuming the httpbin service is defined as:

apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: foo
spec:
  selector:
    app: httpbin
    version: v1
  ports:
    # omitted

You would need to configure three resources to achieve the same result in v1alpha1:

apiVersion: "rbac.istio.io/v1alpha1"
kind: ClusterRbacConfig
metadata:
  name: default
spec:
  mode: 'ON_WITH_INCLUSION'
  inclusion:
    services: ["httpbin.foo.svc.cluster.local"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: httpbin
  namespace: foo
spec:
  rules:
  - services: ["httpbin.foo.svc.cluster.local"]
    methods: ["GET"]
    constraints:
    - key: request.headers[version]
      values: ["v1", "v2"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: httpbin
  namespace: foo
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/sleep"
  roleRef:
    kind: ServiceRole
    name: "httpbin"

Workload selector

A major change in the v1beta1 authorization policy is that it now uses workload selector to specify where to apply the policy. This is the same workload selector used in the Gateway, Sidecar and EnvoyFilter configurations.

The workload selector makes it clear that the policy is applied and enforced on workloads instead of services. If a policy applies to a workload that is used by multiple different services, the same policy will affect the traffic to all the different services.

You can simply leave the selector empty to apply the policy to all workloads in a namespace. The following policy applies to all workloads in the namespace bar:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: policy
 namespace: bar
spec:
 rules:
 # omitted

Root namespace

A policy in the root namespace applies to all workloads in the mesh in every namespaces. The root namespace is configurable in the MeshConfig and has the default value of istio-system.

For example, you installed Istio in istio-system namespace and deployed workloads in default and bookinfo namespace. The root namespace is changed to istio-config from the default value. The following policy will apply to workloads in every namespace including default, bookinfo and the istio-system:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: policy
 namespace: istio-config
spec:
 rules:
 # omitted

Ingress/Egress Gateway support

The v1beta1 authorization policy can also be applied on ingress/egress gateway to enforce access control on traffic entering/leaving the mesh, you only need to change the selector to make select the ingress/egress workload.

The following policy applies to workloads with the app: istio-ingressgateway label:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: ingress
 namespace: istio-system
spec:
 selector:
   matchLabels:
     app: istio-ingressgateway
 rules:
 # omitted

Remember the authorization policy only applies to workloads in the same namespace as the policy, unless the policy is applied in the root namespace:

  • If you don’t change the default root namespace value (i.e. istio-system), the above policy will apply to workloads with the app: istio-ingressgateway label in every namespace.

  • If you have changed the root namespace to a different value, the above policy will only apply to workloads with the app: istio-ingressgateway label only in the istio-system namespace.

Comparison

The following table highlights the key differences between the old v1alpha1 RBAC policies and the new v1beta1 authorization policy.

Feature

Feature v1alpha1 RBAC policy v1beta1 Authorization Policy
API stability alpha: No backward compatible beta: backward compatible guaranteed
Number of CRDs Three: ClusterRbacConfig, ServiceRole and ServiceRoleBinding Only One: AuthorizationPolicy
Policy target service workload
Deny-by-default behavior Enabled explicitly by configuring ClusterRbacConfig Enabled implicitly with AuthorizationPolicy
Ingress/Egress gateway support Not supported Supported
The "*" value in policy Match all contents (empty and non-empty) Match non-empty contents only

The following tables show the relationship between the v1alpha1 and v1beta1 API.

ClusterRbacConfig

ClusterRbacConfig.Mode AuthorizationPolicy
OFF No policy applied
ON A deny-all policy applied in root namespace
ON_WITH_INCLUSION policies should be applied to namespaces or workloads included by ClusterRbacConfig
ON_WITH_EXCLUSION policies should be applied to namespaces or workloads excluded by ClusterRbacConfig

ServiceRole

ServiceRole AuthorizationPolicy
services selector
paths paths in to
methods methods in to
destination.ip in constraint Not supported
destination.port in constraint ports in to
destination.labels in constraint selector
destination.namespace in constraint Replaced by the namespace of the policy, i.e. the namespace in metadata
destination.user in constraint Not supported
experimental.envoy.filters in constraint experimental.envoy.filters in when
request.headers in constraint request.headers in when

ServiceRoleBinding

ServiceRoleBinding AuthorizationPolicy
user principals in from
group request.auth.claims[group] in when
source.ip in property ipBlocks in from
source.namespace in property namespaces in from
source.principal in property principals in from
request.headers in property request.headers in when
request.auth.principal in property requestPrincipals in from or request.auth.principal in when
request.auth.audiences in property request.auth.audiences in when
request.auth.presenter in property request.auth.presenter in when
request.auth.claims in property request.auth.claims in when

Beyond all the differences, the v1beta1 policy is enforced by the same engine in Envoy and supports the same authenticated identity (mutual TLS or JWT), condition and other primitives (e.g. IP, port and etc.) as the v1alpha1 policy.

Future of the v1alpha1 policy

The v1alpha1 RBAC policy (ClusterRbacConfig, ServiceRole, and ServiceRoleBinding) is deprecated by the v1beta1 authorization policy.

Istio 1.4 continues to support the v1alpha1 RBAC policy to give you enough time to move away from the alpha policies.

Migration from the v1alpha1 policy

Istio only supports one of the two versions for a given workload:

  • If there is only v1beta1 policy for a workload, the v1beta1 policy will be used.
  • If there is only v1alpha1 policy for a workload, the v1alpha1 policy will be used.
  • If there are both v1beta1 and v1alpha1 policies for a workload, only the v1beta1 policy will be used and the the v1alpha1 policy will be ignored.

General Guideline

The typical flow of migrating to v1beta1 policy is to start by checking the ClusterRbacConfig to decide which namespace or service is enabled with RBAC.

For each service enabled with RBAC:

  1. Get the workload selector from the service definition.
  2. Create a v1beta1 policy with the workload selector.
  3. Update the v1beta1 policy for each ServiceRole and ServiceRoleBinding applied to the service.
  4. Apply the v1beta1 policy and monitor the traffic to make sure the policy is working as expected.
  5. Repeat the process for the next service enabled with RBAC.

For each namespace enabled with RBAC:

  1. Apply a v1beta1 policy that denies all traffic to the given namespace.

Migration Example

Assume you have the following v1alpha1 policies for the httpbin service in the foo namespace:

apiVersion: "rbac.istio.io/v1alpha1"
kind: ClusterRbacConfig
metadata:
  name: default
spec:
  mode: 'ON_WITH_INCLUSION'
  inclusion:
    namespaces: ["foo"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: httpbin
  namespace: foo
spec:
  rules:
  - services: ["httpbin.foo.svc.cluster.local"]
    methods: ["GET"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: httpbin
  namespace: foo
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/sleep"
  roleRef:
    kind: ServiceRole
    name: "httpbin"

Migrate the above policies to v1beta1 in the following ways:

  1. Assume the httpbin service has the following workload selector:

    selector:
      app: httpbin
      version: v1
  2. Create a v1beta1 policy with the workload selector:

    apiVersion: security.istio.io/v1beta1
    kind: AuthorizationPolicy
    metadata:
     name: httpbin
     namespace: foo
    spec:
     selector:
       matchLabels:
         app: httpbin
         version: v1
  3. Update the v1beta1 policy with each ServiceRole and ServiceRoleBinding applied to the service:

    apiVersion: security.istio.io/v1beta1
    kind: AuthorizationPolicy
    metadata:
     name: httpbin
     namespace: foo
    spec:
     selector:
       matchLabels:
         app: httpbin
         version: v1
     rules:
     - from:
       - source:
           principals: ["cluster.local/ns/default/sa/sleep"]
       to:
       - operation:
           methods: ["GET"]
  4. Apply the v1beta1 policy and monitor the traffic to make sure it works as expected.

  5. Apply the following v1beta1 policy that denies all traffic to the foo namespace because the foo namespace is enabled with RBAC:

    apiVersion: security.istio.io/v1beta1
    kind: AuthorizationPolicy
    metadata:
     name: deny-all
     namespace: foo
    spec:
     {}

Make sure the v1beta1 policy is working as expected and then you can delete the v1alpha1 policies from the cluster.

Automation of the Migration

To help ease the migration, the istioctl experimental authz convert command is provided to automatically convert the v1alpha1 policies to the v1beta1 policy.

You can evaluate the command but it is experimental in Istio 1.4 and doesn’t support the full v1alpha1 semantics as of the date of this blog post.

The command to support the full v1alpha1 semantics is expected in a patch release following Istio 1.4.

]]>
Thu, 14 Nov 2019 00:00:00 +0000/v1.24//blog/2019/v1beta1-authorization-policy//v1.24//blog/2019/v1beta1-authorization-policy/securityRBACaccess controlauthorization
Introducing the Istio OperatorKubernetes operators provide a pattern for encoding human operational knowledge in software and are a popular way to simplify the administration of software infrastructure components. Istio is a natural candidate for an automated operator as it is challenging to administer.

Up until now, Helm has been the primary tool to install and upgrade Istio. Istio 1.4 introduces a new method of installation using istioctl. This new installation method builds on the strengths of Helm with the addition of the following:

  • Users only need to install one tool: istioctl
  • All API fields are validated
  • Small customizations not in the API don’t require chart or API changes
  • Version specific upgrade hooks can be easily and robustly implemented

The Helm installation method is in the process of deprecation. Upgrading from Istio 1.4 with a version not initially installed with Helm will also be replaced by a new istioctl upgrade feature.

The new istioctl installation commands use a custom resource to configure the installation. The custom resource is part of a new Istio operator implementation intended to simplify the common administrative tasks of installation, upgrade, and complex configuration changes for Istio. Validation and checking for installation and upgrade is tightly integrated with the tools to prevent common errors and simplify troubleshooting.

The Operator API

Every operator implementation requires a custom resource definition (CRD) to define its custom resource, that is, its API. Istio’s operator API is defined by the IstioControlPlane CRD, which is generated from an IstioControlPlane proto. The API supports all of Istio’s current configuration profiles using a single field to select the profile. For example, the following IstioControlPlane resource configures Istio using the demo profile:

apiVersion: install.istio.io/v1alpha2
kind: IstioControlPlane
metadata:
  namespace: istio-operator
  name: example-istiocontrolplane
spec:
  profile: demo

You can then customize the configuration with additional settings. For example, to disable telemetry:

apiVersion: install.istio.io/v1alpha2
kind: IstioControlPlane
metadata:
  namespace: istio-operator
  name: example-istiocontrolplane
spec:
  profile: demo
  telemetry:
    enabled: false

Installing with istioctl

The recommended way to use the Istio operator API is through a new set of istioctl commands. For example, to install Istio into a cluster:

$ istioctl manifest apply -f <your-istiocontrolplane-customresource>

Make changes to the installation configuration by editing the configuration file and executing istioctl manifest apply again.

To upgrade to a new version of Istio:

$ istioctl x upgrade -f <your-istiocontrolplane-config-changes>

In addition to specifying the complete configuration in an IstioControlPlane resource, the istioctl commands can also be passed individual settings using a --set flag:

$ istioctl manifest apply --set telemetry.enabled=false

There are also a number of other istioctl commands that, for example, help you list, display, and compare configuration profiles and manifests.

Refer to the Istio install instructions for more details.

Istio Controller (alpha)

Operator implementations use a Kubernetes controller to continuously monitor their custom resource and apply the corresponding configuration changes. The Istio controller monitors an IstioControlPlane resource and reacts to changes by updating the Istio installation configuration in the corresponding cluster.

In the 1.4 release, the Istio controller is in the alpha phase of development and not fully integrated with istioctl. It is, however, available for experimentation using kubectl commands. For example, to install the controller and a default version of Istio into your cluster, run the following command:

$ kubectl apply -f https://<repo URL>/operator.yaml
$ kubectl apply -f https://<repo URL>/default-cr.yaml

You can then make changes to the Istio installation configuration:

$ kubectl edit istiocontrolplane example-istiocontrolplane -n istio-system

As soon as the resource is updated, the controller will detect the changes and respond by updating the Istio installation correspondingly.

Both the operator controller and istioctl commands share the same implementation. The significant difference is the execution context. In the istioctl case, the operation runs in the admin user’s command execution and security context. In the controller case, a pod in the cluster runs the code in its security context. In both cases, configuration is validated against a schema and the same correctness checks are performed.

Migration from Helm

To help ease the transition from previous configurations using Helm, istioctl and the controller support pass-through access for the full Helm installation API.

You can pass Helm configuration options using istioctl --set by prepending the string values. to the option name. For example, instead of this Helm command:

$ helm template ... --set global.mtls.enabled=true

You can use this istioctl command:

$ istioctl manifest generate ... --set values.global.mtls.enabled=true

You can also set Helm configuration values in an IstioControlPlane custom resource. See Customize Istio settings using Helm for details.

Another feature to help with the transition from Helm is the alpha istioctl manifest migrate command. This command can be used to automatically convert a Helm values.yaml file to a corresponding IstioControlPlane configuration.

Implementation

Several frameworks have been created to help implement operators by generating stubs for some or all of the components. The Istio operator was created with the help of a combination of kubebuilder and operator framework. Istio’s installation now uses a proto to describe the API such that runtime validation can be executed against a schema.

More information about the implementation can be found in the README and ARCHITECTURE documents in the Istio operator repository.

Summary

Starting in Istio 1.4, Helm installation is being replaced by new istioctl commands using a new operator custom resource definition, IstioControlPlane, for the configuration API. An alpha controller is also available for early experimentation with the operator.

The new istioctl commands and operator controller both validate configuration schemas and perform a range of checks for installation change or upgrade. These checks are tightly integrated with the tools to prevent common errors and simplify troubleshooting.

The Istio maintainers expect that this new approach will improve the user experience during Istio installation and upgrade, better stabilize the installation API, and help users better manage and monitor their Istio installations.

We welcome your feedback about the new installation approach at discuss.istio.io.

]]>
Thu, 14 Nov 2019 00:00:00 +0000/v1.24//blog/2019/introducing-istio-operator//v1.24//blog/2019/introducing-istio-operator/installconfigurationistioctloperator
Introducing istioctl analyzeIstio 1.4 introduces an experimental new tool to help you analyze and debug your clusters running Istio.

istioctl analyze is a diagnostic tool that detects potential issues with your Istio configuration, as well as gives general insights to improve your configuration. It can run against a live cluster or a set of local configuration files. It can also run against a combination of the two, allowing you to catch problems before you apply changes to a cluster.

To get started with it in just minutes, head over to the documentation.

Designed to be approachable for novice users

One of the key design goals that we followed for this feature is to make it extremely approachable. This is achieved by making the command useful without having to pass any required complex parameters.

In practice, here are some of the scenarios that it goes after:

  • “There is some problem with my cluster, but I have no idea where to start”
  • “Things are generally working, but I’m wondering if there is anything I could improve”

In that sense, it is very different from some of the more advanced diagnostic tools, which go after scenarios along the lines of (taking istioctl proxy-config as an example):

  • “Show me the Envoy configuration for this specific pod so I can see if anything looks wrong”

This can be very useful for advanced debugging, but it requires a lot of expertize before you can figure out that you need to run this specific command, and which pod to run it on.

So really, the one-line pitch for analyze is: just run it! It’s completely safe, it takes no thinking, it might help you, and at worst, you’ll have wasted a minute!

Improving this tool over time

In Istio 1.4, analyze comes with a nice set of analyzers that can detect a number of common issues. But this is just the beginning, and we are planning to keep growing and fine tuning the analyzers with each release.

In fact, we would welcome suggestions from Istio users. Specifically, if you encounter a situation where you think an issue could be detected via configuration analysis, but is not currently flagged by analyze, please do let us know. The best way to do this is to open an issue on GitHub.

]]>
Thu, 14 Nov 2019 00:00:00 +0000/v1.24//blog/2019/introducing-istioctl-analyze//v1.24//blog/2019/introducing-istioctl-analyze/debuggingistioctlconfiguration
DNS Certificate ManagementBy default, Citadel manages the DNS certificates of the Istio control plane. Citadel is a large component that maintains its own private signing key, and acts as a Certificate Authority (CA).

New in Istio 1.4, we introduce a feature to securely provision and manage DNS certificates signed by the Kubernetes CA, which has the following advantages.

  • Lighter weight DNS certificate management with no dependency on Citadel.

  • Unlike Citadel, this feature doesn’t maintain a private signing key, which enhances security.

  • Simplified root certificate distribution to TLS clients. Clients no longer need to wait for Citadel to generate and distribute its CA certificate.

The following diagram shows the architecture of provisioning and managing DNS certificates in Istio. Chiron is the component provisioning and managing DNS certificates in Istio.

The architecture of provisioning and managing DNS certificates in Istio

To try this new feature, refer to the DNS certificate management task.

]]>
Thu, 14 Nov 2019 00:00:00 +0000/v1.24//blog/2019/dns-cert//v1.24//blog/2019/dns-cert/securitykubernetescertificatesDNS
Announcing Istio client-goWe are pleased to announce the initial release of the Istio client go repository which enables developers to gain programmatic access to Istio APIs in a Kubernetes environment. The generated Kubernetes informers and client set in this repository makes it easy for developers to create controllers and perform Create, Read, Update and Delete (CRUD) operations for all Istio Custom Resource Definitions (CRDs).

This was a highly requested functionality by many Istio users, as is evident from the feature requests on the clients generated by Aspen Mesh and the Knative project. If you’re currently using one of the above mentioned clients, you can easily switch to using Istio client go like this:

import (
  ...
  - versionedclient "github.com/aspenmesh/istio-client-go/pkg/client/clientset/versioned"
  + versionedclient "istio.io/client-go/pkg/clientset/versioned"
)

As the generated client sets are functionally equivalent, switching the imported client libraries should be sufficient in order to consume the newly generated library.

How to use client-go

The Istio client go repository follows the same branching strategy as the Istio API repository, as the client repository depends on the API definitions. If you want to use a stable client set, you can use the release branches or tagged versions in the client go repository. Using the client set is very similar to using the Kubernetes client go, here’s a quick example of using the client to list all Istio virtual services in the passed namespace:

package main

import (
  "log"
  "os"

  metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
  "k8s.io/client-go/tools/clientcmd"

  versionedclient "istio.io/client-go/pkg/clientset/versioned"
)

func main() {
  kubeconfig := os.Getenv("KUBECONFIG")
  namespace := os.Getenv("NAMESPACE")
  if len(kubeconfig) == 0 || len(namespace) == 0 {
    log.Fatalf("Environment variables KUBECONFIG and NAMESPACE need to be set")
  }
  restConfig, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
  if err != nil {
    log.Fatalf("Failed to create k8s rest client: %s", err)
  }

  ic, err := versionedclient.NewForConfig(restConfig)
  if err != nil {
    log.Fatalf("Failed to create istio client: %s", err)
  }
  // Print all VirtualServices
  vsList, err := ic.NetworkingV1alpha3().VirtualServices(namespace).List(metav1.ListOptions{})
  if err != nil {
    log.Fatalf("Failed to get VirtualService in %s namespace: %s", namespace, err)
  }
  for i := range vsList.Items {
    vs := vsList.Items[i]
    log.Printf("Index: %d VirtualService Hosts: %+v\n", i, vs.Spec.GetHosts())
  }
}

You can find a more in-depth example here.

Useful tools created for generating Istio client-go

If you’re wondering why it took so long or why was it difficult to generate this client set, this section is for you. In Istio, we use protobuf specifications to write APIs which are then converted to Go definitions using the protobuf tool chain. There are three major challenges which you might face if you’re trying to generate Kubernetes client set from a protobuf-generated API:

  • Creating Kubernetes Wrapper Types - Kubernetes client generation library only works for Go objects which follow the Kubernetes object specification for e.g. Authentication Policy Kubernetes Wrappers. This means for every API which needs programmatic access, you need to create these wrappers. Additionally, there is a fair amount of boilerplate needed for every CRD group, version and kind that needs client code generation. To automate this process, we created a Kubernetes type generator tool which can automatically create the Kubernetes types based on annotations. The annotations parsed by this tool and the various available options are explained in the README. Note that if you’re using protobuf tools to generate Go types, you would need to add these annotations as comments in the proto files, so that the comments are present in the generated Go files which are then used by this tool.

  • Generating deep copy methods - In Kubernetes client machinery, if you want to mutate any object returned from the client set, you are required to make a copy of the object to prevent modifying the object in-place in the cache store. The canonical way to do this is to create a deepcopy method on all nested types. We created a tool protoc deep copy generator which is a protoc plugin and can automatically create deepcopy method based on annotations using the Proto library utility Proto Clone. Here’s an example of the generated deepcopy method.

  • Marshaling and Unmarshaling types to/from JSON - For the types generated from proto definitions, it is often problematic to use the default Go JSON encoder/decoder as there are various fields like protobuf’s oneof which requires special handling. Additionally, any Proto fields with underscores in their name might serialize/deserialize to different field names depending on the encoder/decoder as the Go struct tag are generated differently. It is always recommended to use protobuf primitives for serializing/deserializing to JSON instead of relying on default Go library. We created a tool protoc JSON shim which is a protoc plugin and can automatically create Marshalers/Unmarshalers for all Go type generated from Proto definitions. Here’s an example of the code generated by this tool.

I’m hoping that the newly released client library enables users to create more integrations and controllers for the Istio APIs, and the tools mentioned above can be used by developers to generate Kubernetes client set from Proto APIs.

]]>
Thu, 14 Nov 2019 00:00:00 +0000/v1.24//blog/2019/announcing-istio-client-go//v1.24//blog/2019/announcing-istio-client-go/client-gotoolscrd
Istio as a Proxy for External ServicesThe Control Ingress Traffic and the Ingress Gateway without TLS Termination tasks describe how to configure an ingress gateway to expose services inside the mesh to external traffic. The services can be HTTP or HTTPS. In the case of HTTPS, the gateway passes the traffic through, without terminating TLS.

This blog post describes how to use the same ingress gateway mechanism of Istio to enable access to external services and not to applications inside the mesh. This way Istio as a whole can serve just as a proxy server, with the added value of observability, traffic management and policy enforcement.

The blog post shows configuring access to an HTTP and an HTTPS external service, namely httpbin.org and edition.cnn.com.

Configure an ingress gateway

  1. Define an ingress gateway with a servers: section configuring the 80 and 443 ports. Ensure mode: is set to PASSTHROUGH for tls: in the port 443, which instructs the gateway to pass the ingress traffic AS IS, without terminating TLS.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: proxy
    spec:
      selector:
        istio: ingressgateway # use istio default ingress gateway
      servers:
      - port:
          number: 80
          name: http
          protocol: HTTP
        hosts:
        - httpbin.org
      - port:
          number: 443
          name: tls
          protocol: TLS
        tls:
          mode: PASSTHROUGH
        hosts:
        - edition.cnn.com
    EOF
  2. Create service entries for the httpbin.org and edition.cnn.com services to make them accessible from the ingress gateway:

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: httpbin-ext
    spec:
      hosts:
      - httpbin.org
      ports:
      - number: 80
        name: http
        protocol: HTTP
      resolution: DNS
      location: MESH_EXTERNAL
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: cnn
    spec:
      hosts:
      - edition.cnn.com
      ports:
      - number: 443
        name: tls
        protocol: TLS
      resolution: DNS
      location: MESH_EXTERNAL
    EOF
  3. Create a service entry and configure a destination rule for the localhost service. You need this service entry in the next step as a destination for traffic to the external services from applications inside the mesh to block traffic from inside the mesh. In this example you use Istio as a proxy between external applications and external services.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: localhost
    spec:
      hosts:
      - localhost.local
      location: MESH_EXTERNAL
      ports:
      - number: 80
        name: http
        protocol: HTTP
      - number: 443
        name: tls
        protocol: TLS
      resolution: STATIC
      endpoints:
      - address: 127.0.0.1
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: localhost
    spec:
      host: localhost.local
      trafficPolicy:
        tls:
          mode: DISABLE
          sni: localhost.local
    EOF
  4. Create a virtual service for each external service to configure routing to it. Both virtual services include the proxy gateway in the gateways: section and in the match: section for HTTP and HTTPS traffic accordingly.

    Notice the route: section for the mesh gateway, the gateway that represents the applications inside the mesh. The route: for the mesh gateway shows how the traffic is directed to the localhost.local service, effectively blocking the traffic.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: httpbin
    spec:
      hosts:
      - httpbin.org
      gateways:
      - proxy
      - mesh
      http:
      - match:
        - gateways:
          - proxy
          port: 80
          uri:
            prefix: /status
        route:
        - destination:
            host: httpbin.org
            port:
              number: 80
      - match:
        - gateways:
          - mesh
          port: 80
        route:
        - destination:
            host: localhost.local
            port:
              number: 80
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: cnn
    spec:
      hosts:
      - edition.cnn.com
      gateways:
      - proxy
      - mesh
      tls:
      - match:
        - gateways:
          - proxy
          port: 443
          sni_hosts:
          - edition.cnn.com
        route:
        - destination:
            host: edition.cnn.com
            port:
              number: 443
      - match:
        - gateways:
          - mesh
          port: 443
          sni_hosts:
          - edition.cnn.com
        route:
        - destination:
            host: localhost.local
            port:
              number: 443
    EOF
  5. Enable Envoy’s access logging.

  6. Follow the instructions in Determining the ingress IP and ports to define the SECURE_INGRESS_PORT and INGRESS_HOST environment variables.

  7. Access the httpbin.org service through your ingress IP and port which you stored in the $INGRESS_HOST and $INGRESS_PORT environment variables, respectively, during the previous step. Access the /status/418 path of the httpbin.org service that returns the HTTP status 418 I’m a teapot.

    $ curl $INGRESS_HOST:$INGRESS_PORT/status/418 -Hhost:httpbin.org
    
    -=[ teapot ]=-
    
       _...._
     .'  _ _ `.
    | ."` ^ `". _,
    \_;`"---"`|//
      |       ;/
      \_     _/
        `"""`
  8. If the Istio ingress gateway is deployed in the istio-system namespace, print the gateway’s log with the following command:

    $ kubectl logs -l istio=ingressgateway -c istio-proxy -n istio-system | grep 'httpbin.org'
  9. Search the log for an entry similar to:

    [2019-01-31T14:40:18.645Z] "GET /status/418 HTTP/1.1" 418 - 0 135 187 186 "10.127.220.75" "curl/7.54.0" "28255618-6ca5-9d91-9634-c562694a3625" "httpbin.org" "34.232.181.106:80" outbound|80||httpbin.org - 172.30.230.33:80 10.127.220.75:52077 -
  10. Access the edition.cnn.com service through your ingress gateway:

    $ curl -s --resolve edition.cnn.com:$SECURE_INGRESS_PORT:$INGRESS_HOST https://edition.cnn.com:$SECURE_INGRESS_PORT | grep -o "<title>.*</title>"
    <title>CNN International - Breaking News, US News, World News and Video</title>
  11. If the Istio ingress gateway is deployed in the istio-system namespace, print the gateway’s log with the following command:

    $ kubectl logs -l istio=ingressgateway -c istio-proxy -n istio-system | grep 'edition.cnn.com'
  12. Search the log for an entry similar to:

    [2019-01-31T13:40:11.076Z] "- - -" 0 - 589 17798 1644 - "-" "-" "-" "-" "172.217.31.132:443" outbound|443||edition.cnn.com 172.30.230.33:54508 172.30.230.33:443 10.127.220.75:49467 edition.cnn.com

Cleanup

Remove the gateway, the virtual services and the service entries:

$ kubectl delete gateway proxy
$ kubectl delete virtualservice cnn httpbin
$ kubectl delete serviceentry cnn httpbin-ext localhost
$ kubectl delete destinationrule localhost
]]>
Tue, 15 Oct 2019 00:00:00 +0000/v1.24//blog/2019/proxy//v1.24//blog/2019/proxy/traffic-managementingresshttpshttp
Multi-Mesh Deployments for Isolation and Boundary ProtectionVarious compliance standards require protection of sensitive data environments. Some of the important standards and the types of sensitive data they protect appear in the following table:

Standard Sensitive data
PCI DSS payment card data
FedRAMP federal information, data and metadata
HIPAA personal health data
GDPR personal data

PCI DSS, for example, recommends putting cardholder data environment on a network, separate from the rest of the system. It also requires using a DMZ, and setting firewalls between the public Internet and the DMZ, and between the DMZ and the internal network.

Isolation of sensitive data environments from other information systems can reduce the scope of the compliance checks and improve the security of the sensitive data. Reducing the scope reduces the risks of failing a compliance check and reduces the costs of compliance since there are less components to check and secure, according to compliance requirements.

You can achieve isolation of sensitive data by separating the parts of the application that process that data into a separate service mesh, preferably on a separate network, and then connect the meshes with different compliance requirements together in a multi-mesh deployment. The process of connecting inter-mesh applications is called mesh federation.

Note that using mesh federation to create a multi-mesh deployment is very different than creating a multicluster deployment, which defines a single service mesh composed from services spanning more than one cluster. Unlike multi-mesh, a multicluster deployment is not suitable for applications that require isolation and boundary protection.

In this blog post I describe the requirements for isolation and boundary protection, and outline the principles of multi-mesh deployments. Finally, I touch on the current state of mesh-federation support and automation work under way for Istio.

Isolation and boundary protection

Isolation and boundary protection mechanisms are explained in the NIST Special Publication 800-53, Revision 4, Security and Privacy Controls for Federal Information Systems and Organizations, Appendix F, Security Control Catalog, SC-7 Boundary Protection.

In particular, the Boundary protection, isolation of information system components control enhancement:

Various compliance standards recommend isolating environments that process sensitive data from the rest of the organization. The Payment Card Industry (PCI) Data Security Standard recommends implementing network isolation for cardholder data environment and requires isolating this environment from the DMZ. FedRAMP Authorization Boundary Guidance describes authorization boundary for federal information and data, while NIST Special Publication 800-37, Revision 2, Risk Management Framework for Information Systems and Organizations: A System Life Cycle Approach for Security and Privacy recommends protecting of such a boundary in Appendix G, Authorization Boundary Considerations:

Boundary protection, in particular, means:

  • put an access control mechanism at the boundary (firewall, gateway, etc.)
  • monitor the incoming/outgoing traffic at the boundary
  • all the access control mechanisms must be deny-all by default
  • do not expose private IP addresses from the boundary
  • do not let components from outside the boundary to impact security inside the boundary

Multi-mesh deployments facilitate division of a system into subsystems with different security and compliance requirements, and facilitate the boundary protection. You put each subsystem into a separate service mesh, preferably on a separate network. You connect the Istio meshes using gateways. The gateways monitor and control cross-mesh traffic at the boundary of each mesh.

Features of multi-mesh deployments

  • non-uniform naming. The withdraw service in the accounts namespace in one mesh might have different functionality and API than the withdraw services in the accounts namespace in other meshes. Such situation could happen in an organization where there is no uniform policy on naming of namespaces and services, or when the meshes belong to different organizations.
  • expose-nothing by default. None of the services in a mesh are exposed by default, the mesh owners must explicitly specify which services are exposed.
  • boundary protection. The access control of the traffic must be enforced at the ingress gateway, which stops forbidden traffic from entering the mesh. This requirement implements Defense-in-depth principle and is part of some compliance standards, such as the Payment Card Industry (PCI) Data Security Standard.
  • common trust may not exist. The Istio sidecars in one mesh may not trust the Citadel certificates in other meshes, due to some security requirement or due to the fact that the mesh owners did not initially plan to federate the meshes.

While expose-nothing by default and boundary protection are required to facilitate compliance and improve security, non-uniform naming and common trust may not exist are required when connecting meshes of different organizations, or of an organization that cannot enforce uniform naming or cannot or may not establish common trust between the meshes.

An optional feature that you may want to use is service location transparency: consuming services send requests to the exposed services in remote meshes using local service names. The consuming services are oblivious to the fact that some of the destinations are in remote meshes and some are local services. The access is uniform, using the local service names, for example, in Kubernetes, reviews.default.svc.cluster.local. Service location transparency is useful in the cases when you want to be able to change the location of the consumed services, for example when some service is migrated from private cloud to public cloud, without changing the code of your applications.

The current mesh-federation work

While you can perform mesh federation using standard Istio configurations already today, it requires writing a lot of boilerplate YAML files and is error-prone. There is an effort under way to automate the mesh federation process. In the meantime, you can look at these multi-mesh deployment examples to get an idea of what a generated federation might include.

Summary

In this blog post I described the requirements for isolation and boundary protection of sensitive data environments by using Istio multi-mesh deployments. I outlined the principles of Istio multi-mesh deployments and reported the current work on mesh federation in Istio.

I will be happy to hear your opinion about multi-mesh and multicluster at discuss.istio.io.

]]>
Wed, 02 Oct 2019 00:00:00 +0000/v1.24//blog/2019/isolated-clusters//v1.24//blog/2019/isolated-clusters/traffic-managementmulticlustersecuritygatewaytls
Monitoring Blocked and Passthrough External Service TrafficUnderstanding, controlling and securing your external service access is one of the key benefits that you get from a service mesh like Istio. From a security and operations point of view, it is critical to monitor what external service traffic is getting blocked as they might surface possible misconfigurations or a security vulnerability if an application is attempting to communicate with a service that it should not be allowed to. Similarly, if you currently have a policy of allowing any external service access, it is beneficial to monitor the traffic so you can incrementally add explicit Istio configuration to allow access and better secure your cluster. In either case, having visibility into this traffic via telemetry is quite helpful as it enables you to create alerts and dashboards, and better reason about your security posture. This was a highly requested feature by production users of Istio and we are excited that the support for this was added in release 1.3.

To implement this, the Istio default metrics are augmented with explicit labels to capture blocked and passthrough external service traffic. This blog will cover how you can use these augmented metrics to monitor all external service traffic.

The Istio control plane configures the sidecar proxy with predefined clusters called BlackHoleCluster and Passthrough which block or allow all traffic respectively. To understand these clusters, let’s start with what external and internal services mean in the context of Istio service mesh.

External and internal services

Internal services are defined as services which are part of your platform and are considered to be in the mesh. For internal services, Istio control plane provides all the required configuration to the sidecars by default. For example, in Kubernetes clusters, Istio configures the sidecars for all Kubernetes services to preserve the default Kubernetes behavior of all services being able to communicate with other.

External services are services which are not part of your platform i.e. services which are outside of the mesh. For external services, Istio provides two options, first to block all external service access (enabled by setting global.outboundTrafficPolicy.mode to REGISTRY_ONLY) and second to allow all access to external service (enabled by setting global.outboundTrafficPolicy.mode to ALLOW_ANY). The default option for this setting (as of Istio 1.3) is to allow all external service access. This option can be configured via mesh configuration.

This is where the BlackHole and Passthrough clusters are used.

What are BlackHole and Passthrough clusters?

  • BlackHoleCluster - The BlackHoleCluster is a virtual cluster created in the Envoy configuration when global.outboundTrafficPolicy.mode is set to REGISTRY_ONLY. In this mode, all traffic to external service is blocked unless service entries are explicitly added for each service. To implement this, the default virtual outbound listener at 0.0.0.0:15001 which uses original destination is setup as a TCP Proxy with the BlackHoleCluster as the static cluster. The configuration for the BlackHoleCluster looks like this:

    {
      "name": "BlackHoleCluster",
      "type": "STATIC",
      "connectTimeout": "10s"
    }

    As you can see, this cluster is static with no endpoints so all the traffic will be dropped. Additionally, Istio creates unique listeners for every port/protocol combination of platform services which gets hit instead of the virtual listener if the request is made to an external service on the same port. In that case, the route configuration of every virtual route in Envoy is augmented to add the BlackHoleCluster like this:

    {
      "name": "block_all",
      "domains": [
        "*"
      ],
      "routes": [
        {
          "match": {
            "prefix": "/"
          },
          "directResponse": {
            "status": 502
          }
        }
      ]
    }

    The route is setup as direct response with 502 response code which means if no other routes match the Envoy proxy will directly return a 502 HTTP status code.

  • PassthroughCluster - The PassthroughCluster is a virtual cluster created in the Envoy configuration when global.outboundTrafficPolicy.mode is set to ALLOW_ANY. In this mode, all traffic to any external service external is allowed. To implement this, the default virtual outbound listener at 0.0.0.0:15001 which uses SO_ORIGINAL_DST, is setup as a TCP Proxy with the PassthroughCluster as the static cluster. The configuration for the PassthroughCluster looks like this:

    {
      "name": "PassthroughCluster",
      "type": "ORIGINAL_DST",
      "connectTimeout": "10s",
      "lbPolicy": "ORIGINAL_DST_LB",
      "circuitBreakers": {
        "thresholds": [
          {
            "maxConnections": 102400,
            "maxRetries": 1024
          }
        ]
      }
    }

    This cluster uses the original destination load balancing policy which configures Envoy to send the traffic to the original destination i.e. passthrough.

    Similar to the BlackHoleCluster, for every port/protocol based listener the virtual route configuration is augmented to add the PassthroughCluster as the default route:

    {
      "name": "allow_any",
      "domains": [
        "*"
      ],
      "routes": [
        {
          "match": {
            "prefix": "/"
          },
          "route": {
            "cluster": "PassthroughCluster"
          }
        }
      ]
    }

Prior to Istio 1.3, there were no metrics reported or if metrics were reported there were no explicit labels set when traffic hit these clusters, resulting in lack of visibility in traffic flowing through the mesh.

The next section covers how to take advantage of this enhancement as the metrics and labels emitted are conditional on whether the virtual outbound or explicit port/protocol listener is being hit.

Using the augmented metrics

To capture all external service traffic in either of the cases (BlackHole or Passthrough), you will need to monitor istio_requests_total and istio_tcp_connections_closed_total metrics. Depending upon the Envoy listener type i.e. TCP proxy or HTTP proxy that gets invoked, one of these metrics will be incremented.

Additionally, in case of a TCP proxy listener in order to see the IP address of the external service that is blocked or allowed via BlackHole or Passthrough cluster, you will need to add the destination_ip label to the istio_tcp_connections_closed_total metric. In this scenario, the host name of the external service is not captured. This label is not added by default and can be easily added by augmenting the Istio configuration for attribute generation and Prometheus handler. You should be careful about cardinality explosion in time series if you have many services with non-stable IP addresses.

PassthroughCluster metrics

This section explains the metrics and the labels emitted based on the listener type invoked in Envoy.

  • HTTP proxy listener: This happens when the port of the external service is same as one of the service ports defined in the cluster. In this scenario, when the PassthroughCluster is hit, istio_requests_total will get increased like this:

    {
      "metric": {
        "__name__": "istio_requests_total",
        "connection_security_policy": "unknown",
        "destination_app": "unknown",
        "destination_principal": "unknown",
        "destination_service": "httpbin.org",
        "destination_service_name": "PassthroughCluster",
        "destination_service_namespace": "unknown",
        "destination_version": "unknown",
        "destination_workload": "unknown",
        "destination_workload_namespace": "unknown",
        "instance": "100.96.2.183:42422",
        "job": "istio-mesh",
        "permissive_response_code": "none",
        "permissive_response_policyid": "none",
        "reporter": "source",
        "request_protocol": "http",
        "response_code": "200",
        "response_flags": "-",
        "source_app": "sleep",
        "source_principal": "unknown",
        "source_version": "unknown",
        "source_workload": "sleep",
        "source_workload_namespace": "default"
      },
      "value": [
        1567033080.282,
        "1"
      ]
    }

    Note that the destination_service_name label is set to PassthroughCluster to indicate that this cluster was hit and the destination_service is set to the host of the external service.

  • TCP proxy virtual listener - If the external service port doesn’t map to any HTTP based service ports within the cluster, this listener is invoked and istio_tcp_connections_closed_total is the metric that will be increased:

    {
      "status": "success",
      "data": {
        "resultType": "vector",
        "result": [
          {
            "metric": {
              "__name__": "istio_tcp_connections_closed_total",
              "connection_security_policy": "unknown",
              "destination_app": "unknown",
              "destination_ip": "52.22.188.80",
              "destination_principal": "unknown",
              "destination_service": "unknown",
              "destination_service_name": "PassthroughCluster",
              "destination_service_namespace": "unknown",
              "destination_version": "unknown",
              "destination_workload": "unknown",
              "destination_workload_namespace": "unknown",
              "instance": "100.96.2.183:42422",
              "job": "istio-mesh",
              "reporter": "source",
              "response_flags": "-",
              "source_app": "sleep",
              "source_principal": "unknown",
              "source_version": "unknown",
              "source_workload": "sleep",
              "source_workload_namespace": "default"
            },
            "value": [
              1567033761.879,
              "1"
            ]
          }
        ]
      }
    }

    In this case, destination_service_name is set to PassthroughCluster and the destination_ip is set to the IP address of the external service. The destination_ip label can be used to do a reverse DNS lookup and get the host name of the external service. As this cluster is passthrough, other TCP related metrics like istio_tcp_connections_opened_total, istio_tcp_received_bytes_total and istio_tcp_sent_bytes_total are also updated.

BlackHoleCluster metrics

Similar to the PassthroughCluster, this section explains the metrics and the labels emitted based on the listener type invoked in Envoy.

  • HTTP proxy listener: This happens when the port of the external service is same as one of the service ports defined in the cluster. In this scenario, when the BlackHoleCluster is hit, istio_requests_total will get increased like this:

    {
      "metric": {
        "__name__": "istio_requests_total",
        "connection_security_policy": "unknown",
        "destination_app": "unknown",
        "destination_principal": "unknown",
        "destination_service": "httpbin.org",
        "destination_service_name": "BlackHoleCluster",
        "destination_service_namespace": "unknown",
        "destination_version": "unknown",
        "destination_workload": "unknown",
        "destination_workload_namespace": "unknown",
        "instance": "100.96.2.183:42422",
        "job": "istio-mesh",
        "permissive_response_code": "none",
        "permissive_response_policyid": "none",
        "reporter": "source",
        "request_protocol": "http",
        "response_code": "502",
        "response_flags": "-",
        "source_app": "sleep",
        "source_principal": "unknown",
        "source_version": "unknown",
        "source_workload": "sleep",
        "source_workload_namespace": "default"
      },
      "value": [
        1567034251.717,
        "1"
      ]
    }

    Note the destination_service_name label is set to BlackHoleCluster and the destination_service to the host name of the external service. The response code should always be 502 in this case.

  • TCP proxy virtual listener - If the external service port doesn’t map to any HTTP based service ports within the cluster, this listener is invoked and istio_tcp_connections_closed_total is the metric that will be increased:

    {
      "metric": {
        "__name__": "istio_tcp_connections_closed_total",
        "connection_security_policy": "unknown",
        "destination_app": "unknown",
        "destination_ip": "52.22.188.80",
        "destination_principal": "unknown",
        "destination_service": "unknown",
        "destination_service_name": "BlackHoleCluster",
        "destination_service_namespace": "unknown",
        "destination_version": "unknown",
        "destination_workload": "unknown",
        "destination_workload_namespace": "unknown",
        "instance": "100.96.2.183:42422",
        "job": "istio-mesh",
        "reporter": "source",
        "response_flags": "-",
        "source_app": "sleep",
        "source_principal": "unknown",
        "source_version": "unknown",
        "source_workload": "sleep",
        "source_workload_namespace": "default"
      },
      "value": [
        1567034481.03,
        "1"
      ]
    }

    Note the destination_ip label represents the IP address of the external service and the destination_service_name is set to BlackHoleCluster to indicate that this traffic was blocked by the mesh. Is is interesting to note that for the BlackHole cluster case, other TCP related metrics like istio_tcp_connections_opened_total are not increased as there’s no connection that is ever established.

Monitoring these metrics can help operators easily understand all the external services consumed by the applications in their cluster.

]]>
Sat, 28 Sep 2019 00:00:00 +0000/v1.24//blog/2019/monitoring-external-service-traffic//v1.24//blog/2019/monitoring-external-service-traffic/monitoringblackholepassthrough
Mixer Adapter for KnativeThis post demonstrates how you can use Mixer to push application logic into Istio. It describes a Mixer adapter which implements the Knative scale-from-zero logic with simple code and similar performance to the original implementation.

Knative serving

Knative Serving builds on Kubernetes to support deploying and serving of serverless applications. A core capability of serverless platforms is scale-to-zero functionality which reduces resource usage and cost of inactive workloads. A new mechanism is required to scale from zero when an idle application receives a new request.

The following diagram represents the current Knative architecture for scale-from-zero.

Knative scale-from-zero

The traffic for an idle application is redirected to Activator component by programming Istio with VirtualServices and DestinationRules. When Activator receives a new request, it:

  1. buffers incoming requests
  2. triggers the Autoscaler
  3. redirects requests to the application after it has been scaled up, including retries and load-balancing (if needed)

Once the application is up and running again, Knative restores the routing from Activator to the running application.

Mixer adapter

Mixer provides a rich intermediation layer between the Istio components and infrastructure backends. It is designed as a stand-alone component, separate from Envoy, and has a simple extensibility model to enable Istio to interoperate with a wide breadth of backends. Mixer is inherently easier to extend than Envoy is.

Mixer is an attribute processing engine that uses operator-supplied configuration to map request attributes from the Istio proxy into calls to the infrastructure backends systems via a pluggable set of adapters. Adapters enable Mixer to expose a single consistent API, independent of the infrastructure backends in use. The exact set of adapters used at runtime is determined through operator configuration and can easily be extended to target new or custom infrastructure backends.

In order to achieve Knative scale-from-zero, we use a Mixer out-of-process adapter to call the Autoscaler. Out-of-process adapters for Mixer allow developers to use any programming language and to build and maintain your extension as a stand-alone program without the need to build the Istio proxy.

The following diagram represents the Knative design using the Mixer adapter.

Knative scale-from-zero

In this design, there is no need to change the routing from/to Activator for an idle application as in the original Knative setup. When the Istio proxy represented by the ingress gateway component receives a new request for an idle application, it informs Mixer, including all the relevant metadata information. Mixer then calls your adapter which triggers the Knative Autoscaler using the original Knative protocol.

Istio’s use of Mixer adapters makes it possible to replace otherwise complex networking-based application logic with a more straightforward implementation, as demonstrated in the Knative adapter.

When the adapter receives a message from Mixer, it sends a StatMessage directly to Autoscaler component using the Knative protocol. The metadata information (namespace and service name) required by Autoscaler are transferred by Istio proxy to Mixer and from there to the adapter.

Summary

I compared the cold-start time of the original Knative reference architecture to the new Istio Mixer adapter reference architecture. The results show similar cold-start times. The implementation using the Mixer adapter has greater simplicity. It is not necessary to handle low-level network-based mechanisms as these are handled by Envoy.

The next step is converting this Mixer adapter into an Envoy-specific filter running inside an ingress gateway. This will allow to further improve the latency overhead (no more calls to Mixer and the adapter) and to remove the dependency on the Istio Mixer.

]]>
Wed, 18 Sep 2019 00:00:00 +0000/v1.24//blog/2019/knative-activator-adapter//v1.24//blog/2019/knative-activator-adapter/mixeradapterknativescale-from-zero
App Identity and Access AdapterIf you are running your containerized applications on Kubernetes, you can benefit from using the App Identity and Access Adapter for an abstracted level of security with zero code changes or redeploys.

Whether your computing environment is based on a single cloud provider, a combination of multiple cloud providers, or following a hybrid cloud approach, having a centralized identity management can help you to preserve existing infrastructure and avoid vendor lock-in.

With the App Identity and Access Adapter, you can use any OAuth2/OIDC provider: IBM Cloud App ID, Auth0, Okta, Ping Identity, AWS Cognito, Azure AD B2C and more. Authentication and authorization policies can be applied in a streamlined way in all environments — including frontend and backend applications — all without code changes or redeploys.

Understanding Istio and the adapter

Istio is an open source service mesh that transparently layers onto distributed applications and seamlessly integrates with Kubernetes. To reduce the complexity of deployments Istio provides behavioral insights and operational control over the service mesh as a whole. See the Istio Architecture for more details.

Istio uses Envoy proxy sidecars to mediate inbound and outbound traffic for all pods in the service mesh. Istio extracts telemetry from the Envoy sidecars and sends it to Mixer, the Istio component responsible for collecting telemetry and enforcing policy.

The App Identity and Access adapter extends the Mixer functionality by analyzing the telemetry (attributes) against various access control policies across the service mesh. The access control policies can be linked to a particular Kubernetes services and can be finely tuned to specific service endpoints. For more information about policies and telemetry, see the Istio documentation.

When App Identity and Access Adapter is combined with Istio, it provides a scalable, integrated identity and access solution for multicloud architectures that does not require any custom application code changes.

Installation

App Identity and Access adapter can be installed using Helm directly from the github.com repository

$ helm repo add appidentityandaccessadapter https://raw.githubusercontent.com/ibm-cloud-security/app-identity-and-access-adapter/master/helm/appidentityandaccessadapter
$ helm install --name appidentityandaccessadapter appidentityandaccessadapter/appidentityandaccessadapter

Alternatively, you can clone the repository and install the Helm chart locally

$ git clone git@github.com:ibm-cloud-security/app-identity-and-access-adapter.git
$ helm install ./helm/appidentityandaccessadapter --name appidentityandaccessadapter.

Protecting web applications

Web applications are most commonly protected by the OpenID Connect (OIDC) workflow called authorization_code. When an unauthenticated/unauthorized user is detected, they are automatically redirected to the identity service of your choice and presented with the authentication page. When authentication completes, the browser is redirected back to an implicit /oidc/callback endpoint intercepted by the adapter. At this point, the adapter obtains access and identity tokens from the identity service and then redirects users back to their originally requested URL in the web app.

Authentication state and tokens are maintained by the adapter. Each request processed by the adapter will include the Authorization header bearing both access and identity tokens in the following format Authorization: Bearer <access_token> <id_token>

Developers can read leverage the tokens for application experience adjustments, e.g. displaying user name, adjusting UI based on user role etc.

In order to terminate the authenticated session and wipe tokens, aka user logout, simply redirect browser to the /oidc/logout endpoint under the protected service, e.g. if you’re serving your app from https://example.com/myapp, redirect users to https://example.com/myapp/oidc/logout

Whenever access token expires, a refresh token is used to automatically acquire new access and identity tokens without your user’s needing to re-authenticate. If the configured identity provider returns a refresh token, it is persisted by the adapter and used to retrieve new access and identity tokens when the old ones expire.

Applying web application protection

Protecting web applications requires creating two types of resources - use OidcConfig resources to define various OIDC providers, and Policy resources to define the web app protection policies.

apiVersion: "security.cloud.ibm.com/v1"
kind: OidcConfig
metadata:
    name: my-oidc-provider-config
    namespace: sample-namespace
spec:
    discoveryUrl: <discovery-url-from-oidc-provider>
    clientId: <client-id-from-oidc-provider>
    clientSecretRef:
        name: <kubernetes-secret-name>
        key: <kubernetes-secret-key>
apiVersion: "security.cloud.ibm.com/v1"
kind: Policy
metadata:
    name: my-sample-web-policy
    namespace: sample-namespace
spec:
    targets:
    - serviceName: <kubernetes-service-name-to-protect>
        paths:
        - prefix: /webapp
            method: ALL
            policies:
            - policyType: oidc
                config: my-oidc-provider-config
                rules: // optional
                - claim: iss
                    match: ALL
                    source: access_token
                    values:
                    - <expected-issuer-id>
                - claim: scope
                    match: ALL
                    source: access_token
                    values:
                    - openid

Read more about protecting web applications

Protecting backend application and APIs

Backend applications and APIs are protected using the Bearer Token flow, where an incoming token is validated against a particular policy. The Bearer Token authorization flow expects a request to contain the Authorization header with a valid access token in JWT format. The expected header structure is Authorization: Bearer {access_token}. In case token is successfully validated request will be forwarded to the requested service. In case token validation fails the HTTP 401 will be returned back to the client with a list of scopes that are required to access the API.

Applying backend application and APIs protection

Protecting backend applications and APIs requires creating two types of resources - use JwtConfig resources to define various JWT providers, and Policy resources to define the backend app protection policies.

apiVersion: "security.cloud.ibm.com/v1"
kind: JwtConfig
metadata:
    name: my-jwt-config
    namespace: sample-namespace
spec:
    jwksUrl: <the-jwks-url>
apiVersion: "security.cloud.ibm.com/v1"
kind: Policy
metadata:
    name: my-sample-backend-policy
    namespace: sample-namespace
spec:
    targets:
    - serviceName: <kubernetes-service-name-to-protect>
        paths:
        - prefix: /api/files
            method: ALL
            policies:
            - policyType: jwt
                config: my-oidc-provider-config
                rules: // optional
                - claim: iss
                    match: ALL
                    source: access_token
                    values:
                    - <expected-issuer-id>
                - claim: scope
                    match: ALL
                    source: access_token
                    values:
                    - files.read
                    - files.write

Read more about protecting backend applications

Known limitations

At the time of writing this blog there are two known limitations of the App Identity and Access adapter:

  • If you use the App Identity and Access adapter for Web Applications you should not create more than a single replica of the adapter. Due to the way Envoy Proxy was handling HTTP headers it was impossible to return multiple Set-Cookie headers from Mixer back to Envoy. Therefore we couldn’t set all the cookies required for handling Web Application scenarios. The issue was recently addressed in Envoy and Mixer and we’re planning to address this in future versions of our adapter. Note that this only affects Web Applications, and doesn’t affect Backend Apps and APIs in any way.

  • As a general best practice you should always consider using mutual-tls for any in-cluster communications. At the moment the communications channel between Mixer and App Identity and Access adapter currently does not use mutual-tls. In future we plan to address this by implementing an approach described in the Mixer Adapter developer guide.

Summary

When a multicloud strategy is in place, security can become complicated as the environment grows and diversifies. While cloud providers supply protocols and tools to ensure their offerings are safe, the development teams are still responsible for the application-level security, such as API access control with OAuth2, defending against man-in-the-middle attacks with traffic encryption, and providing mutual TLS for service access control. However, this becomes complex in a multicloud environment since you might need to define those security details for each service separately. With proper security protocols in place, those external and internal threats can be mitigated.

Development teams have spent time making their services portable to different cloud providers, and in the same regard, the security in place should be flexible and not infrastructure-dependent.

Istio and App Identity and Access Adapter allow you to secure your Kubernetes apps with absolutely zero code changes or redeployments regardless of which programming language and which frameworks you use. Following this approach ensures maximum portability of your apps, and ability to easily enforce same security policies across multiple environments.

You can read more about the App Identity and Access Adapter in the release blog.

]]>
Wed, 18 Sep 2019 00:00:00 +0000/v1.24//blog/2019/app-identity-and-access-adapter//v1.24//blog/2019/app-identity-and-access-adapter/securityoidcjwtpolicies
Change in Secret Discovery Service in Istio 1.3In Istio 1.3, we are taking advantage of improvements in Kubernetes to issue certificates for workload instances more securely.

When a Citadel Agent sends a certificate signing request to Citadel to get a certificate for a workload instance, it includes the JWT that the Kubernetes API server issued representing the service account of the workload instance. If Citadel can authenticate the JWT, it extracts the service account name needed to issue the certificate for the workload instance.

Before Kubernetes 1.12, the Kubernetes API server issues JWTs with the following issues:

  1. The tokens don’t have important fields to limit their scope of usage, such as aud or exp. See Bound Service Tokens for more info.
  2. The tokens are mounted onto all the pods without a way to opt-out. See Service Account Token Volumes for motivation.

Kubernetes 1.12 introduces trustworthy JWTs to solve these issues. However, support for the aud field to have a different value than the API server audience didn’t become available until Kubernetes 1.13. To better secure the mesh, Istio 1.3 only supports trustworthy JWTs and requires the value of the aud field to be istio-ca when you enable SDS. Before upgrading your Istio deployment to 1.3 with SDS enabled, verify that you use Kubernetes 1.13 or later.

Make the following considerations based on your platform of choice:

  • GKE: Upgrade your cluster version to at least 1.13.
  • On-prem Kubernetes and GKE on-prem: Add extra configurations to your Kubernetes. You may also want to refer to the api-server page for the most up-to-date flag names.
  • For other platforms, check with your provider. If your vendor does not support trustworthy JWTs, you will need to fall back to the file-mount approach to propagate the workload keys and certificates in Istio 1.3.
]]>
Tue, 10 Sep 2019 00:00:00 +0000/v1.24//blog/2019/trustworthy-jwt-sds//v1.24//blog/2019/trustworthy-jwt-sds/securityPKIcertificatenodeagentsds
The Evolution of Istio's APIsOne of Istio’s main goals has always been, and continues to be, enabling teams to develop abstractions that work best for their specific organization and workloads. Istio provides robust and powerful building blocks for service-to-service networking. Since Istio 0.1, the Istio team has been learning from production users about how they map their own architectures, workloads, and constraints to Istio’s capabilities, and we’ve been evolving Istio’s APIs to make them work better for you.

Evolving Istio’s APIs

The next step in Istio’s evolution is to sharpen our focus and align with the roles of Istio’s users. A security admin should be able to interact with an API that logically groups and simplifies security operations within an Istio mesh; the same goes for service operators and traffic management operations.

Taking it a step further, there’s an opportunity to provide improved experiences for beginning, intermediate, and advanced use cases for each role. There are many common use cases that can be addressed with obvious default settings and a better defined initial experience that requires little to no configuration. For intermediate use cases, the Istio team wants to leverage contextual cues from the environment and provide you with a simpler configuration experience. Finally, for advanced scenarios, our goal is to make easy things easy and hard things possible.

To provide these sorts of role-centric abstractions, however, the APIs underneath them must be able to describe all of Istio’s power and capabilities. Historically, Istio’s approach to API design followed paths similar to those of other infrastructure APIs. Istio follows these design principles:

  1. The Istio APIs should seek to:
    • Properly represent the underlying resources to which they are mapped
    • Shouldn’t hide any of the underlying resource’s useful capabilities
  2. The Istio APIs should also be composable, so end users can combine infrastructure APIs in a way that makes sense for their own needs.
  3. The Istio APIs should be flexible: Within an organization, it should be possible to have different representations of the underlying resources and surface the ones that make sense for each individual team.

Over the course of the next several releases we will share our progress as we strengthen the alignment between Istio’s APIs and the roles of Istio users.

Composability and abstractions

Istio and Kubernetes often go together, but Istio is much more than an add-on to Kubernetes – it is as much a platform as Kubernetes is. Istio aims to provide infrastructure, and surface the capabilities you need in a powerful service mesh. For example, there are platform-as-a-service offerings that use Kubernetes as their foundation, and build on Kubernetes’ composability to provide a subset of APIs to application developers.

The number of objects that must be configured to deploy applications is a concrete example of Kubernetes’ composability. By our count, at least 10 objects need to be configured: Namespace, Service, Ingress, Deployment, HorizontalPodAutoscaler, Secret, ConfigMap, RBAC, PodDisruptionBudget, and NetworkPolicy.

It sounds complicated, but not everyone needs to interact with those concepts. Some are the responsibility of different teams like the cluster, network, or security admin teams, and many provide sensible defaults. A great benefit of cloud native platforms and deployment tools is that they can hide that complexity by taking in a small amount of information and configuring those objects for you.

Another example of composability in the networking space can be found in the Google Cloud HTTP(S) Load Balancer (GCLB). To correctly use an instance of the GCLB, six different infrastructure objects need to be created and configured. This design is the result of our 20 years of experience in operating distributed systems and there is a reason why each one is separate from the others. But the steps are simplified when you’re creating an instance via the Google Cloud console. We provide the more common end-user/role-specific configurations, and you can configure less common settings later. Ultimately, the goals of infrastructure APIs are to offer the most flexibility without sacrificing functionality.

Knative is a platform for building, running, and operating serverless workloads that provides a great real-world example of role-centric, higher-level APIs. Knative Serving, a component of Knative that builds on Kubernetes and Istio to support deploying and serving serverless applications and functions, provides an opinionated workflow for application developers to manage routes and revisions of their services. Thanks to that opinionated approach, Knative Serving exposes a subset of Istio’s networking APIs that are most relevant to application developers via a simplified Routes object that supports revisions and traffic routing, abstracting Istio’s VirtualService and DestinationRule resources.

As Istio has matured, we’ve also seen production users develop workload- and organization-specific abstractions on top of Istio’s infrastructure APIs.

AutoTrader UK has one of our favorite examples of a custom platform built on Istio. In an interview with the Kubernetes Podcast from Google, Russel Warman and Karl Stoney describe their Kubernetes-based delivery platform, with cost dashboards using Prometheus and Grafana. With minimal effort, they added configuration options to determine what their developers want configured on the network, and it now manages the Istio objects required to make that happen. There are countless other platforms being built in enterprise and cloud-native companies: some designed to replace a web of company-specific custom scripts, and some aimed to be a general-purpose public tool. As more companies start to talk about their tooling publicly, we’ll bring their stories to this blog.

What’s coming next

Some areas of improvement that we’re working on for upcoming releases include:

  • Installation profiles to set up standard patterns for ingress and egress, with the Istio operator
  • Automatic inference of container ports and protocols for telemetry
  • Support for routing all traffic by default to constrain routing incrementally
  • Add a single global flag to enable mutual TLS and encrypt all inter-pod traffic

Oh, and if for some reason you judge a toolbox by the list of CRDs it installs, in Istio 1.2 we cut the number from 54 down to 23. Why? It turns out that if you have a bunch of features, you need to have a way to configure them all. With the improvements we’ve made to our installer, you can now install Istio using a configuration that works with your adapters.

All service meshes and, by extension, Istio seeks to automate complex infrastructure operations, like networking and security. That means there will always be complexity in its APIs, but Istio will always aim to solve the needs of operators, while continuing to evolve the API to provide robust building blocks and prioritize flexibility through role-centric abstractions.

We can’t wait for you to join our community to see what you build with Istio next!

]]>
Mon, 05 Aug 2019 00:00:00 +0000/v1.24//blog/2019/evolving-istios-apis//v1.24//blog/2019/evolving-istios-apis/apiscomposabilityevolution
Secure Control of Egress Traffic in Istio, part 3Welcome to part 3 in our series about secure control of egress traffic in Istio. In the first part in the series, I presented the attacks involving egress traffic and the requirements we collected for a secure control system for egress traffic. In the second part in the series, I presented the Istio way of securing egress traffic and showed how you can prevent the attacks using Istio.

In this installment, I compare secure control of egress traffic in Istio with alternative solutions such as using Kubernetes network policies and legacy egress proxies and firewalls. Finally, I describe the performance considerations regarding the secure control of egress traffic in Istio.

Alternative solutions for egress traffic control

First, let’s remember the requirements for egress traffic control we previously collected:

  1. Support of TLS with SNI or of TLS origination.
  2. Monitor SNI and the source workload of every egress access.
  3. Define and enforce policies per cluster.
  4. Define and enforce policies per source, Kubernetes-aware.
  5. Prevent tampering.
  6. Traffic control is transparent to the applications.

Next, I’m going to cover two alternative solutions for egress traffic control: the Kubernetes network policies and egress proxies and firewalls. I show the requirements they satisfy, and, more importantly, the requirements they can’t satisfy.

Kubernetes provides a native solution for traffic control, and in particular, for control of egress traffic, through the network policies. Using these network policies, cluster operators can configure which pods can access specific external services. Cluster operators can identify pods by pod labels, namespace labels, or by IP ranges. To specify the external services, cluster operators can use IP ranges, but cannot use domain names like cnn.com. This is because Kubernetes network policies are not DNS-aware. Network policies satisfy the first requirement since they can control any TCP traffic. Network policies only partially satisfy the third and the fourth requirements because cluster operators can specify policies per cluster or per pod but operators can’t identify external services by domain names. Network policies only satisfy the fifth requirement if the attackers are not able to break from a malicious container into the Kubernetes node and interfere with the implementation of the policies inside said node. Lastly, network policies do satisfy the sixth requirement: Operators don’t need to change the code or the container environment. In summary, we can say that Kubernetes Network Policies provide transparent, Kubernetes-aware egress traffic control, which is not DNS-aware.

The second alternative predates the Kubernetes network policies. Using a DNS-aware egress proxy or firewall lets you configure applications to direct the traffic to the proxy and use some proxy protocol, for example, SOCKS. Since operators must configure the applications, this solution is not transparent. Moreover, operators can’t use pod labels or pod service accounts to configure the proxies because the egress proxies don’t know about them. Therefore, the egress proxies are not Kubernetes-aware and can’t fulfill the fourth requirement because egress proxies cannot enforce policies by source if a Kubernetes artifact specifies the source. In summary, egress proxies can fulfill the first, second, third and fifth requirements, but can’t satisfy the fourth and the six requirements because they are not transparent and not Kubernetes-aware.

Advantages of Istio egress traffic control

Istio egress traffic control is DNS-aware: you can define policies based on URLs or on wildcard domains like *.ibm.com. In this sense, it is better than Kubernetes network policies which are not DNS-aware.

Istio egress traffic control is transparent with regard to TLS traffic, since Istio is transparent: you don’t need to change the applications or configure their containers. For HTTP traffic with TLS origination, you must configure the applications in the mesh to use HTTP instead of HTTPS.

Istio egress traffic control is Kubernetes-aware: the identity of the source of egress traffic is based on Kubernetes service accounts. Istio egress traffic control is better than the legacy DNS-aware proxies or firewalls which are not transparent and not Kubernetes-aware.

Istio egress traffic control is secure: it is based on the strong identity of Istio and, when you apply additional security measures, Istio’s traffic control is resilient to tampering.

Additionally, Istio’s egress traffic control provides the following advantages:

  • Define access policies in the same language for ingress, egress, and in-cluster traffic. You need to learn a single policy and configuration language for all types of traffic.
  • Out-of-the-Box integration of Istio’s egress traffic control with Istio’s policy and observability adapters.
  • Write the adapters to use external monitoring or access control systems with Istio only once and apply them for all types of traffic: ingress, egress, and in-cluster.
  • Use Istio’s traffic management features for egress traffic: load balancing, passive and active health checking, circuit breaker, timeouts, retries, fault injection, and others.

We refer to a system with the advantages above as Istio-aware.

The following table summarizes the egress traffic control features that Istio and the alternative solutions provide:

Istio Egress Traffic Control Kubernetes Network Policies Legacy Egress Proxy or Firewall
DNS-aware
Kubernetes-aware
Transparent
Istio-aware

Performance considerations

Controlling egress traffic using Istio has a price: increased latency of calls to external services and increased CPU usage by the cluster’s pods. Traffic passes through two proxies:

  • The application’s sidecar proxy
  • The egress gateway’s proxy

If you use TLS egress traffic to wildcard domains, you must add an additional proxy between the application and the external service. Since the traffic between the egress gateway’s proxy and the proxy needed for the configuration of arbitrary domains using wildcards is on the pod’s local network, that traffic shouldn’t have a significant impact on latency.

See a performance evaluation of different Istio configurations set to control egress traffic. I would encourage you to carefully measure different configurations with your own applications and your own external services, before you decide whether you can afford the performance overhead for your use cases. You should weigh the required level of security versus your performance requirements and compare the performance overhead of all alternative solutions.

Let me share my thoughts on the performance overhead that controlling egress traffic using Istio adds: Accessing external services already could have high latency and the overhead added because of two or three proxies inside the cluster could likely not be very significant by comparison. After all, applications with a microservice architecture can have chains of dozens of calls between microservices. Therefore, an additional hop with one or two proxies in the egress gateway should not have a large impact.

Moreover, we continue to work towards reducing Istio’s performance overhead. Possible optimizations include:

  • Extending Envoy to handle wildcard domains: This would eliminate the need for a third proxy between the application and the external services for that use case.
  • Using mutual TLS for authentication only without encrypting the TLS traffic, since the traffic is already encrypted.

Summary

I hope that after reading this series you are convinced that controlling egress traffic is very important for the security of your cluster. Hopefully, I also managed to convince you that Istio is an effective tool to control egress traffic securely, and that Istio has multiple advantages over the alternative solutions. Istio is the only solution I’m aware of that lets you:

  • Control egress traffic in a secure and transparent way
  • Specify external services as domain names
  • Use Kubernetes artifacts to specify the traffic source

In my opinion, secure control of egress traffic is a great choice if you are looking for your first Istio use case. In this case, Istio already provides you some benefits even before you start using all other Istio features: traffic management, security, policies and observability, applied to traffic between microservices inside the cluster.

So, if you haven’t had the chance to work with Istio yet, install Istio on your cluster and check our egress traffic control tasks and the tasks for the other Istio features. We also want to hear from you, please join us at discuss.istio.io.

]]>
Mon, 22 Jul 2019 00:00:00 +0000/v1.24//blog/2019/egress-traffic-control-in-istio-part-3//v1.24//blog/2019/egress-traffic-control-in-istio-part-3/traffic-managementegresssecuritygatewaytls
Secure Control of Egress Traffic in Istio, part 2Welcome to part 2 in our new series about secure control of egress traffic in Istio. In the first part in the series, I presented the attacks involving egress traffic and the requirements we collected for a secure control system for egress traffic. In this installment, I describe the Istio way to securely control the egress traffic, and show how Istio can help you prevent the attacks.

Secure control of egress traffic in Istio

To implement secure control of egress traffic in Istio, you must direct TLS traffic to external services through an egress gateway. Alternatively, you can direct HTTP traffic through an egress gateway and let the egress gateway perform TLS origination.

Both alternatives have their pros and cons, you should choose between them according to your circumstances. The choice mainly depends on whether your application can send unencrypted HTTP requests and whether your organization’s security policies allow sending unencrypted HTTP requests. For example, if your application uses some client library that encrypts the traffic without a possibility to cancel the encryption, you cannot use the option of sending unencrypted HTTP traffic. The same in the case your organization’s security policies do not allow sending unencrypted HTTP requests inside the pod (outside the pod the traffic is encrypted by Istio).

If the application sends HTTP requests and the egress gateway performs TLS origination, you can monitor HTTP information like HTTP methods, headers, and URL paths. You can also define policies based on said HTTP information. If the application performs TLS origination, you can monitor SNI and the service account of the source pod’s TLS traffic, and define policies based on SNI and service accounts.

You must ensure that traffic from your cluster to the outside cannot bypass the egress gateway. Istio cannot enforce it for you, so you must apply some additional security mechanisms, for example, the Kubernetes network policies or an L3 firewall. See an example of the Kubernetes network policies configuration. According to the Defense in depth concept, the more security mechanisms you apply for the same goal, the better.

You must also ensure that Istio control plane and the egress gateway cannot be compromised. While you may have hundreds or thousands of application pods in your cluster, there are only a dozen of Istio control plane pods and the gateways. You can and should focus on protecting the control plane pods and the gateways, since it is easy (there is a small number of pods to protect) and it is most crucial for the security of your cluster. If attackers compromise the control plane or the egress gateway, they could violate any policy.

You might have multiple tools to protect the control plane pods, depending on your environment. The reasonable security measures are:

  • Run the control plane pods on nodes separate from the application nodes.
  • Run the control plane pods in their own separate namespace.
  • Apply the Kubernetes RBAC and network policies to protect the control plane pods.
  • Monitor the control plane pods more closely than you do the application pods.

Once you direct egress traffic through an egress gateway and apply the additional security mechanisms, you can securely monitor and enforce security policies for the traffic.

The following diagram shows Istio’s security architecture, augmented with an L3 firewall which is part of the additional security mechanisms that should be provided outside of Istio.

Istio Security Architecture with Egress Gateway and L3 Firewall

You can configure the L3 firewall trivially to only allow incoming traffic through the Istio ingress gateway and only allow outgoing traffic through the Istio egress gateway. The Istio proxies of the gateways enforce policies and report telemetry just as all other proxies in the mesh do.

Now let’s examine possible attacks and let me show you how the secure control of egress traffic in Istio prevents them.

Preventing possible attacks

Consider the following security policies for egress traffic:

  • Application A is allowed to access *.ibm.com, which includes all the external services with URLs matching *.ibm.com.
  • Application B is allowed to access mongo1.composedb.com.
  • All egress traffic is monitored.

Suppose the attackers have the following goals:

  • Access *.ibm.com from your cluster.
  • Access *.ibm.com from your cluster, unmonitored. The attackers want their traffic to be unmonitored to prevent a possibility that you will detect the forbidden access.
  • Access mongo1.composedb.com from your cluster.

Now suppose that the attackers manage to break into one of the pods of application A, and try to use the compromised pod to perform the forbidden access. The attackers may try their luck and access the external services in a straightforward way. You will react to the straightforward attempts as follows:

  • Initially, there is no way to prevent a compromised application A to access *.ibm.com, because the compromised pod is indistinguishable from the original pod.
  • Fortunately, you can monitor all access to external services, detect suspicious traffic, and thwart attackers from gaining unmonitored access to *.ibm.com. For example, you could apply anomaly detection tools on the egress traffic logs.
  • To stop attackers from accessing mongo1.composedb.com from your cluster, Istio will correctly detect the source of the traffic, application A in this case, and verify that it is not allowed to access mongo1.composedb.com according to the security policies mentioned above.

Having failed to achieve their goals in a straightforward way, the malicious actors may resort to advanced attacks:

  • Bypass the container’s sidecar proxy to be able to access any external service directly, without the sidecar’s policy enforcement and reporting. This attack is prevented by a Kubernetes Network Policy or by an L3 firewall that allow egress traffic to exit the mesh only from the egress gateway.
  • Compromise the egress gateway to be able to force it to send fake information to the monitoring system or to disable enforcement of the security policies. This attack is prevented by applying the special security measures to the egress gateway pods.
  • Impersonate as application B since application B is allowed to access mongo1.composedb.com. This attack, fortunately, is prevented by Istio’s strong identity support.

As far as we can see, all the forbidden access is prevented, or at least is monitored and can be prevented later. If you see other attacks that involve egress traffic or security holes in the current design, we would be happy to hear about it.

Summary

Hopefully, I managed to convince you that Istio is an effective tool to prevent attacks involving egress traffic. In the next part of this series, I compare secure control of egress traffic in Istio with alternative solutions such as Kubernetes Network Policies and legacy egress proxies/firewalls.

]]>
Wed, 10 Jul 2019 00:00:00 +0000/v1.24//blog/2019/egress-traffic-control-in-istio-part-2//v1.24//blog/2019/egress-traffic-control-in-istio-part-2/traffic-managementegresssecuritygatewaytls
Best Practices: Benchmarking Service Mesh PerformanceService meshes add a lot of functionality to application deployments, including traffic policies, observability, and secure communication. But adding a service mesh to your environment comes at a cost, whether that’s time (added latency) or resources (CPU cycles). To make an informed decision on whether a service mesh is right for your use case, it’s important to evaluate how your application performs when deployed with a service mesh.

Earlier this year, we published a blog post on Istio’s performance improvements in version 1.1. Following the release of Istio 1.2, we want to provide guidance and tools to help you benchmark Istio’s data plane performance in a production-ready Kubernetes environment.

Overall, we found that Istio’s sidecar proxy latency scales with the number of concurrent connections. At 1000 requests per second (RPS), across 16 connections, Istio adds 3 milliseconds per request in the 50th percentile, and 10 milliseconds in the 99th percentile.

In the Istio Tools repository, you’ll find scripts and instructions for measuring Istio’s data plane performance, with additional instructions on how to run the scripts with Linkerd, another service mesh implementation. Follow along as we detail some best practices for each step of the performance test framework.

1. Use a production-ready Istio installation

To accurately measure the performance of a service mesh at scale, it’s important to use an adequately-sized Kubernetes cluster. We test using three worker nodes, each with at least 4 vCPUs and 15 GB of memory.

Then, it’s important to use a production-ready Istio installation profile on that cluster. This lets us achieve performance-oriented settings such as control plane pod autoscaling, and ensures that resource limits are appropriate for heavy traffic load. The default Istio installation is suitable for most benchmarking use cases. For extensive performance benchmarking, with thousands of proxy-injected services, we also provide a tuned Istio install that allocates extra memory and CPU to the Istio control plane.

Istio’s demo installation is not suitable for performance testing, because it is designed to be deployed on a small trial cluster, and has full tracing and access logs enabled to showcase Istio’s features.

2. Focus on the data plane

Our benchmarking scripts focus on evaluating the Istio data plane: the Envoy proxies that mediate traffic between application containers. Why focus on the data plane? Because at scale, with lots of application containers, the data plane’s memory and CPU usage quickly eclipses that of the Istio control plane. Let’s look at an example of how this happens:

Say you run 2,000 Envoy-injected pods, each handling 1,000 requests per second. Each proxy is using 50 MB of memory, and to configure all these proxies, Pilot is using 1 vCPU and 1.5 GB of memory. All together, the Istio data plane (the sum of all the Envoy proxies) is using 100 GB of memory, compared to Pilot’s 1.5 GB.

It is also important to focus on data plane performance for latency reasons. This is because most application requests move through the Istio data plane, not the control plane. There are two exceptions:

  1. Telemetry reporting: Each proxy sends raw telemetry data to Mixer, which Mixer processes into metrics, traces, and other telemetry. The raw telemetry data is similar to access logs, and therefore comes at a cost. Access log processing consumes CPU and keeps a worker thread from picking up the next unit of work. At higher throughput, it is more likely that the next unit of work is waiting in the queue to be picked up by the worker. This can lead to long-tail (99th percentile) latency for Envoy.
  2. Custom policy checks: When using custom Istio policy adapters, policy checks are on the request path. This means that request headers and metadata on the data path will be sent to the control plane (Mixer), resulting in higher request latency. Note: These policy checks are disabled by default, as the most common policy use case (RBAC) is performed entirely by the Envoy proxies.

Both of these exceptions will go away in a future Istio release, when Mixer V2 moves all policy and telemetry features directly into the proxies.

Next, when testing Istio’s data plane performance at scale, it’s important to test not only at increasing requests per second, but also against an increasing number of concurrent connections. This is because real-world, high-throughput traffic comes from multiple clients. The provided scripts allow you to perform the same load test with any number of concurrent connections, at increasing RPS.

Lastly, our test environment measures requests between two pods, not many. The client pod is Fortio, which sends traffic to the server pod.

Why test with only two pods? Because scaling up throughput (RPS) and connections (threads) has a greater effect on Envoy’s performance than increasing the total size of the service registry — or, the total number of pods and services in the Kubernetes cluster. When the size of the service registry grows, Envoy does have to keep track of more endpoints, and lookup time per request does increase, but by a tiny constant. If you have many services, and this constant becomes a latency concern, Istio provides a Sidecar resource, which allows you to limit which services each Envoy knows about.

3. Measure with and without proxies

While many Istio features, such as mutual TLS authentication, rely on an Envoy proxy next to an application pod, you can selectively disable sidecar proxy injection for some of your mesh services. As you scale up Istio for production, you may want to incrementally add the sidecar proxy to your workloads.

To that end, the test scripts provide three different modes. These modes analyze Istio’s performance when a request goes through both the client and server proxies (both), just the server proxy (serveronly), and neither proxy (baseline).

You can also disable Mixer to stop Istio’s telemetry during the performance tests, which provides results in line with the performance we expect when the Mixer V2 work is completed. Istio also supports Envoy native telemetry, which performs similarly to having Istio’s telemetry disabled.

Istio 1.2 Performance

Let’s see how to use this test environment to analyze the data plane performance of Istio 1.2. We also provide instructions to run the same performance tests for the Linkerd data plane. Currently, only latency benchmarking is supported for Linkerd.

For measuring Istio’s sidecar proxy latency, we look at the 50th, 90th, and 99th percentiles for an increasing number of concurrent connections,keeping request throughput (RPS) constant.

We found that with 16 concurrent connections and 1000 RPS, Istio adds 3ms over the baseline (P50) when a request travels through both a client and server proxy. (Subtract the pink line, base, from the green line, both.) At 64 concurrent connections, Istio adds 12ms over the baseline, but with Mixer disabled (nomixer_both), Istio only adds 7ms.

In the 90th percentile, with 16 concurrent connections, Istio adds 6ms; with 64 connections, Istio adds 20ms.

Finally, in the 99th percentile, with 16 connections, Istio adds 10ms over the baseline. At 64 connections, Istio adds 25ms with Mixer, or 10ms without Mixer.

For CPU usage, we measured with an increasing request throughput (RPS), and a constant number of concurrent connections. We found that Envoy’s maximum CPU usage at 3000 RPS, with Mixer enabled, was 1.2 vCPUs. At 1000 RPS, one Envoy uses approximately half of a CPU.

Summary

In the process of benchmarking Istio’s performance, we learned several key lessons:

  • Use an environment that mimics production.
  • Focus on data plane traffic.
  • Measure against a baseline.
  • Increase concurrent connections as well as total throughput.

For a mesh with 1000 RPS across 16 connections, Istio 1.2 adds just 3 milliseconds of latency over the baseline, in the 50th percentile.

Also check out the Istio Performance and Scalability guide for the most up-to-date performance data.

Thank you for reading, and happy benchmarking!

]]>
Tue, 09 Jul 2019 00:00:00 +0000/v1.24//blog/2019/performance-best-practices//v1.24//blog/2019/performance-best-practices/performancescalabilityscalebenchmarks
Extending Istio Self-Signed Root Certificate LifetimeIstio self-signed certificates have historically had a 1 year default lifetime. If you are using Istio self-signed certificates, you need to schedule regular root transitions before they expire. An expiration of a root certificate may lead to an unexpected cluster-wide outage. The issue affects new clusters created with versions up to 1.0.7 and 1.1.7.

See Extending Self-Signed Certificate Lifetime for information on how to gauge the age of your certificates and how to perform rotation.

]]>
Fri, 07 Jun 2019 00:00:00 +0000/v1.24//blog/2019/root-transition//v1.24//blog/2019/root-transition/securityPKIcertificateCitadel
Secure Control of Egress Traffic in Istio, part 1This is part 1 in a new series about secure control of egress traffic in Istio that I am going to publish. In this installment, I explain why you should apply egress traffic control to your cluster, the attacks involving egress traffic you want to prevent, and the requirements for a system for egress traffic control to do so. Once you agree that you should control the egress traffic coming from your cluster, the following questions arise: What is required from a system for secure control of egress traffic? Which is the best solution to fulfill these requirements? (spoiler: Istio in my opinion) Future installments will describe the implementation of the secure control of egress traffic in Istio and compare it with other solutions.

The most important security aspect for a service mesh is probably ingress traffic. You definitely must prevent attackers from penetrating the cluster through ingress APIs. Having said that, securing the traffic leaving the mesh is also very important. Once your cluster is compromised, and you must be prepared for that scenario, you want to reduce the damage as much as possible and prevent the attackers from using the cluster for further attacks on external services and legacy systems outside of the cluster. To achieve that goal, you need secure control of egress traffic.

Compliance requirements are another reason to implement secure control of egress traffic. For example, the Payment Card Industry (PCI) Data Security Standard requires that inbound and outbound traffic must be restricted to that which is necessary:

And specifically regarding outbound traffic:

Let’s start with the attacks that involve egress traffic.

The attacks

An IT organization must assume it will be attacked if it hasn’t been attacked already, and that part of its infrastructure could already be compromised or become compromised in the future. Once attackers are able to penetrate an application in a cluster, they can proceed to attack external services: legacy systems, external web services and databases. The attackers may want to steal the data of the application and to transfer it to their external servers. Attackers’ malware may require access to attackers’ servers to download updates. The attackers may use pods in the cluster to perform DDOS attacks or to break into external systems. Even though you cannot know all the possible types of attacks, you want to reduce possibilities for any attacks, both for known and unknown ones.

The external attackers gain access to the application’s container from outside the mesh through a bug in the application but attackers can also be internal, for example, malicious DevOps people inside the organization.

To prevent the attacks described above, some form of egress traffic control must be applied. Let me present egress traffic control in the following section.

The solution: secure control of egress traffic

Secure control of egress traffic means monitoring the egress traffic and enforcing all the security policies regarding the egress traffic. Monitoring the egress traffic, enables you to analyze it, possibly offline, and detect the attacks even if you were unable to prevent them in real time. Another good practice to reduce possibilities of attacks is to specify policies that limit access following the Need to know principle: only the applications that need external services should be allowed to access the external services they need.

Let me now turn to the requirements for egress traffic control we collected.

Requirements for egress traffic control

My colleagues at IBM and I collected requirements for secure control of egress traffic from several customers, and combined them with the egress traffic control requirements from Kubernetes Network Special Interest Group.

Istio 1.1 satisfies all gathered requirements:

  1. Support for TLS with SNI or for TLS origination by Istio.

  2. Monitor SNI and the source workload of every egress access.

  3. Define and enforce policies per cluster, e.g.:

    • all applications in the cluster may access service1.foo.com (a specific host)

    • all applications in the cluster may access any host of the form *.bar.com (a wildcarded domain)

    All unspecified access must be blocked.

  4. Define and enforce policies per source, Kubernetes-aware:

    • application A may access *.foo.com.

    • application B may access *.bar.com.

    All other access must be blocked, in particular access of application A to service1.bar.com.

  5. Prevent tampering. In case an application pod is compromised, prevent the compromised pod from escaping monitoring, from sending fake information to the monitoring system, and from breaking the egress policies.

  6. Nice to have: traffic control is transparent to the applications.

Let me explain each requirement in more detail. The first requirement states that only TLS traffic to the external services must be supported. The requirement emerged upon observation that all the traffic that leaves the cluster must be encrypted. This means that either the applications perform TLS origination or Istio must perform TLS origination for them. Note that in the case an application performs TLS origination, the Istio proxies cannot see the original traffic, only the encrypted one, so the proxies see the TLS protocol only. For the proxies it does not matter if the original protocol is HTTP or MongoDB, all the Istio proxies can see is TLS traffic.

The second requirement states that SNI and the source of the traffic must be monitored. Monitoring is the first step to prevent attacks. Even if attackers would be able to access external services from the cluster, if the access is monitored, there is a chance to discover the suspicious traffic and take a corrective action.

Note that in the case of TLS originated by an application, the Istio sidecar proxies can only see TCP traffic and a TLS handshake that includes SNI. A label of the source pod could identify the source of the traffic but a service account of the pod or some other source identifier could be used. We call this property of an egress control system as being Kubernetes-aware: the system must understand Kubernetes artifacts like pods and service accounts. If the system is not Kubernetes-aware, it can only monitor the IP address as the identifier of the source.

The third requirement states that Istio operators must be able to define policies for egress traffic for the entire cluster. The policies state which external services may be accessed by any pod in the cluster. The external services can be identified either by a Fully qualified domain name of the service, e.g. www.ibm.com or by a wildcarded domain, e.g. *.ibm.com. Only the specified external services may be accessed, all other egress traffic is blocked.

This requirement originates from the need to prevent attackers from accessing malicious sites, for example for downloading updates/instructions for their malware. You also want to limit the number of external sites that the attackers can access and attack. You want to allow access only to the external services that the applications in the cluster need to access and to block access to all the other services, this way you reduce the attack surface. While the external services can have their own security mechanisms, you want to exercise Defense in depth and to have multiple security layers: a security layer in your cluster in addition to the security layers in the external systems.

This requirement means that the external services must be identifiable by domain names. We call this property of an egress control system as being DNS-aware. If the system is not DNS-aware, the external services must be specified by IP addresses. Using IP addresses is not convenient and often is not feasible, since the IP addresses of a service can change. Sometimes all the IP addresses of a service are not even known, for example in the case of CDNs.

The fourth requirement states that the source of the egress traffic must be added to the policies effectively extending the third requirement. Policies can specify which source can access which external service and the source must be identified just as in the second requirement, for example, by a label of the source pod or by service account of the pod. It means that policy enforcement must also be Kubernetes-aware. If policy enforcement is not Kubernetes-aware, the policies must identify the source of traffic by the IP of the pod, which is not convenient, especially since the pods can come and go so their IPs are not static.

The fifth requirement states that even if the cluster is compromised and the attackers control some of the pods, they must not be able to cheat the monitoring or to violate policies of the egress control system. We say that such a system provides secure control of egress traffic.

The sixth requirement states that the traffic control should be provided without changing the application containers, in particular without changing the code of the applications and without changing the environment of the containers. We call such a control of egress traffic transparent.

In the next posts I will show that Istio can function as an example of an egress traffic control system that satisfies all of these requirements, in particular it is transparent, DNS-aware, and Kubernetes-aware.

Summary

I hope that you are convinced that controlling egress traffic is important for the security of your cluster. In the part 2 of this series I describe the Istio way to perform secure control of egress traffic. In the part 3 of this series I compare it with alternative solutions such as Kubernetes Network Policies and legacy egress proxies/firewalls.

]]>
Wed, 22 May 2019 00:00:00 +0000/v1.24//blog/2019/egress-traffic-control-in-istio-part-1//v1.24//blog/2019/egress-traffic-control-in-istio-part-1/traffic-managementegresssecurity
Architecting Istio 1.1 for PerformanceHyper-scale, microservice-based cloud environments have been exciting to build but challenging to manage. Along came Kubernetes (container orchestration) in 2014, followed by Istio (container service management) in 2017. Both open-source projects enable developers to scale container-based applications without spending too much time on administration tasks.

Now, new enhancements in Istio 1.1 deliver scale-up with improved application performance and service management efficiency. Simulations using our sample commercial airline reservation application show the following improvements, compared to Istio 1.0.

We’ve seen substantial application performance gains:

  • up to 30% reduction in application average latency
  • up to 40% faster service startup times in a large mesh

As well as impressive improvements in service management efficiency:

  • up to 90% reduction in Pilot CPU usage in a large mesh
  • up to 50% reduction in Pilot memory usage in a large mesh

With Istio 1.1, organizations can be more confident in their ability to scale applications with consistency and control – even in hyper-scale cloud environments.

Congratulations to the Istio experts around the world who contributed to this release. We could not be more pleased with these results.

Istio 1.1 performance enhancements

As members of the Istio Performance and Scalability workgroup, we have done extensive performance evaluations. We introduced many performance design features for Istio 1.1, in collaboration with other Istio contributors. Some of the most visible performance enhancements in 1.1 include:

  • Significant reduction in default collection of Envoy-generated statistics
  • Added load-shedding functionality to Mixer workloads
  • Improved the protocol between Envoy and Mixer
  • Namespace isolation, to reduce operational overhead
  • Configurable concurrent worker threads, which can improve overall throughput
  • Configurable filters that limit telemetry data
  • Removal of synchronization bottlenecks

Continuous code quality and performance verification

Regression Patrol drives continuous improvement in Istio performance and quality. Behind the scenes, the Regression Patrol helps Istio developers to identify and fix code issues. Daily builds are checked using a customer-centric benchmark, BluePerf. The results are published to the Istio community web portal. Various application configurations are evaluated to help provide insights on Istio component performance.

Another tool that is used to evaluate the performance of Istio’s builds is Fortio, which provides a synthetic end to end load testing benchmark.

Summary

Istio 1.1 was designed for performance and scalability. The Istio Performance and Scalability workgroup measured significant performance improvements over 1.0. Istio 1.1 introduces new features and optimizations to help harden the service mesh for enterprise microservice workloads. The Istio 1.1 Performance and Tuning Guide documents performance simulations, provides sizing and capacity planning guidance, and includes best practices for tuning custom use cases.

Disclaimer

The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be obtained in other operating environments may vary significantly. There is no guarantee that the same or similar results will be obtained elsewhere.

]]>
Tue, 19 Mar 2019 00:00:00 +0000/v1.24//blog/2019/istio1.1_perf//v1.24//blog/2019/istio1.1_perf/performancescalabilityscalebenchmarks
Version Routing in a Multicluster Service MeshIf you’ve spent any time looking at Istio, you’ve probably noticed that it includes a lot of features that can be demonstrated with simple tasks and examples running on a single Kubernetes cluster. Because most, if not all, real-world cloud and microservices-based applications are not that simple and will need to have the services distributed and running in more than one location, you may be wondering if all these things will be just as simple in your real production environment.

Fortunately, Istio provides several ways to configure a service mesh so that applications can, more-or-less transparently, be part of a mesh where the services are running in more than one cluster, i.e., in a multicluster deployment. The simplest way to set up a multicluster mesh, because it has no special networking requirements, is using a replicated control plane model. In this configuration, each Kubernetes cluster contributing to the mesh has its own control plane, but each control plane is synchronized and running under a single administrative control.

In this article we’ll look at how one of the features of Istio, traffic management, works in a multicluster mesh with a dedicated control plane topology. We’ll show how to configure Istio route rules to call remote services in a multicluster service mesh by deploying the Bookinfo sample with version v1 of the reviews service running in one cluster, versions v2 and v3 running in a second cluster.

Set up clusters

To start, you’ll need two Kubernetes clusters, both running a slightly customized configuration of Istio.

  • Set up a multicluster environment with two Istio clusters by following the replicated control planes instructions.

  • The kubectl command is used to access both clusters with the --context flag. Use the following command to list your contexts:

    $ kubectl config get-contexts
    CURRENT   NAME       CLUSTER    AUTHINFO       NAMESPACE
    *         cluster1   cluster1   user@foo.com   default
              cluster2   cluster2   user@foo.com   default
  • Export the following environment variables with the context names of your configuration:

    $ export CTX_CLUSTER1=<cluster1 context name>
    $ export CTX_CLUSTER2=<cluster2 context name>

Deploy version v1 of the bookinfo application in cluster1

Run the productpage and details services and version v1 of the reviews service in cluster1:

$ kubectl label --context=$CTX_CLUSTER1 namespace default istio-injection=enabled
$ kubectl apply --context=$CTX_CLUSTER1 -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: productpage
  labels:
    app: productpage
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: productpage
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: productpage-v1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: productpage
        version: v1
    spec:
      containers:
      - name: productpage
        image: istio/examples-bookinfo-productpage-v1:1.10.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
---
apiVersion: v1
kind: Service
metadata:
  name: details
  labels:
    app: details
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: details
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: details-v1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: details
        version: v1
    spec:
      containers:
      - name: details
        image: istio/examples-bookinfo-details-v1:1.10.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
---
apiVersion: v1
kind: Service
metadata:
  name: reviews
  labels:
    app: reviews
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: reviews
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: reviews-v1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: reviews
        version: v1
    spec:
      containers:
      - name: reviews
        image: istio/examples-bookinfo-reviews-v1:1.10.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
EOF

Deploy bookinfo v2 and v3 services in cluster2

Run the ratings service and version v2 and v3 of the reviews service in cluster2:

$ kubectl label --context=$CTX_CLUSTER2 namespace default istio-injection=enabled
$ kubectl apply --context=$CTX_CLUSTER2 -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: ratings
  labels:
    app: ratings
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: ratings
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: ratings-v1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: ratings
        version: v1
    spec:
      containers:
      - name: ratings
        image: istio/examples-bookinfo-ratings-v1:1.10.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
---
apiVersion: v1
kind: Service
metadata:
  name: reviews
  labels:
    app: reviews
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: reviews
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: reviews-v2
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: reviews
        version: v2
    spec:
      containers:
      - name: reviews
        image: istio/examples-bookinfo-reviews-v2:1.10.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: reviews-v3
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: reviews
        version: v3
    spec:
      containers:
      - name: reviews
        image: istio/examples-bookinfo-reviews-v3:1.10.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
EOF

Access the bookinfo application

Just like any application, we’ll use an Istio gateway to access the bookinfo application.

  • Create the bookinfo gateway in cluster1:

    Zip
    $ kubectl apply --context=$CTX_CLUSTER1 -f @samples/bookinfo/networking/bookinfo-gateway.yaml@
  • Follow the Bookinfo sample instructions to determine the ingress IP and port and then point your browser to http://$GATEWAY_URL/productpage.

You should see the productpage with reviews, but without ratings, because only v1 of the reviews service is running on cluster1 and we have not yet configured access to cluster2.

Create a service entry and destination rule on cluster1 for the remote reviews service

As described in the setup instructions, remote services are accessed with a .global DNS name. In our case, it’s reviews.default.global, so we need to create a service entry and destination rule for that host. The service entry will use the cluster2 gateway as the endpoint address to access the service. You can use the gateway’s DNS name, if it has one, or its public IP, like this:

$ export CLUSTER2_GW_ADDR=$(kubectl get --context=$CTX_CLUSTER2 svc --selector=app=istio-ingressgateway \
    -n istio-system -o jsonpath="{.items[0].status.loadBalancer.ingress[0].ip}")

Now create the service entry and destination rule using the following command:

$ kubectl apply --context=$CTX_CLUSTER1 -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: reviews-default
spec:
  hosts:
  - reviews.default.global
  location: MESH_INTERNAL
  ports:
  - name: http1
    number: 9080
    protocol: http
  resolution: DNS
  addresses:
  - 240.0.0.3
  endpoints:
  - address: ${CLUSTER2_GW_ADDR}
    labels:
      cluster: cluster2
    ports:
      http1: 15443 # Do not change this port value
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews-global
spec:
  host: reviews.default.global
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
  subsets:
  - name: v2
    labels:
      cluster: cluster2
  - name: v3
    labels:
      cluster: cluster2
EOF

The address 240.0.0.3 of the service entry can be any arbitrary unallocated IP. Using an IP from the class E addresses range 240.0.0.0/4 is a good choice. Check out the gateway-connected multicluster example for more details.

Note that the labels of the subsets in the destination rule map to the service entry endpoint label (cluster: cluster2) corresponding to the cluster2 gateway. Once the request reaches the destination cluster, a local destination rule will be used to identify the actual pod labels (version: v1 or version: v2) corresponding to the requested subset.

Create a destination rule on both clusters for the local reviews service

Technically, we only need to define the subsets of the local service that are being used in each cluster (i.e., v1 in cluster1, v2 and v3 in cluster2), but for simplicity we’ll just define all three subsets in both clusters, since there’s nothing wrong with defining subsets for versions that are not actually deployed.

$ kubectl apply --context=$CTX_CLUSTER1 -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews.default.svc.cluster.local
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3
EOF
$ kubectl apply --context=$CTX_CLUSTER2 -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews.default.svc.cluster.local
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3
EOF

Create a virtual service to route reviews service traffic

At this point, all calls to the reviews service will go to the local reviews pods (v1) because if you look at the source code you will see that the productpage implementation is simply making requests to http://reviews:9080 (which expands to host reviews.default.svc.cluster.local), the local version of the service. The corresponding remote service is named reviews.default.global, so route rules are needed to redirect requests to the global host.

Apply the following virtual service to direct traffic for user jason to reviews versions v2 and v3 (50/50) which are running on cluster2. Traffic for any other user will go to reviews version v1.

$ kubectl apply --context=$CTX_CLUSTER1 -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews.default.svc.cluster.local
  http:
  - match:
    - headers:
        end-user:
          exact: jason
    route:
    - destination:
        host: reviews.default.global
        subset: v2
      weight: 50
    - destination:
        host: reviews.default.global
        subset: v3
      weight: 50
  - route:
    - destination:
        host: reviews.default.svc.cluster.local
        subset: v1
EOF

Return to your browser and login as user jason. If you refresh the page several times, you should see the display alternating between black and red ratings stars (v2 and v3). If you logout, you will only see reviews without ratings (v1).

Summary

In this article, we’ve seen how to use Istio route rules to distribute the versions of a service across clusters in a multicluster service mesh with a replicated control plane model. In this example, we manually configured the .global service entry and destination rules needed to provide connectivity to one remote service, reviews. In general, however, if we wanted to enable any service to run either locally or remotely, we would need to create .global resources for every service. Fortunately, this process could be automated and likely will be in a future Istio release.

]]>
Thu, 07 Feb 2019 00:00:00 +0000/v1.24//blog/2019/multicluster-version-routing//v1.24//blog/2019/multicluster-version-routing/traffic-managementmulticluster
Sail the Blog!Welcome to the Istio blog!

To make it easier to publish your content on our website, we updated the content types guide.

The goal of the updated guide is to make sharing and finding content easier.

We want to make sharing timely information on Istio easy and the Istio blog is a good place to start.

We welcome your posts to the blog if you think your content falls in one of the following four categories:

  • Your post details your experience using and configuring Istio. Ideally, your post shares a novel experience or perspective.
  • Your post highlights Istio features.
  • Your post details how to accomplish a task or fulfill a specific use case using Istio.

Posting your blog is only one PR away and, if you wish, you can request a review.

We look forward to reading about your Istio experience on the blog soon!

]]>
Tue, 05 Feb 2019 00:00:00 +0000/v1.24//blog/2019/sail-the-blog//v1.24//blog/2019/sail-the-blog/communityblogcontributionguideguidelineevent
Egress Gateway Performance InvestigationThe main objective of this investigation was to determine the impact on performance and resource utilization when an egress gateway is added in the service mesh to access an external service (MongoDB, in this case). The steps to configure an egress gateway for an external MongoDB are described in the blog Consuming External MongoDB Services.

The application used for this investigation was the Java version of Acmeair, which simulates an airline reservation system. This application is used in the Performance Regression Patrol of Istio daily builds, but on that setup the microservices have been accessing the external MongoDB directly via their sidecars, without an egress gateway.

The diagram below illustrates how regression patrol currently runs with Acmeair and Istio:

Acmeair benchmark in the Istio performance regression patrol environment

Another difference is that the application communicates with the external DB with plain MongoDB protocol. The first change made for this study was to establish a TLS communication between the MongoDB and its clients running within the application, as this is a more realistic scenario.

Several cases for accessing the external database from the mesh were tested and described next.

Egress traffic cases

Case 1: Bypassing the sidecar

In this case, the sidecar does not intercept the communication between the application and the external DB. This is accomplished by setting the init container argument -x with the CIDR of the MongoDB, which makes the sidecar ignore messages to/from this IP address. For example:

    - -x
    - "169.47.232.211/32"
Traffic to external MongoDB by-passing the sidecar

Case 2: Through the sidecar, with service entry

This is the default configuration when the sidecar is injected into the application pod. All messages are intercepted by the sidecar and routed to the destination according to the configured rules, including the communication with external services. The MongoDB was defined as a ServiceEntry.

Sidecar intercepting traffic to external MongoDB

Case 3: Egress gateway

The egress gateway and corresponding destination rule and virtual service resources are defined for accessing MongoDB. All traffic to and from the external DB goes through the egress gateway (envoy).

Introduction of the egress gateway to access MongoDB

Case 4: Mutual TLS between sidecars and the egress gateway

In this case, there is an extra layer of security between the sidecars and the gateway, so some impact in performance is expected.

Enabling mutual TLS between sidecars and the egress gateway

Case 5: Egress gateway with SNI proxy

This scenario is used to evaluate the case where another proxy is required to access wildcarded domains. This may be required due current limitations of envoy. An nginx proxy was created as sidecar in the egress gateway pod.

Egress gateway with additional SNI Proxy

Environment

  • Istio version: 1.0.2
  • K8s version: 1.10.5_1517
  • Acmeair App: 4 services (1 replica of each), inter-services transactions, external Mongo DB, avg payload: 620 bytes.

Results

Jmeter was used to generate the workload which consisted in a sequence of 5-minute runs, each one using a growing number of clients making http requests. The number of clients used were 1, 5, 10, 20, 30, 40, 50 and 60.

Throughput

The chart below shows the throughput obtained for the different cases:

Throughput obtained for the different cases

As you can see, there is no major impact in having sidecars and the egress gateway between the application and the external MongoDB, but enabling mutual TLS and then adding the SNI proxy caused a degradation in the throughput of about 10% and 24%, respectively.

Response time

The average response times for the different requests were collected when traffic was being driven with 20 clients. The chart below shows the average, median, 90%, 95% and 99% average values for each case:

Response times obtained for the different configurations

Likewise, not much difference in the response times for the 3 first cases, but mutual TLS and the extra proxy adds noticeable latency.

CPU utilization

The CPU usage was collected for all Istio components as well as for the sidecars during the runs. For a fair comparison, CPU used by Istio was normalized by the throughput obtained for a given run. The results are shown in the following graph:

CPU usage normalized by TPS

In terms of CPU consumption per transaction, Istio has used significantly more CPU only in the egress gateway + SNI proxy case.

Conclusion

In this investigation, we tried different options to access an external TLS-enabled MongoDB to compare their performance. The introduction of the Egress Gateway did not have a significant impact on the performance nor meaningful additional CPU consumption. Only when enabling mutual TLS between sidecars and egress gateway or using an additional SNI proxy for wildcarded domains we could observe some degradation.

]]>
Thu, 31 Jan 2019 00:00:00 +0000/v1.24//blog/2019/egress-performance//v1.24//blog/2019/egress-performance/performancetraffic-managementegressmongo
Demystifying Istio's Sidecar Injection ModelA simple overview of an Istio service-mesh architecture always starts with describing the control-plane and data-plane.

From Istio’s documentation:

Istio Architecture

It is important to understand that the sidecar injection into the application pods happens automatically, though manual injection is also possible. Traffic is directed from the application services to and from these sidecars without developers needing to worry about it. Once the applications are connected to the Istio service mesh, developers can start using and reaping the benefits of all that the service mesh has to offer. However, how does the data plane plumbing happen and what is really required to make it work seamlessly? In this post, we will deep-dive into the specifics of the sidecar injection models to gain a very clear understanding of how sidecar injection works.

Sidecar injection

In simple terms, sidecar injection is adding the configuration of additional containers to the pod template. The added containers needed for the Istio service mesh are:

istio-init This init container is used to setup the iptables rules so that inbound/outbound traffic will go through the sidecar proxy. An init container is different than an app container in following ways:

  • It runs before an app container is started and it always runs to completion.
  • If there are many init containers, each should complete with success before the next container is started.

So, you can see how this type of container is perfect for a set-up or initialization job which does not need to be a part of the actual application container. In this case, istio-init does just that and sets up the iptables rules.

istio-proxy This is the actual sidecar proxy (based on Envoy).

Manual injection

In the manual injection method, you can use istioctl to modify the pod template and add the configuration of the two containers previously mentioned. For both manual as well as automatic injection, Istio takes the configuration from the istio-sidecar-injector configuration map (configmap) and the mesh’s istio configmap.

Let’s look at the configuration of the istio-sidecar-injector configmap, to get an idea of what actually is going on.

$ kubectl -n istio-system get configmap istio-sidecar-injector -o=jsonpath='{.data.config}'
SNIPPET from the output:

policy: enabled
template: |-
  initContainers:
  - name: istio-init
    image: docker.io/istio/proxy_init:1.0.2
    args:
    - "-p"
    - [[ .MeshConfig.ProxyListenPort ]]
    - "-u"
    - 1337
    .....
    imagePullPolicy: IfNotPresent
    securityContext:
      capabilities:
        add:
        - NET_ADMIN
    restartPolicy: Always

  containers:
  - name: istio-proxy
    image: [[ if (isset .ObjectMeta.Annotations "sidecar.istio.io/proxyImage") -]]
    "[[ index .ObjectMeta.Annotations "sidecar.istio.io/proxyImage" ]]"
    [[ else -]]
    docker.io/istio/proxyv2:1.0.2
    [[ end -]]
    args:
    - proxy
    - sidecar
    .....
    env:
    .....
    - name: ISTIO_META_INTERCEPTION_MODE
      value: [[ or (index .ObjectMeta.Annotations "sidecar.istio.io/interceptionMode") .ProxyConfig.InterceptionMode.String ]]
    imagePullPolicy: IfNotPresent
    securityContext:
      readOnlyRootFilesystem: true
      [[ if eq (or (index .ObjectMeta.Annotations "sidecar.istio.io/interceptionMode") .ProxyConfig.InterceptionMode.String) "TPROXY" -]]
      capabilities:
        add:
        - NET_ADMIN
    restartPolicy: Always
    .....

As you can see, the configmap contains the configuration for both, the istio-init init container and the istio-proxy proxy container. The configuration includes the name of the container image and arguments like interception mode, capabilities, etc.

From a security point of view, it is important to note that istio-init requires NET_ADMIN capabilities to modify iptables within the pod’s namespace and so does istio-proxy if configured in TPROXY mode. As this is restricted to a pod’s namespace, there should be no problem. However, I have noticed that recent open-shift versions may have some issues with it and a workaround is needed. One such option is mentioned at the end of this post.

To modify the current pod template for sidecar injection, you can:

$ istioctl kube-inject -f demo-red.yaml | kubectl apply -f -

OR

To use modified configmaps or local configmaps:

  • Create inject-config.yaml and mesh-config.yaml from the configmaps

    $ kubectl -n istio-system get configmap istio-sidecar-injector -o=jsonpath='{.data.config}' > inject-config.yaml
    $ kubectl -n istio-system get configmap istio -o=jsonpath='{.data.mesh}' > mesh-config.yaml
  • Modify the existing pod template, in my case, demo-red.yaml:

    $ istioctl kube-inject --injectConfigFile inject-config.yaml --meshConfigFile mesh-config.yaml --filename demo-red.yaml --output demo-red-injected.yaml
  • Apply the demo-red-injected.yaml

    $ kubectl apply -f demo-red-injected.yaml

As seen above, we create a new template using the sidecar-injector and the mesh configuration to then apply that new template using kubectl. If we look at the injected YAML file, it has the configuration of the Istio-specific containers, as we discussed above. Once we apply the injected YAML file, we see two containers running. One of them is the actual application container, and the other is the istio-proxy sidecar.

$ kubectl get pods | grep demo-red
demo-red-pod-8b5df99cc-pgnl7   2/2       Running   0          3d

The count is not 3 because the istio-init container is an init type container that exits after doing what it supposed to do, which is setting up the iptable rules within the pod. To confirm the init container exit, let’s look at the output of kubectl describe:

$ kubectl describe pod demo-red-pod-8b5df99cc-pgnl7
SNIPPET from the output:

Name:               demo-red-pod-8b5df99cc-pgnl7
Namespace:          default
.....
Labels:             app=demo-red
                    pod-template-hash=8b5df99cc
                    version=version-red
Annotations:        sidecar.istio.io/status={"version":"3c0b8d11844e85232bc77ad85365487638ee3134c91edda28def191c086dc23e","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs...
Status:             Running
IP:                 10.32.0.6
Controlled By:      ReplicaSet/demo-red-pod-8b5df99cc
Init Containers:
  istio-init:
    Container ID:  docker://bef731eae1eb3b6c9d926cacb497bb39a7d9796db49cd14a63014fc1a177d95b
    Image:         docker.io/istio/proxy_init:1.0.2
    Image ID:      docker-pullable://docker.io/istio/proxy_init@sha256:e16a0746f46cd45a9f63c27b9e09daff5432e33a2d80c8cc0956d7d63e2f9185
    .....
    State:          Terminated
      Reason:       Completed
    .....
    Ready:          True
Containers:
  demo-red:
    Container ID:   docker://8cd9957955ff7e534376eb6f28b56462099af6dfb8b9bc37aaf06e516175495e
    Image:          chugtum/blue-green-image:v3
    Image ID:       docker-pullable://docker.io/chugtum/blue-green-image@sha256:274756dbc215a6b2bd089c10de24fcece296f4c940067ac1a9b4aea67cf815db
    State:          Running
      Started:      Sun, 09 Dec 2018 18:12:31 -0800
    Ready:          True
  istio-proxy:
    Container ID:  docker://ca5d690be8cd6557419cc19ec4e76163c14aed2336eaad7ebf17dd46ca188b4a
    Image:         docker.io/istio/proxyv2:1.0.2
    Image ID:      docker-pullable://docker.io/istio/proxyv2@sha256:54e206530ba6ca9b3820254454e01b7592e9f986d27a5640b6c03704b3b68332
    Args:
      proxy
      sidecar
      .....
    State:          Running
      Started:      Sun, 09 Dec 2018 18:12:31 -0800
    Ready:          True
    .....

As seen in the output, the State of the istio-init container is Terminated with the Reason being Completed. The only two containers running are the main application demo-red container and the istio-proxy container.

Automatic injection

Most of the times, you don’t want to manually inject a sidecar every time you deploy an application, using the istioctl command, but would prefer that Istio automatically inject the sidecar to your pod. This is the recommended approach and for it to work, all you need to do is to label the namespace where you are deploying the app with istio-injection=enabled.

Once labeled, Istio injects the sidecar automatically for any pod you deploy in that namespace. In the following example, the sidecar gets automatically injected in the deployed pods in the istio-dev namespace.

$ kubectl get namespaces --show-labels
NAME           STATUS    AGE       LABELS
default        Active    40d       <none>
istio-dev      Active    19d       istio-injection=enabled
istio-system   Active    24d       <none>
kube-public    Active    40d       <none>
kube-system    Active    40d       <none>

But how does this work? To get to the bottom of this, we need to understand Kubernetes admission controllers.

From Kubernetes documentation:

For automatic sidecar injection, Istio relies on Mutating Admission Webhook. Let’s look at the details of the istio-sidecar-injector mutating webhook configuration.

$ kubectl get mutatingwebhookconfiguration istio-sidecar-injector -o yaml
SNIPPET from the output:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: MutatingWebhookConfiguration
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"admissionregistration.k8s.io/v1beta1","kind":"MutatingWebhookConfiguration","metadata":{"annotations":{},"labels":{"app":"istio-sidecar-injector","chart":"sidecarInjectorWebhook-1.0.1","heritage":"Tiller","release":"istio-remote"},"name":"istio-sidecar-injector","namespace":""},"webhooks":[{"clientConfig":{"caBundle":"","service":{"name":"istio-sidecar-injector","namespace":"istio-system","path":"/inject"}},"failurePolicy":"Fail","name":"sidecar-injector.istio.io","namespaceSelector":{"matchLabels":{"istio-injection":"enabled"}},"rules":[{"apiGroups":[""],"apiVersions":["v1"],"operations":["CREATE"],"resources":["pods"]}]}]}
  creationTimestamp: 2018-12-10T08:40:15Z
  generation: 2
  labels:
    app: istio-sidecar-injector
    chart: sidecarInjectorWebhook-1.0.1
    heritage: Tiller
    release: istio-remote
  name: istio-sidecar-injector
  .....
webhooks:
- clientConfig:
    service:
      name: istio-sidecar-injector
      namespace: istio-system
      path: /inject
  name: sidecar-injector.istio.io
  namespaceSelector:
    matchLabels:
      istio-injection: enabled
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - pods

This is where you can see the webhook namespaceSelector label that is matched for sidecar injection with the label istio-injection: enabled. In this case, you also see the operations and resources for which this is done when the pods are created. When an apiserver receives a request that matches one of the rules, the apiserver sends an admission review request to the webhook service as specified in the clientConfig:configuration with the name: istio-sidecar-injector key-value pair. We should be able to see that this service is running in the istio-system namespace.

$ kubectl get svc --namespace=istio-system | grep sidecar-injector
istio-sidecar-injector   ClusterIP   10.102.70.184   <none>        443/TCP             24d

This configuration ultimately does pretty much the same as we saw in manual injection. Just that it is done automatically during pod creation, so you won’t see the change in the deployment. You need to use kubectl describe to see the sidecar proxy and the init proxy.

The automatic sidecar injection not only depends on the namespaceSelector mechanism of the webhook, but also on the default injection policy and the per-pod override annotation.

If you look at the istio-sidecar-injector ConfigMap again, it has the default injection policy defined. In our case, it is enabled by default.

$ kubectl -n istio-system get configmap istio-sidecar-injector -o=jsonpath='{.data.config}'
SNIPPET from the output:

policy: enabled
template: |-
  initContainers:
  - name: istio-init
    image: "gcr.io/istio-release/proxy_init:1.0.2"
    args:
    - "-p"
    - [[ .MeshConfig.ProxyListenPort ]]

You can also use the annotation sidecar.istio.io/inject in the pod template to override the default policy. The following example disables the automatic injection of the sidecar for the pods in a Deployment.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: ignored
spec:
  template:
    metadata:
      annotations:
        sidecar.istio.io/inject: "false"
    spec:
      containers:
      - name: ignored
        image: tutum/curl
        command: ["/bin/sleep","infinity"]

This example shows there are many variables, based on whether the automatic sidecar injection is controlled in your namespace, ConfigMap, or pod and they are:

  • webhooks namespaceSelector (istio-injection: enabled)
  • default policy (Configured in the ConfigMap istio-sidecar-injector)
  • per-pod override annotation (sidecar.istio.io/inject)

The injection status table shows a clear picture of the final injection status based on the value of the above variables.

Traffic flow from application container to sidecar proxy

Now that we are clear about how a sidecar container and an init container are injected into an application manifest, how does the sidecar proxy grab the inbound and outbound traffic to and from the container? We did briefly mention that it is done by setting up the iptable rules within the pod namespace, which in turn is done by the istio-init container. Now, it is time to verify what actually gets updated within the namespace.

Let’s get into the application pod namespace we deployed in the previous section and look at the configured iptables. I am going to show an example using nsenter. Alternatively, you can enter the container in a privileged mode to see the same information. For folks without access to the nodes, using exec to get into the sidecar and running iptables is more practical.

$ docker inspect b8de099d3510 --format '{{ .State.Pid }}'
4125
$ nsenter -t 4215 -n iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N ISTIO_INBOUND
-N ISTIO_IN_REDIRECT
-N ISTIO_OUTPUT
-N ISTIO_REDIRECT
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 80 -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15001
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -j ISTIO_REDIRECT
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001

The output above clearly shows that all the incoming traffic to port 80, which is the port our red-demo application is listening, is now REDIRECTED to port 15001, which is the port that the istio-proxy, an Envoy proxy, is listening. The same holds true for the outgoing traffic.

This brings us to the end of this post. I hope it helped to de-mystify how Istio manages to inject the sidecar proxies into an existing deployment and how Istio routes the traffic to the proxy.

]]>
Thu, 31 Jan 2019 00:00:00 +0000/v1.24//blog/2019/data-plane-setup//v1.24//blog/2019/data-plane-setup/kubernetessidecar-injectiontraffic-management
Sidestepping Dependency Ordering with AppSwitchWe are going through an interesting cycle of application decomposition and recomposition. While the microservice paradigm is driving monolithic applications to be broken into separate individual services, the service mesh approach is helping them to be connected back together into well-structured applications. As such, microservices are logically separate but not independent. They are usually closely interdependent and taking them apart introduces many new concerns such as need for mutual authentication between services. Istio directly addresses most of those issues.

Dependency ordering problem

An issue that arises due to application decomposition and one that Istio doesn’t address is dependency ordering – bringing up individual services of an application in an order that guarantees that the application as a whole comes up quickly and correctly. In a monolithic application, with all its components built-in, dependency ordering between the components is enforced by internal locking mechanisms. But with individual services potentially scattered across the cluster in a service mesh, starting a service first requires checking that the services it depends on are up and available.

Dependency ordering is deceptively nuanced with a host of interrelated problems. Ordering individual services requires having the dependency graph of the services so that they can be brought up starting from leaf nodes back to the root nodes. It is not easy to construct such a graph and keep it updated over time as interdependencies evolve with the behavior of the application. Even if the dependency graph is somehow provided, enforcing the ordering itself is not easy. Simply starting the services in the specified order obviously won’t do. A service may have started but not be ready to accept connections yet. This is the problem with docker-compose’s depends-on tag, for example.

Apart from introducing sufficiently long sleeps between service startups, a common pattern that is often used is to check for readiness of dependencies before starting a service. In Kubernetes, this could be done with a wait script as part of the init container of the pod. However that means that the entire application would be held up until all its dependencies come alive. Sometimes applications spend several minutes initializing themselves on startup before making their first outbound connection. Not allowing a service to start at all adds substantial overhead to overall startup time of the application. Also, the strategy of waiting on the init container won’t work for the case of multiple interdependent services within the same pod.

Example scenario: IBM WebSphere ND

Let us consider IBM WebSphere ND – a widely deployed application middleware – to grok these problems more closely. It is a fairly complex framework in itself and consists of a central component called deployment manager (dmgr) that manages a set of node instances. It uses UDP to negotiate cluster membership among the nodes and requires that deployment manager is up and operational before any of the node instances can come up and join the cluster.

Why are we talking about a traditional application in the modern cloud-native context? It turns out that there are significant gains to be had by enabling them to run on the Kubernetes and Istio platforms. Essentially it’s a part of the modernization journey that allows running traditional apps alongside green-field apps on the same modern platform to facilitate interoperation between the two. In fact, WebSphere ND is a demanding application. It expects a consistent network environment with specific network interface attributes etc. AppSwitch is equipped to take care of those requirements. For the purpose of this blog however, I’ll focus on the dependency ordering requirement and how AppSwitch addresses it.

Simply deploying dmgr and node instances as pods on a Kubernetes cluster does not work. dmgr and the node instances happen to have a lengthy initialization process that can take several minutes. If they are all co-scheduled, the application typically ends up in a funny state. When a node instance comes up and finds that dmgr is missing, it would take an alternate startup path. Instead, if it had exited immediately, Kubernetes crash-loop would have taken over and perhaps the application would have come up. But even in that case, it turns out that a timely startup is not guaranteed.

One dmgr along with its node instances is a basic deployment configuration for WebSphere ND. Applications like IBM Business Process Manager that are built on top of WebSphere ND running in production environments include several other services. In those configurations, there could be a chain of interdependencies. Depending on the applications hosted by the node instances, there may be an ordering requirement among them as well. With long service initialization times and crash-loop restarts, there is little chance for the application to start in any reasonable length of time.

Sidecar dependency in Istio

Istio itself is affected by a version of the dependency ordering problem. Since connections into and out of a service running under Istio are redirected through its sidecar proxy, an implicit dependency is created between the application service and its sidecar. Unless the sidecar is fully operational, all requests from and to the service get dropped.

Dependency ordering with AppSwitch

So how do we go about addressing these issues? One way is to defer it to the applications and say that they are supposed to be “well behaved” and implement appropriate logic to make themselves immune to startup order issues. However, many applications (especially traditional ones) either timeout or deadlock if misordered. Even for new applications, implementing one off logic for each service is substantial additional burden that is best avoided. Service mesh needs to provide adequate support around these problems. After all, factoring out common patterns into an underlying framework is really the point of service mesh.

AppSwitch explicitly addresses dependency ordering. It sits on the control path of the application’s network interactions between clients and services in a cluster and knows precisely when a service becomes a client by making the connect call and when a particular service becomes ready to accept connections by making the listen call. It’s service router component disseminates information about these events across the cluster and arbitrates interactions among clients and servers. That is how AppSwitch implements functionality such as load balancing and isolation in a simple and efficient manner. Leveraging the same strategic location of the application’s network control path, it is conceivable that the connect and listen calls made by those services can be lined up at a finer granularity rather than coarsely sequencing entire services as per a dependency graph. That would effectively solve the multilevel dependency problem and speedup application startup.

But that still requires a dependency graph. A number of products and tools exist to help with discovering service dependencies. But they are typically based on passive monitoring of network traffic and cannot provide the information beforehand for any arbitrary application. Network level obfuscation due to encryption and tunneling also makes them unreliable. The burden of discovering and specifying the dependencies ultimately falls to the developer or the operator of the application. As it is, even consistency checking a dependency specification is itself quite complex and any way to avoid requiring a dependency graph would be most desirable.

The point of a dependency graph is to know which clients depend on a particular service so that those clients can then be made to wait for the respective service to become live. But does it really matter which specific clients? Ultimately one tautology that always holds is that all clients of a service have an implicit dependency on the service. That’s what AppSwitch leverages to get around the requirement. In fact, that sidesteps dependency ordering altogether. All services of the application can be co-scheduled without regard to any startup order. Interdependencies among them automatically work themselves out at the granularity of individual requests and responses, resulting in quick and correct application startups.

AppSwitch model and constructs

Now that we have a conceptual understanding of AppSwitch’s high-level approach, let’s look at the constructs involved. But first a quick summary of the usage model is in order. Even though it is written for a different context, reviewing my earlier blog on this topic would be useful as well. For completeness, let me also note AppSwitch doesn’t bother with non-network dependencies. For example it may be possible for two services to interact using IPC mechanisms or through the shared file system. Processes with deep ties like that are typically part of the same service anyway and don’t require framework’s intervention for ordering.

At its core, AppSwitch is built on a mechanism that allows instrumenting the BSD socket API and other related calls like fcntl and ioctl that deal with sockets. As interesting as the details of its implementation are, it’s going to distract us from the main topic, so I’d just summarize the key properties that distinguish it from other implementations. (1) It’s fast. It uses a combination of seccomp filtering and binary instrumentation to aggressively limit intervening with application’s normal execution. AppSwitch is particularly suited for service mesh and application networking use cases given that it implements those features without ever having to actually touch the data. In contrast, network level approaches incur per-packet cost. Take a look at this blog for some of the performance measurements. (2) It doesn’t require any kernel support, kernel module or a patch and works on standard distro kernels (3) It can run as regular user (no root). In fact, the mechanism can even make it possible to run Docker daemon without root by removing root requirement to network containers (4) It doesn’t require any changes to the applications whatsoever and works for any type of application – from WebSphere ND and SAP to custom C apps to statically linked Go apps. Only requirement at this point is Linux/x86.

Decoupling services from their references

AppSwitch is built on the fundamental premise that applications should be decoupled from their references. The identity of applications is traditionally derived from the identity of the host on which they run. However, applications and hosts are very different objects that need to be referenced independently. Detailed discussion around this topic along with a conceptual foundation of AppSwitch is presented in this research paper.

The central AppSwitch construct that achieves the decoupling between services objects and their identities is service reference (reference, for short). AppSwitch implements service references based on the API instrumentation mechanism outlined above. A service reference consists of an IP:port pair (and optionally a DNS name) and a label-selector that selects the service represented by the reference and the clients to which this reference applies. A reference supports a few key properties. (1) It can be named independently of the name of the object it refers to. That is, a service may be listening on an IP and port but a reference allows that service to be reached on any other IP and port chosen by the user. This is what allows AppSwitch to run traditional applications captured from their source environments with static IP configurations to run on Kubernetes by providing them with necessary IP addresses and ports regardless of the target network environment. (2) It remains unchanged even if the location of the target service changes. A reference automatically redirects itself as its label-selector now resolves to the new instance of the service (3) Most important for this discussion, a reference remains valid even as the target service is coming up.

To facilitate discovering services that can be accessed through service references, AppSwitch provides an auto-curated service registry. The registry is automatically kept up to date as services come and go across the cluster based on the network API that AppSwitch tracks. Each entry in the registry consists of the IP and port where the respective service is bound. Along with that, it includes a set of labels indicating the application to which this service belongs, the IP and port that the application passed through the socket API when creating the service, the IP and port where AppSwitch actually bound the service on the underlying host on behalf of the application etc. In addition, applications created under AppSwitch carry a set of labels passed by the user that describe the application together with a few default system labels indicating the user that created the application and the host where the application is running etc. These labels are all available to be expressed in the label-selector carried by a service reference. A service in the registry can be made accessible to clients by creating a service reference. A client would then be able to reach the service at the reference’s name (IP:port). Now let’s look at how AppSwitch guarantees that the reference remains valid even when the target service has not yet come up.

Non-blocking requests

AppSwitch leverages the semantics of the BSD socket API to ensure that service references appear valid from the perspective of clients as corresponding services come up. When a client makes a blocking connect call to another service that has not yet come up, AppSwitch blocks the call for a certain time waiting for the target service to become live. Since it is known that the target service is a part of the application and is expected to come up shortly, making the client block rather than returning an error such as ECONNREFUSED prevents the application from failing to start. If the service doesn’t come up within time, an error is returned to the application so that framework-level mechanisms like Kubernetes crash-loop can kick in.

If the client request is marked as non-blocking, AppSwitch handles that by returning EAGAIN to inform the application to retry rather than give up. Once again, that is in-line with the semantics of socket API and prevents failures due to startup races. AppSwitch essentially enables the retry logic already built into applications in support of the BSD socket API to be transparently repurposed for dependency ordering.

Application timeouts

What if the application times out based on its own internal timer? Truth be told, AppSwitch can also fake application’s perception of time if wanted but that would be overstepping and actually unnecessary. Application decides and knows best how long it should wait and it’s not appropriate for AppSwitch to mess with that. Application timeouts are conservatively long and if the target service still hasn’t come up in time, it is unlikely to be a dependency ordering issue. There must be something else going on that should not be masked.

Wildcard service references for sidecar dependency

Service references can be used to address the Istio sidecar dependency issue mentioned earlier. AppSwitch allows the IP:port specified as part of a service reference to be a wildcard. That is, the service reference IP address can be a netmask indicating the IP address range to be captured. If the label selector of the service reference points to the sidecar service, then all outgoing connections of any application for which this service reference is applied, will be transparently redirected to the sidecar. And of course, the service reference remains valid while sidecar is still coming up and the race is removed.

Using service references for sidecar dependency ordering also implicitly redirects application’s connections to the sidecar without requiring iptables and attendant privilege issues. Essentially it works as if the application is directly making connections to the sidecar rather than the target destination, leaving the sidecar in charge of what to do. AppSwitch would interject metadata about the original destination etc. into the data stream of the connection using the proxy protocol that the sidecar could decode before passing the connection through to the application. Some of these details were discussed here. That takes care of outbound connections but what about incoming connections? With all services and their sidecars running under AppSwitch, any incoming connections that would have come from remote nodes would be redirected to their respective remote sidecars. So nothing special to do about incoming connections.

Summary

Dependency ordering is a pesky problem. This is mostly due to lack of access to fine-grain application-level events around inter-service interactions. Addressing this problem would have normally required applications to implement their own internal logic. But AppSwitch makes those internal application events to be instrumented without requiring application changes. AppSwitch then leverages the ubiquitous support for the BSD socket API to sidestep the requirement of ordering dependencies.

Acknowledgements

Thanks to Eric Herness and team for their insights and support with IBM WebSphere and BPM products as we modernized them onto the Kubernetes platform and to Mandar Jog, Martin Taillefer and Shriram Rajagopalan for reviewing early drafts of this blog.

]]>
Mon, 14 Jan 2019 00:00:00 +0000/v1.24//blog/2019/appswitch//v1.24//blog/2019/appswitch/appswitchperformance
Deploy a Custom Ingress Gateway Using Cert-ManagerThis post provides instructions to manually create a custom ingress gateway with automatic provisioning of certificates based on cert-manager.

The creation of custom ingress gateway could be used in order to have different loadbalancer in order to isolate traffic.

Before you begin

  • Set up Istio by following the instructions in the Installation guide.
  • Set up cert-manager with helm chart
  • We will use demo.mydemo.com for our example, it must be resolved with your DNS

Configuring the custom ingress gateway

  1. Check if cert-manager was installed using Helm with the following command:

    $ helm ls

    The output should be similar to the example below and show cert-manager with a STATUS of DEPLOYED:

    NAME   REVISION UPDATED                  STATUS   CHART                     APP VERSION   NAMESPACE
    istio     1     Thu Oct 11 13:34:24 2018 DEPLOYED istio-1.0.X               1.0.X         istio-system
    cert      1     Wed Oct 24 14:08:36 2018 DEPLOYED cert-manager-v0.6.0-dev.2 v0.6.0-dev.2  istio-system
  2. To create the cluster’s issuer, apply the following configuration:

    apiVersion: certmanager.k8s.io/v1alpha1
    kind: ClusterIssuer
    metadata:
      name: letsencrypt-demo
      namespace: kube-system
    spec:
      acme:
        # The ACME server URL
        server: https://acme-v02.api.letsencrypt.org/directory
        # Email address used for ACME registration
        email: <REDACTED>
        # Name of a secret used to store the ACME account private key
        privateKeySecretRef:
          name: letsencrypt-demo
        dns01:
          # Here we define a list of DNS-01 providers that can solve DNS challenges
          providers:
          - name: your-dns
            route53:
              accessKeyID: <REDACTED>
              region: eu-central-1
              secretAccessKeySecretRef:
                name: prod-route53-credentials-secret
                key: secret-access-key
  3. If you use the route53 provider, you must provide a secret to perform DNS ACME Validation. To create the secret, apply the following configuration file:

    apiVersion: v1
    kind: Secret
    metadata:
      name: prod-route53-credentials-secret
    type: Opaque
    data:
      secret-access-key: <REDACTED BASE64>
  4. Create your own certificate:

    apiVersion: certmanager.k8s.io/v1alpha1
    kind: Certificate
    metadata:
      name: demo-certificate
      namespace: istio-system
    spec:
      acme:
        config:
        - dns01:
            provider: your-dns
          domains:
          - '*.mydemo.com'
      commonName: '*.mydemo.com'
      dnsNames:
      - '*.mydemo.com'
      issuerRef:
        kind: ClusterIssuer
        name: letsencrypt-demo
      secretName: istio-customingressgateway-certs

    Make a note of the value of secretName since a future step requires it.

  5. To scale automatically, declare a new horizontal pod autoscaler with the following configuration:

    apiVersion: autoscaling/v1
    kind: HorizontalPodAutoscaler
    metadata:
      name: my-ingressgateway
      namespace: istio-system
    spec:
      maxReplicas: 5
      minReplicas: 1
      scaleTargetRef:
        apiVersion: apps/v1beta1
        kind: Deployment
        name: my-ingressgateway
      targetCPUUtilizationPercentage: 80
    status:
      currentCPUUtilizationPercentage: 0
      currentReplicas: 1
      desiredReplicas: 1
  6. Apply your deployment with declaration provided in the yaml definition

  7. Create your service:

    apiVersion: v1
    kind: Service
    metadata:
      name: my-ingressgateway
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: nlb
      labels:
        app: my-ingressgateway
        istio: my-ingressgateway
    spec:
      type: LoadBalancer
      selector:
        app: my-ingressgateway
        istio: my-ingressgateway
      ports:
        -
          name: http2
          nodePort: 32380
          port: 80
          targetPort: 80
        -
          name: https
          nodePort: 32390
          port: 443
        -
          name: tcp
          nodePort: 32400
          port: 31400
  8. Create your Istio custom gateway configuration object:

    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      annotations:
      name: istio-custom-gateway
      namespace: default
    spec:
      selector:
        istio: my-ingressgateway
      servers:
      - hosts:
        - '*.mydemo.com'
        port:
          name: http
          number: 80
          protocol: HTTP
        tls:
          httpsRedirect: true
      - hosts:
        - '*.mydemo.com'
        port:
          name: https
          number: 443
          protocol: HTTPS
        tls:
          mode: SIMPLE
          privateKey: /etc/istio/ingressgateway-certs/tls.key
          serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
  9. Link your istio-custom-gateway with your VirtualService:

    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: my-virtualservice
    spec:
      hosts:
      - "demo.mydemo.com"
      gateways:
      - istio-custom-gateway
      http:
      - route:
        - destination:
            host: my-demoapp
  10. Correct certificate is returned by the server and it is successfully verified (SSL certificate verify ok is printed):

    $ curl -v `https://demo.mydemo.com`
    Server certificate:
      SSL certificate verify ok.

Congratulations! You can now use your custom istio-custom-gateway gateway configuration object.

]]>
Thu, 10 Jan 2019 00:00:00 +0000/v1.24//blog/2019/custom-ingress-gateway//v1.24//blog/2019/custom-ingress-gateway/ingresstraffic-management
Announcing discuss.istio.ioWe in the Istio community have been working to find the right medium for users to engage with other members of the community – to ask questions, to get help from other users, and to engage with developers working on the project.

We’ve tried several different avenues, but each has had some downsides. RocketChat was our most recent endeavor, but the lack of certain features (for example, threading) meant it wasn’t ideal for any longer discussions around a single issue. It also led to a dilemma for some users – when should I email istio-users@googlegroups.com and when should I use RocketChat?

We think we’ve found the right balance of features in a single platform, and we’re happy to announce discuss.istio.io. It’s a full-featured forum where we will have discussions about Istio from here on out. It will allow you to ask a question and get threaded replies! As a real bonus, you can use your GitHub identity.

If you prefer emails, you can configure it to send emails just like Google groups did.

We will be marking our Google groups “read only” so that the content remains, but we ask you to send further questions over to discuss.istio.io. If you have any outstanding questions or discussions in the groups, please move the conversation over.

Happy meshing!

]]>
Thu, 10 Jan 2019 00:00:00 +0000/v1.24//blog/2019/announcing-discuss.istio.io//v1.24//blog/2019/announcing-discuss.istio.io/
Incremental Istio Part 1, Traffic ManagementTraffic management is one of the critical benefits provided by Istio. At the heart of Istio’s traffic management is the ability to decouple traffic flow and infrastructure scaling. This lets you control your traffic in ways that aren’t possible without a service mesh like Istio.

For example, let’s say you want to execute a canary deployment. With Istio, you can specify that v1 of a service receives 90% of incoming traffic, while v2 of that service only receives 10%. With standard Kubernetes deployments, the only way to achieve this is to manually control the number of available Pods for each version, for example 9 Pods running v1 and 1 Pod running v2. This type of manual control is hard to implement, and over time may have trouble scaling. For more information, check out Canary Deployments using Istio.

The same issue exists when deploying updates to existing services. While you can update deployments with Kubernetes, it requires replacing v1 Pods with v2 Pods. Using Istio, you can deploy v2 of your service and use built-in traffic management mechanisms to shift traffic to your updated services at a network level, then remove the v1 Pods.

In addition to canary deployments and general traffic shifting, Istio also gives you the ability to implement dynamic request routing (based on HTTP headers), failure recovery, retries, circuit breakers, and fault injection. For more information, check out the Traffic Management documentation.

This post walks through a technique that highlights a particularly useful way that you can implement Istio incrementally – in this case, only the traffic management features – without having to individually update each of your Pods.

Setup: why implement Istio traffic management features?

Of course, the first question is: Why would you want to do this?

If you’re part of one of the many organizations out there that have a large cluster with lots of teams deploying, the answer is pretty clear. Let’s say Team A is getting started with Istio and wants to start some canary deployments on Service A, but Team B hasn’t started using Istio, so they don’t have sidecars deployed.

With Istio, Team A can still implement their canaries by having Service B call Service A through Istio’s ingress gateway.

Background: traffic routing in an Istio mesh

But how can you use Istio’s traffic management capabilities without updating each of your applications’ Pods to include the Istio sidecar? Before answering that question, let’s take a quick high-level look at how traffic enters an Istio mesh and how it’s routed.

Pods that are part of the Istio mesh contain a sidecar proxy that is responsible for mediating all inbound and outbound traffic to the Pod. Within an Istio mesh, Pilot is responsible for converting high-level routing rules into configurations and propagating them to the sidecar proxies. That means when services communicate with one another, their routing decisions are determined from the client side.

Let’s say you have two services that are part of the Istio mesh, Service A and Service B. When A wants to communicate with B, the sidecar proxy of Pod A is responsible for directing traffic to Service B. For example, if you wanted to split traffic 50/50 across Service B v1 and v2, the traffic would flow as follows:

50/50 Traffic Split

If Services A and B are not part of the Istio mesh, there is no sidecar proxy that knows how to route traffic to different versions of Service B. In that case you need to use another approach to get traffic from Service A to Service B, following the 50/50 rules you’ve set up.

Fortunately, a standard Istio deployment already includes a Gateway that specifically deals with ingress traffic outside of the Istio mesh. This Gateway is used to allow ingress traffic from outside the cluster via an external load balancer, or to allow ingress traffic from within the Kubernetes cluster but outside the service mesh. It can be configured to proxy incoming ingress traffic to the appropriate Pods, even if they don’t have a sidecar proxy. While this approach allows you to leverage Istio’s traffic management features, it does mean that traffic going through the ingress gateway will incur an extra hop.

50/50 Traffic Split using Ingress Gateway

In action: traffic routing with Istio

A simple way to see this type of approach in action is to first set up your Kubernetes environment using the Platform Setup instructions, and then install the minimal Istio profile using Helm, including only the traffic management components (ingress gateway, egress gateway, Pilot). The following example uses Google Kubernetes Engine.

First, set up and configure GKE:

$ gcloud container clusters create istio-inc --zone us-central1-f
$ gcloud container clusters get-credentials istio-inc
$ kubectl create clusterrolebinding cluster-admin-binding \
   --clusterrole=cluster-admin \
   --user=$(gcloud config get-value core/account)

Next, install Helm and generate a minimal Istio install – only traffic management components:

$ helm template install/kubernetes/helm/istio \
  --name istio \
  --namespace istio-system \
  --set security.enabled=false \
  --set galley.enabled=false \
  --set sidecarInjectorWebhook.enabled=false \
  --set mixer.enabled=false \
  --set prometheus.enabled=false \
  --set pilot.sidecar=false > istio-minimal.yaml

Then create the istio-system namespace and deploy Istio:

$ kubectl create namespace istio-system
$ kubectl apply -f istio-minimal.yaml

Next, deploy the Bookinfo sample without the Istio sidecar containers:

Zip
$ kubectl apply -f @samples/bookinfo/platform/kube/bookinfo.yaml@

Now, configure a new Gateway that allows access to the reviews service from outside the Istio mesh, a new VirtualService that splits traffic evenly between v1 and v2 of the reviews service, and a set of new DestinationRule resources that match destination subsets to service versions:

$ cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: reviews-gateway
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
  - "*"
  gateways:
  - reviews-gateway
  http:
  - match:
    - uri:
        prefix: /reviews
    route:
    - destination:
        host: reviews
        subset: v1
      weight: 50
    - destination:
        host: reviews
        subset: v2
      weight: 50
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3
EOF

Finally, deploy a pod that you can use for testing with curl (and without the Istio sidecar container):

Zip
$ kubectl apply -f @samples/sleep/sleep.yaml@

Testing your deployment

Now, you can test different behaviors using the curl commands via the sleep Pod.

The first example is to issue requests to the reviews service using standard Kubernetes service DNS behavior (note: jq is used in the examples below to filter the output from curl):

$ export SLEEP_POD=$(kubectl get pod -l app=sleep \
  -o jsonpath={.items..metadata.name})
$ for i in `seq 3`; do \
  kubectl exec -it $SLEEP_POD curl http://reviews:9080/reviews/0 | \
  jq '.reviews|.[]|.rating?'; \
  done
{
  "stars": 5,
  "color": "black"
}
{
  "stars": 4,
  "color": "black"
}
null
null
{
  "stars": 5,
  "color": "red"
}
{
  "stars": 4,
  "color": "red"
}

Notice how we’re getting responses from all three versions of the reviews service (null is from reviews v1 which doesn’t have ratings) and not getting the even split across v1 and v2. This is expected behavior because the curl command is using Kubernetes service load balancing across all three versions of the reviews service. In order to access the reviews 50/50 split we need to access the service via the ingress Gateway:

$ for i in `seq 4`; do \
  kubectl exec -it $SLEEP_POD curl http://istio-ingressgateway.istio-system/reviews/0 | \
  jq '.reviews|.[]|.rating?'; \
  done
{
  "stars": 5,
  "color": "black"
}
{
  "stars": 4,
  "color": "black"
}
null
null
{
  "stars": 5,
  "color": "black"
}
{
  "stars": 4,
  "color": "black"
}
null
null

Mission accomplished! This post showed how to deploy a minimal installation of Istio that only contains the traffic management components (Pilot, ingress Gateway), and then use those components to direct traffic to specific versions of the reviews service. And it wasn’t necessary to deploy the Istio sidecar proxy to gain these capabilities, so there was little to no interruption of existing workloads or applications.

Using the built-in ingress gateway (along with some VirtualService and DestinationRule resources) this post showed how you can easily leverage Istio’s traffic management for cluster-external ingress traffic and cluster-internal service-to-service traffic. This technique is a great example of an incremental approach to adopting Istio, and can be especially useful in real-world cases where Pods are owned by different teams or deployed to different namespaces.

]]>
Wed, 21 Nov 2018 00:00:00 +0000/v1.24//blog/2018/incremental-traffic-management//v1.24//blog/2018/incremental-traffic-management/traffic-managementgateway
Consuming External MongoDB ServicesIn the Consuming External TCP Services blog post, I described how external services can be consumed by in-mesh Istio applications via TCP. In this post, I demonstrate consuming external MongoDB services. You use the Istio Bookinfo sample application, the version in which the book ratings data is persisted in a MongoDB database. You deploy this database outside the cluster and configure the ratings microservice to use it. You will learn multiple options of controlling traffic to external MongoDB services and their pros and cons.

Bookinfo with external ratings database

First, you set up a MongoDB database instance to hold book ratings data outside of your Kubernetes cluster. Then you modify the Bookinfo sample application to use your database.

Setting up the ratings database

For this task you set up an instance of MongoDB. You can use any MongoDB instance; I used Compose for MongoDB.

  1. Set an environment variable for the password of your admin user. To prevent the password from being preserved in the Bash history, remove the command from the history immediately after running the command, using history -d.

    $ export MONGO_ADMIN_PASSWORD=<your MongoDB admin password>
  2. Set an environment variable for the password of the new user you will create, namely bookinfo. Remove the command from the history using history -d.

    $ export BOOKINFO_PASSWORD=<password>
  3. Set environment variables for your MongoDB service, MONGODB_HOST and MONGODB_PORT.

  4. Create the bookinfo user:

    $ cat <<EOF | mongo --ssl --sslAllowInvalidCertificates $MONGODB_HOST:$MONGODB_PORT -u admin -p $MONGO_ADMIN_PASSWORD --authenticationDatabase admin
    use test
    db.createUser(
       {
         user: "bookinfo",
         pwd: "$BOOKINFO_PASSWORD",
         roles: [ "read"]
       }
    );
    EOF
  5. Create a collection to hold ratings. The following command sets both ratings to be equal 1 to provide a visual clue when your database is used by the Bookinfo ratings service (the default Bookinfo ratings are 4 and 5).

    $ cat <<EOF | mongo --ssl --sslAllowInvalidCertificates $MONGODB_HOST:$MONGODB_PORT -u admin -p $MONGO_ADMIN_PASSWORD --authenticationDatabase admin
    use test
    db.createCollection("ratings");
    db.ratings.insert(
      [{rating: 1},
       {rating: 1}]
    );
    EOF
  6. Check that bookinfo user can get ratings:

    $ cat <<EOF | mongo --ssl --sslAllowInvalidCertificates $MONGODB_HOST:$MONGODB_PORT -u bookinfo -p $BOOKINFO_PASSWORD --authenticationDatabase test
    use test
    db.ratings.find({});
    EOF

    The output should be similar to:

    MongoDB server version: 3.4.10
    switched to db test
    { "_id" : ObjectId("5b7c29efd7596e65b6ed2572"), "rating" : 1 }
    { "_id" : ObjectId("5b7c29efd7596e65b6ed2573"), "rating" : 1 }
    bye

Initial setting of Bookinfo application

To demonstrate the scenario of using an external database, you start with a Kubernetes cluster with Istio installed. Then you deploy the Istio Bookinfo sample application, apply the default destination rules, and change Istio to the blocking-egress-by-default policy.

This application uses the ratings microservice to fetch book ratings, a number between 1 and 5. The ratings are displayed as stars for each review. There are several versions of the ratings microservice. You will deploy the version that uses MongoDB as the ratings database in the next subsection.

The example commands in this blog post work with Istio 1.0.

As a reminder, here is the end-to-end architecture of the application from the Bookinfo sample application.

The original Bookinfo application

Use the external database in Bookinfo application

  1. Deploy the spec of the ratings microservice that uses a MongoDB database (ratings v2):

    Zip
    $ kubectl apply -f @samples/bookinfo/platform/kube/bookinfo-ratings-v2.yaml@
    serviceaccount "bookinfo-ratings-v2" created
    deployment "ratings-v2" created
  2. Update the MONGO_DB_URL environment variable to the value of your MongoDB:

    $ kubectl set env deployment/ratings-v2 "MONGO_DB_URL=mongodb://bookinfo:$BOOKINFO_PASSWORD@$MONGODB_HOST:$MONGODB_PORT/test?authSource=test&ssl=true"
    deployment.extensions/ratings-v2 env updated
  3. Route all the traffic destined to the reviews service to its v3 version. You do this to ensure that the reviews service always calls the ratings service. In addition, route all the traffic destined to the ratings service to ratings v2 that uses your database.

    Specify the routing for both services above by adding two virtual services. These virtual services are specified in samples/bookinfo/networking/virtual-service-ratings-mongodb.yaml of an Istio release archive. Important: make sure you applied the default destination rules before running the following command.

    Zip
    $ kubectl apply -f @samples/bookinfo/networking/virtual-service-ratings-db.yaml@

The updated architecture appears below. Note that the blue arrows inside the mesh mark the traffic configured according to the virtual services we added. According to the virtual services, the traffic is sent to reviews v3 and ratings v2.

The Bookinfo application with ratings v2 and an external MongoDB database

Note that the MongoDB database is outside the Istio service mesh, or more precisely outside the Kubernetes cluster. The boundary of the service mesh is marked by a dashed line.

Access the webpage

Access the webpage of the application, after determining the ingress IP and port.

Since you did not configure the egress traffic control yet, the access to the MongoDB service is blocked by Istio. This is why instead of the rating stars, the message “Ratings service is currently unavailable” is currently displayed below each review:

The Ratings service error messages

In the following sections you will configure egress access to the external MongoDB service, using different options for egress control in Istio.

Egress control for TCP

Since MongoDB Wire Protocol runs on top of TCP, you can control the egress traffic to your MongoDB as traffic to any other external TCP service. To control TCP traffic, a block of IPs in the CIDR notation that includes the IP address of your MongoDB host must be specified. The caveat here is that sometimes the IP of the MongoDB host is not stable or known in advance.

In the cases when the IP of the MongoDB host is not stable, the egress traffic can either be controlled as TLS traffic, or the traffic can be routed directly, bypassing the Istio sidecar proxies.

Get the IP address of your MongoDB database instance. As an option, you can use the host command:

$ export MONGODB_IP=$(host $MONGODB_HOST | grep " has address " | cut -d" " -f4)

Control TCP egress traffic without a gateway

In case you do not need to direct the traffic through an egress gateway, for example if you do not have a requirement that all the traffic that exists your mesh must exit through the gateway, follow the instructions in this section. Alternatively, if you do want to direct your traffic through an egress gateway, proceed to Direct TCP egress traffic through an egress gateway.

  1. Define a TCP mesh-external service entry:

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: mongo
    spec:
      hosts:
      - my-mongo.tcp.svc
      addresses:
      - $MONGODB_IP/32
      ports:
      - number: $MONGODB_PORT
        name: tcp
        protocol: TCP
      location: MESH_EXTERNAL
      resolution: STATIC
      endpoints:
      - address: $MONGODB_IP
    EOF

    Note that the protocol TCP is specified instead of MONGO due to the fact that the traffic can be encrypted in case the MongoDB protocol runs on top of TLS. If the traffic is encrypted, the encrypted MongoDB protocol cannot be parsed by the Istio proxy.

    If you know that the plain MongoDB protocol is used, without encryption, you can specify the protocol as MONGO and let the Istio proxy produce MongoDB related statistics. Also note that when the protocol TCP is specified, the configuration is not specific for MongoDB, but is the same for any other database with the protocol on top of TCP.

    Note that the host of your MongoDB is not used in TCP routing, so you can use any host, for example my-mongo.tcp.svc. Notice the STATIC resolution and the endpoint with the IP of your MongoDB service. Once you define such an endpoint, you can access MongoDB services that do not have a domain name.

  2. Refresh the web page of the application. Now the application should display the ratings without error:

    Book Ratings Displayed Correctly

    Note that you see a one-star rating for both displayed reviews, as expected. You set the ratings to be one star to provide yourself with a visual clue that your external database is indeed being used.

  3. If you want to direct the traffic through an egress gateway, proceed to the next section. Otherwise, perform cleanup.

Direct TCP Egress traffic through an egress gateway

In this section you handle the case when you need to direct the traffic through an egress gateway. The sidecar proxy routes TCP connections from the MongoDB client to the egress gateway, by matching the IP of the MongoDB host (a CIDR block of length 32). The egress gateway forwards the traffic to the MongoDB host, by its hostname.

  1. Deploy Istio egress gateway.

  2. If you did not perform the steps in the previous section, perform them now.

  3. You may want to enable mutual TLS Authentication between the sidecar proxies of your MongoDB clients and the egress gateway to let the egress gateway monitor the identity of the source pods and to enable Mixer policy enforcement based on that identity. By enabling mutual TLS you also encrypt the traffic. If you do not want to enable mutual TLS, proceed to the Mutual TLS between the sidecar proxies and the egress gateway section. Otherwise, proceed to the following section.

Configure TCP traffic from sidecars to the egress gateway

  1. Define the EGRESS_GATEWAY_MONGODB_PORT environment variable to hold some port for directing traffic through the egress gateway, e.g. 7777. You must select a port that is not used for any other service in the mesh.

    $ export EGRESS_GATEWAY_MONGODB_PORT=7777
  2. Add the selected port to the istio-egressgateway service. You should use the same values you used for installing Istio, in particular you have to specify all the ports of the istio-egressgateway service that you previously configured.

    $ helm template install/kubernetes/helm/istio/ --name istio-egressgateway --namespace istio-system -x charts/gateways/templates/deployment.yaml -x charts/gateways/templates/service.yaml --set gateways.istio-ingressgateway.enabled=false --set gateways.istio-egressgateway.enabled=true --set gateways.istio-egressgateway.ports[0].port=80 --set gateways.istio-egressgateway.ports[0].name=http --set gateways.istio-egressgateway.ports[1].port=443 --set gateways.istio-egressgateway.ports[1].name=https --set gateways.istio-egressgateway.ports[2].port=$EGRESS_GATEWAY_MONGODB_PORT --set gateways.istio-egressgateway.ports[2].name=mongo | kubectl apply -f -
  3. Check that the istio-egressgateway service indeed has the selected port:

    $ kubectl get svc istio-egressgateway -n istio-system
    NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                   AGE
    istio-egressgateway   ClusterIP   172.21.202.204   <none>        80/TCP,443/TCP,7777/TCP   34d
  4. Disable mutual TLS authentication for the istio-egressgateway service:

    $ kubectl apply -f - <<EOF
    apiVersion: authentication.istio.io/v1alpha1
    kind: Policy
    metadata:
      name: istio-egressgateway
      namespace: istio-system
    spec:
      targets:
      - name: istio-egressgateway
    EOF
  5. Create an egress Gateway for your MongoDB service, and destination rules and a virtual service to direct the traffic through the egress gateway and from the egress gateway to the external service.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: istio-egressgateway
    spec:
      selector:
        istio: egressgateway
      servers:
      - port:
          number: $EGRESS_GATEWAY_MONGODB_PORT
          name: tcp
          protocol: TCP
        hosts:
        - my-mongo.tcp.svc
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: egressgateway-for-mongo
    spec:
      host: istio-egressgateway.istio-system.svc.cluster.local
      subsets:
      - name: mongo
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: mongo
    spec:
      host: my-mongo.tcp.svc
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-mongo-through-egress-gateway
    spec:
      hosts:
      - my-mongo.tcp.svc
      gateways:
      - mesh
      - istio-egressgateway
      tcp:
      - match:
        - gateways:
          - mesh
          destinationSubnets:
          - $MONGODB_IP/32
          port: $MONGODB_PORT
        route:
        - destination:
            host: istio-egressgateway.istio-system.svc.cluster.local
            subset: mongo
            port:
              number: $EGRESS_GATEWAY_MONGODB_PORT
      - match:
        - gateways:
          - istio-egressgateway
          port: $EGRESS_GATEWAY_MONGODB_PORT
        route:
        - destination:
            host: my-mongo.tcp.svc
            port:
              number: $MONGODB_PORT
          weight: 100
    EOF
  6. Verify that egress traffic is directed through the egress gateway.

Mutual TLS between the sidecar proxies and the egress gateway

  1. Delete the previous configuration:

    $ kubectl delete gateway istio-egressgateway --ignore-not-found=true
    $ kubectl delete virtualservice direct-mongo-through-egress-gateway --ignore-not-found=true
    $ kubectl delete destinationrule egressgateway-for-mongo mongo --ignore-not-found=true
    $ kubectl delete policy istio-egressgateway -n istio-system --ignore-not-found=true
  2. Enforce mutual TLS authentication for the istio-egressgateway service:

    $ kubectl apply -f - <<EOF
    apiVersion: authentication.istio.io/v1alpha1
    kind: Policy
    metadata:
      name: istio-egressgateway
      namespace: istio-system
    spec:
      targets:
      - name: istio-egressgateway
      peers:
      - mtls: {}
    EOF
  3. Create an egress Gateway for your MongoDB service, and destination rules and a virtual service to direct the traffic through the egress gateway and from the egress gateway to the external service.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: istio-egressgateway
    spec:
      selector:
        istio: egressgateway
      servers:
      - port:
          number: 443
          name: tls
          protocol: TLS
        hosts:
        - my-mongo.tcp.svc
        tls:
          mode: MUTUAL
          serverCertificate: /etc/certs/cert-chain.pem
          privateKey: /etc/certs/key.pem
          caCertificates: /etc/certs/root-cert.pem
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: egressgateway-for-mongo
    spec:
      host: istio-egressgateway.istio-system.svc.cluster.local
      subsets:
      - name: mongo
        trafficPolicy:
          loadBalancer:
            simple: ROUND_ROBIN
          portLevelSettings:
          - port:
              number: 443
            tls:
              mode: ISTIO_MUTUAL
              sni: my-mongo.tcp.svc
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: mongo
    spec:
      host: my-mongo.tcp.svc
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-mongo-through-egress-gateway
    spec:
      hosts:
      - my-mongo.tcp.svc
      gateways:
      - mesh
      - istio-egressgateway
      tcp:
      - match:
        - gateways:
          - mesh
          destinationSubnets:
          - $MONGODB_IP/32
          port: $MONGODB_PORT
        route:
        - destination:
            host: istio-egressgateway.istio-system.svc.cluster.local
            subset: mongo
            port:
              number: 443
      - match:
        - gateways:
          - istio-egressgateway
          port: 443
        route:
        - destination:
            host: my-mongo.tcp.svc
            port:
              number: $MONGODB_PORT
          weight: 100
    EOF
  4. Proceed to the next section.

Verify that egress traffic is directed through the egress gateway

  1. Refresh the web page of the application again and verify that the ratings are still displayed correctly.

  2. Enable Envoy’s access logging

  3. Check the log of the egress gateway’s Envoy and see a line that corresponds to your requests to the MongoDB service. If Istio is deployed in the istio-system namespace, the command to print the log is:

    $ kubectl logs -l istio=egressgateway -n istio-system
    [2019-04-14T06:12:07.636Z] "- - -" 0 - "-" 1591 4393 94 - "-" "-" "-" "-" "<Your MongoDB IP>:<your MongoDB port>" outbound|<your MongoDB port>||my-mongo.tcp.svc 172.30.146.119:59924 172.30.146.119:443 172.30.230.1:59206 -

Cleanup of TCP egress traffic control

$ kubectl delete serviceentry mongo
$ kubectl delete gateway istio-egressgateway --ignore-not-found=true
$ kubectl delete virtualservice direct-mongo-through-egress-gateway --ignore-not-found=true
$ kubectl delete destinationrule egressgateway-for-mongo mongo --ignore-not-found=true
$ kubectl delete policy istio-egressgateway -n istio-system --ignore-not-found=true

Egress control for TLS

In the real life, most of the communication to the external services must be encrypted and the MongoDB protocol runs on top of TLS. Also, the TLS clients usually send Server Name Indication, SNI, as part of their handshake. If your MongoDB server runs TLS and your MongoDB client sends SNI as part of the handshake, you can control your MongoDB egress traffic as any other TLS-with-SNI traffic. With TLS and SNI, you do not need to specify the IP addresses of your MongoDB servers. You specify their host names instead, which is more convenient since you do not have to rely on the stability of the IP addresses. You can also specify wildcards as a prefix of the host names, for example allowing access to any server from the *.com domain.

To check if your MongoDB server supports TLS, run:

$ openssl s_client -connect $MONGODB_HOST:$MONGODB_PORT -servername $MONGODB_HOST

If the command above prints a certificate returned by the server, the server supports TLS. If not, you have to control your MongoDB egress traffic on the TCP level, as described in the previous sections.

Control TLS egress traffic without a gateway

In case you do not need an egress gateway, follow the instructions in this section. If you want to direct your traffic through an egress gateway, proceed to Direct TCP Egress traffic through an egress gateway.

  1. Create a ServiceEntry for the MongoDB service:

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: mongo
    spec:
      hosts:
      - $MONGODB_HOST
      ports:
      - number: $MONGODB_PORT
        name: tls
        protocol: TLS
      resolution: DNS
    EOF
  2. Refresh the web page of the application. The application should display the ratings without error.

Cleanup of the egress configuration for TLS

$ kubectl delete serviceentry mongo

Direct TLS Egress traffic through an egress gateway

In this section you handle the case when you need to direct the traffic through an egress gateway. The sidecar proxy routes TLS connections from the MongoDB client to the egress gateway, by matching the SNI of the MongoDB host. The egress gateway forwards the traffic to the MongoDB host. Note that the sidecar proxy rewrites the destination port to be 443. The egress gateway accepts the MongoDB traffic on the port 443, matches the MongoDB host by SNI, and rewrites the port again to be the port of the MongoDB server.

  1. Deploy Istio egress gateway.

  2. Create a ServiceEntry for the MongoDB service:

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: mongo
    spec:
      hosts:
      - $MONGODB_HOST
      ports:
      - number: $MONGODB_PORT
        name: tls
        protocol: TLS
      - number: 443
        name: tls-port-for-egress-gateway
        protocol: TLS
      resolution: DNS
      location: MESH_EXTERNAL
    EOF
  3. Refresh the web page of the application and verify that the ratings are displayed correctly.

  4. Create an egress Gateway for your MongoDB service, and destination rules and virtual services to direct the traffic through the egress gateway and from the egress gateway to the external service.

    If you want to enable mutual TLS Authentication between the sidecar proxies of your application pods and the egress gateway, use the following command. (You may want to enable mutual TLS to let the egress gateway monitor the identity of the source pods and to enable Mixer policy enforcement based on that identity.)

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: istio-egressgateway
    spec:
      selector:
    istio: egressgateway
      servers:
      - port:
      number: 443
      name: tls
      protocol: TLS
    hosts:
    - $MONGODB_HOST
    tls:
      mode: MUTUAL
      serverCertificate: /etc/certs/cert-chain.pem
      privateKey: /etc/certs/key.pem
      caCertificates: /etc/certs/root-cert.pem
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: egressgateway-for-mongo
    spec:
      host: istio-egressgateway.istio-system.svc.cluster.local
      subsets:
      - name: mongo
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
      portLevelSettings:
      - port:
          number: 443
        tls:
          mode: ISTIO_MUTUAL
          sni: $MONGODB_HOST
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-mongo-through-egress-gateway
    spec:
      hosts:
      - $MONGODB_HOST
      gateways:
      - mesh
      - istio-egressgateway
      tls:
      - match:
    - gateways:
      - mesh
      port: $MONGODB_PORT
      sni_hosts:
      - $MONGODB_HOST
    route:
    - destination:
        host: istio-egressgateway.istio-system.svc.cluster.local
        subset: mongo
        port:
          number: 443
      tcp:
      - match:
    - gateways:
      - istio-egressgateway
      port: 443
    route:
    - destination:
        host: $MONGODB_HOST
        port:
          number: $MONGODB_PORT
      weight: 100
    EOF
  5. Verify that the traffic is directed though the egress gateway

Cleanup directing TLS egress traffic through an egress gateway

$ kubectl delete serviceentry mongo
$ kubectl delete gateway istio-egressgateway
$ kubectl delete virtualservice direct-mongo-through-egress-gateway
$ kubectl delete destinationrule egressgateway-for-mongo

Enable MongoDB TLS egress traffic to arbitrary wildcarded domains

Sometimes you want to configure egress traffic to multiple hostnames from the same domain, for example traffic to all MongoDB services from *.<your company domain>.com. You do not want to create multiple configuration items, one for each and every MongoDB service in your company. To configure access to all the external services from the same domain by a single configuration, you use wildcarded hosts.

In this section you configure egress traffic for a wildcarded domain. I used a MongoDB instance at composedb.com domain, so configuring egress traffic for *.com worked for me (I could have used *.composedb.com as well). You can pick a wildcarded domain according to your MongoDB host.

To configure egress gateway traffic for a wildcarded domain, you will first need to deploy a custom egress gateway with an additional SNI proxy. This is needed due to current limitations of Envoy, the proxy used by the standard Istio egress gateway.

Prepare a new egress gateway with an SNI proxy

In this subsection you deploy an egress gateway with an SNI proxy, in addition to the standard Istio Envoy proxy. You can use any SNI proxy that is capable of routing traffic according to arbitrary, not-preconfigured SNI values; we used Nginx to achieve this functionality.

  1. Create a configuration file for the Nginx SNI proxy. You may want to edit the file to specify additional Nginx settings, if required.

    $ cat <<EOF > ./sni-proxy.conf
    user www-data;
    
    events {
    }
    
    stream {
      log_format log_stream '\$remote_addr [\$time_local] \$protocol [\$ssl_preread_server_name]'
      '\$status \$bytes_sent \$bytes_received \$session_time';
    
      access_log /var/log/nginx/access.log log_stream;
      error_log  /var/log/nginx/error.log;
    
      # tcp forward proxy by SNI
      server {
        resolver 8.8.8.8 ipv6=off;
        listen       127.0.0.1:$MONGODB_PORT;
        proxy_pass   \$ssl_preread_server_name:$MONGODB_PORT;
        ssl_preread  on;
      }
    }
    EOF
  2. Create a Kubernetes ConfigMap to hold the configuration of the Nginx SNI proxy:

    $ kubectl create configmap egress-sni-proxy-configmap -n istio-system --from-file=nginx.conf=./sni-proxy.conf
  3. The following command will generate istio-egressgateway-with-sni-proxy.yaml to edit and deploy.

    $ cat <<EOF | helm template install/kubernetes/helm/istio/ --name istio-egressgateway-with-sni-proxy --namespace istio-system -x charts/gateways/templates/deployment.yaml -x charts/gateways/templates/service.yaml -x charts/gateways/templates/serviceaccount.yaml -x charts/gateways/templates/autoscale.yaml -x charts/gateways/templates/role.yaml -x charts/gateways/templates/rolebindings.yaml --set global.mtls.enabled=true --set global.istioNamespace=istio-system -f - > ./istio-egressgateway-with-sni-proxy.yaml
    gateways:
      enabled: true
      istio-ingressgateway:
        enabled: false
      istio-egressgateway:
        enabled: false
      istio-egressgateway-with-sni-proxy:
        enabled: true
        labels:
          app: istio-egressgateway-with-sni-proxy
          istio: egressgateway-with-sni-proxy
        replicaCount: 1
        autoscaleMin: 1
        autoscaleMax: 5
        cpu:
          targetAverageUtilization: 80
        serviceAnnotations: {}
        type: ClusterIP
        ports:
          - port: 443
            name: https
        secretVolumes:
          - name: egressgateway-certs
            secretName: istio-egressgateway-certs
            mountPath: /etc/istio/egressgateway-certs
          - name: egressgateway-ca-certs
            secretName: istio-egressgateway-ca-certs
            mountPath: /etc/istio/egressgateway-ca-certs
        configVolumes:
          - name: sni-proxy-config
            configMapName: egress-sni-proxy-configmap
        additionalContainers:
        - name: sni-proxy
          image: nginx
          volumeMounts:
          - name: sni-proxy-config
            mountPath: /etc/nginx
            readOnly: true
    EOF
  4. Deploy the new egress gateway:

    $ kubectl apply -f ./istio-egressgateway-with-sni-proxy.yaml
    serviceaccount "istio-egressgateway-with-sni-proxy-service-account" created
    role "istio-egressgateway-with-sni-proxy-istio-system" created
    rolebinding "istio-egressgateway-with-sni-proxy-istio-system" created
    service "istio-egressgateway-with-sni-proxy" created
    deployment "istio-egressgateway-with-sni-proxy" created
    horizontalpodautoscaler "istio-egressgateway-with-sni-proxy" created
  5. Verify that the new egress gateway is running. Note that the pod has two containers (one is the Envoy proxy and the second one is the SNI proxy).

    $ kubectl get pod -l istio=egressgateway-with-sni-proxy -n istio-system
    NAME                                                  READY     STATUS    RESTARTS   AGE
    istio-egressgateway-with-sni-proxy-79f6744569-pf9t2   2/2       Running   0          17s
  6. Create a service entry with a static address equal to 127.0.0.1 (localhost), and disable mutual TLS on the traffic directed to the new service entry:

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: sni-proxy
    spec:
      hosts:
      - sni-proxy.local
      location: MESH_EXTERNAL
      ports:
      - number: $MONGODB_PORT
        name: tcp
        protocol: TCP
      resolution: STATIC
      endpoints:
      - address: 127.0.0.1
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: disable-mtls-for-sni-proxy
    spec:
      host: sni-proxy.local
      trafficPolicy:
        tls:
          mode: DISABLE
    EOF

Configure access to *.com using the new egress gateway

  1. Define a ServiceEntry for *.com:

    $ cat <<EOF | kubectl create -f -
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: mongo
    spec:
      hosts:
      - "*.com"
      ports:
      - number: 443
        name: tls
        protocol: TLS
      - number: $MONGODB_PORT
        name: tls-mongodb
        protocol: TLS
      location: MESH_EXTERNAL
    EOF
  2. Create an egress Gateway for *.com, port 443, protocol TLS, a destination rule to set the SNI for the gateway, and Envoy filters to prevent tampering with SNI by a malicious application (the filters verify that the SNI issued by the application is the SNI reported to Mixer).

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: istio-egressgateway-with-sni-proxy
    spec:
      selector:
        istio: egressgateway-with-sni-proxy
      servers:
      - port:
          number: 443
          name: tls
          protocol: TLS
        hosts:
        - "*.com"
        tls:
          mode: MUTUAL
          serverCertificate: /etc/certs/cert-chain.pem
          privateKey: /etc/certs/key.pem
          caCertificates: /etc/certs/root-cert.pem
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: mtls-for-egress-gateway
    spec:
      host: istio-egressgateway-with-sni-proxy.istio-system.svc.cluster.local
      subsets:
        - name: mongo
          trafficPolicy:
            loadBalancer:
              simple: ROUND_ROBIN
            portLevelSettings:
            - port:
                number: 443
              tls:
                mode: ISTIO_MUTUAL
    ---
    # The following filter is used to forward the original SNI (sent by the application) as the SNI of the mutual TLS
    # connection.
    # The forwarded SNI will be reported to Mixer so that policies will be enforced based on the original SNI value.
    apiVersion: networking.istio.io/v1alpha3
    kind: EnvoyFilter
    metadata:
      name: forward-downstream-sni
    spec:
      filters:
      - listenerMatch:
          portNumber: $MONGODB_PORT
          listenerType: SIDECAR_OUTBOUND
        filterName: forward_downstream_sni
        filterType: NETWORK
        filterConfig: {}
    ---
    # The following filter verifies that the SNI of the mutual TLS connection (the SNI reported to Mixer) is
    # identical to the original SNI issued by the application (the SNI used for routing by the SNI proxy).
    # The filter prevents Mixer from being deceived by a malicious application: routing to one SNI while
    # reporting some other value of SNI. If the original SNI does not match the SNI of the mutual TLS connection, the
    # filter will block the connection to the external service.
    apiVersion: networking.istio.io/v1alpha3
    kind: EnvoyFilter
    metadata:
      name: egress-gateway-sni-verifier
    spec:
      workloadLabels:
        app: istio-egressgateway-with-sni-proxy
      filters:
      - listenerMatch:
          portNumber: 443
          listenerType: GATEWAY
        filterName: sni_verifier
        filterType: NETWORK
        filterConfig: {}
    EOF
  3. Route the traffic destined for *.com to the egress gateway and from the egress gateway to the SNI proxy.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-mongo-through-egress-gateway
    spec:
      hosts:
      - "*.com"
      gateways:
      - mesh
      - istio-egressgateway-with-sni-proxy
      tls:
      - match:
        - gateways:
          - mesh
          port: $MONGODB_PORT
          sni_hosts:
          - "*.com"
        route:
        - destination:
            host: istio-egressgateway-with-sni-proxy.istio-system.svc.cluster.local
            subset: mongo
            port:
              number: 443
          weight: 100
      tcp:
      - match:
        - gateways:
          - istio-egressgateway-with-sni-proxy
          port: 443
        route:
        - destination:
            host: sni-proxy.local
            port:
              number: $MONGODB_PORT
          weight: 100
    EOF
  4. Refresh the web page of the application again and verify that the ratings are still displayed correctly.

  5. Enable Envoy’s access logging

  6. Check the log of the egress gateway’s Envoy proxy. If Istio is deployed in the istio-system namespace, the command to print the log is:

    $ kubectl logs -l istio=egressgateway-with-sni-proxy -c istio-proxy -n istio-system

    You should see lines similar to the following:

    [2019-01-02T17:22:04.602Z] "- - -" 0 - 768 1863 88 - "-" "-" "-" "-" "127.0.0.1:28543" outbound|28543||sni-proxy.local 127.0.0.1:49976 172.30.146.115:443 172.30.146.118:58510 <your MongoDB host>
    [2019-01-02T17:22:04.713Z] "- - -" 0 - 1534 2590 85 - "-" "-" "-" "-" "127.0.0.1:28543" outbound|28543||sni-proxy.local 127.0.0.1:49988 172.30.146.115:443 172.30.146.118:58522 <your MongoDB host>
  7. Check the logs of the SNI proxy. If Istio is deployed in the istio-system namespace, the command to print the log is:

    $ kubectl logs -l istio=egressgateway-with-sni-proxy -n istio-system -c sni-proxy
    127.0.0.1 [23/Aug/2018:03:28:18 +0000] TCP [<your MongoDB host>]200 1863 482 0.089
    127.0.0.1 [23/Aug/2018:03:28:18 +0000] TCP [<your MongoDB host>]200 2590 1248 0.095

Understanding what happened

In this section you configured egress traffic to your MongoDB host using a wildcarded domain. While for a single MongoDB host there is no gain in using wildcarded domains (an exact hostname can be specified), it could be beneficial for cases when the applications in the cluster access multiple MongoDB hosts that match some wildcarded domain. For example, if the applications access mongodb1.composedb.com, mongodb2.composedb.com and mongodb3.composedb.com, the egress traffic can be configured by a single configuration for the wildcarded domain *.composedb.com.

I will leave it as an exercise for the reader to verify that no additional Istio configuration is required when you configure an app to use another instance of MongoDB with a hostname that matches the wildcarded domain used in this section.

Cleanup of configuration for MongoDB TLS egress traffic to arbitrary wildcarded domains

  1. Delete the configuration items for *.com:

    $ kubectl delete serviceentry mongo
    $ kubectl delete gateway istio-egressgateway-with-sni-proxy
    $ kubectl delete virtualservice direct-mongo-through-egress-gateway
    $ kubectl delete destinationrule mtls-for-egress-gateway
    $ kubectl delete envoyfilter forward-downstream-sni egress-gateway-sni-verifier
  2. Delete the configuration items for the egressgateway-with-sni-proxy deployment:

    $ kubectl delete serviceentry sni-proxy
    $ kubectl delete destinationrule disable-mtls-for-sni-proxy
    $ kubectl delete -f ./istio-egressgateway-with-sni-proxy.yaml
    $ kubectl delete configmap egress-sni-proxy-configmap -n istio-system
  3. Remove the configuration files you created:

    $ rm ./istio-egressgateway-with-sni-proxy.yaml
    $ rm ./nginx-sni-proxy.conf

Cleanup

  1. Drop the bookinfo user:

    $ cat <<EOF | mongo --ssl --sslAllowInvalidCertificates $MONGODB_HOST:$MONGODB_PORT -u admin -p $MONGO_ADMIN_PASSWORD --authenticationDatabase admin
    use test
    db.dropUser("bookinfo");
    EOF
  2. Drop the ratings collection:

    $ cat <<EOF | mongo --ssl --sslAllowInvalidCertificates $MONGODB_HOST:$MONGODB_PORT -u admin -p $MONGO_ADMIN_PASSWORD --authenticationDatabase admin
    use test
    db.ratings.drop();
    EOF
  3. Unset the environment variables you used:

    $ unset MONGO_ADMIN_PASSWORD BOOKINFO_PASSWORD MONGODB_HOST MONGODB_PORT MONGODB_IP
  4. Remove the virtual services:

    Zip
    $ kubectl delete -f @samples/bookinfo/networking/virtual-service-ratings-db.yaml@
    Deleted config: virtual-service/default/reviews
    Deleted config: virtual-service/default/ratings
  5. Undeploy ratings v2-mongodb:

    Zip
    $ kubectl delete -f @samples/bookinfo/platform/kube/bookinfo-ratings-v2.yaml@
    deployment "ratings-v2" deleted

Conclusion

In this blog post I demonstrated various options for MongoDB egress traffic control. You can control the MongoDB egress traffic on a TCP or TLS level where applicable. In both TCP and TLS cases, you can direct the traffic from the sidecar proxies directly to the external MongoDB host, or direct the traffic through an egress gateway, according to your organization’s security requirements. In the latter case, you can also decide to apply or disable mutual TLS authentication between the sidecar proxies and the egress gateway. If you want to control MongoDB egress traffic on the TLS level by specifying wildcarded domains like *.com and you need to direct the traffic through the egress gateway, you must deploy a custom egress gateway with an SNI proxy.

Note that the configuration and considerations described in this blog post for MongoDB are rather the same for other non-HTTP protocols on top of TCP/TLS.

]]>
Fri, 16 Nov 2018 00:00:00 +0000/v1.24//blog/2018/egress-mongo//v1.24//blog/2018/egress-mongo/traffic-managementegresstcpmongo
All Day Istio Twitch StreamTo celebrate the 1.0 release and to promote the software to a wider audience, the Istio community is hosting an all day live stream on Twitch on August 17th.

What is Twitch?

Twitch is a popular video gaming live streaming platform and recently has seen a lot of coding content showing up. The IBM Advocates have been doing live coding and presentations there and it’s been fun. While mostly used for gaming content, there is a growing community sharing and watching programming content on the site.

What does this have to do with Istio?

The stream is going to be a full day of Istio content. Hopefully we’ll have a good mix of deep technical content, beginner content and line-of-business content for our audience. We’ll have developers, users, and evangelists on throughout the day to share their demos and stories. Expect live coding, q and a, and some surprises. We have stellar guests lined up from IBM, Google, Datadog, Pivotal, and more!

Recordings

Recordings are available here.

Schedule

All times are PDT.

Time Speaker Affiliation
10:00 - 10:30 Spencer Krum + Lisa-Marie Namphy IBM / Portworx
10:30 - 11:00 Lin Sun / Spencer Krum / Sven Mawson IBM / Google
11:00 - 11:10 Lin Sun / Spencer Krum IBM
11:10 - 11:30 Jason Yee / Ilan Rabinovich Datadog
11:30 - 11:50 April Nassl Google
11:50 - 12:10 Spike Curtis Tigera
12:10 - 12:30 Shannon Coen Pivotal
12:30 - 1:00 Matt Klein Lyft
1:00 - 1:20 Zach Jory F5/Aspen Mesh
1:20 - 1:40 Dan Ciruli Google
1:40 - 2:00 Isaiah Snell-Feikema / Greg Hanson IBM
2:00 - 2:20 Zach Butcher Tetrate
2:20 - 2:40 Ray Hudaihed American Airlines
2:40 - 3:00 Christian Posta Red Hat
3:00 - 3:20 Google/IBM China Google / IBM
3:20 - 3:40 Colby Dyess Tuffin
3:40 - 4:00 Rohit Agarwalla Cisco
]]>
Fri, 03 Aug 2018 00:00:00 +0000/v1.24//blog/2018/istio-twitch-stream//v1.24//blog/2018/istio-twitch-stream/
Istio a Game Changer for HP's FitStation PlatformThe FitStation team at HP strongly believes in the future of Kubernetes, BPF and service-mesh as the next standards in cloud infrastructure. We are also very happy to see Istio coming to its official Istio 1.0 release – thanks to the joint collaboration that started at Google, IBM and Lyft beginning in May 2017.

Throughout the development of FitStation’s large scale and progressive cloud platform, Istio, Cilium and Kubernetes technologies have delivered a multitude of opportunities to make our systems more robust and scalable. Istio was a game changer in creating reliable and dynamic network communication.

FitStation powered by HP is a technology platform that captures 3D biometric data to design personalized footwear to perfectly fit individual foot size and shape as well as gait profile. It uses 3D scanning, pressure sensing, 3D printing and variable density injection molding to create unique footwear. Footwear brands such as Brooks, Steitz Secura or Superfeet are connecting to FitStation to build their next generation of high performance sports, professional and medical shoes.

FitStation is built on the promise of ultimate security and privacy for users’ biometric data. ISTIO is the cornerstone to make that possible for data-at-flight within our cloud. By managing these aspects at the infrastructure level, we focused on solving business problems instead of spending time on individual implementations of secure service communication. Using Istio allowed us to dramatically reduce the complexity of maintaining a multitude of libraries and services to provide secure service communication.

As a bonus benefit of Istio 1.0, we gained network visibility, metrics and tracing out of the box. This radically improved decision-making and response quality for our development and devops teams. The team got in-depth insight in the network communication across the entire platform, both for new as well as legacy applications. The integration of Cilium with Envoy delivered a remarkable performance benefit on Istio service mesh communication, combined with a fine-grained kernel driven L7 network security layer. This was due to the powers of BPF brought to Istio by Cilium. We believe this will drive the future of Linux kernel security.

It has been very exciting to follow Istio’s growth. We have been able to see clear improvements of performance and stability over the different development versions. The improvements between version 0.7 and 0.8 made our teams feel comfortable with version 1.0, we can state that Istio is now ready for real production usage.

We are looking forward to the promising roadmaps of Istio, Envoy, Cilium and CNCF.

]]>
Tue, 31 Jul 2018 00:00:00 +0000/v1.24//blog/2018/hp//v1.24//blog/2018/hp/
Delayering Istio with AppSwitch

The sidecar proxy approach enables a lot of awesomeness. Squarely in the datapath between microservices, the sidecar can precisely tell what the application is trying to do. It can monitor and instrument protocol traffic, not in the bowels of the networking layers but at the application level, to enable deep visibility, access controls and traffic management.

If we look closely however, there are many intermediate layers that the data has to pass through before the high-value analysis of application-traffic can be performed. Most of those layers are part of the base plumbing infrastructure that are there just to push the data along. In doing so, they add latency to communication and complexity to the overall system.

Over the years, there has been much collective effort in implementing aggressive fine-grained optimizations within the layers of the network datapath. Each iteration may shave another few microseconds. But then the true necessity of those layers itself has not been questioned.

Don’t optimize layers, remove them

In my belief, optimizing something is a poor fallback to removing its requirement altogether. That was the goal of my initial work (broken link: https://apporbit.com/a-brief-history-of-containers-from-reality-to-hype/) on OS-level virtualization that led to Linux containers which effectively removed virtual machines by running applications directly on the host operating system without requiring an intermediate guest. For a long time the industry was fighting the wrong battle distracted by optimizing VMs rather than removing the additional layer altogether.

I see the same pattern repeat itself with the connectivity of microservices, and networking in general. The network has been going through the changes that physical servers have gone through a decade earlier. New set of layers and constructs are being introduced. They are being baked deep into the protocol stack and even silicon without adequately considering low-touch alternatives. Perhaps there is a way to remove those additional layers altogether.

I have been thinking about these problems for some time and believe that an approach similar in concept to containers can be applied to the network stack that would fundamentally simplify how application endpoints are connected across the complexity of many intermediate layers. I have reapplied the same principles from the original work on containers to create AppSwitch. Similar to the way containers provide an interface that applications can directly consume, AppSwitch plugs directly into well-defined and ubiquitous network API that applications currently use and directly connects application clients to appropriate servers, skipping all intermediate layers. In the end, that’s what networking is all about.

Before going into the details of how AppSwitch promises to remove unnecessary layers from the Istio stack, let me give a very brief introduction to its architecture. Further details are available at the documentation page.

AppSwitch

Not unlike the container runtime, AppSwitch consists of a client and a daemon that speak over HTTP via a REST API. Both the client and the daemon are built as one self-contained binary, ax. The client transparently plugs into the application and tracks its system calls related to network connectivity and notifies the daemon about their occurrences. As an example, let’s say an application makes the connect(2) system call to the service IP of a Kubernetes service. The AppSwitch client intercepts the connect call, nullifies it and notifies the daemon about its occurrence along with some context that includes the system call arguments. The daemon would then handle the system call, potentially by directly connecting to the Pod IP of the upstream server on behalf of the application.

It is important to note that no data is forwarded between AppSwitch client and daemon. They are designed to exchange file descriptors (FDs) over a Unix domain socket to avoid having to copy data. Note also that client is not a separate process. Rather it directly runs in the context of the application itself. There is no data copy between the application and AppSwitch client either.

Delayering the stack

Now that we have an idea about what AppSwitch does, let’s look at the layers that it optimizes away from a standard service mesh.

Network devirtualization

Kubernetes offers simple and well-defined network constructs to the microservice applications it runs. In order to support them however, it imposes specific requirements on the underlying network. Meeting those requirements is often not easy. The go-to solution of adding another layer is typically adopted to satisfy the requirements. In most cases the additional layer consists of a network overlay that sits between Kubernetes and underlying network. Traffic produced by the applications is encapsulated at the source and decapsulated at the target, which not only costs network resources but also takes up compute cores.

Because AppSwitch arbitrates what the application sees through its touchpoints with the platform, it projects a consistent virtual view of the underlying network to the application similar to an overlay but without introducing an additional layer of processing along the datapath. Just to draw a parallel to containers, the inside of a container looks and feels like a VM. However the underlying implementation does not intervene along the high-incidence control paths of low-level interrupts etc.

AppSwitch can be injected into a standard Kubernetes manifest (similar to Istio injection) such that the application’s network is directly handled by AppSwitch bypassing any network overlay underneath. More details to follow in just a bit.

Artifacts of container networking

Extending network connectivity from host into the container has been a major challenge. New layers of network plumbing were invented explicitly for that purpose. As such, an application running in a container is simply a process on the host. However due to a fundamental misalignment between the network abstraction expected by the application and the abstraction exposed by container network namespace, the process cannot directly access the host network. Applications think of networking in terms of sockets or sessions whereas network namespaces expose a device abstraction. Once placed in a network namespace, the process suddenly loses all connectivity. The notion of veth-pair and corresponding tooling were invented just to close that gap. The data would now have to go from a host interface into a virtual switch and then through a veth-pair to the virtual network interface of the container network namespace.

AppSwitch can effectively remove both the virtual switch and veth-pair layers on both ends of the connection. Since the connections are established by the daemon running on the host using the network that’s already available on the host, there is no need for additional plumbing to bridge host network into the container. The socket FDs created on the host are passed to the application running within the pod’s network namespace. By the time the application receives the FD, all control path work (security checks, connection establishment) is already done and the FD is ready for actual IO.

Skip TCP/IP for colocated endpoints

TCP/IP is the universal protocol medium over which pretty much all communication occurs. But if application endpoints happen to be on the same host, is TCP/IP really required? After all, it does do quite a bit of work and it is quite complex. Unix sockets are explicitly designed for intrahost communication and AppSwitch can transparently switch the communication to occur over a Unix socket for colocated endpoints.

For each listening socket of an application, AppSwitch maintains two listening sockets, one each for TCP and Unix. When a client tries to connect to a server that happens to be colocated, AppSwitch daemon would choose to connect to the Unix listening socket of the server. The resulting Unix sockets on each end are passed into respective applications. Once a fully connected FD is returned, the application would simply treat it as a bit pipe. The protocol doesn’t really matter. The application may occasionally make protocol specific calls such as getsockname(2) and AppSwitch would handle them in kind. It would present consistent responses such that the application would continue to run on.

Data pushing proxy

As we continue to look for layers to remove, let us also reconsider the requirement of the proxy layer itself. There are times when the role of the proxy may degenerate into a plain data pusher:

  • There may not be a need for any protocol decoding
  • The protocol may not be recognized by the proxy
  • The communication may be encrypted and the proxy cannot access relevant headers
  • The application (redis, memcached etc.) may be too latency-sensitive and cannot afford the cost of an intermediate proxy

In all these cases, the proxy is not different from any low-level plumbing layer. In fact, the latency introduced can be far higher because the same level of optimizations won’t be available to a proxy.

To illustrate this with an example, consider the application shown below. It consists of a Python app and a set of memcached servers behind it. An upstream memcached server is selected based on connection time routing. Speed is the primary concern here.

Latency-sensitive application scenario

If we look at the data flow in this setup, the Python app makes a connection to the service IP of memcached. It is redirected to the client-side sidecar. The sidecar routes the connection to one of the memcached servers and copies the data between the two sockets – one connected to the app and another connected to memcached. And the same also occurs on the server side between the server-side sidecar and memcached. The role of proxy at that point is just boring shoveling of bits between the two sockets. However, it ends up adding substantial latency to the end-to-end connection.

Now let us imagine that the app is somehow made to connect directly to memcached, then the two intermediate proxies could be skipped. The data would flow directly between the app and memcached without any intermediate hops. AppSwitch can arrange for that by transparently tweaking the target address passed by the Python app when it makes the connect(2) system call.

Proxyless protocol decoding

Things are going to get a bit strange here. We have seen that the proxy can be bypassed for cases that don’t involve looking into application traffic. But is there anything we can do even for those other cases? It turns out, yes.

In a typical communication between microservices, much of the interesting information is exchanged in the initial headers. Headers are followed by body or payload which typically represents bulk of the communication. And once again the proxy degenerates into a data pusher for this part of communication. AppSwitch provides a nifty mechanism to skip proxy for these cases.

Even though AppSwitch is not a proxy, it does arbitrate connections between application endpoints and it does have access to corresponding socket FDs. Normally, AppSwitch simply passes those FDs to the application. But it can also peek into the initial message received on the connection using the MSG_PEEK option of the recvfrom(2) system call on the socket. It allows AppSwitch to examine application traffic without actually removing it from the socket buffers. When AppSwitch returns the FD to the application and steps out of the datapath, the application would do an actual read on the connection. AppSwitch uses this technique to perform deeper analysis of application-level traffic and implement sophisticated network functions as discussed in the next section, all without getting into the datapath.

Zero-cost load balancer, firewall and network analyzer

Typical implementations of network functions such as load balancers and firewalls require an intermediate layer that needs to tap into data/packet stream. Kubernetes’ implementation of load balancer (kube-proxy) for example introduces a probe into the packet stream through iptables and Istio implements the same at the proxy layer. But if all that is required is to redirect or drop connections based on policy, it is not really necessary to stay in the datapath during the entire course of the connection. AppSwitch can take care of that much more efficiently by simply manipulating the control path at the API level. Given its intimate proximity to the application, AppSwitch also has easy access to various pieces of application level metrics such as dynamics of stack and heap usage, precisely when a service comes alive, attributes of active connections etc., all of which could potentially form a rich signal for monitoring and analytics.

To go a step further, AppSwitch can also perform L7 load balancing and firewall functions based on the protocol data that it obtains from the socket buffers. It can synthesize the protocol data and various other signals with the policy information acquired from Pilot to implement a highly efficient form of routing and access control enforcement. It can essentially “influence” the application to connect to the right backend server without requiring any changes to the application or its configuration. It is as if the application itself is infused with policy and traffic-management intelligence. Except in this case, the application can’t escape the influence.

There is some more black-magic possible that would actually allow modifying the application data stream without getting into the datapath but I am going to save that for a later post. Current implementation of AppSwitch uses a proxy if the use case requires application protocol traffic to be modified. For those cases, AppSwitch provides a highly optimal mechanism to attract traffic to the proxy as discussed in the next section.

Traffic redirection

Before the sidecar proxy can look into application protocol traffic, it needs to first receive the connections. Redirection of connections coming into and going out of the application is currently done by a layer of packet filtering that rewrites packets such that they go to respective sidecars. Creating potentially large number of rules required to represent the redirection policy is tedious. And the process of applying the rules and updating them, as the target subnets to be captured by the sidecar change, is expensive.

While some of the performance concerns are being addressed by the Linux community, there is another concern related to privilege: iptables rules need to be updated whenever the policy changes. Given the current architecture, all privileged operations are performed in an init container that runs just once at the very beginning before privileges are dropped for the actual application. Since updating iptables rules requires root privileges, there is no way to do that without restarting the application.

AppSwitch provides a way to redirect application connections without root privilege. As such, an unprivileged application is already able to connect to any host (modulo firewall rules etc.) and the owner of the application should be allowed to change the host address passed by its application via connect(2) without requiring additional privilege.

Socket delegation

Let’s see how AppSwitch could help redirect connections without using iptables. Imagine that the application somehow voluntarily passes the socket FDs that it uses for its communication to the sidecar, then there would be no need for iptables. AppSwitch provides a feature called socket delegation that does exactly that. It allows the sidecar to transparently gain access to copies of socket FDs that the application uses for its communication without any changes to the application itself.

Here are the sequence of steps that would achieve this in the context of the Python application example.

  1. The application initiates a connection request to the service IP of memcached service.
  2. The connection request from client is forwarded to the daemon.
  3. The daemon creates a pair of pre-connected Unix sockets (using socketpair(2) system call).
  4. It passes one end of the socket pair into the application such that the application would use that socket FD for read/write. It also ensures that the application consistently sees it as a legitimate TCP socket as it expects by interposing all calls that query connection properties.
  5. The other end is passed to sidecar over a different Unix socket where the daemon exposes its API. Information such as the original destination that the application was connecting to is also conveyed over the same interface.
Socket delegation based connection redirection

Once the application and sidecar are connected, the rest happens as usual. Sidecar would initiate a connection to upstream server and proxy data between the socket received from the daemon and the socket connected to upstream server. The main difference here is that sidecar would get the connection, not through the accept(2) system call as it is in the normal case, but from the daemon over the Unix socket. In addition to listening for connections from applications through the normal accept(2) channel, the sidecar proxy would connect to the AppSwitch daemon’s REST endpoint and receive sockets that way.

For completeness, here are the sequence of steps that would occur on the server side:

  1. The application receives a connection
  2. AppSwitch daemon accepts the connection on behalf of the application
  3. It creates a pair of pre-connected Unix sockets using socketpair(2) system call
  4. One end of the socket pair is returned to the application through the accept(2) system call
  5. The other end of the socket pair along with the socket originally accepted by the daemon on behalf of the application is sent to sidecar
  6. Sidecar would extract the two socket FDs – a Unix socket FD connected to the application and a TCP socket FD connected to the remote client
  7. Sidecar would read the metadata supplied by the daemon about the remote client and perform its usual operations

“Sidecar-aware” applications

Socket delegation feature can be very useful for applications that are explicitly aware of the sidecar and wish to take advantage of its features. They can voluntarily delegate their network interactions by passing their sockets to the sidecar using the same feature. In a way, AppSwitch transparently turns every application into a sidecar-aware application.

How does it all come together?

Just to step back, Istio offloads common connectivity concerns from applications to a sidecar proxy that performs those functions on behalf of the application. And AppSwitch simplifies and optimizes the service mesh by sidestepping intermediate layers and invoking the proxy only for cases where it is truly necessary.

In the rest of this section, I outline how AppSwitch may be integrated with Istio based on a very cursory initial implementation. This is not intended to be anything like a design doc – not every possible way of integration is explored and not every detail is worked out. The intent is to discuss high-level aspects of the implementation to present a rough idea of how the two systems may come together. The key is that AppSwitch would act as a cushion between Istio and a real proxy. It would serve as the “fast-path” for cases that can be performed more efficiently without invoking the sidecar proxy. And for the cases where the proxy is used, it would shorten the datapath by cutting through unnecessary layers. Look at this blog for a more detailed walk through of the integration.

AppSwitch client injection

Similar to Istio sidecar-injector, a simple tool called ax-injector injects AppSwitch client into a standard Kubernetes manifest. Injected client transparently monitors the application and intimates AppSwitch daemon of the control path network API events that the application produces.

It is possible to not require the injection and work with standard Kubernetes manifests if AppSwitch CNI plugin is used. In that case, the CNI plugin would perform necessary injection when it gets the initialization callback. Using injector does have some advantages, however: (1) It works in tightly-controlled environments like GKE (2) It can be easily extended to support other frameworks such as Mesos (3) Same cluster would be able to run standard applications alongside “AppSwitch-enabled” applications.

AppSwitch DaemonSet

AppSwitch daemon can be configured to run as a DaemonSet or as an extension to the application that is directly injected into application manifest. In either case it handles network events coming in from the applications that it supports.

Agent for policy acquisition

This is the component that conveys policy and configuration dictated by Istio to AppSwitch. It implements xDS API to listen from Pilot and calls appropriate AppSwitch APIs to program the daemon. For example, it allows the load balancing strategy, as specified by istioctl, to be translated into equivalent AppSwitch capability.

Platform adapter for AppSwitch “Auto-Curated” service registry

Given that AppSwitch is in the control path of applications’ network APIs, it has ready access to the topology of services across the cluster. AppSwitch exposes that information in the form of a service registry that is automatically and (almost) synchronously updated as applications and their services come and go. A new platform adapter for AppSwitch alongside Kubernetes, Eureka etc. would provide the details of upstream services to Istio. This is not strictly necessary but it does make it easier to correlate service endpoints received from Pilot by AppSwitch agent above.

Proxy integration and chaining

Connections that do require deep scanning and mutation of application traffic are handed off to an external proxy through the socket delegation mechanism discussed earlier. It uses an extended version of proxy protocol. In addition to the simple parameters supported by the proxy protocol, a variety of other metadata (including the initial protocol headers obtained from the socket buffers) and live socket FDs (representing application connections) are forwarded to the proxy.

The proxy can look at the metadata and decide how to proceed. It could respond by accepting the connection to do the proxying or by directing AppSwitch to allow the connection and use the fast-path or to just drop the connection.

One of the interesting aspects of the mechanism is that, when the proxy accepts a socket from AppSwitch, it can in turn delegate the socket to another proxy. In fact that is how AppSwitch currently works. It uses a simple built-in proxy to examine the metadata and decide whether to handle the connection internally or to hand it off to an external proxy (Envoy). The same mechanism can be potentially extended to allow for a chain of plugins, each looking for a specific signature, with the last one in the chain doing the real proxy work.

It’s not just about performance

Removing intermediate layers along the datapath is not just about improving performance. Performance is a great side effect, but it is a side effect. There are a number of important advantages to an API level approach.

Automatic application onboarding and policy authoring

Before microservices and service mesh, traffic management was done by load balancers and access controls were enforced by firewalls. Applications were identified by IP addresses and DNS names which were relatively static. In fact, that’s still the status quo in most environments. Such environments stand to benefit immensely from service mesh. However a practical and scalable bridge to the new world needs to be provided. The difficulty in transformation is not as much due to lack of features and functionality but the investment required to rethink and reimplement the entire application infrastructure. Currently most of the policy and configuration exists in the form of load balancer and firewall rules. Somehow that existing context needs to be leveraged in providing a scalable path to adopting the service mesh model.

AppSwitch can substantially ease the onboarding process. It can project the same network environment to the application at the target as its current source environment. Not having any assistance here is typically a non-starter in case of traditional applications which have complex configuration files with static IP addresses or specific DNS names hard-coded in them. AppSwitch could help capture those applications along with their existing configuration and connect them over a service mesh without requiring any changes.

Broader application and protocol support

HTTP clearly dominates the modern application landscapes but once we talk about traditional applications and environments, we’d encounter all kinds of protocols and transports. Particularly, support for UDP becomes unavoidable. Traditional application servers such as IBM WebSphere rely extensively on UDP. Most multimedia applications use UDP media streams. Of course DNS is probably the most widely used UDP “application”. AppSwitch supports UDP at the API level much the same way as TCP and when it detects a UDP connection, it can transparently handle it in its “fast-path” rather than delegating it to the proxy.

Client IP preservation and end-to-end principle

The same mechanism that preserves the source network environment can also preserve client IP addresses as seen by the servers. With a sidecar proxy in place, connection requests come from the proxy rather than the client. As a result, the peer address (IP:port) of the connection as seen by the server would be that of the proxy rather than the client. AppSwitch ensures that the server sees correct address of the client, logs it correctly and any decisions made based on the client address remain valid. More generally, AppSwitch preserves the end-to-end principle which is otherwise broken by intermediate layers that obfuscate the true underlying context.

Enhanced application signal with access to encrypted headers

Encrypted traffic completely undermines the ability of the service mesh to analyze application traffic. API level interposition could potentially offer a way around it. Current implementation of AppSwitch gains access to application’s network API at the system call level. However it is possible in principle to influence the application at an API boundary, higher in the stack where application data is not yet encrypted or already decrypted. Ultimately the data is always produced in the clear by the application and then encrypted at some point before it goes out. Since AppSwitch directly runs within the memory context of the application, it is possible to tap into the data higher on the stack where it is still held in clear. Only requirement for this to work is that the API used for encryption should be well-defined and amenable for interposition. Particularly, it requires access to the symbol table of the application binaries. Just to be clear, AppSwitch doesn’t implement this today.

So what’s the net?

AppSwitch removes a number of layers and processing from the standard service mesh stack. What does all that translate to in terms of performance?

We ran some initial experiments to characterize the extent of the opportunity for optimization based on the initial integration of AppSwitch discussed earlier. The experiments were run on GKE using fortio-0.11.0, istio-0.8.0 and appswitch-0.4.0-2. In case of the proxyless test, AppSwitch daemon was run as a DaemonSet on the Kubernetes cluster and the Fortio pod spec was modified to inject AppSwitch client. These were the only two changes made to the setup. The test was configured to measure the latency of GRPC requests across 100 concurrent connections.

Latency with and without AppSwitch

Initial results indicate a difference of over 18x in p50 latency with and without AppSwitch (3.99ms vs 72.96ms). The difference was around 8x when mixer and access logs were disabled. Clearly the difference was due to sidestepping all those intermediate layers along the datapath. Unix socket optimization wasn’t triggered in case of AppSwitch because client and server pods were scheduled to separate hosts. End-to-end latency of AppSwitch case would have been even lower if the client and server happened to be colocated. Essentially the client and server running in their respective pods of the Kubernetes cluster are directly connected over a TCP socket going over the GKE network – no tunneling, bridge or proxies.

Net Net

I started out with David Wheeler’s seemingly reasonable quote that says adding another layer is not a solution for the problem of too many layers. And I argued through most of the blog that current network stack already has too many layers and that they should be removed. But isn’t AppSwitch itself a layer?

Yes, AppSwitch is clearly another layer. However it is one that can remove multiple other layers. In doing so, it seamlessly glues the new service mesh layer with existing layers of traditional network environments. It offsets the cost of sidecar proxy and as Istio graduates to 1.0, it provides a bridge for existing applications and their network environments to transition to the new world of service mesh.

Perhaps Wheeler’s quote should read:

Acknowledgements

Thanks to Mandar Jog (Google) for several discussions about the value of AppSwitch for Istio and to the following individuals (in alphabetical order) for their review of early drafts of this blog.

  • Frank Budinsky (IBM)
  • Lin Sun (IBM)
  • Shriram Rajagopalan (VMware)
]]>
Mon, 30 Jul 2018 00:00:00 +0000/v1.24//blog/2018/delayering-istio//v1.24//blog/2018/delayering-istio/appswitchperformance
Micro-Segmentation with Istio AuthorizationMicro-segmentation is a security technique that creates secure zones in cloud deployments and allows organizations to isolate workloads from one another and secure them individually. Istio’s authorization feature, also known as Istio Role Based Access Control, provides micro-segmentation for services in an Istio mesh. It features:

  • Authorization at different levels of granularity, including namespace level, service level, and method level.
  • Service-to-service and end-user-to-service authorization.
  • High performance, as it is enforced natively on Envoy.
  • Role-based semantics, which makes it easy to use.
  • High flexibility as it allows users to define conditions using combinations of attributes.

In this blog post, you’ll learn about the main authorization features and how to use them in different situations.

Characteristics

RPC level authorization

Authorization is performed at the level of individual RPCs. Specifically, it controls “who can access my bookstore service”, or “who can access method getBook in my bookstore service”. It is not designed to control access to application-specific resource instances, like access to “storage bucket X” or access to “3rd book on 2nd shelf”. Today this kind of application specific access control logic needs to be handled by the application itself.

Role-based access control with conditions

Authorization is a role-based access control (RBAC) system, contrast this to an attribute-based access control (ABAC) system. Compared to ABAC, RBAC has the following advantages:

  • Roles allow grouping of attributes. Roles are groups of permissions, which specifies the actions you are allowed to perform on a system. Users are grouped based on the roles within an organization. You can define the roles and reuse them for different cases.

  • It is easier to understand and reason about who has access. The RBAC concepts map naturally to business concepts. For example, a DB admin may have all access to DB backend services, while a web client may only be able to view the frontend service.

  • It reduces unintentional errors. RBAC policies make otherwise complex security changes easier. You won’t have duplicate configurations in multiple places and later forget to update some of them when you need to make changes.

On the other hand, Istio’s authorization system is not a traditional RBAC system. It also allows users to define conditions using combinations of attributes. This gives Istio flexibility to express complex access control policies. In fact, the “RBAC + conditions” model that Istio authorization adopts, has all the benefits an RBAC system has, and supports the level of flexibility that normally an ABAC system provides. You’ll see some examples below.

High performance

Because of its simple semantics, Istio authorization is enforced on Envoy as a native authorization support. At runtime, the authorization decision is completely done locally inside an Envoy filter, without dependency to any external module. This allows Istio authorization to achieve high performance and availability.

Work with/without primary identities

Like any other RBAC system, Istio authorization is identity aware. In Istio authorization policy, there is a primary identity called user, which represents the principal of the client.

In addition to the primary identity, you can also specify any conditions that define the identities. For example, you can specify the client identity as “user Alice calling from Bookstore frontend service”, in which case, you have a combined identity of the calling service (Bookstore frontend) and the end user (Alice).

To improve security, you should enable authentication features, and use authenticated identities in authorization policies. However, strongly authenticated identity is not required for using authorization. Istio authorization works with or without identities. If you are working with a legacy system, you may not have mutual TLS or JWT authentication setup for your mesh. In this case, the only way to identify the client is, for example, through IP. You can still use Istio authorization to control which IP addresses or IP ranges are allowed to access your service.

Examples

The authorization task shows you how to use Istio’s authorization feature to control namespace level and service level access using the Bookinfo application. In this section, you’ll see more examples on how to achieve micro-segmentation with Istio authorization.

Namespace level segmentation via RBAC + conditions

Suppose you have services in the frontend and backend namespaces. You would like to allow all your services in the frontend namespace to access all services that are marked external in the backend namespace.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: external-api-caller
  namespace: backend
spec:
  rules:
  - services: ["*"]
    methods: ["*”]
    constraints:
    - key: "destination.labels[visibility]”
      values: ["external"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: external-api-caller
  namespace: backend
spec:
  subjects:
  - properties:
      source.namespace: "frontend”
  roleRef:
    kind: ServiceRole
    name: "external-api-caller"

The ServiceRole and ServiceRoleBinding above expressed “who is allowed to do what under which conditions” (RBAC + conditions). Specifically:

  • “who” are the services in the frontend namespace.
  • “what” is to call services in backend namespace.
  • “conditions” is the visibility label of the destination service having the value external.

Service/method level isolation with/without primary identities

Here is another example that demonstrates finer grained access control at service/method level. The first step is to define a book-reader service role that allows READ access to /books/* resource in bookstore service.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: book-reader
  namespace: default
spec:
  rules:
  - services: ["bookstore.default.svc.cluster.local"]
    paths: ["/books/*”]
    methods: ["GET”]

Using authenticated client identities

Suppose you want to grant this book-reader role to your bookstore-frontend service. If you have enabled mutual TLS authentication for your mesh, you can use a service account to identify your bookstore-frontend service. Granting the book-reader role to the bookstore-frontend service can be done by creating a ServiceRoleBinding as shown below:

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: book-reader
  namespace: default
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/bookstore-frontend”
  roleRef:
    kind: ServiceRole
    name: "book-reader"

You may want to restrict this further by adding a condition that “only users who belong to the qualified-reviewer group are allowed to read books”. The qualified-reviewer group is the end user identity that is authenticated by JWT authentication. In this case, the combination of the client service identity (bookstore-frontend) and the end user identity (qualified-reviewer) is used in the authorization policy.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: book-reader
  namespace: default
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/bookstore-frontend"
    properties:
      request.auth.claims[group]: "qualified-reviewer"
  roleRef:
    kind: ServiceRole
    name: "book-reader"

Client does not have identity

Using authenticated identities in authorization policies is strongly recommended for security. However, if you have a legacy system that does not support authentication, you may not have authenticated identities for your services. You can still use Istio authorization to protect your services even without authenticated identities. The example below shows that you can specify allowed source IP range in your authorization policy.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: book-reader
  namespace: default
spec:
  subjects:
  - properties:
      source.ip: 10.20.0.0/9
  roleRef:
    kind: ServiceRole
    name: "book-reader"

Summary

Istio’s authorization feature provides authorization at namespace-level, service-level, and method-level granularity. It adopts “RBAC + conditions” model, which makes it easy to use and understand as an RBAC system, while providing the level of flexibility that an ABAC system normally provides. Istio authorization achieves high performance as it is enforced natively on Envoy. While it provides the best security by working together with Istio authentication features, Istio authorization can also be used to provide access control for legacy systems that do not have authentication.

]]>
Fri, 20 Jul 2018 00:00:00 +0000/v1.24//blog/2018/istio-authorization//v1.24//blog/2018/istio-authorization/authorizationrbacsecurity
Exporting Logs to BigQuery, GCS, Pub/Sub through StackdriverThis post shows how to direct Istio logs to Stackdriver and export those logs to various configured sinks such as such as BigQuery, Google Cloud Storage or Cloud Pub/Sub. At the end of this post you can perform analytics on Istio data from your favorite places such as BigQuery, GCS or Cloud Pub/Sub.

The Bookinfo sample application is used as the example application throughout this task.

Before you begin

Install Istio in your cluster and deploy an application.

Configuring Istio to export logs

Istio exports logs using the logentry template. This specifies all the variables that are available for analysis. It contains information like source service, destination service, auth metrics (coming..) among others. Following is a diagram of the pipeline:

Exporting logs from Istio to Stackdriver for analysis

Istio supports exporting logs to Stackdriver which can in turn be configured to export logs to your favorite sink like BigQuery, Pub/Sub or GCS. Please follow the steps below to set up your favorite sink for exporting logs first and then Stackdriver in Istio.

Setting up various log sinks

Common setup for all sinks:

  1. Enable Stackdriver Monitoring API for the project.
  2. Make sure principalEmail that would be setting up the sink has write access to the project and Logging Admin role permissions.
  3. Make sure the GOOGLE_APPLICATION_CREDENTIALS environment variable is set. Please follow instructions here to set it up.

BigQuery

  1. Create a BigQuery dataset as a destination for the logs export.
  2. Record the ID of the dataset. It will be needed to configure the Stackdriver handler. It would be of the form bigquery.googleapis.com/projects/[PROJECT_ID]/datasets/[DATASET_ID]
  3. Give sink’s writer identity: cloud-logs@system.gserviceaccount.com BigQuery Data Editor role in IAM.
  4. If using Google Kubernetes Engine, make sure bigquery Scope is enabled on the cluster.

Google Cloud Storage (GCS)

  1. Create a GCS bucket where you would like logs to get exported in GCS.
  2. Recode the ID of the bucket. It will be needed to configure Stackdriver. It would be of the form storage.googleapis.com/[BUCKET_ID]
  3. Give sink’s writer identity: cloud-logs@system.gserviceaccount.com Storage Object Creator role in IAM.

Google Cloud Pub/Sub

  1. Create a topic where you would like logs to get exported in Google Cloud Pub/Sub.
  2. Recode the ID of the topic. It will be needed to configure Stackdriver. It would be of the form pubsub.googleapis.com/projects/[PROJECT_ID]/topics/[TOPIC_ID]
  3. Give sink’s writer identity: cloud-logs@system.gserviceaccount.com Pub/Sub Publisher role in IAM.
  4. If using Google Kubernetes Engine, make sure pubsub Scope is enabled on the cluster.

Setting up Stackdriver

A Stackdriver handler must be created to export data to Stackdriver. The configuration for a Stackdriver handler is described here.

  1. Save the following yaml file as stackdriver.yaml. Replace <project_id>, <sink_id>, <sink_destination>, <log_filter> with their specific values.

    apiVersion: "config.istio.io/v1alpha2"
    kind: stackdriver
    metadata:
      name: handler
      namespace: istio-system
    spec:
      # We'll use the default value from the adapter, once per minute, so we don't need to supply a value.
      # pushInterval: 1m
      # Must be supplied for the Stackdriver adapter to work
      project_id: "<project_id>"
      # One of the following must be set; the preferred method is `appCredentials`, which corresponds to
      # Google Application Default Credentials.
      # If none is provided we default to app credentials.
      # appCredentials:
      # apiKey:
      # serviceAccountPath:
      # Describes how to map Istio logs into Stackdriver.
      logInfo:
        accesslog.logentry.istio-system:
          payloadTemplate: '{{or (.sourceIp) "-"}} - {{or (.sourceUser) "-"}} [{{or (.timestamp.Format "02/Jan/2006:15:04:05 -0700") "-"}}] "{{or (.method) "-"}} {{or (.url) "-"}} {{or (.protocol) "-"}}" {{or (.responseCode) "-"}} {{or (.responseSize) "-"}}'
          httpMapping:
            url: url
            status: responseCode
            requestSize: requestSize
            responseSize: responseSize
            latency: latency
            localIp: sourceIp
            remoteIp: destinationIp
            method: method
            userAgent: userAgent
            referer: referer
          labelNames:
          - sourceIp
          - destinationIp
          - sourceService
          - sourceUser
          - sourceNamespace
          - destinationIp
          - destinationService
          - destinationNamespace
          - apiClaims
          - apiKey
          - protocol
          - method
          - url
          - responseCode
          - responseSize
          - requestSize
          - latency
          - connectionMtls
          - userAgent
          - responseTimestamp
          - receivedBytes
          - sentBytes
          - referer
          sinkInfo:
            id: '<sink_id>'
            destination: '<sink_destination>'
            filter: '<log_filter>'
    ---
    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: stackdriver
      namespace: istio-system
    spec:
      match: "true" # If omitted match is true.
      actions:
      - handler: handler.stackdriver
        instances:
        - accesslog.logentry
    ---
  2. Push the configuration

    $ kubectl apply -f stackdriver.yaml
    stackdriver "handler" created
    rule "stackdriver" created
    logentry "stackdriverglobalmr" created
    metric "stackdriverrequestcount" created
    metric "stackdriverrequestduration" created
    metric "stackdriverrequestsize" created
    metric "stackdriverresponsesize" created
  3. Send traffic to the sample application.

    For the Bookinfo sample, visit http://$GATEWAY_URL/productpage in your web browser or issue the following command:

    $ curl http://$GATEWAY_URL/productpage
  4. Verify that logs are flowing through Stackdriver to the configured sink.

    • Stackdriver: Navigate to the Stackdriver Logs Viewer for your project and look under “GKE Container” -> “Cluster Name” -> “Namespace Id” for Istio Access logs.
    • BigQuery: Navigate to the BigQuery Interface for your project and you should find a table with prefix accesslog_logentry_istio in your sink dataset.
    • GCS: Navigate to the Storage Browser for your project and you should find a bucket named accesslog.logentry.istio-system in your sink bucket.
    • Pub/Sub: Navigate to the Pub/Sub Topic List for your project and you should find a topic for accesslog in your sink topic.

Understanding what happened

Stackdriver.yaml file above configured Istio to send access logs to Stackdriver and then added a sink configuration where these logs could be exported. In detail as follows:

  1. Added a handler of kind stackdriver

    apiVersion: "config.istio.io/v1alpha2"
    kind: stackdriver
    metadata:
      name: handler
      namespace: <your defined namespace>
  2. Added logInfo in spec

    spec:
      logInfo: accesslog.logentry.istio-system:
        labelNames:
        - sourceIp
        - destinationIp
        ...
        ...
        sinkInfo:
          id: '<sink_id>'
          destination: '<sink_destination>'
          filter: '<log_filter>'

    In the above configuration sinkInfo contains information about the sink where you want the logs to get exported to. For more information on how this gets filled for different sinks please refer here.

  3. Added a rule for Stackdriver

    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: stackdriver
      namespace: istio-system spec:
      match: "true" # If omitted match is true
    actions:
    - handler: handler.stackdriver
      instances:
      - accesslog.logentry

Cleanup

  • Remove the new Stackdriver configuration:

    $ kubectl delete -f stackdriver.yaml
  • If you are not planning to explore any follow-on tasks, refer to the Bookinfo cleanup instructions to shutdown the application.

Availability of logs in export sinks

Export to BigQuery is within minutes (we see it to be almost instant), GCS can have a delay of 2 to 12 hours and Pub/Sub is almost immediately.

]]>
Mon, 09 Jul 2018 00:00:00 +0000/v1.24//blog/2018/export-logs-through-stackdriver//v1.24//blog/2018/export-logs-through-stackdriver/
Monitoring and Access Policies for HTTP Egress TrafficWhile Istio’s main focus is management of traffic between microservices inside a service mesh, Istio can also manage ingress (from outside into the mesh) and egress (from the mesh outwards) traffic. Istio can uniformly enforce access policies and aggregate telemetry data for mesh-internal, ingress and egress traffic.

In this blog post, we show how to apply monitoring and access policies to HTTP egress traffic with Istio.

Use case

Consider an organization that runs applications that process content from cnn.com. The applications are decomposed into microservices deployed in an Istio service mesh. The applications access pages of various topics from cnn.com: edition.cnn.com/politics, edition.cnn.com/sport and edition.cnn.com/health. The organization configures Istio to allow access to edition.cnn.com and everything works fine. However, at some point in time, the organization decides to banish politics. Practically, it means blocking access to edition.cnn.com/politics and allowing access to edition.cnn.com/sport and edition.cnn.com/health only. The organization will grant permissions to individual applications and to particular users to access edition.cnn.com/politics, on a case-by-case basis.

To achieve that goal, the organization’s operations people monitor access to the external services and analyze Istio logs to verify that no unauthorized request was sent to edition.cnn.com/politics. They also configure Istio to prevent access to edition.cnn.com/politics automatically.

The organization is resolved to prevent any tampering with the new policy. It decides to put mechanisms in place that will prevent any possibility for a malicious application to access the forbidden topic.

As opposed to the observability and security tasks above, this blog post describes Istio’s monitoring and access policies applied exclusively to the egress traffic.

Before you begin

Follow the steps in the Egress Gateway with TLS Origination example, with mutual TLS authentication enabled, without the Cleanup step. After completing that example, you can access edition.cnn.com/politics from an in-mesh container with curl installed. This blog post assumes that the SOURCE_POD environment variable contains the source pod’s name and that the container’s name is sleep.

Configure monitoring and access policies

Since you want to accomplish your tasks in a secure way, you should direct egress traffic through egress gateway, as described in the Egress Gateway with TLS Origination task. The secure way here means that you want to prevent malicious applications from bypassing Istio monitoring and policy enforcement.

According to our scenario, the organization performed the instructions in the Before you begin section, enabled HTTP traffic to edition.cnn.com, and configured that traffic to pass through the egress gateway. The egress gateway performs TLS origination to edition.cnn.com, so the traffic leaves the mesh encrypted. At this point, the organization is ready to configure Istio to monitor and apply access policies for the traffic to edition.cnn.com.

Logging

Configure Istio to log access to *.cnn.com. You create a logentry and two stdio handlers, one for logging forbidden access (error log level) and another one for logging all access to *.cnn.com (info log level). Then you create rules to direct your logentry instances to your handlers. One rule directs access to *.cnn.com/politics to the handler for logging forbidden access, another rule directs log entries to the handler that outputs each access to *.cnn.com as an info log entry. To understand the Istio logentries, rules, and handlers, see Istio Adapter Model. A diagram with the involved entities and dependencies between them appears below:

Instances, rules and handlers for egress monitoring
  1. Create the logentry, rules and handlers. Note that you specify context.reporter.uid as kubernetes://istio-egressgateway in the rules to get logs from the egress gateway only.

    $ cat <<EOF | kubectl apply -f -
    # Log entry for egress access
    apiVersion: "config.istio.io/v1alpha2"
    kind: logentry
    metadata:
      name: egress-access
      namespace: istio-system
    spec:
      severity: '"info"'
      timestamp: request.time
      variables:
        destination: request.host | "unknown"
        path: request.path | "unknown"
        responseCode: response.code | 0
        responseSize: response.size | 0
        reporterUID: context.reporter.uid | "unknown"
        sourcePrincipal: source.principal | "unknown"
      monitored_resource_type: '"UNSPECIFIED"'
    ---
    # Handler for error egress access entries
    apiVersion: "config.istio.io/v1alpha2"
    kind: stdio
    metadata:
      name: egress-error-logger
      namespace: istio-system
    spec:
     severity_levels:
       info: 2 # output log level as error
     outputAsJson: true
    ---
    # Rule to handle access to *.cnn.com/politics
    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: handle-politics
      namespace: istio-system
    spec:
      match: request.host.endsWith("cnn.com") && request.path.startsWith("/politics") && context.reporter.uid.startsWith("kubernetes://istio-egressgateway")
      actions:
      - handler: egress-error-logger.stdio
        instances:
        - egress-access.logentry
    ---
    # Handler for info egress access entries
    apiVersion: "config.istio.io/v1alpha2"
    kind: stdio
    metadata:
      name: egress-access-logger
      namespace: istio-system
    spec:
      severity_levels:
        info: 0 # output log level as info
      outputAsJson: true
    ---
    # Rule to handle access to *.cnn.com
    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: handle-cnn-access
      namespace: istio-system
    spec:
      match: request.host.endsWith(".cnn.com") && context.reporter.uid.startsWith("kubernetes://istio-egressgateway")
      actions:
      - handler: egress-access-logger.stdio
        instances:
          - egress-access.logentry
    EOF
  2. Send three HTTP requests to cnn.com, to edition.cnn.com/politics, edition.cnn.com/sport and edition.cnn.com/health. All three should return 200 OK.

    $ kubectl exec -it $SOURCE_POD -c sleep -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    200
    200
    200
  3. Query the Mixer log and see that the information about the requests appears in the log:

    $ kubectl -n istio-system logs -l istio-mixer-type=telemetry -c mixer | grep egress-access | grep cnn | tail -4
    {"level":"info","time":"2019-01-29T07:43:24.611462Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/politics","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":1883355,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}
    {"level":"info","time":"2019-01-29T07:43:24.886316Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/sport","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":2094561,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}
    {"level":"info","time":"2019-01-29T07:43:25.369663Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/health","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":2157009,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}
    {"level":"error","time":"2019-01-29T07:43:24.611462Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/politics","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":1883355,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}

    You see four log entries related to your three requests. Three info entries about the access to edition.cnn.com and one error entry about the access to edition.cnn.com/politics. The service mesh operators can see all the access instances, and can also search the log for error log entries that represent forbidden accesses. This is the first security measure the organization can apply before blocking the forbidden accesses automatically, namely logging all the forbidden access instances as errors. In some settings this can be a sufficient security measure.

    Note the attributes:

    • destination, path, responseCode, responseSize are related to HTTP parameters of the requests
    • sourcePrincipal:cluster.local/ns/default/sa/sleep - a string that represents the sleep service account in the default namespace
    • reporterUID: kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system - a UID of the reporting pod, in this case istio-egressgateway-747b6764b8-44rrh in the istio-system namespace

Access control by routing

After enabling logging of access to edition.cnn.com, automatically enforce an access policy, namely allow accessing /health and /sport URL paths only. Such a simple policy control can be implemented with Istio routing.

  1. Redefine your VirtualService for edition.cnn.com:

    $ cat <<EOF | kubectl apply -f -
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-cnn-through-egress-gateway
    spec:
      hosts:
      - edition.cnn.com
      gateways:
      - istio-egressgateway
      - mesh
      http:
      - match:
        - gateways:
          - mesh
          port: 80
        route:
        - destination:
            host: istio-egressgateway.istio-system.svc.cluster.local
            subset: cnn
            port:
              number: 443
          weight: 100
      - match:
        - gateways:
          - istio-egressgateway
          port: 443
          uri:
            regex: "/health|/sport"
        route:
        - destination:
            host: edition.cnn.com
            port:
              number: 443
          weight: 100
    EOF

    Note that you added a match by uri condition that checks that the URL path is either /health or /sport. Also note that this condition is added to the istio-egressgateway section of the VirtualService, since the egress gateway is a hardened component in terms of security (see [egress gateway security considerations] (/docs/tasks/traffic-management/egress/egress-gateway/#additional-security-considerations)). You don’t want any tampering with your policies.

  2. Send the previous three HTTP requests to cnn.com:

    $ kubectl exec -it $SOURCE_POD -c sleep -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    404
    200
    200

    The request to edition.cnn.com/politics returned 404 Not Found, while requests to edition.cnn.com/sport and edition.cnn.com/health returned 200 OK, as expected.

  3. Query the Mixer log and see that the information about the requests appears again in the log:

    $ kubectl -n istio-system logs -l istio-mixer-type=telemetry -c mixer | grep egress-access | grep cnn | tail -4
    {"level":"info","time":"2019-01-29T07:55:59.686082Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/politics","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":404,"responseSize":0,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}
    {"level":"info","time":"2019-01-29T07:55:59.697565Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/sport","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":2094561,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}
    {"level":"info","time":"2019-01-29T07:56:00.264498Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/health","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":2157009,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}
    {"level":"error","time":"2019-01-29T07:55:59.686082Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/politics","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":404,"responseSize":0,"sourcePrincipal":"cluster.local/ns/default/sa/sleep"}

    You still get info and error messages regarding accesses to edition.cnn.com/politics, however this time the responseCode is 404, as expected.

While implementing access control using Istio routing worked for us in this simple case, it would not suffice for more complex cases. For example, the organization may want to allow access to edition.cnn.com/politics under certain conditions, so more complex policy logic than just filtering by URL paths will be required. You may want to apply Istio Mixer Adapters, for example white lists or black lists of allowed/forbidden URL paths, respectively. Policy Rules allow specifying complex conditions, specified in a rich expression language, which includes AND and OR logical operators. The rules can be reused for both logging and policy checks. More advanced users may want to apply Istio Role-Based Access Control.

An additional aspect is integration with remote access policy systems. If the organization in our use case operates some Identity and Access Management system, you may want to configure Istio to use access policy information from such a system. You implement this integration by applying Istio Mixer Adapters.

Cancel the access control by routing you used in this section and implement access control by Mixer policy checks in the next section.

  1. Replace the VirtualService for edition.cnn.com with your previous version from the Configure an Egress Gateway example:

    $ cat <<EOF | kubectl apply -f -
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-cnn-through-egress-gateway
    spec:
      hosts:
      - edition.cnn.com
      gateways:
      - istio-egressgateway
      - mesh
      http:
      - match:
        - gateways:
          - mesh
          port: 80
        route:
        - destination:
            host: istio-egressgateway.istio-system.svc.cluster.local
            subset: cnn
            port:
              number: 443
          weight: 100
      - match:
        - gateways:
          - istio-egressgateway
          port: 443
        route:
        - destination:
            host: edition.cnn.com
            port:
              number: 443
          weight: 100
    EOF
  2. Send the previous three HTTP requests to cnn.com, this time you should get three 200 OK responses as previously:

    $ kubectl exec -it $SOURCE_POD -c sleep -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    200
    200
    200

Access control by Mixer policy checks

In this step you use a Mixer Listchecker adapter, its whitelist variety. You define a listentry with the URL path of the request and a listchecker to check the listentry using a static list of allowed URL paths, specified by the overrides field. For an external Identity and Access Management system, use the providerurl field instead. The updated diagram of the instances, rules and handlers appears below. Note that you reuse the same policy rule, handle-cnn-access both for logging and for access policy checks.

Instances, rules and handlers for egress monitoring and access policies
  1. Define path-checker and request-path:

    $ cat <<EOF | kubectl create -f -
    apiVersion: "config.istio.io/v1alpha2"
    kind: listchecker
    metadata:
      name: path-checker
      namespace: istio-system
    spec:
      overrides: ["/health", "/sport"]  # overrides provide a static list
      blacklist: false
    ---
    apiVersion: "config.istio.io/v1alpha2"
    kind: listentry
    metadata:
      name: request-path
      namespace: istio-system
    spec:
      value: request.path
    EOF
  2. Modify the handle-cnn-access policy rule to send request-path instances to the path-checker:

    $ cat <<EOF | kubectl apply -f -
    # Rule handle egress access to cnn.com
    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: handle-cnn-access
      namespace: istio-system
    spec:
      match: request.host.endsWith(".cnn.com") && context.reporter.uid.startsWith("kubernetes://istio-egressgateway")
      actions:
      - handler: egress-access-logger.stdio
        instances:
          - egress-access.logentry
      - handler: path-checker.listchecker
        instances:
          - request-path.listentry
    EOF
  3. Perform your usual test by sending HTTP requests to edition.cnn.com/politics, edition.cnn.com/sport and edition.cnn.com/health. As expected, the request to edition.cnn.com/politics returns 403 (Forbidden).

    $ kubectl exec -it $SOURCE_POD -c sleep -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    403
    200
    200

Access control by Mixer policy checks, part 2

After the organization in our use case managed to configure logging and access control, it decided to extend its access policy by allowing the applications with a special Service Account to access any topic of cnn.com, without being monitored. You’ll see how this requirement can be configured in Istio.

  1. Start the sleep sample with the politics service account.

    Zip
    $  sed 's/: sleep/: politics/g' @samples/sleep/sleep.yaml@ | kubectl create -f -
    serviceaccount "politics" created
    service "politics" created
    deployment "politics" created
  2. Define the SOURCE_POD_POLITICS shell variable to hold the name of the source pod with the politics service account, for sending requests to external services.

    $ export SOURCE_POD_POLITICS=$(kubectl get pod -l app=politics -o jsonpath={.items..metadata.name})
  3. Perform your usual test of sending three HTTP requests this time from SOURCE_POD_POLITICS. The request to edition.cnn.com/politics returns 403, since you did not configure the exception for the politics namespace.

    $ kubectl exec -it $SOURCE_POD_POLITICS -c politics -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    403
    200
    200
  4. Query the Mixer log and see that the information about the requests from the politics namespace appears in the log:

    $ kubectl -n istio-system logs -l istio-mixer-type=telemetry -c mixer | grep egress-access | grep cnn | tail -4
    {"level":"info","time":"2019-01-29T08:04:42.559812Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/politics","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":403,"responseSize":84,"sourcePrincipal":"cluster.local/ns/default/sa/politics"}
    {"level":"info","time":"2019-01-29T08:04:42.568424Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/sport","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":2094561,"sourcePrincipal":"cluster.local/ns/default/sa/politics"}
    {"level":"error","time":"2019-01-29T08:04:42.559812Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/politics","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":403,"responseSize":84,"sourcePrincipal":"cluster.local/ns/default/sa/politics"}
    {"level":"info","time":"2019-01-29T08:04:42.615641Z","instance":"egress-access.logentry.istio-system","destination":"edition.cnn.com","path":"/health","reporterUID":"kubernetes://istio-egressgateway-747b6764b8-44rrh.istio-system","responseCode":200,"responseSize":2157009,"sourcePrincipal":"cluster.local/ns/default/sa/politics"}

    Note that sourcePrincipal is cluster.local/ns/default/sa/politics which represents the politics service account in the default namespace.

  5. Redefine handle-cnn-access and handle-politics policy rules, to make the applications in the politics namespace exempt from monitoring and policy enforcement.

    $ cat <<EOF | kubectl apply -f -
    # Rule to handle access to *.cnn.com/politics
    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: handle-politics
      namespace: istio-system
    spec:
      match: request.host.endsWith("cnn.com") && context.reporter.uid.startsWith("kubernetes://istio-egressgateway") && request.path.startsWith("/politics") && source.principal != "cluster.local/ns/default/sa/politics"
      actions:
      - handler: egress-error-logger.stdio
        instances:
        - egress-access.logentry
    ---
    # Rule handle egress access to cnn.com
    apiVersion: "config.istio.io/v1alpha2"
    kind: rule
    metadata:
      name: handle-cnn-access
      namespace: istio-system
    spec:
      match: request.host.endsWith(".cnn.com") && context.reporter.uid.startsWith("kubernetes://istio-egressgateway") && source.principal != "cluster.local/ns/default/sa/politics"
      actions:
      - handler: egress-access-logger.stdio
        instances:
          - egress-access.logentry
      - handler: path-checker.listchecker
        instances:
          - request-path.listentry
    EOF
  6. Perform your usual test from SOURCE_POD:

    $ kubectl exec -it $SOURCE_POD -c sleep -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    403
    200
    200

    Since SOURCE_POD does not have politics service account, access to edition.cnn.com/politics is forbidden, as previously.

  7. Perform the previous test from SOURCE_POD_POLITICS:

    $ kubectl exec -it $SOURCE_POD_POLITICS -c politics -- sh -c 'curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/politics; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/sport; curl -sL -o /dev/null -w "%{http_code}\n" http://edition.cnn.com/health'
    200
    200
    200

    Access to all the topics of edition.cnn.com is allowed.

  8. Examine the Mixer log and see that no more requests with sourcePrincipal equal cluster.local/ns/default/sa/politics appear in the log.

    $  kubectl -n istio-system logs -l istio-mixer-type=telemetry -c mixer | grep egress-access | grep cnn | tail -4

Comparison with HTTPS egress traffic control

In this use case the applications use HTTP and Istio Egress Gateway performs TLS origination for them. Alternatively, the applications could originate TLS themselves by issuing HTTPS requests to edition.cnn.com. In this section we describe both approaches and their pros and cons.

In the HTTP approach, the requests are sent unencrypted on the local host, intercepted by the Istio sidecar proxy and forwarded to the egress gateway. Since you configure Istio to use mutual TLS between the sidecar proxy and the egress gateway, the traffic leaves the pod encrypted. The egress gateway decrypts the traffic, inspects the URL path, the HTTP method and headers, reports telemetry and performs policy checks. If the request is not blocked by some policy check, the egress gateway performs TLS origination to the external destination (cnn.com in our case), so the request is encrypted again and sent encrypted to the external destination. The diagram below demonstrates the network flow of this approach. The HTTP protocol inside the gateway designates the protocol as seen by the gateway after decryption.

HTTP egress traffic through an egress gateway

The drawback of this approach is that the requests are sent unencrypted inside the pod, which may be against security policies in some organizations. Also some SDKs have external service URLs hard-coded, including the protocol, so sending HTTP requests could be impossible. The advantage of this approach is the ability to inspect HTTP methods, headers and URL paths, and to apply policies based on them.

In the HTTPS approach, the requests are encrypted end-to-end, from the application to the external destination. The diagram below demonstrates the network flow of this approach. The HTTPS protocol inside the gateway designates the protocol as seen by the gateway.

HTTPS egress traffic through an egress gateway

The end-to-end HTTPS is considered a better approach from the security point of view. However, since the traffic is encrypted the Istio proxies and the egress gateway can only see the source and destination IPs and the SNI of the destination. Since you configure Istio to use mutual TLS between the sidecar proxy and the egress gateway, the identity of the source is also known. The gateway is unable to inspect the URL path, the HTTP method and the headers of the requests, so no monitoring and policies based on the HTTP information can be possible. In our use case, the organization would be able to allow access to edition.cnn.com and to specify which applications are allowed to access edition.cnn.com. However, it will not be possible to allow or block access to specific URL paths of edition.cnn.com. Neither blocking access to edition.cnn.com/politics nor monitoring such access are possible with the HTTPS approach.

We guess that each organization will consider the pros and cons of the two approaches and choose the one most appropriate to its needs.

Summary

In this blog post we showed how different monitoring and policy mechanisms of Istio can be applied to HTTP egress traffic. Monitoring can be implemented by configuring a logging adapter. Access policies can be implemented by configuring VirtualServices or by configuring various policy check adapters. We demonstrated a simple policy that allowed certain URL paths only. We also showed a more complex policy that extended the simple policy by making an exemption to the applications with a certain service account. Finally, we compared HTTP-with-TLS-origination egress traffic with HTTPS egress traffic, in terms of control possibilities by Istio.

Cleanup

  1. Perform the instructions in Cleanup section of the Configure an Egress Gateway example.

  2. Delete the logging and policy checks configuration:

    $ kubectl delete logentry egress-access -n istio-system
    $ kubectl delete stdio egress-error-logger -n istio-system
    $ kubectl delete stdio egress-access-logger -n istio-system
    $ kubectl delete rule handle-politics -n istio-system
    $ kubectl delete rule handle-cnn-access -n istio-system
    $ kubectl delete -n istio-system listchecker path-checker
    $ kubectl delete -n istio-system listentry request-path
  3. Delete the politics source pod:

    Zip
    $ sed 's/: sleep/: politics/g' @samples/sleep/sleep.yaml@ | kubectl delete -f -
    serviceaccount "politics" deleted
    service "politics" deleted
    deployment "politics" deleted
]]>
Fri, 22 Jun 2018 00:00:00 +0000/v1.24//blog/2018/egress-monitoring-access-control//v1.24//blog/2018/egress-monitoring-access-control/egresstraffic-managementaccess-controlmonitoring
Introducing the Istio v1alpha3 routing APIUp until now, Istio has provided a simple API for traffic management using four configuration resources: RouteRule, DestinationPolicy, EgressRule, and (Kubernetes) Ingress. With this API, users have been able to easily manage the flow of traffic in an Istio service mesh. The API has allowed users to route requests to specific versions of services, inject delays and failures for resilience testing, add timeouts and circuit breakers, and more, all without changing the application code itself.

While this functionality has proven to be a very compelling part of Istio, user feedback has also shown that this API does have some shortcomings, specifically when using it to manage very large applications containing thousands of services, and when working with protocols other than HTTP. Furthermore, the use of Kubernetes Ingress resources to configure external traffic has proven to be woefully insufficient for our needs.

To address these, and other concerns, a new traffic management API, a.k.a. v1alpha3, is being introduced, which will completely replace the previous API going forward. Although the v1alpha3 model is fundamentally the same, it is not backward compatible and will require manual conversion from the old API.

To justify this disruption, the v1alpha3 API has gone through a long and painstaking community review process that has hopefully resulted in a greatly improved API that will stand the test of time. In this article, we will introduce the new configuration model and attempt to explain some of the motivation and design principles that influenced it.

Design principles

A few key design principles played a role in the routing model redesign:

  • Explicitly model infrastructure as well as intent. For example, in addition to configuring an ingress gateway, the component (controller) implementing it can also be specified.
  • The authoring model should be “producer oriented” and “host centric” as opposed to compositional. For example, all rules associated with a particular host are configured together, instead of individually.
  • Clear separation of routing from post-routing behaviors.

Configuration resources in v1alpha3

A typical mesh will have one or more load balancers (we call them gateways) that terminate TLS from external networks and allow traffic into the mesh. Traffic then flows through internal services via sidecar gateways. It is also common for applications to consume external services (e.g., Google Maps API). These may be called directly or, in certain deployments, all traffic exiting the mesh may be forced through dedicated egress gateways. The following diagram depicts this mental model.

Gateways in an Istio service mesh

With the above setup in mind, v1alpha3 introduces the following new configuration resources to control traffic routing into, within, and out of the mesh.

  1. Gateway
  2. VirtualService
  3. DestinationRule
  4. ServiceEntry

VirtualService, DestinationRule, and ServiceEntry replace RouteRule, DestinationPolicy, and EgressRule respectively. The Gateway is a platform independent abstraction to model the traffic flowing into dedicated middleboxes.

The figure below depicts the flow of control across configuration resources.

Relationship between different v1alpha3 elements

Gateway

A Gateway configures a load balancer for HTTP/TCP traffic, regardless of where it will be running. Any number of gateways can exist within the mesh and multiple different gateway implementations can co-exist. In fact, a gateway configuration can be bound to a particular workload by specifying the set of workload (pod) labels as part of the configuration, allowing users to reuse off the shelf network appliances by writing a simple gateway controller.

For ingress traffic management, you might ask: Why not reuse Kubernetes Ingress APIs? The Ingress APIs proved to be incapable of expressing Istio’s routing needs. By trying to draw a common denominator across different HTTP proxies, the Ingress is only able to support the most basic HTTP routing and ends up pushing every other feature of modern proxies into non-portable annotations.

Istio Gateway overcomes the Ingress shortcomings by separating the L4-L6 spec from L7. It only configures the L4-L6 functions (e.g., ports to expose, TLS configuration) that are uniformly implemented by all good L7 proxies. Users can then use standard Istio rules to control HTTP requests as well as TCP traffic entering a Gateway by binding a VirtualService to it.

For example, the following simple Gateway configures a load balancer to allow external https traffic for host bookinfo.com into the mesh:

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway
spec:
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    hosts:
    - bookinfo.com
    tls:
      mode: SIMPLE
      serverCertificate: /tmp/tls.crt
      privateKey: /tmp/tls.key

To configure the corresponding routes, a VirtualService (described in the following section) must be defined for the same host and bound to the Gateway using the gateways field in the configuration:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: bookinfo
spec:
  hosts:
    - bookinfo.com
  gateways:
  - bookinfo-gateway # <---- bind to gateway
  http:
  - match:
    - uri:
        prefix: /reviews
    route:
    ...

The Gateway can be used to model an edge-proxy or a purely internal proxy as shown in the first figure. Irrespective of the location, all gateways can be configured and controlled in the same way.

VirtualService

Replacing route rules with something called “virtual services” might seem peculiar at first, but in reality it’s fundamentally a much better name for what is being configured, especially after redesigning the API to address the scalability issues with the previous model.

In effect, what has changed is that instead of configuring routing using a set of individual configuration resources (rules) for a particular destination service, each containing a precedence field to control the order of evaluation, we now configure the (virtual) destination itself, with all of its rules in an ordered list within a corresponding VirtualService resource. For example, where previously we had two RouteRule resources for the Bookinfo application’s reviews service, like this:

apiVersion: config.istio.io/v1alpha2
kind: RouteRule
metadata:
  name: reviews-default
spec:
  destination:
    name: reviews
  precedence: 1
  route:
  - labels:
      version: v1
---
apiVersion: config.istio.io/v1alpha2
kind: RouteRule
metadata:
  name: reviews-test-v2
spec:
  destination:
    name: reviews
  precedence: 2
  match:
    request:
      headers:
        cookie:
          regex: "^(.*?;)?(user=jason)(;.*)?$"
  route:
  - labels:
      version: v2

In v1alpha3, we provide the same configuration in a single VirtualService resource:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - match:
    - headers:
        cookie:
          regex: "^(.*?;)?(user=jason)(;.*)?$"
    route:
    - destination:
        host: reviews
        subset: v2
  - route:
    - destination:
        host: reviews
        subset: v1

As you can see, both of the rules for the reviews service are consolidated in one place, which at first may or may not seem preferable. However, if you look closer at this new model, you’ll see there are fundamental differences that make v1alpha3 vastly more functional.

First of all, notice that the destination service for the VirtualService is specified using a hosts field (repeated field, in fact) and is then again specified in a destination field of each of the route specifications. This is a very important difference from the previous model.

A VirtualService describes the mapping between one or more user-addressable destinations to the actual destination workloads inside the mesh. In our example, they are the same, however, the user-addressed hosts can be any DNS names with optional wildcard prefix or CIDR prefix that will be used to address the service. This can be particularly useful in facilitating turning monoliths into a composite service built out of distinct microservices without requiring the consumers of the service to adapt to the transition.

For example, the following rule allows users to address both the reviews and ratings services of the Bookinfo application as if they are parts of a bigger (virtual) service at http://bookinfo.com/:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: bookinfo
spec:
  hosts:
    - bookinfo.com
  http:
  - match:
    - uri:
        prefix: /reviews
    route:
    - destination:
        host: reviews
  - match:
    - uri:
        prefix: /ratings
    route:
    - destination:
        host: ratings
  ...

The hosts of a VirtualService do not actually have to be part of the service registry, they are simply virtual destinations. This allows users to model traffic for virtual hosts that do not have routable entries inside the mesh. These hosts can be exposed outside the mesh by binding the VirtualService to a Gateway configuration for the same host (as described in the previous section).

In addition to this fundamental restructuring, VirtualService includes several other important changes:

  1. Multiple match conditions can be expressed inside the VirtualService configuration, reducing the need for redundant rules.

  2. Each service version has a name (called a service subset). The set of pods/VMs belonging to a subset is defined in a DestinationRule, described in the following section.

  3. VirtualService hosts can be specified using wildcard DNS prefixes to create a single rule for all matching services. For example, in Kubernetes, to apply the same rewrite rule for all services in the foo namespace, the VirtualService would use *.foo.svc.cluster.local as the host.

DestinationRule

A DestinationRule configures the set of policies to be applied while forwarding traffic to a service. They are intended to be authored by service owners, describing the circuit breakers, load balancer settings, TLS settings, etc.. DestinationRule is more or less the same as its predecessor, DestinationPolicy, with the following exceptions:

  1. The host of a DestinationRule can include wildcard prefixes, allowing a single rule to be specified for many actual services.
  2. A DestinationRule defines addressable subsets (i.e., named versions) of the corresponding destination host. These subsets are used in VirtualService route specifications when sending traffic to specific versions of the service. Naming versions this way allows us to cleanly refer to them across different virtual services, simplify the stats that Istio proxies emit, and to encode subsets in SNI headers.

A DestinationRule that configures policies and subsets for the reviews service might look something like this:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
  - name: v3
    labels:
      version: v3

Notice that, unlike DestinationPolicy, multiple policies (e.g., default and v2-specific) are specified in a single DestinationRule configuration.

ServiceEntry

ServiceEntry is used to add additional entries into the service registry that Istio maintains internally. It is most commonly used to allow one to model traffic to external dependencies of the mesh such as APIs consumed from the web or traffic to services in legacy infrastructure.

Everything you could previously configure using an EgressRule can just as easily be done with a ServiceEntry. For example, access to a simple external service from inside the mesh can be enabled using a configuration something like this:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: foo-ext
spec:
  hosts:
  - foo.com
  ports:
  - number: 80
    name: http
    protocol: HTTP

That said, ServiceEntry has significantly more functionality than its predecessor. First of all, a ServiceEntry is not limited to external service configuration, it can be of two types: mesh-internal or mesh-external. Mesh-internal entries are like all other internal services but are used to explicitly add services to the mesh. They can be used to add services as part of expanding the service mesh to include unmanaged infrastructure (e.g., VMs added to a Kubernetes-based service mesh). Mesh-external entries represent services external to the mesh. For them, mutual TLS authentication is disabled and policy enforcement is performed on the client-side, instead of on the usual server-side for internal service requests.

Because a ServiceEntry configuration simply adds a destination to the internal service registry, it can be used in conjunction with a VirtualService and/or DestinationRule, just like any other service in the registry. The following DestinationRule, for example, can be used to initiate mutual TLS connections for an external service:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: foo-ext
spec:
  host: foo.com
  trafficPolicy:
    tls:
      mode: MUTUAL
      clientCertificate: /etc/certs/myclientcert.pem
      privateKey: /etc/certs/client_private_key.pem
      caCertificates: /etc/certs/rootcacerts.pem

In addition to its expanded generality, ServiceEntry provides several other improvements over EgressRule including the following:

  1. A single ServiceEntry can configure multiple service endpoints, which previously would have required multiple EgressRules.
  2. The resolution mode for the endpoints is now configurable (NONE, STATIC, or DNS).
  3. Additionally, we are working on addressing another pain point: the need to access secure external services over plain text ports (e.g., http://google.com:443). This should be fixed in the coming weeks, allowing you to directly access https://google.com from your application. Stay tuned for an Istio patch release (0.8.x) that addresses this limitation.

Creating and deleting v1alpha3 route rules

Because all route rules for a given destination are now stored together as an ordered list in a single VirtualService resource, adding a second and subsequent rules for a particular destination is no longer done by creating a new (RouteRule) resource, but instead by updating the one-and-only VirtualService resource for the destination.

old routing rules:

$ kubectl apply -f my-second-rule-for-destination-abc.yaml

v1alpha3 routing rules:

$ kubectl apply -f my-updated-rules-for-destination-abc.yaml

Deleting route rules other than the last one for a particular destination is also done by updating the existing resource using kubectl apply.

When adding or removing routes that refer to service versions, the subsets will need to be updated in the service’s corresponding DestinationRule. As you might have guessed, this is also done using kubectl apply.

Summary

The Istio v1alpha3 routing API has significantly more functionality than its predecessor, but unfortunately is not backwards compatible, requiring a one time manual conversion. The previous configuration resources, RouteRule, DesintationPolicy, and EgressRule, will not be supported from Istio 0.9 onwards. Kubernetes users can continue to use Ingress to configure their edge load balancers for basic routing. However, advanced routing features (e.g., traffic split across two versions) will require use of Gateway, a significantly more functional and highly recommended Ingress replacement.

Acknowledgments

Credit for the routing model redesign and implementation work goes to the following people (in alphabetical order):

  • Frank Budinsky (IBM)
  • Zack Butcher (Google)
  • Greg Hanson (IBM)
  • Costin Manolache (Google)
  • Martin Ostrowski (Google)
  • Shriram Rajagopalan (VMware)
  • Louis Ryan (Google)
  • Isaiah Snell-Feikema (IBM)
  • Kuat Yessenov (Google)
]]>
Wed, 25 Apr 2018 00:00:00 +0000/v1.24//blog/2018/v1alpha3-routing//v1.24//blog/2018/v1alpha3-routing/traffic-management
Configuring Istio Ingress with AWS NLB

This post provides instructions to use and configure ingress Istio with AWS Network Load Balancer.

Network load balancer (NLB) could be used instead of classical load balancer. You can see the comparison between different AWS loadbalancer for more explanation.

Prerequisites

The following instructions require a Kubernetes 1.9.0 or newer cluster.

IAM policy

You need to apply policy on the master role in order to be able to provision network load balancer.

  1. In AWS iam console click on policies and click on create a new one:

    Create a new policy
  2. Select json:

    Select json
  3. Copy/paste text below:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "kopsK8sNLBMasterPermsRestrictive",
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeVpcs",
                    "elasticloadbalancing:AddTags",
                    "elasticloadbalancing:CreateListener",
                    "elasticloadbalancing:CreateTargetGroup",
                    "elasticloadbalancing:DeleteListener",
                    "elasticloadbalancing:DeleteTargetGroup",
                    "elasticloadbalancing:DescribeListeners",
                    "elasticloadbalancing:DescribeLoadBalancerPolicies",
                    "elasticloadbalancing:DescribeTargetGroups",
                    "elasticloadbalancing:DescribeTargetHealth",
                    "elasticloadbalancing:ModifyListener",
                    "elasticloadbalancing:ModifyTargetGroup",
                    "elasticloadbalancing:RegisterTargets",
                    "elasticloadbalancing:SetLoadBalancerPoliciesOfListener"
                ],
                "Resource": [
                    "*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeVpcs",
                    "ec2:DescribeRegions"
                ],
                "Resource": "*"
            }
        ]
    }
  4. Click review policy, fill all fields and click create policy:

    Validate policy
  5. Click on roles, select you master role nodes, and click attach policy:

    Attach policy
  6. Your policy is now attach to your master node.

Generate the Istio manifest

To use an AWS nlb load balancer, it is necessary to add an AWS specific annotation to the Istio installation. These instructions explain how to add the annotation.

Save this as the file override.yaml:

gateways:
  istio-ingressgateway:
    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

Generate a manifest with Helm:

$ helm template install/kubernetes/helm/istio --namespace istio -f override.yaml > $HOME/istio.yaml
]]>
Fri, 20 Apr 2018 00:00:00 +0000/v1.24//blog/2018/aws-nlb//v1.24//blog/2018/aws-nlb/ingresstraffic-managementaws
Istio Soft Multi-Tenancy SupportMulti-tenancy is commonly used in many environments across many different applications, but the implementation details and functionality provided on a per tenant basis does not follow one model in all environments. The Kubernetes multi-tenancy working group is working to define the multi-tenant use cases and functionality that should be available within Kubernetes. However, from their work so far it is clear that only “soft multi-tenancy” is possible due to the inability to fully protect against malicious containers or workloads gaining access to other tenant’s pods or kernel resources.

Soft multi-tenancy

For this blog, “soft multi-tenancy” is defined as having a single Kubernetes control plane with multiple Istio control planes and multiple meshes, one control plane and one mesh per tenant. The cluster administrator gets control and visibility across all the Istio control planes, while the tenant administrator only gets control of a specific Istio instance. Separation between the tenants is provided by Kubernetes namespaces and RBAC.

One use case for this deployment model is a shared corporate infrastructure where malicious actions are not expected, but a clean separation of the tenants is still required.

Potential future Istio multi-tenant deployment models are described at the bottom of this blog.

Deployment

Multiple Istio control planes

Deploying multiple Istio control planes starts by replacing all namespace references in a manifest file with the desired namespace. Using istio.yaml as an example, if two tenant level Istio control planes are required; the first can use the istio.yaml default name of istio-system and a second control plane can be created by generating a new yaml file with a different namespace. As an example, the following command creates a yaml file with the Istio namespace of istio-system1.

$ cat istio.yaml | sed s/istio-system/istio-system1/g > istio-system1.yaml

The istio.yaml file contains the details of the Istio control plane deployment, including the pods that make up the control plane (Mixer, Pilot, Ingress, Galley, CA). Deploying the two Istio control plane yaml files:

$ kubectl apply -f install/kubernetes/istio.yaml
$ kubectl apply -f install/kubernetes/istio-system1.yaml

Results in two Istio control planes running in two namespaces.

$ kubectl get pods --all-namespaces
NAMESPACE       NAME                                       READY     STATUS    RESTARTS   AGE
istio-system    istio-ca-ffbb75c6f-98w6x                   1/1       Running   0          15d
istio-system    istio-ingress-68d65fc5c6-dnvfl             1/1       Running   0          15d
istio-system    istio-mixer-5b9f8dffb5-8875r               3/3       Running   0          15d
istio-system    istio-pilot-678fc976c8-b8tv6               2/2       Running   0          15d
istio-system1   istio-ca-5f496fdbcd-lqhlk                  1/1       Running   0          15d
istio-system1   istio-ingress-68d65fc5c6-2vldg             1/1       Running   0          15d
istio-system1   istio-mixer-7d4f7b9968-66z44               3/3       Running   0          15d
istio-system1   istio-pilot-5bb6b7669c-779vb               2/2       Running   0          15d

The Istio sidecar and addons, if required, manifests must also be deployed to match the configured namespace in use by the tenant’s Istio control plane.

The execution of these two yaml files is the responsibility of the cluster administrator, not the tenant level administrator. Additional RBAC restrictions will also need to be configured and applied by the cluster administrator, limiting the tenant administrator to only the assigned namespace.

Split common and namespace specific resources

The manifest files in the Istio repositories create both common resources that would be used by all Istio control planes as well as resources that are replicated per control plane. Although it is a simple matter to deploy multiple control planes by replacing the istio-system namespace references as described above, a better approach is to split the manifests into a common part that is deployed once for all tenants and a tenant specific part. For the Custom Resource Definitions, the roles and the role bindings should be separated out from the provided Istio manifests. Additionally, the roles and role bindings in the provided Istio manifests are probably unsuitable for a multi-tenant environment and should be modified or augmented as described in the next section.

Kubernetes RBAC for Istio control plane resources

To restrict a tenant administrator to a single Istio namespace, the cluster administrator would create a manifest containing, at a minimum, a Role and RoleBinding similar to the one below. In this example, a tenant administrator named sales-admin is limited to the namespace istio-system1. A completed manifest would contain many more apiGroups under the Role providing resource access to the tenant administrator.

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: istio-system1
  name: ns-access-for-sales-admin-istio-system1
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["*"]
  verbs: ["*"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: access-all-istio-system1
  namespace: istio-system1
subjects:
- kind: User
  name: sales-admin
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: ns-access-for-sales-admin-istio-system1
  apiGroup: rbac.authorization.k8s.io

Watching specific namespaces for service discovery

In addition to creating RBAC rules limiting the tenant administrator’s access to a specific Istio control plane, the Istio manifest must be updated to specify the application namespace that Pilot should watch for creation of its xDS cache. This is done by starting the Pilot component with the additional command line arguments --appNamespace, ns-1. Where ns-1 is the namespace that the tenant’s application will be deployed in. An example snippet from the istio-system1.yaml file is shown below.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: istio-pilot
  namespace: istio-system1
  annotations:
    sidecar.istio.io/inject: "false"
spec:
  replicas: 1
  template:
    metadata:
      labels:
        istio: pilot
    spec:
      serviceAccountName: istio-pilot-service-account
      containers:
      - name: discovery
        image: docker.io/<user ID>/pilot:<tag>
        imagePullPolicy: IfNotPresent
        args: ["discovery", "-v", "2", "--admission-service", "istio-pilot", "--appNamespace", "ns-1"]
        ports:
        - containerPort: 8080
        - containerPort: 443

Deploying the tenant application in a namespace

Now that the cluster administrator has created the tenant’s namespace (ex. istio-system1) and Pilot’s service discovery has been configured to watch for a specific application namespace (ex. ns-1), create the application manifests to deploy in that tenant’s specific namespace. For example:

apiVersion: v1
kind: Namespace
metadata:
  name: ns-1

And add the namespace reference to each resource type included in the application’s manifest file. For example:

apiVersion: v1
kind: Service
metadata:
  name: details
  labels:
    app: details
  namespace: ns-1

Although not shown, the application namespaces will also have RBAC settings limiting access to certain resources. These RBAC settings could be set by the cluster administrator and/or the tenant administrator.

Using kubectl in a multi-tenant environment

When defining route rules or destination policies, it is necessary to ensure that the kubectl command is scoped to the namespace the Istio control plane is running in to ensure the resource is created in the proper namespace. Additionally, the rule itself must be scoped to the tenant’s namespace so that it will be applied properly to that tenant’s mesh. The -i option is used to create (or get or describe) the rule in the namespace that the Istio control plane is deployed in. The -n option will scope the rule to the tenant’s mesh and should be set to the namespace that the tenant’s app is deployed in. Note that the -n option can be skipped on the command line if the .yaml file for the resource scopes it properly instead.

For example, the following command would be required to add a route rule to the istio-system1 namespace:

$ kubectl –i istio-system1 apply -n ns-1 -f route_rule_v2.yaml

And can be displayed using the command:

$ kubectl -i istio-system1 -n ns-1 get routerule
NAME                  KIND                                  NAMESPACE
details-Default       RouteRule.v1alpha2.config.istio.io    ns-1
productpage-default   RouteRule.v1alpha2.config.istio.io    ns-1
ratings-default       RouteRule.v1alpha2.config.istio.io    ns-1
reviews-default       RouteRule.v1alpha2.config.istio.io    ns-1

See the Multiple Istio control planes section of this document for more details on namespace requirements in a multi-tenant environment.

Test results

Following the instructions above, a cluster administrator can create an environment limiting, via RBAC and namespaces, what a tenant administrator can deploy.

After deployment, accessing the Istio control plane pods assigned to a specific tenant administrator is permitted:

$ kubectl get pods -n istio-system
NAME                                      READY     STATUS    RESTARTS   AGE
grafana-78d649479f-8pqk9                  1/1       Running   0          1d
istio-ca-ffbb75c6f-98w6x                  1/1       Running   0          1d
istio-ingress-68d65fc5c6-dnvfl            1/1       Running   0          1d
istio-mixer-5b9f8dffb5-8875r              3/3       Running   0          1d
istio-pilot-678fc976c8-b8tv6              2/2       Running   0          1d
istio-sidecar-injector-7587bd559d-5tgk6   1/1       Running   0          1d
prometheus-cf8456855-hdcq7                1/1       Running   0          1d

However, accessing all the cluster’s pods is not permitted:

$ kubectl get pods --all-namespaces
Error from server (Forbidden): pods is forbidden: User "dev-admin" cannot list pods at the cluster scope

And neither is accessing another tenant’s namespace:

$ kubectl get pods -n istio-system1
Error from server (Forbidden): pods is forbidden: User "dev-admin" cannot list pods in the namespace "istio-system1"

The tenant administrator can deploy applications in the application namespace configured for that tenant. As an example, updating the Bookinfo manifests and then deploying under the tenant’s application namespace of ns-0, listing the pods in use by this tenant’s namespace is permitted:

$ kubectl get pods -n ns-0
NAME                              READY     STATUS    RESTARTS   AGE
details-v1-64b86cd49-b7rkr        2/2       Running   0          1d
productpage-v1-84f77f8747-rf2mt   2/2       Running   0          1d
ratings-v1-5f46655b57-5b4c5       2/2       Running   0          1d
reviews-v1-ff6bdb95b-pm5lb        2/2       Running   0          1d
reviews-v2-5799558d68-b989t       2/2       Running   0          1d
reviews-v3-58ff7d665b-lw5j9       2/2       Running   0          1d

But accessing another tenant’s application namespace is not:

$ kubectl get pods -n ns-1
Error from server (Forbidden): pods is forbidden: User "dev-admin" cannot list pods in the namespace "ns-1"

If the add-on tools, example Prometheus, are deployed (also limited by an Istio namespace) the statistical results returned would represent only that traffic seen from that tenant’s application namespace.

Conclusion

The evaluation performed indicates Istio has sufficient capabilities and security to meet a small number of multi-tenant use cases. It also shows that Istio and Kubernetes cannot provide sufficient capabilities and security for other use cases, especially those use cases that require complete security and isolation between untrusted tenants. The improvements required to reach a more secure model of security and isolation require work in container technology, ex. Kubernetes, rather than improvements in Istio capabilities.

Issues

  • The CA (Certificate Authority) and Mixer pod logs from one tenant’s Istio control plane (e.g. istio-system namespace) contained ‘info’ messages from a second tenant’s Istio control plane (e.g. istio-system1 namespace).

Challenges with other multi-tenancy models

Other multi-tenancy deployment models were considered:

  1. A single mesh with multiple applications, one for each tenant on the mesh. The cluster administrator gets control and visibility mesh wide and across all applications, while the tenant administrator only gets control of a specific application.

  2. A single Istio control plane with multiple meshes, one mesh per tenant. The cluster administrator gets control and visibility across the entire Istio control plane and all meshes, while the tenant administrator only gets control of a specific mesh.

  3. A single cloud environment (cluster controlled), but multiple Kubernetes control planes (tenant controlled).

These options either can’t be properly supported without code changes or don’t fully address the use cases.

Current Istio capabilities are poorly suited to support the first model as it lacks sufficient RBAC capabilities to support cluster versus tenant operations. Additionally, having multiple tenants under one mesh is too insecure with the current mesh model and the way Istio drives configuration to the Envoy proxies.

Regarding the second option, the current Istio paradigm assumes a single mesh per Istio control plane. The needed changes to support this model are substantial. They would require finer grained scoping of resources and security domains based on namespaces, as well as, additional Istio RBAC changes. This model will likely be addressed by future work, but not currently possible.

The third model doesn’t satisfy most use cases, as most cluster administrators prefer a common Kubernetes control plane which they provide as a PaaS to their tenants.

Future work

Allowing a single Istio control plane to control multiple meshes would be an obvious next feature. An additional improvement is to provide a single mesh that can host different tenants with some level of isolation and security between the tenants. This could be done by partitioning within a single control plane using the same logical notion of namespace as Kubernetes. A document has been started within the Istio community to define additional use cases and the Istio functionality required to support those use cases.

References

]]>
Thu, 19 Apr 2018 00:00:00 +0000/v1.24//blog/2018/soft-multitenancy//v1.24//blog/2018/soft-multitenancy/tenancy
Traffic Mirroring with Istio for Testing in ProductionTrying to enumerate all the possible combinations of test cases for testing services in non-production/test environments can be daunting. In some cases, you’ll find that all of the effort that goes into cataloging these use cases doesn’t match up to real production use cases. Ideally, we could use live production use cases and traffic to help illuminate all of the feature areas of the service under test that we might miss in more contrived testing environments.

Istio can help here. With the release of Istio 0.5, Istio can mirror traffic to help test your services. You can write route rules similar to the following to enable traffic mirroring:

apiVersion: config.istio.io/v1alpha2
kind: RouteRule
metadata:
  name: mirror-traffic-to-httpbin-v2
spec:
  destination:
    name: httpbin
  precedence: 11
  route:
  - labels:
      version: v1
    weight: 100
  - labels:
      version: v2
    weight: 0
  mirror:
    name: httpbin
    labels:
      version: v2

A few things to note here:

  • When traffic gets mirrored to a different service, that happens outside the critical path of the request
  • Responses to any mirrored traffic is ignored; traffic is mirrored as “fire-and-forget”
  • You’ll need to have the 0-weighted route to hint to Istio to create the proper Envoy cluster under the covers; this should be ironed out in future releases.

Learn more about mirroring by visiting the Mirroring Task and see a more comprehensive treatment of this scenario on my blog.

]]>
Thu, 08 Feb 2018 00:00:00 +0000/v1.24//blog/2018/traffic-mirroring//v1.24//blog/2018/traffic-mirroring/traffic-managementmirroring
Consuming External TCP Services

In my previous blog post, Consuming External Web Services, I described how external services can be consumed by in-mesh Istio applications via HTTPS. In this post, I demonstrate consuming external services over TCP. You will use the Istio Bookinfo sample application, the version in which the book ratings data is persisted in a MySQL database. You deploy this database outside the cluster and configure the ratings microservice to use it. You define a Service Entry to allow the in-mesh applications to access the external database.

Bookinfo sample application with external ratings database

First, you set up a MySQL database instance to hold book ratings data outside of your Kubernetes cluster. Then you modify the Bookinfo sample application to use your database.

Setting up the database for ratings data

For this task you set up an instance of MySQL. You can use any MySQL instance; I used Compose for MySQL. I used mysqlsh (MySQL Shell) as a MySQL client to feed the ratings data.

  1. Set the MYSQL_DB_HOST and MYSQL_DB_PORT environment variables:

    $ export MYSQL_DB_HOST=<your MySQL database host>
    $ export MYSQL_DB_PORT=<your MySQL database port>

    In case of a local MySQL database with the default port, the values are localhost and 3306, respectively.

  2. To initialize the database, run the following command entering the password when prompted. The command is performed with the credentials of the admin user, created by default by Compose for MySQL.

    $ curl -s https://raw.githubusercontent.com/istio/istio/release-1.24/samples/bookinfo/src/mysql/mysqldb-init.sql | mysqlsh --sql --ssl-mode=REQUIRED -u admin -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT

    OR

    When using the mysql client and a local MySQL database, run:

    $ curl -s https://raw.githubusercontent.com/istio/istio/release-1.24/samples/bookinfo/src/mysql/mysqldb-init.sql | mysql -u root -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT
  3. Create a user with the name bookinfo and grant it SELECT privilege on the test.ratings table:

    $ mysqlsh --sql --ssl-mode=REQUIRED -u admin -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "CREATE USER 'bookinfo' IDENTIFIED BY '<password you choose>'; GRANT SELECT ON test.ratings to 'bookinfo';"

    OR

    For mysql and the local database, the command is:

    $ mysql -u root -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "CREATE USER 'bookinfo' IDENTIFIED BY '<password you choose>'; GRANT SELECT ON test.ratings to 'bookinfo';"

    Here you apply the principle of least privilege. This means that you do not use your admin user in the Bookinfo application. Instead, you create a special user for the Bookinfo application , bookinfo, with minimal privileges. In this case, the bookinfo user only has the SELECT privilege on a single table.

    After running the command to create the user, you may want to clean your bash history by checking the number of the last command and running history -d <the number of the command that created the user>. You don’t want the password of the new user to be stored in the bash history. If you’re using mysql, remove the last command from ~/.mysql_history file as well. Read more about password protection of the newly created user in MySQL documentation.

  4. Inspect the created ratings to see that everything worked as expected:

    $ mysqlsh --sql --ssl-mode=REQUIRED -u bookinfo -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "select * from test.ratings;"
    Enter password:
    +----------+--------+
    | ReviewID | Rating |
    +----------+--------+
    |        1 |      5 |
    |        2 |      4 |
    +----------+--------+

    OR

    For mysql and the local database:

    $ mysql -u bookinfo -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "select * from test.ratings;"
    Enter password:
    +----------+--------+
    | ReviewID | Rating |
    +----------+--------+
    |        1 |      5 |
    |        2 |      4 |
    +----------+--------+
  5. Set the ratings temporarily to 1 to provide a visual clue when our database is used by the Bookinfo ratings service:

    $ mysqlsh --sql --ssl-mode=REQUIRED -u admin -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "update test.ratings set rating=1; select * from test.ratings;"
    Enter password:
    
    Rows matched: 2  Changed: 2  Warnings: 0
    +----------+--------+
    | ReviewID | Rating |
    +----------+--------+
    |        1 |      1 |
    |        2 |      1 |
    +----------+--------+

    OR

    For mysql and the local database:

    $ mysql -u root -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "update test.ratings set rating=1; select * from test.ratings;"
    Enter password:
    +----------+--------+
    | ReviewID | Rating |
    +----------+--------+
    |        1 |      1 |
    |        2 |      1 |
    +----------+--------+

    You used the admin user (and root for the local database) in the last command since the bookinfo user does not have the UPDATE privilege on the test.ratings table.

Now you are ready to deploy a version of the Bookinfo application that will use your database.

Initial setting of Bookinfo application

To demonstrate the scenario of using an external database, you start with a Kubernetes cluster with Istio installed. Then you deploy the Istio Bookinfo sample application, apply the default destination rules, and change Istio to the blocking-egress-by-default policy.

This application uses the ratings microservice to fetch book ratings, a number between 1 and 5. The ratings are displayed as stars for each review. There are several versions of the ratings microservice. Some use MongoDB, others use MySQL as their database.

The example commands in this blog post work with Istio 0.8+, with or without mutual TLS enabled.

As a reminder, here is the end-to-end architecture of the application from the Bookinfo sample application.

The original Bookinfo application

Use the database for ratings data in Bookinfo application

  1. Modify the deployment spec of a version of the ratings microservice that uses a MySQL database, to use your database instance. The spec is in samples/bookinfo/platform/kube/bookinfo-ratings-v2-mysql.yaml of an Istio release archive. Edit the following lines:

    - name: MYSQL_DB_HOST
      value: mysqldb
    - name: MYSQL_DB_PORT
      value: "3306"
    - name: MYSQL_DB_USER
      value: root
    - name: MYSQL_DB_PASSWORD
      value: password

    Replace the values in the snippet above, specifying the database host, port, user, and password. Note that the correct way to work with passwords in container’s environment variables in Kubernetes is to use secrets. For this example task only, you may want to write the password directly in the deployment spec. Do not do it in a real environment! I also assume everyone realizes that "password" should not be used as a password…

  2. Apply the modified spec to deploy the version of the ratings microservice, v2-mysql, that will use your database.

    Zip
    $ kubectl apply -f @samples/bookinfo/platform/kube/bookinfo-ratings-v2-mysql.yaml@
    deployment "ratings-v2-mysql" created
  3. Route all the traffic destined to the reviews service to its v3 version. You do this to ensure that the reviews service always calls the ratings service. In addition, route all the traffic destined to the ratings service to ratings v2-mysql that uses your database.

    Specify the routing for both services above by adding two virtual services. These virtual services are specified in samples/bookinfo/networking/virtual-service-ratings-mysql.yaml of an Istio release archive. Important: make sure you applied the default destination rules before running the following command.

    Zip
    $ kubectl apply -f @samples/bookinfo/networking/virtual-service-ratings-mysql.yaml@

The updated architecture appears below. Note that the blue arrows inside the mesh mark the traffic configured according to the virtual services we added. According to the virtual services, the traffic is sent to reviews v3 and ratings v2-mysql.

The Bookinfo application with ratings v2-mysql and an external MySQL database

Note that the MySQL database is outside the Istio service mesh, or more precisely outside the Kubernetes cluster. The boundary of the service mesh is marked by a dashed line.

Access the webpage

Access the webpage of the application, after determining the ingress IP and port.

You have a problem… Instead of the rating stars, the message “Ratings service is currently unavailable” is currently displayed below each review:

The Ratings service error messages

As in Consuming External Web Services, you experience graceful service degradation, which is good. The application did not crash due to the error in the ratings microservice. The webpage of the application correctly displayed the book information, the details, and the reviews, just without the rating stars.

You have the same problem as in Consuming External Web Services, namely all the traffic outside the Kubernetes cluster, both TCP and HTTP, is blocked by default by the sidecar proxies. To enable such traffic for TCP, a mesh-external service entry for TCP must be defined.

Mesh-external service entry for an external MySQL instance

TCP mesh-external service entries come to our rescue.

  1. Get the IP address of your MySQL database instance. As an option, you can use the host command:

    $ export MYSQL_DB_IP=$(host $MYSQL_DB_HOST | grep " has address " | cut -d" " -f4)

    For a local database, set MYSQL_DB_IP to contain the IP of your machine, accessible from your cluster.

  2. Define a TCP mesh-external service entry:

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: mysql-external
    spec:
      hosts:
      - $MYSQL_DB_HOST
      addresses:
      - $MYSQL_DB_IP/32
      ports:
      - name: tcp
        number: $MYSQL_DB_PORT
        protocol: tcp
      location: MESH_EXTERNAL
    EOF
  3. Review the service entry you just created and check that it contains the correct values:

    $ kubectl get serviceentry mysql-external -o yaml
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
    ...

Note that for a TCP service entry, you specify tcp as the protocol of a port of the entry. Also note that you have to specify the IP of the external service in the list of addresses, as a CIDR block with suffix 32.

I will talk more about TCP service entries below. For now, verify that the service entry we added fixed the problem. Access the webpage and see if the stars are back.

It worked! Accessing the web page of the application displays the ratings without error:

Book Ratings Displayed Correctly

Note that you see a one-star rating for both displayed reviews, as expected. You changed the ratings to be one star to provide us with a visual clue that our external database is indeed being used.

As with service entries for HTTP/HTTPS, you can delete and create service entries for TCP using kubectl, dynamically.

Motivation for egress TCP traffic control

Some in-mesh Istio applications must access external services, for example legacy systems. In many cases, the access is not performed over HTTP or HTTPS protocols. Other TCP protocols are used, such as database-specific protocols like MongoDB Wire Protocol and MySQL Client/Server Protocol to communicate with external databases.

Next let me provide more details about the service entries for TCP traffic.

Service entries for TCP traffic

The service entries for enabling TCP traffic to a specific port must specify TCP as the protocol of the port. Additionally, for the MongoDB Wire Protocol, the protocol can be specified as MONGO, instead of TCP.

For the addresses field of the entry, a block of IPs in CIDR notation must be used. Note that the hosts field is ignored for TCP service entries.

To enable TCP traffic to an external service by its hostname, all the IPs of the hostname must be specified. Each IP must be specified by a CIDR block.

Note that all the IPs of an external service are not always known. To enable egress TCP traffic, only the IPs that are used by the applications must be specified.

Also note that the IPs of an external service are not always static, for example in the case of CDNs. Sometimes the IPs are static most of the time, but can be changed from time to time, for example due to infrastructure changes. In these cases, if the range of the possible IPs is known, you should specify the range by CIDR blocks. If the range of the possible IPs is not known, service entries for TCP cannot be used and the external services must be called directly, bypassing the sidecar proxies.

Relation to virtual machines support

Note that the scenario described in this post is different from the Bookinfo with Virtual Machines example. In that scenario, a MySQL instance runs on an external (outside the cluster) machine (a bare metal or a VM), integrated with the Istio service mesh. The MySQL service becomes a first-class citizen of the mesh with all the beneficial features of Istio applicable. Among other things, the service becomes addressable by a local cluster domain name, for example by mysqldb.vm.svc.cluster.local, and the communication to it can be secured by mutual TLS authentication. There is no need to create a service entry to access this service; however, the service must be registered with Istio. To enable such integration, Istio components (Envoy proxy, node-agent, _istio-agent_) must be installed on the machine and the Istio control plane (Pilot, Mixer, Citadel) must be accessible from it. See the Istio VM-related tasks for more details.

In our case, the MySQL instance can run on any machine or can be provisioned as a service by a cloud provider. There is no requirement to integrate the machine with Istio. The Istio control plane does not have to be accessible from the machine. In the case of MySQL as a service, the machine which MySQL runs on may be not accessible and installing on it the required components may be impossible. In our case, the MySQL instance is addressable by its global domain name, which could be beneficial if the consuming applications expect to use that domain name. This is especially relevant when that expected domain name cannot be changed in the deployment configuration of the consuming applications.

Cleanup

  1. Drop the test database and the bookinfo user:

    $ mysqlsh --sql --ssl-mode=REQUIRED -u admin -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "drop database test; drop user bookinfo;"

    OR

    For mysql and the local database:

    $ mysql -u root -p --host $MYSQL_DB_HOST --port $MYSQL_DB_PORT -e "drop database test; drop user bookinfo;"
  2. Remove the virtual services:

    Zip
    $ kubectl delete -f @samples/bookinfo/networking/virtual-service-ratings-mysql.yaml@
    Deleted config: virtual-service/default/reviews
    Deleted config: virtual-service/default/ratings
  3. Undeploy ratings v2-mysql:

    Zip
    $ kubectl delete -f @samples/bookinfo/platform/kube/bookinfo-ratings-v2-mysql.yaml@
    deployment "ratings-v2-mysql" deleted
  4. Delete the service entry:

    $ kubectl delete serviceentry mysql-external -n default
    Deleted config: serviceentry mysql-external

Conclusion

In this blog post, I demonstrated how the microservices in an Istio service mesh can consume external services via TCP. By default, Istio blocks all the traffic, TCP and HTTP, to the hosts outside the cluster. To enable such traffic for TCP, TCP mesh-external service entries must be created for the service mesh.

]]>
Tue, 06 Feb 2018 00:00:00 +0000/v1.24//blog/2018/egress-tcp//v1.24//blog/2018/egress-tcp/traffic-managementegresstcp
Consuming External Web ServicesIn many cases, not all the parts of a microservices-based application reside in a service mesh. Sometimes, the microservices-based applications use functionality provided by legacy systems that reside outside the mesh. You may want to migrate these systems to the service mesh gradually. Until these systems are migrated, they must be accessed by the applications inside the mesh. In other cases, the applications use web services provided by third parties.

In this blog post, I modify the Istio Bookinfo Sample Application to fetch book details from an external web service (Google Books APIs). I show how to enable egress HTTPS traffic in Istio by using mesh-external service entries. I provide two options for egress HTTPS traffic and describe the pros and cons of each of the options.

Initial setting

To demonstrate the scenario of consuming an external web service, I start with a Kubernetes cluster with Istio installed. Then I deploy Istio Bookinfo Sample Application. This application uses the details microservice to fetch book details, such as the number of pages and the publisher. The original details microservice provides the book details without consulting any external service.

The example commands in this blog post work with Istio 1.0+, with or without mutual TLS enabled. The Bookinfo configuration files reside in the samples/bookinfo directory of the Istio release archive.

Here is a copy of the end-to-end architecture of the application from the original Bookinfo sample application.

The Original Bookinfo Application

Perform the steps in the Deploying the application, Confirm the app is running, Apply default destination rules sections, and change Istio to the blocking-egress-by-default policy.

Bookinfo with HTTPS access to a Google Books web service

Deploy a new version of the details microservice, v2, that fetches the book details from Google Books APIs. Run the following command; it sets the DO_NOT_ENCRYPT environment variable of the service’s container to false. This setting will instruct the deployed service to use HTTPS (instead of HTTP) to access to the external service.

Zip
$ kubectl apply -f @samples/bookinfo/platform/kube/bookinfo-details-v2.yaml@ --dry-run -o yaml | kubectl set env --local -f - 'DO_NOT_ENCRYPT=false' -o yaml | kubectl apply -f -

The updated architecture of the application now looks as follows:

The Bookinfo Application with details V2

Note that the Google Books web service is outside the Istio service mesh, the boundary of which is marked by a dashed line.

Now direct all the traffic destined to the details microservice, to details version v2.

Zip
$ kubectl apply -f @samples/bookinfo/networking/virtual-service-details-v2.yaml@

Note that the virtual service relies on a destination rule that you created in the Apply default destination rules section.

Access the web page of the application, after determining the ingress IP and port.

Oops… Instead of the book details you have the Error fetching product details message displayed:

The Error Fetching Product Details Message

The good news is that your application did not crash. With a good microservice design, you do not have failure propagation. In your case, the failing details microservice does not cause the productpage microservice to fail. Most of the functionality of the application is still provided, despite the failure in the details microservice. You have graceful service degradation: as you can see, the reviews and the ratings are displayed correctly, and the application is still useful.

So what might have gone wrong? Ah… The answer is that I forgot to tell you to enable traffic from inside the mesh to an external service, in this case to the Google Books web service. By default, the Istio sidecar proxies (Envoy proxies) block all the traffic to destinations outside the cluster. To enable such traffic, you must define a mesh-external service entry.

Enable HTTPS access to a Google Books web service

No worries, define a mesh-external service entry and fix your application. You must also define a virtual service to perform routing by SNI to the external service.

$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: googleapis
spec:
  hosts:
  - www.googleapis.com
  ports:
  - number: 443
    name: https
    protocol: HTTPS
  location: MESH_EXTERNAL
  resolution: DNS
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: googleapis
spec:
  hosts:
  - www.googleapis.com
  tls:
  - match:
    - port: 443
      sni_hosts:
      - www.googleapis.com
    route:
    - destination:
        host: www.googleapis.com
        port:
          number: 443
      weight: 100
EOF

Now accessing the web page of the application displays the book details without error:

Book Details Displayed Correctly

You can query your service entries:

$ kubectl get serviceentries
NAME         AGE
googleapis   8m

You can delete your service entry:

$ kubectl delete serviceentry googleapis
serviceentry "googleapis" deleted

and see in the output that the service entry is deleted.

Accessing the web page after deleting the service entry produces the same error that you experienced before, namely Error fetching product details. As you can see, the service entries are defined dynamically, as are many other Istio configuration artifacts. The Istio operators can decide dynamically which domains they allow the microservices to access. They can enable and disable traffic to the external domains on the fly, without redeploying the microservices.

Cleanup of HTTPS access to a Google Books web service

ZipZip
$ kubectl delete serviceentry googleapis
$ kubectl delete virtualservice googleapis
$ kubectl delete -f @samples/bookinfo/networking/virtual-service-details-v2.yaml@
$ kubectl delete -f @samples/bookinfo/platform/kube/bookinfo-details-v2.yaml@

TLS origination by Istio

There is a caveat to this story. Suppose you want to monitor which specific set of Google APIs your microservices use (Books, Calendar, Tasks etc.) Suppose you want to enforce a policy that using only Books APIs is allowed. Suppose you want to monitor the book identifiers that your microservices access. For these monitoring and policy tasks you need to know the URL path. Consider for example the URL www.googleapis.com/books/v1/volumes?q=isbn:0486424618. In that URL, Books APIs is specified by the path segment /books, and the ISBN number by the path segment /volumes?q=isbn:0486424618. However, in HTTPS, all the HTTP details (hostname, path, headers etc.) are encrypted and such monitoring and policy enforcement by the sidecar proxies is not possible. Istio can only know the server name of the encrypted requests by the SNI (Server Name Indication) field, in this case www.googleapis.com.

To allow Istio to perform monitoring and policy enforcement of egress requests based on HTTP details, the microservices must issue HTTP requests. Istio then opens an HTTPS connection to the destination (performs TLS origination). The code of the microservices must be written differently or configured differently, according to whether the microservice runs inside or outside an Istio service mesh. This contradicts the Istio design goal of maximizing transparency. Sometimes you need to compromise…

The diagram below shows two options for sending HTTPS traffic to external services. On the top, a microservice sends regular HTTPS requests, encrypted end-to-end. On the bottom, the same microservice sends unencrypted HTTP requests inside a pod, which are intercepted by the sidecar Envoy proxy. The sidecar proxy performs TLS origination, so the traffic between the pod and the external service is encrypted.

HTTPS traffic to external services, with TLS originated by the microservice vs. by the sidecar proxy

Here is how both patterns are supported in the Bookinfo details microservice code, using the Ruby net/http module:

uri = URI.parse('https://www.googleapis.com/books/v1/volumes?q=isbn:' + isbn)
http = Net::HTTP.new(uri.host, ENV['DO_NOT_ENCRYPT'] === 'true' ? 80:443)
...
unless ENV['DO_NOT_ENCRYPT'] === 'true' then
     http.use_ssl = true
end

When the DO_NOT_ENCRYPT environment variable is defined, the request is performed without SSL (plain HTTP) to port 80.

You can set the DO_NOT_ENCRYPT environment variable to “true” in the Kubernetes deployment spec of details v2, the container section:

env:
- name: DO_NOT_ENCRYPT
  value: "true"

In the next section you will configure TLS origination for accessing an external web service.

Bookinfo with TLS origination to a Google Books web service

  1. Deploy a version of details v2 that sends an HTTP request to Google Books APIs. The DO_NOT_ENCRYPT variable is set to true in bookinfo-details-v2.yaml.

    Zip
    $ kubectl apply -f @samples/bookinfo/platform/kube/bookinfo-details-v2.yaml@
  2. Direct the traffic destined to the details microservice, to details version v2.

    Zip
    $ kubectl apply -f @samples/bookinfo/networking/virtual-service-details-v2.yaml@
  3. Create a mesh-external service entry for www.google.apis , a virtual service to rewrite the destination port from 80 to 443, and a destination rule to perform TLS origination.

    $ kubectl apply -f - <<EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: googleapis
    spec:
      hosts:
      - www.googleapis.com
      ports:
      - number: 80
        name: http
        protocol: HTTP
      - number: 443
        name: https
        protocol: HTTPS
      resolution: DNS
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: rewrite-port-for-googleapis
    spec:
      hosts:
      - www.googleapis.com
      http:
      - match:
        - port: 80
        route:
        - destination:
            host: www.googleapis.com
            port:
              number: 443
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: originate-tls-for-googleapis
    spec:
      host: www.googleapis.com
      trafficPolicy:
        loadBalancer:
          simple: ROUND_ROBIN
        portLevelSettings:
        - port:
            number: 443
          tls:
            mode: SIMPLE # initiates HTTPS when accessing www.googleapis.com
    EOF
  4. Access the web page of the application and verify that the book details are displayed without errors.

  5. Enable Envoy’s access logging

  6. Check the log of of the sidecar proxy of details v2 and see the HTTP request.

    $ kubectl logs $(kubectl get pods -l app=details -l version=v2 -o jsonpath='{.items[0].metadata.name}') istio-proxy | grep googleapis
    [2018-08-09T11:32:58.171Z] "GET /books/v1/volumes?q=isbn:0486424618 HTTP/1.1" 200 - 0 1050 264 264 "-" "Ruby" "b993bae7-4288-9241-81a5-4cde93b2e3a6" "www.googleapis.com:80" "172.217.20.74:80"
    EOF

    Note the URL path in the log, the path can be monitored and access policies can be applied based on it. To read more about monitoring and access policies for HTTP egress traffic, check out this blog post.

Cleanup of TLS origination to a Google Books web service

ZipZip
$ kubectl delete serviceentry googleapis
$ kubectl delete virtualservice rewrite-port-for-googleapis
$ kubectl delete destinationrule originate-tls-for-googleapis
$ kubectl delete -f @samples/bookinfo/networking/virtual-service-details-v2.yaml@
$ kubectl delete -f @samples/bookinfo/platform/kube/bookinfo-details-v2.yaml@

Relation to Istio mutual TLS

Note that the TLS origination in this case is unrelated to the mutual TLS applied by Istio. The TLS origination for the external services will work, whether the Istio mutual TLS is enabled or not. The mutual TLS secures service-to-service communication inside the service mesh and provides each service with a strong identity. The external services in this blog post were accessed using one-way TLS, the same mechanism used to secure communication between a web browser and a web server. TLS is applied to the communication with external services to verify the identity of the external server and to encrypt the traffic.

Conclusion

In this blog post I demonstrated how microservices in an Istio service mesh can consume external web services by HTTPS. By default, Istio blocks all the traffic to the hosts outside the cluster. To enable such traffic, mesh-external service entries must be created for the service mesh. It is possible to access the external sites either by issuing HTTPS requests, or by issuing HTTP requests with Istio performing TLS origination. When the microservices issue HTTPS requests, the traffic is encrypted end-to-end, however Istio cannot monitor HTTP details like the URL paths of the requests. When the microservices issue HTTP requests, Istio can monitor the HTTP details of the requests and enforce HTTP-based access policies. However, in that case the traffic between microservice and the sidecar proxy is unencrypted. Having part of the traffic unencrypted can be forbidden in organizations with very strict security requirements.

]]>
Wed, 31 Jan 2018 00:00:00 +0000/v1.24//blog/2018/egress-https//v1.24//blog/2018/egress-https/traffic-managementegresshttps
Mixer and the SPOF MythAs Mixer is in the request path, it is natural to question how it impacts overall system availability and latency. A common refrain we hear when people first glance at Istio architecture diagrams is “Isn’t this just introducing a single point of failure?”

In this post, we’ll dig deeper and cover the design principles that underpin Mixer and the surprising fact Mixer actually increases overall mesh availability and reduces average request latency.

Istio’s use of Mixer has two main benefits in terms of overall system availability and latency:

  • Increased SLO. Mixer insulates proxies and services from infrastructure backend failures, enabling higher effective mesh availability. The mesh as a whole tends to experience a lower rate of failure when interacting with the infrastructure backends than if Mixer were not present.

  • Reduced Latency. Through aggressive use of shared multi-level caches and sharding, Mixer reduces average observed latencies across the mesh.

We’ll explain this in more detail below.

How we got here

For many years at Google, we’ve been using an internal API & service management system to handle the many APIs exposed by Google. This system has been fronting the world’s biggest services (Google Maps, YouTube, Gmail, etc) and sustains a peak rate of hundreds of millions of QPS. Although this system has served us well, it had problems keeping up with Google’s rapid growth, and it became clear that a new architecture was needed in order to tamp down ballooning operational costs.

In 2014, we started an initiative to create a replacement architecture that would scale better. The result has proven extremely successful and has been gradually deployed throughout Google, saving in the process millions of dollars a month in ops costs.

The older system was built around a centralized fleet of fairly heavy proxies into which all incoming traffic would flow, before being forwarded to the services where the real work was done. The newer architecture jettisons the shared proxy design and instead consists of a very lean and efficient distributed sidecar proxy sitting next to service instances, along with a shared fleet of sharded control plane intermediaries:

Google's API & Service Management System

Look familiar? Of course: it’s just like Istio! Istio was conceived as a second generation of this distributed proxy architecture. We took the core lessons from this internal system, generalized many of the concepts by working with our partners, and created Istio.

Architecture recap

As shown in the diagram below, Mixer sits between the mesh and the infrastructure backends that support it:

Istio Topology

The Envoy sidecar logically calls Mixer before each request to perform precondition checks, and after each request to report telemetry. The sidecar has local caching such that a relatively large percentage of precondition checks can be performed from cache. Additionally, the sidecar buffers outgoing telemetry such that it only actually needs to call Mixer once for every several thousands requests. Whereas precondition checks are synchronous to request processing, telemetry reports are done asynchronously with a fire-and-forget pattern.

At a high level, Mixer provides:

  • Backend Abstraction. Mixer insulates the Istio components and services within the mesh from the implementation details of individual infrastructure backends.

  • Intermediation. Mixer allows operators to have fine-grained control over all interactions between the mesh and the infrastructure backends.

However, even beyond these purely functional aspects, Mixer has other characteristics that provide the system with additional benefits.

Mixer: SLO booster

Contrary to the claim that Mixer is a SPOF and can therefore lead to mesh outages, we believe it in fact improves the effective availability of a mesh. How can that be? There are three basic characteristics at play:

  • Statelessness. Mixer is stateless in that it doesn’t manage any persistent storage of its own.

  • Hardening. Mixer proper is designed to be a highly reliable component. The design intent is to achieve > 99.999% uptime for any individual Mixer instance.

  • Caching and Buffering. Mixer is designed to accumulate a large amount of transient ephemeral state.

The sidecar proxies that sit next to each service instance in the mesh must necessarily be frugal in terms of memory consumption, which constrains the possible amount of local caching and buffering. Mixer, however, lives independently and can use considerably larger caches and output buffers. Mixer thus acts as a highly-scaled and highly-available second-level cache for the sidecars.

Mixer’s expected availability is considerably higher than most infrastructure backends (those often have availability of perhaps 99.9%). Its local caches and buffers help mask infrastructure backend failures by being able to continue operating even when a backend has become unresponsive.

Mixer: Latency slasher

As we explained above, the Istio sidecars generally have fairly effective first-level caching. They can serve the majority of their traffic from cache. Mixer provides a much greater shared pool of second-level cache, which helps Mixer contribute to a lower average per-request latency.

While it’s busy cutting down latency, Mixer is also inherently cutting down the number of calls your mesh makes to infrastructure backends. Depending on how you’re paying for these backends, this might end up saving you some cash by cutting down the effective QPS to the backends.

Work ahead

We have opportunities ahead to continue improving the system in many ways.

Configuration canaries

Mixer is highly scaled so it is generally resistant to individual instance failures. However, Mixer is still susceptible to cascading failures in the case when a poison configuration is deployed which causes all Mixer instances to crash basically at the same time (yeah, that would be a bad day). To prevent this from happening, configuration changes can be canaried to a small set of Mixer instances, and then more broadly rolled out.

Mixer doesn’t yet do canarying of configuration changes, but we expect this to come online as part of Istio’s ongoing work on reliable configuration distribution.

Cache tuning

We have yet to fine-tune the sizes of the sidecar and Mixer caches. This work will focus on achieving the highest performance possible using the least amount of resources.

Cache sharing

At the moment, each Mixer instance operates independently of all other instances. A request handled by one Mixer instance will not leverage data cached in a different instance. We will eventually experiment with a distributed cache such as memcached or Redis in order to provide a much larger mesh-wide shared cache, and further reduce the number of calls to infrastructure backends.

Sharding

In very large meshes, the load on Mixer can be great. There can be a large number of Mixer instances, each straining to keep caches primed to satisfy incoming traffic. We expect to eventually introduce intelligent sharding such that Mixer instances become slightly specialized in handling particular data streams in order to increase the likelihood of cache hits. In other words, sharding helps improve cache efficiency by routing related traffic to the same Mixer instance over time, rather than randomly dispatching to any available Mixer instance.

Conclusion

Practical experience at Google showed that the model of a slim sidecar proxy and a large shared caching control plane intermediary hits a sweet spot, delivering excellent perceived availability and latency. We’ve taken the lessons learned there and applied them to create more sophisticated and effective caching, prefetching, and buffering strategies in Istio. We’ve also optimized the communication protocols to reduce overhead when a cache miss does occur.

Mixer is still young. As of Istio 0.3, we haven’t really done significant performance work within Mixer itself. This means when a request misses the sidecar cache, we spend more time in Mixer to respond to requests than we should. We’re doing a lot of work to improve this in coming months to reduce the overhead that Mixer imparts in the synchronous precondition check case.

We hope this post makes you appreciate the inherent benefits that Mixer brings to Istio. Don’t hesitate to post comments or questions to istio-policies-and-telemetry@.

]]>
Thu, 07 Dec 2017 00:00:00 +0000/v1.24//blog/2017/mixer-spof-myth//v1.24//blog/2017/mixer-spof-myth/adaptersmixerpoliciestelemetryavailabilitylatency
Mixer Adapter ModelIstio 0.2 introduced a new Mixer adapter model which is intended to increase Mixer’s flexibility to address a varied set of infrastructure backends. This post intends to put the adapter model in context and explain how it works.

Why adapters?

Infrastructure backends provide support functionality used to build services. They include such things as access control systems, telemetry capturing systems, quota enforcement systems, billing systems, and so forth. Services traditionally directly integrate with these backend systems, creating a hard coupling and baking-in specific semantics and usage options.

Mixer serves as an abstraction layer between Istio and an open-ended set of infrastructure backends. The Istio components and services that run within the mesh can interact with these backends, while not being coupled to the backends’ specific interfaces.

In addition to insulating application-level code from the details of infrastructure backends, Mixer provides an intermediation model that allows operators to inject and control policies between application code and backends. Operators can control which data is reported to which backend, which backend to consult for authorization, and much more.

Given that individual infrastructure backends each have different interfaces and operational models, Mixer needs custom code to deal with each and we call these custom bundles of code adapters.

Adapters are Go packages that are directly linked into the Mixer binary. It’s fairly simple to create custom Mixer binaries linked with specialized sets of adapters, in case the default set of adapters is not sufficient for specific use cases.

Philosophy

Mixer is essentially an attribute processing and routing machine. The proxy sends it attributes as part of doing precondition checks and telemetry reports, which it turns into a series of calls into adapters. The operator supplies configuration which describes how to map incoming attributes to inputs for the adapters.

Attribute Machine

Configuration is a complex task. In fact, evidence shows that the overwhelming majority of service outages are caused by configuration errors. To help combat this, Mixer’s configuration model enforces a number of constraints designed to avoid errors. For example, the configuration model uses strong typing to ensure that only meaningful attributes or attribute expressions are used in any given context.

Handlers: configuring adapters

Each adapter that Mixer uses requires some configuration to operate. Typically, adapters need things like the URL to their backend, credentials, caching options, and so forth. Each adapter defines the exact configuration data it needs via a protobuf message.

You configure each adapter by creating handlers for them. A handler is a configuration resource which represents a fully configured adapter ready for use. There can be any number of handlers for a single adapter, making it possible to reuse an adapter in different scenarios.

Templates: adapter input schema

Mixer is typically invoked twice for every incoming request to a mesh service, once for precondition checks and once for telemetry reporting. For every such call, Mixer invokes one or more adapters. Different adapters need different pieces of data as input in order to do their work. A logging adapter needs a log entry, a metric adapter needs a metric, an authorization adapter needs credentials, etc. Mixer templates are used to describe the exact data that an adapter consumes at request time.

Each template is specified as a protobuf message. A single template describes a bundle of data that is delivered to one or more adapters at runtime. Any given adapter can be designed to support any number of templates, the specific templates the adapter supports is determined by the adapter developer.

metric and logentry are two of the most essential templates used within Istio. They represent respectively the payload to report a single metric and a single log entry to appropriate backends.

Instances: attribute mapping

You control which data is delivered to individual adapters by creating instances. Instances control how Mixer uses the attributes delivered by the proxy into individual bundles of data that can be routed to different adapters.

Creating instances generally requires using attribute expressions. The point of these expressions is to use any attribute or literal value in order to produce a result that can be assigned to an instance’s field.

Every instance field has a type, as defined in the template, every attribute has a type, and every attribute expression has a type. You can only assign type-compatible expressions to any given instance fields. For example, you can’t assign an integer expression to a string field. This kind of strong typing is designed to minimize the risk of creating bogus configurations.

Rules: delivering data to adapters

The last piece to the puzzle is telling Mixer which instances to send to which handler and when. This is done by creating rules. Each rule identifies a specific handler and the set of instances to send to that handler. Whenever Mixer processes an incoming call, it invokes the indicated handler and gives it the specific set of instances for processing.

Rules contain matching predicates. A predicate is an attribute expression which returns a true/false value. A rule only takes effect if its predicate expression returns true. Otherwise, it’s like the rule didn’t exist and the indicated handler isn’t invoked.

Future

We are working to improve the end to end experience of using and developing adapters. For example, several new features are planned to make templates more expressive. Additionally, the expression language is being substantially enhanced to be more powerful and well-rounded.

Longer term, we are evaluating ways to support adapters which aren’t directly linked into the main Mixer binary. This would simplify deployment and composition.

Conclusion

The refreshed Mixer adapter model is designed to provide a flexible framework to support an open-ended set of infrastructure backends.

Handlers provide configuration data for individual adapters, templates determine exactly what kind of data different adapters want to consume at runtime, instances let operators prepare this data, rules direct the data to one or more handlers.

You can learn more about Mixer’s overall architecture here, and learn the specifics of templates, handlers, and rules here. You can find many examples of Mixer configuration resources in the Bookinfo sample here.

]]>
Fri, 03 Nov 2017 00:00:00 +0000/v1.24//blog/2017/adapter-model//v1.24//blog/2017/adapter-model/adaptersmixerpoliciestelemetry
Using Network Policy with IstioThe use of Network Policy to secure applications running on Kubernetes is a now a widely accepted industry best practice. Given that Istio also supports policy, we want to spend some time explaining how Istio policy and Kubernetes Network Policy interact and support each other to deliver your application securely.

Let’s start with the basics: why might you want to use both Istio and Kubernetes Network Policy? The short answer is that they are good at different things. Consider the main differences between Istio and Network Policy (we will describe “typical” implementations, e.g. Calico, but implementation details can vary with different network providers):

Istio Policy Network Policy
Layer “Service” — L7 “Network” — L3-4
Implementation User space Kernel
Enforcement Point Pod Node

Layer

Istio policy operates at the “service” layer of your network application. This is Layer 7 (Application) from the perspective of the OSI model, but the de facto model of cloud native applications is that Layer 7 actually consists of at least two layers: a service layer and a content layer. The service layer is typically HTTP, which encapsulates the actual application data (the content layer). It is at this service layer of HTTP that the Istio’s Envoy proxy operates. In contrast, Network Policy operates at Layers 3 (Network) and 4 (Transport) in the OSI model.

Operating at the service layer gives the Envoy proxy a rich set of attributes to base policy decisions on, for protocols it understands, which at present includes HTTP/1.1 & HTTP/2 (gRPC operates over HTTP/2). So, you can apply policy based on virtual host, URL, or other HTTP headers. In the future, Istio will support a wide range of Layer 7 protocols, as well as generic TCP and UDP transport.

In contrast, operating at the network layer has the advantage of being universal, since all network applications use IP. At the network layer you can apply policy regardless of the layer 7 protocol: DNS, SQL databases, real-time streaming, and a plethora of other services that do not use HTTP can be secured. Network Policy isn’t limited to a classic firewall’s tuple of IP addresses, proto, and ports. Both Istio and Network Policy are aware of rich Kubernetes labels to describe pod endpoints.

Implementation

Istio’s proxy is based on Envoy, which is implemented as a user space daemon in the data plane that interacts with the network layer using standard sockets. This gives it a large amount of flexibility in processing, and allows it to be distributed (and upgraded!) in a container.

Network Policy data plane is typically implemented in kernel space (e.g. using iptables, eBPF filters, or even custom kernel modules). Being in kernel space allows them to be extremely fast, but not as flexible as the Envoy proxy.

Enforcement point

Policy enforcement using the Envoy proxy is implemented inside the pod, as a sidecar container in the same network namespace. This allows a simple deployment model. Some containers are given permission to reconfigure the networking inside their pod (CAP_NET_ADMIN). If such a service instance is compromised, or misbehaves (as in a malicious tenant) the proxy can be bypassed.

While this won’t let an attacker access other Istio-enabled pods, so long as they are correctly configured, it opens several attack vectors:

  • Attacking unprotected pods
  • Attempting to deny service to protected pods by sending lots of traffic
  • Exfiltrating data collected in the pod
  • Attacking the cluster infrastructure (servers or Kubernetes services)
  • Attacking services outside the mesh, like databases, storage arrays, or legacy systems.

Network Policy is typically enforced at the host node, outside the network namespace of the guest pods. This means that compromised or misbehaving pods must break into the root namespace to avoid enforcement. With the addition of egress policy due in Kubernetes 1.8, this difference makes Network Policy a key part of protecting your infrastructure from compromised workloads.

Examples

Let’s walk through a few examples of what you might want to do with Kubernetes Network Policy for an Istio-enabled application. Consider the Bookinfo sample application. We’re going to cover the following use cases for Network Policy:

  • Reduce attack surface of the application ingress
  • Enforce fine-grained isolation within the application

Reduce attack surface of the application ingress

Our application ingress controller is the main entry-point to our application from the outside world. A quick peek at istio.yaml (used to install Istio) defines the Istio ingress like this:

apiVersion: v1
kind: Service
metadata:
  name: istio-ingress
  labels:
    istio: ingress
spec:
  type: LoadBalancer
  ports:
  - port: 80
    name: http
  - port: 443
    name: https
  selector:
    istio: ingress

The istio-ingress exposes ports 80 and 443. Let’s limit incoming traffic to just these two ports. Envoy has a built-in administrative interface, and we don’t want a misconfigured istio-ingress image to accidentally expose our admin interface to the outside world. This is an example of defense in depth: a properly configured image should not expose the interface, and a properly configured Network Policy will prevent anyone from connecting to it. Either can fail or be misconfigured and we are still protected.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: istio-ingress-lockdown
  namespace: default
spec:
  podSelector:
    matchLabels:
      istio: ingress
  ingress:
  - ports:
    - protocol: TCP
      port: 80
    - protocol: TCP
      port: 443

Enforce fine-grained isolation within the application

Here is the service graph for the Bookinfo application.

Bookinfo Service Graph

This graph shows every connection that a correctly functioning application should be allowed to make. All other connections, say from the Istio Ingress directly to the Rating service, are not part of the application. Let’s lock out those extraneous connections so they cannot be used by an attacker. Imagine, for example, that the Ingress pod is compromised by an exploit that allows an attacker to run arbitrary code. If we only allow connections to the Product Page pods using Network Policy, the attacker has gained no more access to my application backends even though they have compromised a member of the service mesh.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: product-page-ingress
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: productpage
  ingress:
  - ports:
    - protocol: TCP
      port: 9080
    from:
    - podSelector:
        matchLabels:
          istio: ingress

You can and should write a similar policy for each service to enforce which other pods are allowed to access each.

Summary

Our take is that Istio and Network Policy have different strengths in applying policy. Istio is application-protocol aware and highly flexible, making it ideal for applying policy in support of operational goals, like service routing, retries, circuit-breaking, etc, and for security that operates at the application layer, such as token validation. Network Policy is universal, highly efficient, and isolated from the pods, making it ideal for applying policy in support of network security goals. Furthermore, having policy that operates at different layers of the network stack is a really good thing as it gives each layer specific context without commingling of state and allows separation of responsibility.

This post is based on the three part blog series by Spike Curtis, one of the Istio team members at Tigera. The full series can be found here: https://www.projectcalico.org/using-network-policy-in-concert-with-istio/

]]>
Thu, 10 Aug 2017 00:00:00 +0000/v1.24//blog/2017/0.1-using-network-policy//v1.24//blog/2017/0.1-using-network-policy/
Canary Deployments using Istio

One of the benefits of the Istio project is that it provides the control needed to deploy canary services. The idea behind canary deployment (or rollout) is to introduce a new version of a service by first testing it using a small percentage of user traffic, and then if all goes well, increase, possibly gradually in increments, the percentage while simultaneously phasing out the old version. If anything goes wrong along the way, we abort and roll back to the previous version. In its simplest form, the traffic sent to the canary version is a randomly selected percentage of requests, but in more sophisticated schemes it can be based on the region, user, or other properties of the request.

Depending on your level of expertise in this area, you may wonder why Istio’s support for canary deployment is even needed, given that platforms like Kubernetes already provide a way to do version rollout and canary deployment. Problem solved, right? Well, not exactly. Although doing a rollout this way works in simple cases, it’s very limited, especially in large scale cloud environments receiving lots of (and especially varying amounts of) traffic, where autoscaling is needed.

Canary deployment in Kubernetes

As an example, let’s say we have a deployed service, helloworld version v1, for which we would like to test (or simply roll out) a new version, v2. Using Kubernetes, you can roll out a new version of the helloworld service by simply updating the image in the service’s corresponding Deployment and letting the rollout happen automatically. If we take particular care to ensure that there are enough v1 replicas running when we start and pause the rollout after only one or two v2 replicas have been started, we can keep the canary’s effect on the system very small. We can then observe the effect before deciding to proceed or, if necessary, roll back. Best of all, we can even attach a horizontal pod autoscaler to the Deployment and it will keep the replica ratios consistent if, during the rollout process, it also needs to scale replicas up or down to handle traffic load.

Although fine for what it does, this approach is only useful when we have a properly tested version that we want to deploy, i.e., more of a blue/green, a.k.a. red/black, kind of upgrade than a “dip your feet in the water” kind of canary deployment. In fact, for the latter (for example, testing a canary version that may not even be ready or intended for wider exposure), the canary deployment in Kubernetes would be done using two Deployments with common pod labels. In this case, we can’t use autoscaling anymore because it’s now being done by two independent autoscalers, one for each Deployment, so the replica ratios (percentages) may vary from the desired ratio, depending purely on load.

Whether we use one deployment or two, canary management using deployment features of container orchestration platforms like Docker, Mesos/Marathon, or Kubernetes has a fundamental problem: the use of instance scaling to manage the traffic; traffic version distribution and replica deployment are not independent in these systems. All replica pods, regardless of version, are treated the same in the kube-proxy round-robin pool, so the only way to manage the amount of traffic that a particular version receives is by controlling the replica ratio. Maintaining canary traffic at small percentages requires many replicas (e.g., 1% would require a minimum of 100 replicas). Even if we ignore this problem, the deployment approach is still very limited in that it only supports the simple (random percentage) canary approach. If, instead, we wanted to limit the visibility of the canary to requests based on some specific criteria, we still need another solution.

Enter Istio

With Istio, traffic routing and replica deployment are two completely independent functions. The number of pods implementing services are free to scale up and down based on traffic load, completely orthogonal to the control of version traffic routing. This makes managing a canary version in the presence of autoscaling a much simpler problem. Autoscalers may, in fact, respond to load variations resulting from traffic routing changes, but they are nevertheless functioning independently and no differently than when loads change for other reasons.

Istio’s routing rules also provide other important advantages; you can easily control fine-grained traffic percentages (e.g., route 1% of traffic without requiring 100 pods) and you can control traffic using other criteria (e.g., route traffic for specific users to the canary version). To illustrate, let’s look at deploying the helloworld service and see how simple the problem becomes.

We begin by defining the helloworld Service, just like any other Kubernetes service, something like this:

apiVersion: v1
kind: Service
metadata:
name: helloworld
labels:
  app: helloworld
spec:
  selector:
    app: helloworld
  ...

We then add 2 Deployments, one for each version (v1 and v2), both of which include the service selector’s app: helloworld label:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloworld-v1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: helloworld
        version: v1
    spec:
      containers:
      - image: helloworld-v1
        ...
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloworld-v2
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: helloworld
        version: v2
    spec:
      containers:
      - image: helloworld-v2
        ...

Note that this is exactly the same way we would do a canary deployment using plain Kubernetes, but in that case we would need to adjust the number of replicas of each Deployment to control the distribution of traffic. For example, to send 10% of the traffic to the canary version (v2), the replicas for v1 and v2 could be set to 9 and 1, respectively.

However, since we are going to deploy the service in an Istio enabled cluster, all we need to do is set a routing rule to control the traffic distribution. For example if we want to send 10% of the traffic to the canary, we could use kubectl to set a routing rule something like this:

$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: helloworld
spec:
  hosts:
    - helloworld
  http:
  - route:
    - destination:
        host: helloworld
        subset: v1
      weight: 90
    - destination:
        host: helloworld
        subset: v2
      weight: 10
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: helloworld
spec:
  host: helloworld
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
EOF

After setting this rule, Istio will ensure that only one tenth of the requests will be sent to the canary version, regardless of how many replicas of each version are running.

Autoscaling the deployments

Because we don’t need to maintain replica ratios anymore, we can safely add Kubernetes horizontal pod autoscalers to manage the replicas for both version Deployments:

$ kubectl autoscale deployment helloworld-v1 --cpu-percent=50 --min=1 --max=10
deployment "helloworld-v1" autoscaled
$ kubectl autoscale deployment helloworld-v2 --cpu-percent=50 --min=1 --max=10
deployment "helloworld-v2" autoscaled
$ kubectl get hpa
NAME           REFERENCE                 TARGET  CURRENT  MINPODS  MAXPODS  AGE
Helloworld-v1  Deployment/helloworld-v1  50%     47%      1        10       17s
Helloworld-v2  Deployment/helloworld-v2  50%     40%      1        10       15s

If we now generate some load on the helloworld service, we would notice that when scaling begins, the v1 autoscaler will scale up its replicas significantly higher than the v2 autoscaler will for its replicas because v1 pods are handling 90% of the load.

$ kubectl get pods | grep helloworld
helloworld-v1-3523621687-3q5wh   0/2       Pending   0          15m
helloworld-v1-3523621687-73642   2/2       Running   0          11m
helloworld-v1-3523621687-7hs31   2/2       Running   0          19m
helloworld-v1-3523621687-dt7n7   2/2       Running   0          50m
helloworld-v1-3523621687-gdhq9   2/2       Running   0          11m
helloworld-v1-3523621687-jxs4t   0/2       Pending   0          15m
helloworld-v1-3523621687-l8rjn   2/2       Running   0          19m
helloworld-v1-3523621687-wwddw   2/2       Running   0          15m
helloworld-v1-3523621687-xlt26   0/2       Pending   0          19m
helloworld-v2-4095161145-963wt   2/2       Running   0          50m

If we then change the routing rule to send 50% of the traffic to v2, we should, after a short delay, notice that the v1 autoscaler will scale down the replicas of v1 while the v2 autoscaler will perform a corresponding scale up.

$ kubectl get pods | grep helloworld
helloworld-v1-3523621687-73642   2/2       Running   0          35m
helloworld-v1-3523621687-7hs31   2/2       Running   0          43m
helloworld-v1-3523621687-dt7n7   2/2       Running   0          1h
helloworld-v1-3523621687-gdhq9   2/2       Running   0          35m
helloworld-v1-3523621687-l8rjn   2/2       Running   0          43m
helloworld-v2-4095161145-57537   0/2       Pending   0          21m
helloworld-v2-4095161145-9322m   2/2       Running   0          21m
helloworld-v2-4095161145-963wt   2/2       Running   0          1h
helloworld-v2-4095161145-c3dpj   0/2       Pending   0          21m
helloworld-v2-4095161145-t2ccm   0/2       Pending   0          17m
helloworld-v2-4095161145-v3v9n   0/2       Pending   0          13m

The end result is very similar to the simple Kubernetes Deployment rollout, only now the whole process is not being orchestrated and managed in one place. Instead, we’re seeing several components doing their jobs independently, albeit in a cause and effect manner. What’s different, however, is that if we now stop generating load, the replicas of both versions will eventually scale down to their minimum (1), regardless of what routing rule we set.

$ kubectl get pods | grep helloworld
helloworld-v1-3523621687-dt7n7   2/2       Running   0          1h
helloworld-v2-4095161145-963wt   2/2       Running   0          1h

Focused canary testing

As mentioned above, the Istio routing rules can be used to route traffic based on specific criteria, allowing more sophisticated canary deployment scenarios. Say, for example, instead of exposing the canary to an arbitrary percentage of users, we want to try it out on internal users, maybe even just a percentage of them. The following command could be used to send 50% of traffic from users at some-company-name.com to the canary version, leaving all other users unaffected:

$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: helloworld
spec:
  hosts:
    - helloworld
  http:
  - match:
    - headers:
        cookie:
          regex: "^(.*?;)?(email=[^;]*@some-company-name.com)(;.*)?$"
    route:
    - destination:
        host: helloworld
        subset: v1
      weight: 50
    - destination:
        host: helloworld
        subset: v2
      weight: 50
  - route:
    - destination:
        host: helloworld
        subset: v1
EOF

As before, the autoscalers bound to the 2 version Deployments will automatically scale the replicas accordingly, but that will have no affect on the traffic distribution.

Summary

In this article we’ve seen how Istio supports general scalable canary deployments, and how this differs from the basic deployment support in Kubernetes. Istio’s service mesh provides the control necessary to manage traffic distribution with complete independence from deployment scaling. This allows for a simpler, yet significantly more functional, way to do canary test and rollout.

Intelligent routing in support of canary deployment is just one of the many features of Istio that will make the production deployment of large-scale microservices-based applications much simpler. Check out istio.io for more information and to try it out. The sample code used in this article can be found here.

]]>
Wed, 14 Jun 2017 00:00:00 +0000/v1.24//blog/2017/0.1-canary//v1.24//blog/2017/0.1-canary/traffic-managementcanary
Using Istio to Improve End-to-End SecurityConventional network security approaches fail to address security threats to distributed applications deployed in dynamic production environments. Today, we describe how Istio authentication enables enterprises to transform their security posture from just protecting the edge to consistently securing all inter-service communications deep within their applications. With Istio authentication, developers and operators can protect services with sensitive data against unauthorized insider access and they can achieve this without any changes to the application code!

Istio authentication is the security component of the broader Istio platform. It incorporates the learnings of securing millions of microservice endpoints in Google’s production environment.

Background

Modern application architectures are increasingly based on shared services that are deployed and scaled dynamically on cloud platforms. Traditional network edge security (e.g. firewall) is too coarse-grained and allows access from unintended clients. An example of a security risk is stolen authentication tokens that can be replayed from another client. This is a major risk for companies with sensitive data that are concerned about insider threats. Other network security approaches like IP whitelists have to be statically defined, are hard to manage at scale, and are unsuitable for dynamic production environments.

Thus, security administrators need a tool that enables them to consistently, and by default, secure all communication between services across diverse production environments.

Solution: strong service identity and authentication

Google has, over the years, developed architecture and technology to uniformly secure millions of microservice endpoints in its production environment against external attacks and insider threats. Key security principles include trusting the endpoints and not the network, strong mutual authentication based on service identity and service level authorization. Istio authentication is based on the same principles.

The version 0.1 release of Istio authentication runs on Kubernetes and provides the following features:

  • Strong identity assertion between services

  • Access control to limit the identities that can access a service (and its data)

  • Automatic encryption of data in transit

  • Management of keys and certificates at scale

Istio authentication is based on industry standards like mutual TLS and X.509. Furthermore, Google is actively contributing to an open, community-driven service security framework called SPIFFE. As the SPIFFE specifications mature, we intend for Istio authentication to become a reference implementation of the same.

The diagram below provides an overview of the Istio’s service authentication architecture on Kubernetes.

Istio Authentication Overview

The above diagram illustrates three key security features:

Strong identity

Istio authentication uses Kubernetes service accounts to identify who the service runs as. The identity is used to establish trust and define service level access policies. The identity is assigned at service deployment time and encoded in the SAN (Subject Alternative Name) field of an X.509 certificate. Using a service account as the identity has the following advantages:

  • Administrators can configure who has access to a Service Account by using the RBAC feature introduced in Kubernetes 1.6

  • Flexibility to identify a human user, a service, or a group of services

  • Stability of the service identity for dynamically placed and auto-scaled workloads

Communication security

Service-to-service communication is tunneled through high performance client side and server side Envoy proxies. The communication between the proxies is secured using mutual TLS. The benefit of using mutual TLS is that the service identity is not expressed as a bearer token that can be stolen or replayed from another source. Istio authentication also introduces the concept of Secure Naming to protect from a server spoofing attacks - the client side proxy verifies that the authenticated server’s service account is allowed to run the named service.

Key management and distribution

Istio authentication provides a per-cluster CA (Certificate Authority) and automated key & certificate management. In this context, Istio authentication:

  • Generates a key and certificate pair for each service account.

  • Distributes keys and certificates to the appropriate pods using Kubernetes Secrets.

  • Rotates keys and certificates periodically.

  • Revokes a specific key and certificate pair when necessary (future).

The following diagram explains the end to end Istio authentication workflow on Kubernetes:

Istio Authentication Workflow

Istio authentication is part of the broader security story for containers. Red Hat, a partner on the development of Kubernetes, has identified 10 Layers of container security. Istio addresses two of these layers: “Network Isolation” and “API and Service Endpoint Management”. As cluster federation evolves on Kubernetes and other platforms, our intent is for Istio to secure communications across services spanning multiple federated clusters.

Benefits of Istio authentication

Defense in depth: When used in conjunction with Kubernetes (or infrastructure) network policies, users achieve higher levels of confidence, knowing that pod-to-pod or service-to-service communication is secured both at network and application layers.

Secure by default: When used with Istio’s proxy and centralized policy engine, Istio authentication can be configured during deployment with minimal or no application change. Administrators and operators can thus ensure that service communications are secured by default and that they can enforce these policies consistently across diverse protocols and runtimes.

Strong service authentication: Istio authentication secures service communication using mutual TLS to ensure that the service identity is not expressed as a bearer token that can be stolen or replayed from another source. This ensures that services with sensitive data can only be accessed from strongly authenticated and authorized clients.

Join us in this journey

Istio authentication is the first step towards providing a full stack of capabilities to protect services with sensitive data from external attacks and insider threats. While the initial version runs on Kubernetes, our goal is to enable Istio authentication to secure services across diverse production environments. We encourage the community to join us in making robust service security easy and ubiquitous across different application stacks and runtime platforms.

]]>
Thu, 25 May 2017 00:00:00 +0000/v1.24//blog/2017/0.1-auth//v1.24//blog/2017/0.1-auth/