From dbb23e1fdb41423dc9d87d723112a3b0fd282c83 Mon Sep 17 00:00:00 2001
From: Vadim Eisenberg <vadime@il.ibm.com>
Date: Wed, 2 Oct 2019 16:40:25 +0300
Subject: [PATCH] Blog post about using Istio multi-mesh for isolation and
 boundary protection (#4776)

* initial version

* add structure and certificate generation

* remove redundant article

* create the reviews service and later delete it

required for pods to start

* kubernetes -> kubectl

* complete creating the egress gateway section

* add deployment of an ingress gateway

* use LoadBalancer type for the private ingress gateway

* expand the cleanup section

* add "Expose reviews v2" section

* use hostnames in CN so it can be verified by curl

* use a single slash in HTTPRewrite uri field

* fix the virtual service and the curl call

* add a troubleshooting section

* use port 80 in the egress gateway's deployment

* implement the consume section for reviews v2

* expand the troubleshooting section

* split a virtual service, use port 443

* unite two virtual services for reviews

* add namespace to the gateway reference

* complete the cleaning instructions

* fix prefix match and rewrite in consuming reviews v2

* rename the gateway, destination rule, rewrite authority in ingress cluster2

* split the virtual service in cluster1 into two parts

* set access log format to print both the path and the rewritten path

* extend the cleanup section

* add load balancing between the local and remote versions of reviews

* remove usi

* change consume/expose details to ratings

* add diagrams

* canary release the remote version

* fix the subtitle and the publish date

* add subset v1 to the routing to the local version

* use local name (reviews) for a virtual service in the default namespace

* add the 'Deploy reviews v2 locally and retire reviews v1' section

* a Gateway -> an ingress Gateway

* virtualservice myreviews-bookinfo-v2 -> virtualservice privately-exposed-services

* add the "Expose ratings and reviews v3" section

* add printing response code to curl commands

* add a step to delete the consumption of the remote service from `cluster2`

* add a section "Consume ratings and reviews v3"

* add a section about Istio RBAC

* rewrite certificate creation - add spiffe SAN

* add a section about RBAC on ingress gateway

* remove redundant quote

* add extended key usage and critical to subjectAltName

* add generation of certificate and key for cluster3

* rewrite ingress RBAC in cluster2 to use EnvoyFilter for RBAC

Istio RBAC currently does not support getting principal for
MUTUAL TLS, only for ISTIO_MUTUAL

* fix MeshFederation5, the local version of reviews must be v2

* fix a typo

* add the "Cancel exposure of ratings" section

* add checking Istio configuration artifacts

* rewrite the introduction, add requirements and the proposed implementation section

* to base implementation -> to base the implementation

* split a long line

* web page -> webpage

* fix indentation

* of deploying -> after deploying

* add an explanation about openssl

* extend the explanation about `cluster3`

* add an explanation about deploying gateways

* create the certificates -> create the certificates and keys

* remove "the" from "to generate the certificates and the keys"

* minor changes in gateway deployment

* mount volumes from secrets -> mount secrets as data volumes

* add explanation about private gateways

* cluster1 and cluster2 -> both clusters

* add an explanation about exposure/consumption

* add an explanation about c1,c2,c3.example.com hostnames

* real URL -> existing hostname

* port 80 -> port 443 (the egress gateway)

* remove the non-mTLS options

* VirtualService -> virtual service

* fix indentation

* remove back ticks from reviews v1 and v2

* in remote cluster -> is in remote cluster

* add explanation about expose-nothing behavior by default

* add a separating empty line

* port 80 -> port 443

* VirtualService -> virtual service, part 2

* your Kubernetes cluster -> your second cluster

* add "in case you have a load balancer"

* add "in case you have a load balancer... otherwise..."

* fix the pod of reviews-v2 in the first cluster

mention the new pod

* web page -> webpage

* cluster1 -> the first cluster

* make multiple tests a sublist

* rewrite the sentence "Let's change the RBAC policy"

remove let's
remote passive voice

* rewrite the series of the tests to check RBAC

* issues requests -> sends requests

* Let's consider -> consider

* split a long line

* add "locally" to has access to ratings

* the ratings -> ratings

* use first/second cluster instead of cluster1/cluster2 in headings

* add a subsection to remove certificate and key files

* extend the sentence about role binding

* extend the sentence about enabling Istio RBAC on bookinfo

* rewrite the sentence about accessing the webpage of the bookinfo app

* add an explanation about the EnvoyFilter

* other 50% -> the other 50%

* 50% of time -> 50% of the time

* at cluster -> in cluster

* rewrite the sentence about cleaning Istio RBAC

* add summary

* in the subtitle: traffic control -> strict access control

* for the many different reasons -> for different reasons

* special certificates -> dedicated certificates, add dots

* add a sentence about defense in depth and PCI compliance

* fix typos

* through their gateways -> through corresponding gateways

* _v1_ -> `v1`

* ad-hoc -> ad hoc

* put EnvoyFilter and the name of the Envoy's filter in backticks

* instructions for NodePort Ingress -> instructions for using node port for ingress

* add "hoc" to .spelling, for "ad hoc" expression

* fix a link

* remove unneeded single bullet

* fix a link for Defense-in-depth

* rewrite the list of reasons for split applications between multiple clusters

* add a clause about boundary protection

* expand on non-uniform naming

* rewrite the bullet about boundary protection

* expand on the lack of common trust

* fix division into paragraphs in the introduction

* different as -> different than

* in different namespaces in a cluster -> in the clusters

* to the ratings -> to the ratings service

* rewrite the explanation about DNS and routing

* add a comma after "destined to ratings"

* split a long line

* replace PCI DSS with boundary protection

* remove an unneeded empty line

* split long lines in the summary

* simplify the sentence in the summary about explicit exposure of the clusters

* put "paired" in italics

* split a long line

* change the publish date to 12-th of August

* split a long line

* add the "Isolation of system components and boundary protection" subsection

* rephrase a sentence to remove passive voice

* add cyber and subnetworks to .spelling

used by NIST Special Publication 800-53, Revision 4, Security and Privacy
Controls for Federal Information Systems and Organizations:

This type of enhanced protection limits the potential harm from cyber attacks...

... routers, gateways, and firewalls separating system components into physically separate networks or
subnetworks

* rephrase and reformat the section about boundary protection and isolation

* rewrite the section about isolation and boundary protection

* Kubernetes community -> the Kubernetes community

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* three patterns -> three documented patterns

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* three patterns differ -> the differences between the patterns

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* add "where none of the multi cluster patterns apply" to "there are cases when you want to"

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* didn't establish -> have not established

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* rewrite the sentence about the best solution and the goal

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* Payment Card Industry Data Security Standard -> the ..

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* move "in my opinion" to the beginning of the sentence

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* move "in my opinion" to the beginning of the sentence, part 2

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* Add "the" to PCI DSS

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* add "approach" after "the proposed mesh federation"

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* add "the" before NIST

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* uniform identical naming -> uniform naming

* common indentity and common trust -> common identity and trust

* mesh-federation -> isolated-clusters

* rewrite the blog post, removing mesh federation and multicluster mesh mentioning

* add the "Testing the certificates in the chain of calls" section

* Revert "add the "Testing the certificates in the chain of calls" section"

This reverts commit 6ada5903e58139dff3d6018950f61a9e3df8edf6.

* remove redundant parenthesis around the first link to PCI DSS

* fix a typo (though -> through)

* remove the last '/' which seems to confuse lint

* remove namespace qualifier for gateways in virtual services

since the virtual services are in the same namespace

* extend the explanation about RBAC

* try another link for gdpr

* add `&nbsp;` to try to make lint happy

* Revert "add `&nbsp;` to try to make lint happy"

This reverts commit 552806883f1e9742af640ae638f53841563880ec.

* rewrite the list of standards as a table, add links to the paragraph below

* put full service name in backticks

* fix a typo (localtion -> location)

* fix the level of the first section

* rename the ca-example-com-certs secrets into c1/c2-trusted-certs secrets

to enable running commands in a single cluster

* use kubectl apply to create a namespace in case it already exists

for the single cluster scenario

* add deleting of the ratings service in the first cluster

during the initial setting

* change the error in case ratings is not found

* remove istio-private-gateways from the list of RBAC-included namespaces

* add '--ignore-not-found=true' to the kubectl delete commands

to support the case of a single cluster

* credit card -> payment card

* add running the blog post in a single cluster

* add unsetting environment variables to the cleanup section

* fix internal links

* The approach I propose - The approach I use

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* features of the proposed approach -> features of the approach

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* I propose -> I use

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* I propose to base connecting clusters on  -> I connect clusters based on

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* add "some of the process could clearly benefit from automation..."

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* similar the pattern -> similar to the pattern

* the proposed implementation -> the implementation pattern

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* added a comment that my approach is different from multicluster meshes

* fix a link

* add a multi-mesh section to examples

* move the blog post about cluster isolation to examples

* rewrite the blog post as example

* add a missing period in the description

* Revert "add a missing period in the description"

This reverts commit 14f656280f04dab83bfb62dbf1713e15fdca3aa5.

* Revert "rewrite the blog post as example"

This reverts commit 875a4f55f0c3d4782fb4b46cd9d11ae86b31bc37.

* Revert "move the blog post about cluster isolation to examples"

This reverts commit 17b20a1cb5c9b26de7a293d7c5bb68252083e321.

* Revert "add a multi-mesh section to examples"

This reverts commit 9d9365eee79342590acce90cd672be067916cb7f.

* rewrite the blog post to not contain the same service (reviews) in two meshes

per comments of Sven Mawson
using ratings and httpbin to show exposure of two services

* fix the link to Envoy's RBAC filter

* fix an internal link

* fix spelling

* remove redundant empty line

* remove "no common trust" from the single cluster

* initial version after moving the example to istio-ecosystem

* fix list formatting

* additional touches

replace cluster with mesh everywhere
add monitoring at the boundary

* describe -> outline, report

* put all mesh-federation and multi-mesh instances into the glossary markup

* update the publish date

* call "service location transparency" an optional feature

* rewrote "Service location transparency is important" to "Service location transparency is useful in the cases when you want"

* the istio-ecosystem repository -> Istio ecosystem

* rewrite subtitle

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* Rewrite the title

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* rewrite the sentence about isolation

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* rewrite the sentence about separate service meshes on separate networks

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* Remove "Istio to connect applications in the meshes with different compliance requirements"

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove the glossary item from mesh federation and add "support and automation work under way"

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 2

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* add comparison with multi-cluster (single mesh)

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 3

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 4

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 5

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 5

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 6

Co-Authored-By: Frank Budinsky <frankb@ca.ibm.com>

* remove glossary reference, 7

* report -> touch on

* update the date of the blog
---
 .spelling                                     |   6 +-
 .../en/blog/2019/isolated-clusters/index.md   | 146 ++++++++++++++++++
 2 files changed, 151 insertions(+), 1 deletion(-)
 create mode 100644 content/en/blog/2019/isolated-clusters/index.md

diff --git a/.spelling b/.spelling
index 8df075c6f3..81f0a2faf2 100644
--- a/.spelling
+++ b/.spelling
@@ -154,9 +154,10 @@ CVE-2019-9513
 CVE-2019-9514
 CVE-2019-9515
 CVE-2019-9518
+CVEs
+cyber
 Datadog
 datapath
-CVEs
 dataset
 datastore
 Datawire
@@ -201,6 +202,7 @@ faq.md
 Fawad
 fcm.googleapis.com
 FDs
+FedRAMP
 filename
 filenames
 fine-grained
@@ -236,6 +238,7 @@ gRPC
 grpc
 helloworld
 Herness
+hoc
 hostIP
 hostname
 hostnames
@@ -506,6 +509,7 @@ subdomain
 subdomains
 subnet
 subnets
+subnetworks
 subresources
 substring
 Superfeet
diff --git a/content/en/blog/2019/isolated-clusters/index.md b/content/en/blog/2019/isolated-clusters/index.md
new file mode 100644
index 0000000000..20bc7ca9e8
--- /dev/null
+++ b/content/en/blog/2019/isolated-clusters/index.md
@@ -0,0 +1,146 @@
+---
+title: Multi-mesh deployments for isolation and boundary protection
+subtitle: Separate applications that require isolation into multiple meshes using mesh federation to enable inter-mesh communication
+description: Deploy environments that require isolation into separate meshes and enable inter-mesh communication by mesh federation.
+publishdate: 2019-10-02
+attribution: Vadim Eisenberg (IBM)
+keywords: [traffic-management,multicluster,security,gateway,tls]
+---
+Various compliance standards require protection of sensitive data environments. Some of the important standards and the
+types of sensitive data they protect appear in the following table:
+
+|Standard|Sensitive data|
+| --- | --- |
+|[PCI DSS](https://www.pcisecuritystandards.org/pci_security)|payment card data|
+|[FedRAMP](https://www.fedramp.gov)|federal information, data and metadata|
+|[HIPAA](http://www.gpo.gov/fdsys/search/pagedetails.action?granuleId=CRPT-104hrpt736&packageId=CRPT-104hrpt736)|personal health data|
+|[GDPR](https://gdpr-info.eu)| personal data|
+
+[PCI DSS](https://www.pcisecuritystandards.org/pci_security), for example, recommends putting cardholder data
+environment on a network, separate from the rest of the system. It also requires using a [DMZ](https://en.wikipedia.org/wiki/DMZ_(computing)),
+and setting firewalls between the public Internet and the DMZ, and between the DMZ and the internal network.
+
+Isolation of sensitive data environments from other information systems can reduce the scope of the compliance checks
+and improve the security of the sensitive data. Reducing the scope reduces the risks of failing a compliance check and
+reduces the costs of compliance since there are less components to check and secure, according to compliance
+requirements.
+
+You can achieve isolation of sensitive data by separating the parts of the application that process that data
+into a separate service mesh, preferably on a separate network, and then connect the meshes with different
+compliance requirements together in a {{< gloss >}}multi-mesh{{< /gloss >}} deployment.
+The process of connecting inter-mesh
+applications is called {{< gloss >}}mesh federation{{< /gloss >}}.
+
+Note that using mesh federation to create a multi-mesh deployment is very different than creating a
+{{< gloss >}}multi-cluster{{< /gloss >}} deployment, which defines a single service mesh composed from services spanning more than one cluster. Unlike multi-mesh, a multi-cluster deployment is not suitable for
+applications that require isolation and boundary protection.
+
+In this blog post I describe the requirements for isolation and boundary protection, and outline the principles of
+multi-mesh deployments. Finally, I touch on the current state of mesh-federation support and automation work under way for
+Istio.
+
+## Isolation and boundary protection
+
+Isolation and boundary protection mechanisms are explained in the
+[NIST Special Publication 800-53, Revision 4, Security and Privacy Controls for Federal Information Systems and Organizations](http://dx.doi.org/10.6028/NIST.SP.800-53r4),
+_Appendix F, Security Control Catalog, SC-7 Boundary Protection_.
+
+In particular, the _Boundary protection, isolation of information system components_ control enhancement:
+
+{{< quote >}}
+Organizations can isolate information system components performing different missions and/or business functions.
+Such isolation limits unauthorized information flows among system components and also provides the opportunity to deploy
+greater levels of protection for selected components. Separating system components with boundary protection mechanisms
+provides the capability for increased protection of individual components and to more effectively control information
+flows between those components. This type of enhanced protection limits the potential harm from cyber attacks and
+errors. The degree of separation provided varies depending upon the mechanisms chosen. Boundary protection mechanisms
+include, for example, routers, gateways, and firewalls separating system components into physically separate networks or
+subnetworks, cross-domain devices separating subnetworks, virtualization techniques, and encrypting information flows
+among system components using distinct encryption keys.
+{{< /quote >}}
+
+Various compliance standards recommend isolating environments that process sensitive data from the rest of the
+organization.
+The [Payment Card Industry (PCI) Data Security Standard](https://www.pcisecuritystandards.org/pci_security/)
+recommends implementing network isolation for _cardholder data_ environment and requires isolating this environment from
+the [DMZ](https://en.wikipedia.org/wiki/DMZ_(computing)).
+[FedRAMP Authorization Boundary Guidance](https://www.fedramp.gov/assets/resources/documents/CSP_A_FedRAMP_Authorization_Boundary_Guidance.pdf)
+describes _authorization boundary_ for federal information and data, while
+[NIST Special Publication 800-37, Revision 2, Risk Management Framework for Information Systems and Organizations: A System Life Cycle Approach for Security and Privacy](https://doi.org/10.6028/NIST.SP.800-37r2)
+recommends protecting of such a boundary in _Appendix G, Authorization Boundary Considerations_:
+
+{{< quote >}}
+Dividing a system into subsystems (i.e., divide and conquer) facilitates a targeted application of controls to achieve
+adequate security, protection of individual privacy, and a cost-effective risk management process. Dividing complex
+systems into subsystems also supports the important security concepts of domain separation and network segmentation,
+which can be significant when dealing with high value assets. When systems are divided into subsystems, organizations
+may choose to develop individual subsystem security and privacy plans or address the system and subsystems in the same
+security and privacy plans.
+Information security and privacy architectures play a key part in the process of dividing complex systems into
+subsystems. This includes monitoring and controlling communications at internal boundaries among subsystems and
+selecting, allocating, and implementing controls that meet or exceed the security and privacy requirements of the
+constituent subsystems.
+{{< /quote >}}
+
+Boundary protection, in particular, means:
+
+- put an access control mechanism at the boundary (firewall, gateway, etc.)
+- monitor the incoming/outgoing traffic at the boundary
+- all the access control mechanisms must be _deny-all_ by default
+- do not expose private IP addresses from the boundary
+- do not let components from outside the boundary to impact security inside the boundary
+
+Multi-mesh deployments facilitate division of a system into subsystems with different
+security and compliance requirements, and facilitate the boundary protection.
+You put each subsystem into a separate service mesh, preferably on a separate network.
+You connect the Istio meshes using gateways. The gateways monitor and control cross-mesh traffic at the boundary of
+each mesh.
+
+## Features of multi-mesh deployments
+
+- **non-uniform naming**. The `withdraw` service in the `accounts` namespace in one mesh might have
+different functionality and API than the `withdraw` services in the `accounts` namespace in other meshes.
+Such situation could happen in an organization where there is no uniform policy on naming of namespaces and services, or
+when the meshes belong to different organizations.
+- **expose-nothing by default**. None of the services in a mesh are exposed by default, the mesh owners must
+explicitly specify which services are exposed.
+- **boundary protection**. The access control of the traffic must be enforced at the ingress gateway, which stops
+forbidden traffic from entering the mesh. This requirement implements
+[Defense-in-depth principle](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)) and is part of some compliance
+standards, such as the
+[Payment Card Industry (PCI) Data Security Standard](https://www.pcisecuritystandards.org/pci_security/).
+- **common trust may not exist**. The Istio sidecars in one mesh may not trust the Citadel certificates in other
+meshes, due to some security requirement or due to the fact that the mesh owners did not initially plan to federate
+the meshes.
+
+While **expose-nothing by default** and **boundary protection** are required to facilitate compliance and improve
+security, **non-uniform naming** and **common trust may not exist** are required when connecting
+meshes of different organizations, or of an organization that cannot enforce uniform naming or cannot or may not
+establish common trust between the meshes.
+
+An optional feature that you may want to use is **service location transparency**: consuming services send requests
+to the exposed services in remote meshes using local service names. The consuming services are oblivious to the fact
+that some of the destinations are in remote meshes and some are local services. The access is uniform, using the local
+service names, for example, in Kubernetes, `reviews.default.svc.cluster.local`.
+**Service location transparency** is useful in the cases when you want to be able to change the location of the
+consumed services, for example when some service is migrated from private cloud to public cloud, without changing the
+code of your applications.
+
+## The current mesh-federation work
+
+While you can perform mesh federation using standard Istio configurations already today,
+it would require writing a lot of boiler-plate YAML files and could be error-prone. There is an effort under way to
+automate the mesh federation process.
+Before the automation of mesh federation is released, and if you are curious, you
+can check [multi-mesh deployment examples](https://github.com/istio-ecosystem/multi-mesh-examples) in
+[Istio ecosystem](https://github.com/istio-ecosystem).
+
+## Summary
+
+In this blog post I described the requirements for isolation and boundary protection of sensitive data environments by
+using Istio multi-mesh deployments. I outlined the principles of Istio
+multi-mesh deployments and reported the current work on
+mesh federation in Istio.
+
+I will be happy to hear your opinion about {{< gloss >}}multi-mesh{{< /gloss >}} and
+{{< gloss >}}multi-cluster{{< /gloss >}} at [discuss.istio.io](https://discuss.istio.io).