Update overview.md (#2932)

Multi-user isolation has evolved in the last few releases.   These updates reflect the changes and updated related content.
This commit is contained in:
Josh Bottum 2021-09-13 09:27:01 -05:00 committed by GitHub
parent 101eb164fc
commit e95b3174a3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 19 additions and 43 deletions

View File

@ -1,65 +1,41 @@
+++
title = "Introduction to Multi-user Isolation"
description = "What does multi-user isolation mean?"
description = "Why do Kubeflow administrators need multi-user isolation?"
weight = 10
+++
{{% stable-status %}}
In a production environment, it is often necessary to share the same pool
of resources across different teams and users. These different users need
a reliable way to isolate and protect their own resources, without accidentally
viewing or changing each other's resources.
Kubeflow {{% kf-latest-version %}} supports multi-user isolation, which applies
access control over namespaces and user-created
resources in a deployment. This feature provides the users with the
convenience of clutter-free browsing of notebooks, training jobs, serving
deployments and other resources. The isolation mechanisms also prevent
accidental deletion/modification of resources of other users in the deployment.
Note that the isolation support in Kubeflow doesn't provide any hard security
guarantees against malicious attempts by users to infiltrate other user's
profiles.
In Kubeflow clusters, users often need to be isolated into a group, where a group includes one or more users. Additionally, a user may need to belong to multiple groups. Kubeflows multi-user isolation simplifies user operations because each user only views and edited\s the Kubeflow components and model artifacts defined in their configuration. A users view is not cluttered by components or model artifacts that are not in their configuration. This isolation also provides for efficient infrastructure and operations i.e. a single cluster supports multiple isolated users, and does not require the administrator to operate different clusters to isolate users.
## Key concepts
**Administrator**: An administrator is someone who creates and maintains the Kubeflow cluster.
This person has the permission to grant access permissions to others.
**Administrator**: An Administrator is someone who creates and maintains the Kubeflow cluster. This person configures permissions (i.e. view, edit) for other users.
**User**: A user is someone who has access to some set of resources in the cluster. A user
needs to be granted access permissions by the administrator.
**User**: A User is someone who has access to some set of resources in the cluster. A user needs to be granted access permissions by the administrator.
**Profile**: A profile is a grouping of all Kubernetes clusters owned by a user.
**Profile**: A Profile is a unique configuration for a user, which determines their access privileges and is defined by the Administrator.
## Current integration and limitations
**Isolation**: Isolation uses Kubernetes Namespaces. Namespaces isolate users or a group of users i.e. Bobs namespace or ML Eng namespace that is shared by Bob and Sara.
The Jupyter notebooks service is the first application to be fully integrated with
multi-user isolation. Access to the notebooks and the creation of notebooks is
controlled by the profile access policies set by the administrator or the owners
of the profiles. Resources created by the notebooks (for example, training jobs and
deployments) also inherit the same access.
**Authentication**: Authentication is provided by an integration of Istio and OIDC and is secured by mTLS. More details can be found [here](https://journal.arrikto.com/kubeflow-authentication-with-istio-dex-5eafdfac4782)
Kubeflow Pipelines is partially integrated with multi-user isolation starting
from Kubeflow v1.1. You can find more information on [Multi-user Isolation for
Pipelines](https://www.kubeflow.org/docs/components/pipelines/multi-user/).
**Authorization**: Authorization is provided by an integration with Kubernetes RBAC.
Metadata or any other applications currently don't have full
fledged integration with isolation, though they do have access to the user
identity through the headers of the incoming requests. It's up to the individual
applications to use the available identity and isolation features
in a way that makes sense for each application.
Kubeflow multi-user isolation is configured by Kubeflow administrators. Administrators configure Kubeflow User Profiles for each user. After the configuration is created and applied, a User can only access the Kubeflow components that the Administrator has configured for them. The configuration limits non-authorized UI users from viewing or accidentally deleting model artifacts.
On Google Cloud Platform (GCP), the authentication and identity token is generated by GCP IAM and carried
through the requests as a JWT Token in the request header. Other cloud providers can have a
similar header to provide identity information.
With multi-user isolation, Users are authenticated and authorized, and then provided with a time-based token i.e. a json web token (JWT). The access token is carried as a web header in user requests, and authorizes the user to access the resources configured in their Profile. The Profile configures several items including the Users namespace(s), RBAC RoleBinding, Istio ServiceRole and ServiceRoleBindings along with Resource Quotas and Custom Plug-ins. More information on the Profile definition and related CRD can be found [here](https://github.com/kubeflow/kubeflow/blob/master/components/profile-controller/README.md)
For on-premises deployments, Kubeflow uses Dex as a federated OpenID connection
provider and can be integrated with LDAP or Active Directory to provide authentication
and identity services.
## Current integration
These Kubeflow Components can support multi-user isolation: Central Dashboard, Notebooks, Pipelines, AutoML (Katib), KFServing. Furthermore, resources created by the notebooks (for example, training jobs and deployments) also inherit the same access.
Important notes: Multi-user isolation has several configurable dependencies, especially those related to how Kubeflow is configured with the underlying Kubernetes clusters identity management system. Additionally, Kubeflow multi-user isolation doesnt provide hard security guarantees against malicious attempts to infiltrate another users profile.
When configuring multi-user isolation along with your security and identity management requirements, it is recommended that you consult with your [distribution provider](https://www.kubeflow.org/docs/distributions/). This KubeCon [presentation](https://www.youtube.com/watch?v=U8yWOKOhzes) provides a detailed review of the architecture and implementation. For on-premise deployments, Kubeflow uses Dex as a federated OpenID connection provider and can be integrated with LDAP or Active Directory to provide authentication and identity services. This can be an advanced configuration and it is recommended that you consult with a distribution provider, or a team that provides advanced technical support for on-premise Kubeflow.
## Next steps
* Understand the [detailed design](/docs/components/multi-tenancy/design/) of Kubeflow's multi-user isolation feature.
* Learn [how to use multi-user isolation and profiles](/docs/components/multi-tenancy/getting-started/).
* Learn [how more on multi-user isolation and profiles](/docs/components/multi-tenancy/getting-started/).