diff --git a/events/2014/contributor-conference/README.md b/events/2014/contributor-conference/README.md new file mode 100644 index 000000000..e100ed828 --- /dev/null +++ b/events/2014/contributor-conference/README.md @@ -0,0 +1,200 @@ +# Kubernetes Contributor Conference, 2014-12-03 to 12-05 +**Full notes:** (Has pictures; Shared with k-dev mailing list) (https://docs.google.com/document/d/1cQLY9yeFgxlr_SRgaBZYGcJ4UtNhLAjJNwJa8424JMA/edit?usp=sharing) +**Organizers:** thockin and bburns +**26 Attendees from:** Google, Red Hat, CoreOS, Box +**This is a historical document. No typo or grammar correction PRs needed.** + +Last modified: Dec. 8. 2014 + +# Clustering and Cluster Formation +Goal: Decide how clusters should be formed and resized over time +Models for building clusters +* Master in charge - asset DB +Dynamic join - ask to join +* How Kelsey Hightower has seen this done on bare metal +Use Fleet as a machine database +A Fleet agent is run on each node +Each node registers its information in etcd when it comes up +Only security is that etcd expects the node to have a cert signed by a specific CA +Run an etcd proxy on each node +Don't run any salt scripts, everything is declarative +Just put a daemon (kube-register) on a machine to become part of the cluster +brendanburns: basically using Fleet as the cloud provider +* Puppet model - whitelist some cert and/or subnet that you want to trust everything in +One problem - if CA leaks, have to replace certs on all nodes +* briangrant: we may want to support adding nodes that aren't trusted, only scheduling work from the nodes' owner on them +* lavalamp: we need to differentiate between node states: +In the cluster +Ready to accept work +Trusted to accept work +* Proposal: +New nodes initiate contact with the master +Allow multiple config options for how trust can be established - IP, cert, etc. +Each new node only needs one piece of information - how to find the master +Can support many different auth modes - let anyone in, whitelist IPs, a particular signed cert, queue up requests for an admin to approve, etc. +Default should be auto-register with no auth/approval needed +Auth-ing is separate from registering +Supporting switching between permissive and strict auth modes: +Each node should register a public key such that if the auth mode is changed to require a cert upon registration, old nodes won't break +kelseyhightower: let the minion do the same thing that kube-register currently does +Separate adding a node to the cluster from declaring it as schedulable +* Use cases: +Kick the tires, everything should be automagic +Professional that needs security +* Working group for later: Joe, Kelsey, Quintin, Eric Paris +# Usability +* Getting started +Want easy entry for Docker users +Library/registry of pod templates +* GUI - visualization of relationships and dependencies, workflows, dashboards, ways to learn, first impressions +Will be easiest to start with a read-only UI before worrying about read-write workflows +* Docs +Need to refactor getting started guides so that there's one common guide +Each cloud provider will just have its own short guide on how to create a cluster +Need a simple test that can verify whether your cluster is healthy or diagnose why it isn't +Make it easier to get to architecture/design doc from front page of github project +Table of contents for docs? +Realistic examples +Kelsey has found that doing a tutorial of deploying with a canary helped make the value of labels clear +* CLI +Annoying when local auth files and config get overwritten when trying to work with multiple clusters +Like when running e2e tests +* Common friction points +External IPs +Image registry +Secrets +Deployment +Stateful services +Scheduling +Events/status +Log access +* Working groups +GUI - Jordan, Brian, Max, Satnam +CLI - Jeff, Sam, Derek +Docs - Proppy, Kelsey, TJ, Satnam, Jeff +Features/Experience - Dawn, Rohit, Kelsey, Proppy, Clayton: https://docs.google.com/document/d/1hqn6FtBNMe0sThbciq2PbE_P5BONBgCzHST4gz2zj50/edit + + +# v1beta3 discussion + +12-04-2014 +Network -- breakout +* Dynamic IP +Once we support live migration, IP assigned for each POD has to move together, which might be broken the underneath. +We don’t have introspection, which makes supporting various network topology harder. +External IP is an important part. +There’s a kick-the-tires mode and full-on mode (for GCE, AWS - fully featured). +How do we select kick-the-tires ? Weave, Flannel, Calico: pick one. +Someone does a comparison. thockin@ would like help in evaluating these tech against some benchmarks. Eric Paris can help - has a bare-metal setup. We’ll have a benchmark setup for evaluation. +We need to have two real use-cases at least - a webserver example; can 10 pods find each other. lavalamp@ working on a test. +If docker picks up a plugin model, we can use that. +Cluster will be dynamically change, we need to design a flexible network plugin API to accomplish this. +Flannel two things: network allocation through etcd and traffic routing w/ overlays. Also programs underlay networks (like GCE). Flannel will do IP allocation, not hard-coded. +One special use case: per node, there are only 20 ips could be allocated. Scheduler might need to know the limitation: OUT-OF-IP(?) +Different cloud providers, but OVS is a common mechanism +We might need Network Grids at the end +ACTIONS: better doc, test. +* Public Services +Hard problem: Have to scale to GCE, GCE load balancer cannot target to arbitrary IP, only can target to a VM for now. +Until we have an external IP, you cannot build a HA public service. +We can run Digital Ocean on top of kubernetes +Issue: When starting a public service, there is internal IP assigned. It is accessable from node within cluster, but not from outside. Now we have a 3-tier services, how to access one service from outside The issue is how to take this internal accessible service externalized. General solution: forwarding the traffic outside to the internal IP. First action, teach kubernetes mapping. +We need a registry of those public IPs. All traffic comes to that IP will be forwarded to proper IP internally. +public service can register with DNS, and do a intermiddle load balancing outside cluster / kubernetes. Label query to tell the endpoint. +K8s proxy can be L3 LB, and listen to the external IPs, it also talk to k8s service DB and find internal services; then goes to L7 LB, which could be HAP proxy, scheduled as a pod, it talks to Pods DB, find a cluster of pods to forward the traffic. + + + +Two types of services: mapping external IPs and L3 LB to map to pods. L7 LB can access the IPs assigned to pods. +Policy: Add more nodes, more external IPs can be used. +Issue1: how to take external IP to map to a list of pods, L3 LB part. +Issue2: how to slice those external IPs: general pool vs. private pools. +* IP-per-service, visibility, segmenting +* Scale +* MAC + +# Roadmap + +* Should be driven by scenarios / use cases -- breakout +* Storage / stateful services -- breakout +Clustered databases / kv stores +Mongo +MySQL master/slave +Cassandra +etcd +zookeeper +redis +ldap +Alternatives +local storage +durable volumes +identity associated with volumes +lifecycle management +network storage (ceph, nfs, gluster, hdfs) +volume plugin +flocker - volume migration +“durable” data (as reliable as host) +* Upgrading Kubernetes +master components +kubelets +OS + kernel + Docker +* Usability +Easy cluster startup +Minion registration +Configuring k8s +move away from flags in master +node config distribution +kubelet config +dockercfg +Cluster scaling +CLI + config + deployment / rolling updates +Selected workloads +* Networking +External IPs +DNS +Kick-the-tires networking implementation +* Admission control not required for 1.0 +* v1 API + deprecation policy +* Kubelet API well defined and versioned +* Basic resource-aware scheduling -- breakout +require limits? +auto-sizing +* Registry +Predictable deployment (config-time image resolution) +Easy code->k8s +Simple out-of-the box setup +One or many? +Proxy? +Service? +Configurable .dockercfg +* Productionization +Scalability +100 for 1.0 +1000 by summer 2015 +HA master -- not gating 1.0 +Master election +Eliminate global in-memory state +IP allocator +Operations +Sharding +Pod getter +Kubelets need to coast when master down +Don’t blow away pods when master is down +Testing +More/better/easier E2E +E2E integration testing w/ OpenShift +More non-E2E integration tests +Long-term soaking / stress test +Backward compatibility +Release cadence and artifacts +Export monitoring metrics (instrumentation) +Bounded disk space on master and kubelets +GC of unused images +* Docs +Reference architecture +* Auth[nz] +plugins + policy +admin +user->master +master component->component: localhost in 1.0 +kubelet->master diff --git a/hack/.spelling_failures b/hack/.spelling_failures index bf3b4eb3d..6f00db630 100644 --- a/hack/.spelling_failures +++ b/hack/.spelling_failures @@ -1,4 +1,4 @@ events/elections/2017/ vendor/ sig-contributor-experience/contribex-survey-2018.csv - +events/2014