Browse Source

Documentation: sync with etcd master

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Gyuho Lee 8 years ago
parent
commit
f320348682

+ 6 - 0
Documentation/faq.md

@@ -102,6 +102,12 @@ To recover from the low space quota alarm:
 2. [Defragment][maintenance-defragment] every etcd endpoint.
 2. [Defragment][maintenance-defragment] every etcd endpoint.
 3. [Disarm][maintenance-disarm] the alarm.
 3. [Disarm][maintenance-disarm] the alarm.
 
 
+### What does the etcd warning "etcdserver/api/v3rpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:43020: read: connection reset by peer" mean?
+
+This is gRPC-side warning when a server receives a TCP RST flag with client-side streams being prematurely closed. For example, a client closes its connection, while gRPC server has not yet processed all HTTP/2 frames in the TCP queue. Some data may have been lost in server side, but it is ok so long as client connection has already been closed.
+
+Only [old versions of gRPC](https://github.com/grpc/grpc-go/issues/1362) log this. etcd [>=v3.2.13 by default log this with DEBUG level](https://github.com/coreos/etcd/pull/9080), thus only visible with `--debug` flag enabled.
+
 ## Performance
 ## Performance
 
 
 ### How should I benchmark etcd?
 ### How should I benchmark etcd?

+ 0 - 1
Documentation/integrations.md

@@ -152,7 +152,6 @@
 - [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
 - [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
 - [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
 - [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
 - [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
 - [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
-- [fleet](https://github.com/coreos/fleet) - Distributed init system
 - [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
 - [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
 - [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
 - [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
 - [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.
 - [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.

+ 7 - 0
Documentation/op-guide/clustering.md

@@ -359,6 +359,13 @@ If `_etcd-client-ssl._tcp.example.com` is found, clients will attempt to communi
 
 
 If etcd is using TLS without a custom certificate authority, the discovery domain (e.g., example.com) must match the SRV record domain (e.g., infra1.example.com). This is to mitigate attacks that forge SRV records to point to a different domain; the domain would have a valid certificate under PKI but be controlled by an unknown third party.
 If etcd is using TLS without a custom certificate authority, the discovery domain (e.g., example.com) must match the SRV record domain (e.g., infra1.example.com). This is to mitigate attacks that forge SRV records to point to a different domain; the domain would have a valid certificate under PKI but be controlled by an unknown third party.
 
 
+The `-discovery-srv-name` flag additionally configures a suffix to the SRV name that is queried during discovery.
+Use this flag to differentiate between multiple etcd clusters under the same domain.
+For example, if `discovery-srv=example.com` and `-discovery-srv-name=foo` are set, the following DNS SRV queries are made:
+
+* _etcd-server-ssl-foo._tcp.example.com
+* _etcd-server-foo._tcp.example.com
+
 #### Create DNS SRV records
 #### Create DNS SRV records
 
 
 ```
 ```

+ 15 - 3
Documentation/op-guide/configuration.md

@@ -1,6 +1,11 @@
 # Configuration flags
 # Configuration flags
 
 
-etcd is configurable through command-line flags and environment variables. Options set on the command line take precedence over those from the environment.
+etcd is configurable through a configuration file, various command-line flags, and environment variables.
+
+A reusable configuration file is a YAML file made with name and value of one or more command-line flags described below. In order to use this file, specify the file path as a value to the `--config-file` flag. The [sample configuration file][sample-config-file] can be used as a starting point to create a new configuration file as needed.
+
+Options set on the command line take precedence over those from the environment. If a configuration file is provided, other command line flags and environment variables will be ignored.
+For example, `etcd --config-file etcd.conf.yml.sample --data-dir /tmp` will ignore the `--data-dir` flag.
 
 
 The format of environment variable for flag `--my-flag` is `ETCD_MY_FLAG`. It applies to all flags.
 The format of environment variable for flag `--my-flag` is `ETCD_MY_FLAG`. It applies to all flags.
 
 
@@ -150,6 +155,11 @@ To start etcd automatically using custom settings at startup in Linux, using a [
 + default: ""
 + default: ""
 + env variable: ETCD_DISCOVERY_SRV
 + env variable: ETCD_DISCOVERY_SRV
 
 
+### --discovery-srv-name
++ Suffix to the DNS srv name queried when bootstrapping using DNS.
++ default: ""
++ env variable: ETCD_DISCOVERY_SRV_NAME
+
 ### --discovery-fallback
 ### --discovery-fallback
 + Expected behavior ("exit" or "proxy") when discovery services fails. "proxy" supports v2 API only.
 + Expected behavior ("exit" or "proxy") when discovery services fails. "proxy" supports v2 API only.
 + default: "proxy"
 + default: "proxy"
@@ -266,12 +276,12 @@ The security flags help to [build a secure etcd cluster][security].
 + env variable: ETCD_PEER_CA_FILE
 + env variable: ETCD_PEER_CA_FILE
 
 
 ### --peer-cert-file
 ### --peer-cert-file
-+ Path to the peer server TLS cert file.
++ Path to the peer server TLS cert file. This is the cert for peer-to-peer traffic, used both for server and client.
 + default: ""
 + default: ""
 + env variable: ETCD_PEER_CERT_FILE
 + env variable: ETCD_PEER_CERT_FILE
 
 
 ### --peer-key-file
 ### --peer-key-file
-+ Path to the peer server TLS key file.
++ Path to the peer server TLS key file. This is the key for peer-to-peer traffic, used both for server and client.
 + default: ""
 + default: ""
 + env variable: ETCD_PEER_KEY_FILE
 + env variable: ETCD_PEER_KEY_FILE
 
 
@@ -332,6 +342,7 @@ Follow the instructions when using these flags.
 ### --config-file
 ### --config-file
 + Load server configuration from a file.
 + Load server configuration from a file.
 + default: ""
 + default: ""
++ example: [sample configuration file][sample-config-file]
 
 
 ## Profiling flags
 ## Profiling flags
 
 
@@ -369,3 +380,4 @@ Follow the instructions when using these flags.
 [security]: security.md
 [security]: security.md
 [systemd-intro]: http://freedesktop.org/wiki/Software/systemd/
 [systemd-intro]: http://freedesktop.org/wiki/Software/systemd/
 [tuning]: ../tuning.md#time-parameters
 [tuning]: ../tuning.md#time-parameters
+[sample-config-file]: ../../etcd.conf.yml.sample

+ 6 - 6
Documentation/op-guide/container.md

@@ -17,14 +17,14 @@ export NODE1=192.168.1.21
 Trust the CoreOS [App Signing Key](https://coreos.com/security/app-signing-key/).
 Trust the CoreOS [App Signing Key](https://coreos.com/security/app-signing-key/).
 
 
 ```
 ```
-sudo rkt trust --prefix coreos.com/etcd
+sudo rkt trust --prefix quay.io/coreos/etcd
 # gpg key fingerprint is: 18AD 5014 C99E F7E3 BA5F  6CE9 50BD D3E0 FC8A 365E
 # gpg key fingerprint is: 18AD 5014 C99E F7E3 BA5F  6CE9 50BD D3E0 FC8A 365E
 ```
 ```
 
 
-Run the `v3.1.2` version of etcd or specify another release version.
+Run the `v3.2` version of etcd or specify another release version.
 
 
 ```
 ```
-sudo rkt run --net=default:IP=${NODE1} coreos.com/etcd:v3.1.2 -- -name=node1 -advertise-client-urls=http://${NODE1}:2379 -initial-advertise-peer-urls=http://${NODE1}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE1}:2380 -initial-cluster=node1=http://${NODE1}:2380
+sudo rkt run --net=default:IP=${NODE1} quay.io/coreos/etcd:v3.2 -- -name=node1 -advertise-client-urls=http://${NODE1}:2379 -initial-advertise-peer-urls=http://${NODE1}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE1}:2380 -initial-cluster=node1=http://${NODE1}:2380
 ```
 ```
 
 
 List the cluster member.
 List the cluster member.
@@ -45,13 +45,13 @@ export NODE3=172.16.28.23
 
 
 ```
 ```
 # node 1
 # node 1
-sudo rkt run --net=default:IP=${NODE1} coreos.com/etcd:v3.1.2 -- -name=node1 -advertise-client-urls=http://${NODE1}:2379 -initial-advertise-peer-urls=http://${NODE1}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE1}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
+sudo rkt run --net=default:IP=${NODE1} quay.io/coreos/etcd:v3.2 -- -name=node1 -advertise-client-urls=http://${NODE1}:2379 -initial-advertise-peer-urls=http://${NODE1}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE1}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
 
 
 # node 2
 # node 2
-sudo rkt run --net=default:IP=${NODE2} coreos.com/etcd:v3.1.2 -- -name=node2 -advertise-client-urls=http://${NODE2}:2379 -initial-advertise-peer-urls=http://${NODE2}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE2}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
+sudo rkt run --net=default:IP=${NODE2} quay.io/coreos/etcd:v3.2 -- -name=node2 -advertise-client-urls=http://${NODE2}:2379 -initial-advertise-peer-urls=http://${NODE2}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE2}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
 
 
 # node 3
 # node 3
-sudo rkt run --net=default:IP=${NODE3} coreos.com/etcd:v3.1.2 -- -name=node3 -advertise-client-urls=http://${NODE3}:2379 -initial-advertise-peer-urls=http://${NODE3}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE3}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
+sudo rkt run --net=default:IP=${NODE3} quay.io/coreos/etcd:v3.2 -- -name=node3 -advertise-client-urls=http://${NODE3}:2379 -initial-advertise-peer-urls=http://${NODE3}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE3}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
 ```
 ```
 
 
 Verify the cluster is healthy and can be reached.
 Verify the cluster is healthy and can be reached.

+ 8 - 8
Documentation/op-guide/etcd3_alert.rules

@@ -43,8 +43,8 @@ ANNOTATIONS {
 
 
 # alert if more than 1% of gRPC method calls have failed within the last 5 minutes
 # alert if more than 1% of gRPC method calls have failed within the last 5 minutes
 ALERT HighNumberOfFailedGRPCRequests
 ALERT HighNumberOfFailedGRPCRequests
-IF sum by(grpc_method) (rate(etcd_grpc_requests_failed_total{job="etcd"}[5m]))
-  / sum by(grpc_method) (rate(etcd_grpc_total{job="etcd"}[5m])) > 0.01
+IF 100 * (sum by(grpc_method) (rate(etcd_grpc_requests_failed_total{job="etcd"}[5m]))
+  / sum by(grpc_method) (rate(etcd_grpc_total{job="etcd"}[5m]))) > 1
 FOR 10m
 FOR 10m
 LABELS {
 LABELS {
   severity = "warning"
   severity = "warning"
@@ -56,8 +56,8 @@ ANNOTATIONS {
 
 
 # alert if more than 5% of gRPC method calls have failed within the last 5 minutes
 # alert if more than 5% of gRPC method calls have failed within the last 5 minutes
 ALERT HighNumberOfFailedGRPCRequests
 ALERT HighNumberOfFailedGRPCRequests
-IF sum by(grpc_method) (rate(etcd_grpc_requests_failed_total{job="etcd"}[5m]))
-  / sum by(grpc_method) (rate(etcd_grpc_total{job="etcd"}[5m])) > 0.05
+IF 100 * (sum by(grpc_method) (rate(etcd_grpc_requests_failed_total{job="etcd"}[5m]))
+  / sum by(grpc_method) (rate(etcd_grpc_total{job="etcd"}[5m]))) > 5
 FOR 5m
 FOR 5m
 LABELS {
 LABELS {
   severity = "critical"
   severity = "critical"
@@ -84,8 +84,8 @@ ANNOTATIONS {
 
 
 # alert if more than 1% of requests to an HTTP endpoint have failed within the last 5 minutes
 # alert if more than 1% of requests to an HTTP endpoint have failed within the last 5 minutes
 ALERT HighNumberOfFailedHTTPRequests
 ALERT HighNumberOfFailedHTTPRequests
-IF sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
-  / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
+IF 100 * (sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
+  / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method)) > 1
 FOR 10m
 FOR 10m
 LABELS {
 LABELS {
   severity = "warning"
   severity = "warning"
@@ -97,8 +97,8 @@ ANNOTATIONS {
 
 
 # alert if more than 5% of requests to an HTTP endpoint have failed within the last 5 minutes
 # alert if more than 5% of requests to an HTTP endpoint have failed within the last 5 minutes
 ALERT HighNumberOfFailedHTTPRequests
 ALERT HighNumberOfFailedHTTPRequests
-IF sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
-  / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
+IF 100 * (sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
+  / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method))  > 5
 FOR 5m
 FOR 5m
 LABELS {
 LABELS {
   severity = "critical"
   severity = "critical"

+ 8 - 8
Documentation/op-guide/etcd3_alert.rules.yml

@@ -26,8 +26,8 @@ groups:
         changes within the last hour
         changes within the last hour
       summary: a high number of leader changes within the etcd cluster are happening
       summary: a high number of leader changes within the etcd cluster are happening
   - alert: HighNumberOfFailedGRPCRequests
   - alert: HighNumberOfFailedGRPCRequests
-    expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
-      / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.01
+    expr: 100 * (sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
+      / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method)) > 1
     for: 10m
     for: 10m
     labels:
     labels:
       severity: warning
       severity: warning
@@ -36,8 +36,8 @@ groups:
         on etcd instance {{ $labels.instance }}'
         on etcd instance {{ $labels.instance }}'
       summary: a high number of gRPC requests are failing
       summary: a high number of gRPC requests are failing
   - alert: HighNumberOfFailedGRPCRequests
   - alert: HighNumberOfFailedGRPCRequests
-    expr: sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
-      / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method) > 0.05
+    expr: 100 * (sum(rate(grpc_server_handled_total{grpc_code!="OK",job="etcd"}[5m])) BY (grpc_service, grpc_method)
+      / sum(rate(grpc_server_handled_total{job="etcd"}[5m])) BY (grpc_service, grpc_method)) > 5
     for: 5m
     for: 5m
     labels:
     labels:
       severity: critical
       severity: critical
@@ -56,8 +56,8 @@ groups:
         }} are slow
         }} are slow
       summary: slow gRPC requests
       summary: slow gRPC requests
   - alert: HighNumberOfFailedHTTPRequests
   - alert: HighNumberOfFailedHTTPRequests
-    expr: sum(rate(etcd_http_failed_total{job="etcd"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job="etcd"}[5m]))
-      BY (method) > 0.01
+    expr: 100 * (sum(rate(etcd_http_failed_total{job="etcd"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job="etcd"}[5m]))
+      BY (method)) > 1
     for: 10m
     for: 10m
     labels:
     labels:
       severity: warning
       severity: warning
@@ -66,8 +66,8 @@ groups:
         instance {{ $labels.instance }}'
         instance {{ $labels.instance }}'
       summary: a high number of HTTP requests are failing
       summary: a high number of HTTP requests are failing
   - alert: HighNumberOfFailedHTTPRequests
   - alert: HighNumberOfFailedHTTPRequests
-    expr: sum(rate(etcd_http_failed_total{job="etcd"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job="etcd"}[5m]))
-      BY (method) > 0.05
+    expr: 100 * (sum(rate(etcd_http_failed_total{job="etcd"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job="etcd"}[5m]))
+      BY (method)) > 5
     for: 5m
     for: 5m
     labels:
     labels:
       severity: critical
       severity: critical

+ 1 - 1
Documentation/op-guide/hardware.md

@@ -48,7 +48,7 @@ Example application workload: A 50-node Kubernetes cluster
 | Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
 | Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
 |----------|------|-------|--------|------|----------------|
 |----------|------|-------|--------|------|----------------|
 | AWS | m4.large | 2 | 8 | 3600 | 56.25 |
 | AWS | m4.large | 2 | 8 | 3600 | 56.25 |
-| GCE | n1-standard-1 + 50GB PD SSD | 2 | 7.5 | 1500 | 25 |
+| GCE | n1-standard-2 + 50GB PD SSD | 2 | 7.5 | 1500 | 25 |
 
 
 
 
 ### Medium cluster
 ### Medium cluster

+ 6 - 6
Documentation/op-guide/monitoring.md

@@ -20,14 +20,14 @@ Showing top 10 nodes out of 157 (cum >= 10ms)
     flat  flat%   sum%        cum   cum%
     flat  flat%   sum%        cum   cum%
    130ms 27.08% 27.08%      130ms 27.08%  runtime.futex
    130ms 27.08% 27.08%      130ms 27.08%  runtime.futex
     70ms 14.58% 41.67%       70ms 14.58%  syscall.Syscall
     70ms 14.58% 41.67%       70ms 14.58%  syscall.Syscall
-    20ms  4.17% 45.83%       20ms  4.17%  github.com/coreos/etcd/cmd/vendor/golang.org/x/net/http2/hpack.huffmanDecode
+    20ms  4.17% 45.83%       20ms  4.17%  github.com/coreos/etcd/vendor/golang.org/x/net/http2/hpack.huffmanDecode
     20ms  4.17% 50.00%       30ms  6.25%  runtime.pcvalue
     20ms  4.17% 50.00%       30ms  6.25%  runtime.pcvalue
     20ms  4.17% 54.17%       50ms 10.42%  runtime.schedule
     20ms  4.17% 54.17%       50ms 10.42%  runtime.schedule
-    10ms  2.08% 56.25%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).AuthInfoFromCtx
-    10ms  2.08% 58.33%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).Lead
-    10ms  2.08% 60.42%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/wait.(*timeList).Trigger
-    10ms  2.08% 62.50%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/prometheus/client_golang/prometheus.(*MetricVec).hashLabelValues
-    10ms  2.08% 64.58%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/golang.org/x/net/http2.(*Framer).WriteHeaders
+    10ms  2.08% 56.25%       10ms  2.08%  github.com/coreos/etcd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).AuthInfoFromCtx
+    10ms  2.08% 58.33%       10ms  2.08%  github.com/coreos/etcd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).Lead
+    10ms  2.08% 60.42%       10ms  2.08%  github.com/coreos/etcd/vendor/github.com/coreos/etcd/pkg/wait.(*timeList).Trigger
+    10ms  2.08% 62.50%       10ms  2.08%  github.com/coreos/etcd/vendor/github.com/prometheus/client_golang/prometheus.(*MetricVec).hashLabelValues
+    10ms  2.08% 64.58%       10ms  2.08%  github.com/coreos/etcd/vendor/golang.org/x/net/http2.(*Framer).WriteHeaders
 ```
 ```
 
 
 The `/debug/requests` endpoint gives gRPC traces and performance statistics through a web browser. For example, here is a `Range` request for the key `abc`:
 The `/debug/requests` endpoint gives gRPC traces and performance statistics through a web browser. For example, here is a `Range` request for the key `abc`:

+ 65 - 5
Documentation/op-guide/security.md

@@ -195,9 +195,9 @@ When client authentication is enabled for an etcd member, the administrator must
 
 
 ## Notes for TLS authentication
 ## Notes for TLS authentication
 
 
-Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v320-2017-06-09), [TLS certificates get reloaded on every client connection](https://github.com/coreos/etcd/pull/7829). This is useful when replacing expiry certs without stopping etcd servers; it can be done by overwriting old certs with new ones. Refreshing certs for every connection should not have too much overhead, but can be improved in the future, with caching layer. Example tests can be found [here](https://github.com/coreos/etcd/blob/b041ce5d514a4b4aaeefbffb008f0c7570a18986/integration/v3_grpc_test.go#L1601-L1757).
+Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.2.md#v320-2017-06-09), [TLS certificates get reloaded on every client connection](https://github.com/coreos/etcd/pull/7829). This is useful when replacing expiry certs without stopping etcd servers; it can be done by overwriting old certs with new ones. Refreshing certs for every connection should not have too much overhead, but can be improved in the future, with caching layer. Example tests can be found [here](https://github.com/coreos/etcd/blob/b041ce5d514a4b4aaeefbffb008f0c7570a18986/integration/v3_grpc_test.go#L1601-L1757).
 
 
-Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v320-2017-06-09), [server denies incoming peer certs with wrong IP `SAN`](https://github.com/coreos/etcd/pull/7687). For instance, if peer cert contains any IP addresses in Subject Alternative Name (SAN) field, server authenticates a peer only when the remote IP address matches one of those IP addresses. This is to prevent unauthorized endpoints from joining the cluster. For example, peer B's CSR (with `cfssl`) is:
+Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.2.md#v320-2017-06-09), [server denies incoming peer certs with wrong IP `SAN`](https://github.com/coreos/etcd/pull/7687). For instance, if peer cert contains any IP addresses in Subject Alternative Name (SAN) field, server authenticates a peer only when the remote IP address matches one of those IP addresses. This is to prevent unauthorized endpoints from joining the cluster. For example, peer B's CSR (with `cfssl`) is:
 
 
 ```json
 ```json
 {
 {
@@ -223,7 +223,7 @@ Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v320-2017
 
 
 when peer B's actual IP address is `10.138.0.2`, not `10.138.0.27`. When peer B tries to join the cluster, peer A will reject B with the error `x509: certificate is valid for 10.138.0.27, not 10.138.0.2`, because B's remote IP address does not match the one in Subject Alternative Name (SAN) field.
 when peer B's actual IP address is `10.138.0.2`, not `10.138.0.27`. When peer B tries to join the cluster, peer A will reject B with the error `x509: certificate is valid for 10.138.0.27, not 10.138.0.2`, because B's remote IP address does not match the one in Subject Alternative Name (SAN) field.
 
 
-Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v320-2017-06-09), [server resolves TLS `DNSNames` when checking `SAN`](https://github.com/coreos/etcd/pull/7767). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server authenticates a peer only when forward-lookups (`dig b.com`) on those DNS names have matching IP with the remote IP address. For example, peer B's CSR (with `cfssl`) is:
+Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.2.md#v320-2017-06-09), [server resolves TLS `DNSNames` when checking `SAN`](https://github.com/coreos/etcd/pull/7767). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server authenticates a peer only when forward-lookups (`dig b.com`) on those DNS names have matching IP with the remote IP address. For example, peer B's CSR (with `cfssl`) is:
 
 
 ```json
 ```json
 {
 {
@@ -235,7 +235,7 @@ Since [v3.2.0](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v320-2017
 
 
 when peer B's remote IP address is `10.138.0.2`. When peer B tries to join the cluster, peer A looks up the incoming host `b.com` to get the list of IP addresses (e.g. `dig b.com`). And rejects B if the list does not contain the IP `10.138.0.2`, with the error `tls: 10.138.0.2 does not match any of DNSNames ["b.com"]`.
 when peer B's remote IP address is `10.138.0.2`. When peer B tries to join the cluster, peer A looks up the incoming host `b.com` to get the list of IP addresses (e.g. `dig b.com`). And rejects B if the list does not contain the IP `10.138.0.2`, with the error `tls: 10.138.0.2 does not match any of DNSNames ["b.com"]`.
 
 
-Since [v3.2.2](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v322-2017-07-07), [server accepts connections if IP matches, without checking DNS entries](https://github.com/coreos/etcd/pull/8223). For instance, if peer cert contains IP addresses and DNS names in Subject Alternative Name (SAN) field, and the remote IP address matches one of those IP addresses, server just accepts connection without further checking the DNS names. For example, peer B's CSR (with `cfssl`) is:
+Since [v3.2.2](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.2.md#v322-2017-07-07), [server accepts connections if IP matches, without checking DNS entries](https://github.com/coreos/etcd/pull/8223). For instance, if peer cert contains IP addresses and DNS names in Subject Alternative Name (SAN) field, and the remote IP address matches one of those IP addresses, server just accepts connection without further checking the DNS names. For example, peer B's CSR (with `cfssl`) is:
 
 
 ```json
 ```json
 {
 {
@@ -248,7 +248,7 @@ Since [v3.2.2](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v322-2017
 
 
 when peer B's remote IP address is `10.138.0.2` and `invalid.domain` is a invalid host. When peer B tries to join the cluster, peer A successfully authenticates B, since Subject Alternative Name (SAN) field has a valid matching IP address. See [issue#8206](https://github.com/coreos/etcd/issues/8206) for more detail.
 when peer B's remote IP address is `10.138.0.2` and `invalid.domain` is a invalid host. When peer B tries to join the cluster, peer A successfully authenticates B, since Subject Alternative Name (SAN) field has a valid matching IP address. See [issue#8206](https://github.com/coreos/etcd/issues/8206) for more detail.
 
 
-Since [v3.2.5](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v325-2017-08-04), [server supports reverse-lookup on wildcard DNS `SAN`](https://github.com/coreos/etcd/pull/8281). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server first reverse-lookups the remote IP address to get a list of names mapping to that address (e.g. `nslookup IPADDR`). Then accepts the connection if those names have a matching name with peer cert's DNS names (either by exact or wildcard match). If none is matched, server forward-lookups each DNS entry in peer cert (e.g. look up `example.default.svc` when the entry is `*.example.default.svc`), and accepts connection only when the host's resolved addresses have the matching IP address with the peer's remote IP address. For example, peer B's CSR (with `cfssl`) is:
+Since [v3.2.5](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.2.md#v325-2017-08-04), [server supports reverse-lookup on wildcard DNS `SAN`](https://github.com/coreos/etcd/pull/8281). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server first reverse-lookups the remote IP address to get a list of names mapping to that address (e.g. `nslookup IPADDR`). Then accepts the connection if those names have a matching name with peer cert's DNS names (either by exact or wildcard match). If none is matched, server forward-lookups each DNS entry in peer cert (e.g. look up `example.default.svc` when the entry is `*.example.default.svc`), and accepts connection only when the host's resolved addresses have the matching IP address with the peer's remote IP address. For example, peer B's CSR (with `cfssl`) is:
 
 
 ```json
 ```json
 {
 {
@@ -261,6 +261,66 @@ Since [v3.2.5](https://github.com/coreos/etcd/blob/master/CHANGELOG.md#v325-2017
 
 
 when peer B's remote IP address is `10.138.0.2`. When peer B tries to join the cluster, peer A reverse-lookup the IP `10.138.0.2` to get the list of host names. And either exact or wildcard match the host names with peer B's cert DNS names in Subject Alternative Name (SAN) field. If none of reverse/forward lookups worked, it returns an error `"tls: "10.138.0.2" does not match any of DNSNames ["*.example.default.svc","*.example.default.svc.cluster.local"]`. See [issue#8268](https://github.com/coreos/etcd/issues/8268) for more detail.
 when peer B's remote IP address is `10.138.0.2`. When peer B tries to join the cluster, peer A reverse-lookup the IP `10.138.0.2` to get the list of host names. And either exact or wildcard match the host names with peer B's cert DNS names in Subject Alternative Name (SAN) field. If none of reverse/forward lookups worked, it returns an error `"tls: "10.138.0.2" does not match any of DNSNames ["*.example.default.svc","*.example.default.svc.cluster.local"]`. See [issue#8268](https://github.com/coreos/etcd/issues/8268) for more detail.
 
 
+[v3.3.0](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md) adds [`etcd --peer-cert-allowed-cn`](https://github.com/coreos/etcd/pull/8616) flag to support [CN(Common Name)-based auth for inter-peer connections](https://github.com/coreos/etcd/issues/8262). Kubernetes TLS bootstrapping involves generating dynamic certificates for etcd members and other system components (e.g. API server, kubelet, etc.). Maintaining different CAs for each component provides tighter access control to etcd cluster but often tedious. When `--peer-cert-allowed-cn` flag is specified, node can only join with matching common name even with shared CAs. For example, each member in 3-node cluster is set up with CSRs (with `cfssl`) as below:
+
+```json
+{
+  "CN": "etcd.local",
+  "hosts": [
+    "m1.etcd.local",
+    "127.0.0.1",
+    "localhost"
+  ],
+```
+
+```json
+{
+  "CN": "etcd.local",
+  "hosts": [
+    "m2.etcd.local",
+    "127.0.0.1",
+    "localhost"
+  ],
+```
+
+```json
+{
+  "CN": "etcd.local",
+  "hosts": [
+    "m3.etcd.local",
+    "127.0.0.1",
+    "localhost"
+  ],
+```
+
+Then only peers with matching common names will be authenticated if `--peer-cert-allowed-cn etcd.local` is given. And nodes with different CNs in CSRs or different `--peer-cert-allowed-cn` will be rejected:
+
+```bash
+$ etcd --peer-cert-allowed-cn m1.etcd.local
+
+I | embed: rejected connection from "127.0.0.1:48044" (error "CommonName authentication failed", ServerName "m1.etcd.local")
+I | embed: rejected connection from "127.0.0.1:55702" (error "remote error: tls: bad certificate", ServerName "m3.etcd.local")
+```
+
+Each process should be started with:
+
+```bash
+etcd --peer-cert-allowed-cn etcd.local
+
+I | pkg/netutil: resolving m3.etcd.local:32380 to 127.0.0.1:32380
+I | pkg/netutil: resolving m2.etcd.local:22380 to 127.0.0.1:22380
+I | pkg/netutil: resolving m1.etcd.local:2380 to 127.0.0.1:2380
+I | etcdserver: published {Name:m3 ClientURLs:[https://m3.etcd.local:32379]} to cluster 9db03f09b20de32b
+I | embed: ready to serve client requests
+I | etcdserver: published {Name:m1 ClientURLs:[https://m1.etcd.local:2379]} to cluster 9db03f09b20de32b
+I | embed: ready to serve client requests
+I | etcdserver: published {Name:m2 ClientURLs:[https://m2.etcd.local:22379]} to cluster 9db03f09b20de32b
+I | embed: ready to serve client requests
+I | embed: serving client requests on 127.0.0.1:32379
+I | embed: serving client requests on 127.0.0.1:22379
+I | embed: serving client requests on 127.0.0.1:2379
+```
+
 ## Frequently asked questions
 ## Frequently asked questions
 
 
 ### I'm seeing a SSLv3 alert handshake failure when using TLS client authentication?
 ### I'm seeing a SSLv3 alert handshake failure when using TLS client authentication?

+ 7 - 7
Documentation/upgrades/upgrade_3_2.md

@@ -66,7 +66,7 @@ if err == context.DeadlineExceeded {
 
 
 #### Change in maximum request size limits (>=3.2.10)
 #### Change in maximum request size limits (>=3.2.10)
 
 
-3.2.10 and 3.2.11 allow custom request size limits in server side. >=3.2.12 allows custom request size limits for both server and **client side**.
+3.2.10 and 3.2.11 allow custom request size limits in server side. >=3.2.12 allows custom request size limits for both server and **client side**. In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
 
 
 Server-side request limits can be configured with `--max-request-bytes` flag:
 Server-side request limits can be configured with `--max-request-bytes` flag:
 
 
@@ -160,12 +160,6 @@ Before and after
 +func NewWatchFromWatchClient(wc pb.WatchClient, c *Client) Watcher {
 +func NewWatchFromWatchClient(wc pb.WatchClient, c *Client) Watcher {
 ```
 ```
 
 
-#### Change in `--listen-peer-urls` and `--listen-client-urls`
-
-3.2 now rejects domains names for `--listen-peer-urls` and `--listen-client-urls` (3.1 only prints out warnings), since domain name is invalid for network interface binding. Make sure that those URLs are properly formated as `scheme://IP:port`.
-
-See [issue #6336](https://github.com/coreos/etcd/issues/6336) for more contexts.
-
 #### Change in `clientv3.Lease.TimeToLive` API
 #### Change in `clientv3.Lease.TimeToLive` API
 
 
 Previously, `clientv3.Lease.TimeToLive` API returned `lease.ErrLeaseNotFound` on non-existent lease ID. 3.2 instead returns TTL=-1 in its response and no error (see [#7305](https://github.com/coreos/etcd/pull/7305)).
 Previously, `clientv3.Lease.TimeToLive` API returned `lease.ErrLeaseNotFound` on non-existent lease ID. 3.2 instead returns TTL=-1 in its response and no error (see [#7305](https://github.com/coreos/etcd/pull/7305)).
@@ -206,6 +200,12 @@ import clientv3yaml "github.com/coreos/etcd/clientv3/yaml"
 clientv3yaml.NewConfig
 clientv3yaml.NewConfig
 ```
 ```
 
 
+#### Change in `--listen-peer-urls` and `--listen-client-urls`
+
+3.2 now rejects domains names for `--listen-peer-urls` and `--listen-client-urls` (3.1 only prints out warnings), since domain name is invalid for network interface binding. Make sure that those URLs are properly formated as `scheme://IP:port`.
+
+See [issue #6336](https://github.com/coreos/etcd/issues/6336) for more contexts.
+
 ### Server upgrade checklists
 ### Server upgrade checklists
 
 
 #### Upgrade requirements
 #### Upgrade requirements

+ 4 - 14
Documentation/upgrades/upgrade_3_3.md

@@ -72,27 +72,17 @@ cfg.SetupLogging()
 
 
 Set `embed.Config.Debug` field to `true` to enable gRPC server logs.
 Set `embed.Config.Debug` field to `true` to enable gRPC server logs.
 
 
-#### Change in `/health` endpoint response value
+#### Change in `/health` endpoint response
 
 
-Previously, `[endpoint]:[client-port]/health` returned manually marshaled JSON value. 3.3 now defines [`etcdhttp.Health`](https://godoc.org/github.com/coreos/etcd/etcdserver/api/etcdhttp#Health) struct and includes errors, if any.
+Previously, `[endpoint]:[client-port]/health` returned manually marshaled JSON value. 3.3 now defines [`etcdhttp.Health`](https://godoc.org/github.com/coreos/etcd/etcdserver/api/etcdhttp#Health) struct.
 
 
-Before
+Note that in v3.3.0-rc.0, v3.3.0-rc.1, and v3.3.0-rc.2, `etcdhttp.Health` has boolean type `"health"` and `"errors"` fields. For backward compatibilities, we reverted `"health"` field to `string` type and removed `"errors"` field. Further health information will be provided in separate APIs.
 
 
 ```bash
 ```bash
 $ curl http://localhost:2379/health
 $ curl http://localhost:2379/health
 {"health":"true"}
 {"health":"true"}
 ```
 ```
 
 
-After
-
-```bash
-$ curl http://localhost:2379/health
-{"health":"true"}
-
-# Or
-{"health":"false","errors":["NOSPACE"]}
-```
-
 #### Change in gRPC gateway HTTP endpoints (replaced `/v3alpha` with `/v3beta`)
 #### Change in gRPC gateway HTTP endpoints (replaced `/v3alpha` with `/v3beta`)
 
 
 Before
 Before
@@ -113,7 +103,7 @@ Requests to `/v3alpha` endpoints will redirect to `/v3beta`, and `/v3alpha` will
 
 
 #### Change in maximum request size limits
 #### Change in maximum request size limits
 
 
-3.3 now allows custom request size limits for both server and **client side**.
+3.3 now allows custom request size limits for both server and **client side**. In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
 
 
 Server-side request limits can be configured with `--max-request-bytes` flag:
 Server-side request limits can be configured with `--max-request-bytes` flag: