16308 Commits

Author SHA1 Message Date
Joe Betz
76e769ce95
Merge pull request #12273 from ptabor/2020-09-07-fix-grpc-proxy-tests
testing/e2e,grpcproxy: Fix: go test --tags "cluster_proxy" -v ./tests/e2e/...
2020-09-09 12:03:09 -07:00
Joe Betz
e81cae77a3
Merge pull request #12274 from ptabor/20200907-fix-cov-e2e-tests
tests/e2e,etcdctl,etcdmain: Fix go test --tags cov -v ./tests/e2e
2020-09-09 11:14:18 -07:00
Piotr Tabor
9d5a840942 etcdmain/grpc_proxy: Remove superflous logging line. 2020-09-09 20:04:25 +02:00
Piotr Tabor
7c880e5263 etcdctl: Rename Start / StartWithErrors to MustStart 2020-09-09 19:32:50 +02:00
Joe Betz
10fa9614e1
Merge pull request #12271 from jingyih/add_watch_notify_interval_flag_in_testing
integration: add WatchProgressNotifyInterval in integration test
2020-09-09 08:56:48 -07:00
Piotr Tabor
c32180d772 tests/e2e,etcdctl,etcdmain: Fix go test --tags cov -v ./tests/e2e
This CL fixes:
  COVERDIR=./coverage PASSES="build_cov" && go test --tags cov -v ./tests/e2e
and is part of the effort to make:
  COVERDIR=coverage PASSES="build_cov cov" ./test
fully pass.

The args passed to ./bin/etcd_test and ./bin/etcdctl_test binaries were
mismatched. The protocol of passing the arguments using
environment variables has been replaces with proper passing of flags.

How the measurement of coverage by e2e tests works:
  1. COVERDIR=./coverage PASSES="build_cov" are generating
./bin/etcd_test and ./bin/etcdctl_test binaries.

  2. These binaries are tests (as coverage can be computed only for
tests) [see ./main_test.go ./etcdctl/main_test.go], but this tests are
running the main logic of the server and uppon termination (or SIGTERM
signal) are writting proper .coverprofile files in the $COVERDIR folder.
The binaries used to take arguments using env variables, but its not
needed any longer. The binaries can consume any command line arguments
that either test (so --test.fooo) or the original binary can consume.

 3.  The tests/e2e (when compiled with the --tags cov) are starting the
_test binaries instead of the original binaries, such that the coverage
is being collected.
2020-09-09 12:56:15 +02:00
jingyih
73817b53fd integration: add flag WatchProgressNotifyInterval in integration test 2020-09-07 08:32:54 -07:00
Piotr Tabor
093282f5ea tests/e2e: cluster_proxy tests use CN-less cert for etcd-server auth.
Change tests/e2e to use proper (client-nocn.crt) certificate when
running in tags="cluster_proxy" mode.

Thanks to this (and previous in this PR) changes, the following test run
finally succeeds:
  ./build && go test --tags "cluster_proxy" -v ./tests/e2e/...
2020-09-07 12:55:01 +02:00
Piotr Tabor
2d0ce9de3d etcdmain: grpc-proxy should only require CN-less certificates for --cert flags.
We have following communication schema:
client --- 1 ---> grpc-proxy --- 2 --- > etcd-server

There are 2 sets of flags/certs in grpc proxy [ https://github.com/etcd-io/etcd/blob/master/etcdmain/grpc_proxy.go#L140 ]:
 A. (cert-file, key-file, trusted-ca-file, auto-tls) this are controlling [1] so client to proxy connection and in particular they are describing proxy public identity.
 B. (cert,key, cacert ) - these are controlling [2] so what's the identity that proxy uses to make connections to the etcd-server.

If 2 (B.) contains certificate with CN and etcd-server is running with --client-cert-auth=true, the CN can be used as identity of 'client' from service perspective. This is permission escalation, that we should forbid.

If 1 (A.) contains certificate with CN - it should be considered perfectly valid. The server can (should) have full identity.

So only --cert flag (and not --cert-file flag) should be validated for empty CN.
2020-09-07 11:59:28 +02:00
Piotr Tabor
2c93127c7b integration: Regenerate certificates, add client-nocn.crt
Executed:
(cd ./integration/fixtures && ./gencerts.sh)

This in particular cereated a new client-nocn.crt (and key) that can be
used for testing grpc-proxy -> etcd-server connections.
2020-09-07 11:48:38 +02:00
Piotr Tabor
966e8cecf0 integration: gencerts.sh cleanup and supports no-CN certs
integration/fixtures/gencerts.sh:
  - refactored common logic to a helper function
  - added definition for client-nocn certificate
    (used for grpc-proxy -> etcd-server) communication.
2020-09-07 11:47:24 +02:00
Ben Cox
c20cc05fc5
mvcc: Export a "Last DB compaction" timestamp metric (#12176)
This is to aid with debugging the effectiveness of systems that
manually take care of cluster compaction, and have greater visibity
into recent compactions.

It can be handy to alert on the exactly how long it was since a
compaction (and also to put on dashboards) had happened.

---

Tested using a test cluster, the final result looks like this:

```
	root@etcd-1:~# ETCDCTL_API=3 /tmp/test-etcd/etcdctl --endpoints=192.168.232.10:2379 compact 1012
	compacted revision 1012

	root@etcd-1:~# curl -s 192.168.232.10:2379/metrics | grep last
	# HELP etcd_debugging_mvcc_db_compaction_last The unix time since the last db compaction.  Resets to 0 on start.
	# TYPE etcd_debugging_mvcc_db_compaction_last gauge
	etcd_debugging_mvcc_db_compaction_last 1.595873939e+09
```
2020-08-26 16:27:10 -07:00
Gyuho Lee
facd0c9460
Merge pull request #12252 from spzala/changelogfileperm34and33
CHANGELOG: file perm updates in 3.4 and 3.3
2020-08-24 12:30:21 -07:00
Sahdev P. Zala
1c0d73d248 CHANGELOG: file perm updates in 3.4 and 3.3
Updates for https://github.com/etcd-io/etcd/pull/12250 and
https://github.com/etcd-io/etcd/pull/12251
2020-08-24 11:58:19 -04:00
Sahdev Zala
ae66916226
pkg: file stat warning (#12242)
Provide warning and doc instead of enforcing file permission.
2020-08-23 17:20:16 -07:00
gaurav
c199d3d8c3
server: use buffered channel to avoid goroutine leak (#11941)
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-08-21 18:46:28 -07:00
Sam Batschelet
261aa31dc5
Merge pull request #12243 from hexfusion/bump-x/text
vendor: bump golang.org/x/text
2020-08-21 10:03:01 -04:00
Sam Batschelet
100c443a10 vendor: bump golang.org/x/text
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2020-08-21 09:14:40 -04:00
Gyuho Lee
76993f1fc6
Merge pull request #12240 from avorima/fdusage_readdirnames
etcdserver: Use Readdirnames to count fds for FDUsage
2020-08-20 21:58:18 -07:00
Brandon Philips
ab9c14f477
Merge pull request #12241 from philips/add-asset-transparency-github-action
github: workflows: add asset-transparency release action
2020-08-20 14:03:19 -07:00
Brandon Philips
142358c13d github: workflows: add asset-transparency release action
From etcd-dev discussion:
https://groups.google.com/u/2/g/etcd-dev/c/oMGSBqs_7sc

I have been working on this system called Asset Transparency[1] which
helps users verify they have received the correct contents from a URL.
If you are familiar with the "download a file, download a SHA256SUM
file, run `sha256sum -c`, etc" process? This tool helps to automate
that for users into something like this[2]:

$ tl get https://github.com/etcd-io/etcd/releases/download/v3.4.12/etcd-v3.4.12-darwin-amd64.zip

And a best practice for this Asset Transparency system is that URLs
are registered with the log as soon as possible. Why? Well, the sooner
a URL is entered the longer it can protect people consuming a URL from
unexpected content modification from say a GitHub credential
compromise.

To that end I have written a GitHub Action[3] that will automatically
do that on every release. It is easy to activate and should be hands
free after installation. So, before I enable it I want to see if there
are any concerns from maintainers. The only change to our repo will be
a new file in .github/workflows.

[1] https://www.transparencylog.com
[2] https://github.com/transparencylog/tl
[3] https://github.com/transparencylog/publish-releases-asset-transparency-action
2020-08-20 11:32:36 -07:00
Mario Valderrama
be70400fb5 etcdserver: Use Readdirnames to count fds for FDUsage
Readdir already calls Readdirnames, but continues to allocate
os.FileInfo with Lstat for each result.
2020-08-20 16:51:29 +02:00
Gyuho Lee
4b6a0eea49 CHANGELOG: update with server panic fix
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-19 20:02:52 -07:00
Gyuho Lee
44dea5df03
Merge pull request #12238 from liggitt/slow-v2-panic
etcdserver: Avoid panics logging slow v2 requests in integration tests
2020-08-19 09:20:47 -07:00
Jordan Liggitt
ad57fea4c5 etcdserver: Avoid panics logging slow v2 requests in integration tests 2020-08-19 11:30:15 -04:00
Gyuho Lee
cfdc296a3c CHANGELOG: update release-3.2 dates
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-18 09:43:45 -07:00
Gyuho Lee
edaac6e2a9 CHANGELOG: add v3.3.24 release dates
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-18 09:30:27 -07:00
Gyuho Lee
e37b28bd28 CHANGELOG: add v3.4.11
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-18 09:23:24 -07:00
OG
0526f461e1
Doc: Update curl command to fix 400 Bad Request (#11911) 2020-08-16 16:12:39 -07:00
Gyuho Lee
0e4ba37b6c
Merge pull request #12193 from mitake/integration_test
test: avoid non existing package for integration test
2020-08-15 23:26:34 -07:00
Gyuho Lee
d35933c351
Merge pull request #12221 from wenjiaswe/changelog_12215
CHANGELOG: update from 12215
2020-08-14 13:29:40 -07:00
Wenjia Zhang
32982ef469 CHANGELOG: update from 12215
Change-Id: I17e076554a56c95fabd95af111eccd8d7409966b
2020-08-14 13:18:02 -07:00
Gyuho Lee
92f9e6eba2
Merge pull request #12216 from jingyih/experimental_flag_for_watch_notify_interval
*: add experimental flag for watch notify interval
2020-08-14 12:47:43 -07:00
jingyih
799b16c2d1 CHANGELOG: update for PR12216 2020-08-14 12:06:38 -07:00
jingyih
9a698476bf *: add experimental flag for watch notify interval 2020-08-14 12:01:00 -07:00
Gyuho Lee
06f89cc4f8
Merge pull request #12212 from gyuho/logger
*: upgrade zap logger to 1.15, replace global logger
2020-08-13 09:46:44 -07:00
Gyuho Lee
93cf449205
Merge pull request #12214 from gyuho/fd
*: optimize runtime.FDUsage + add OS level FD metrics
2020-08-12 18:37:05 -07:00
Gyuho Lee
5678779665 CHANGELOG: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 10:32:27 -07:00
Gyuho Lee
421df2ecbb etcdserver: add OS level FD metrics
Similar counts are exposed via Prometheus.
This adds the one that are perceived by etcd server.

e.g.

os_fd_limit 120000
os_fd_used 14
process_cpu_seconds_total 0.31
process_max_fds 120000
process_open_fds 17

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 10:32:27 -07:00
Gyuho Lee
53fdcdc5a2 pkg/runtime: optimize FDUsage by removing sort
No need sort when we just want the counts.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 10:32:24 -07:00
Gyuho Lee
d8ed233791 CHANGELOG: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 09:50:00 -07:00
Gyuho Lee
7eac6bd497 *: upgrade zap logger to 1.15, replace global logger
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 09:50:00 -07:00
Gyuho Lee
ed27d9d2de
Merge pull request #12198 from ptabor/20200803-int-to-string-test-fix
etcdserver, wal: Fix tests unintended CASTing of int->String.
2020-08-11 21:35:20 -07:00
Gyuho Lee
8c44d25f2a
Merge pull request #12211 from tangcong/ignore-errcompacted
etcdserver: ignore ErrCompacted error
2020-08-11 21:34:38 -07:00
Gyuho Lee
fe36be2251
Merge pull request #12195 from tangcong/optimize-healthcheck
*: check health by using v3 range request and its corresponding timeout
2020-08-11 21:32:44 -07:00
tangcong
8a4c7751d8 CHANGELOG: update for 12195 2020-08-12 08:10:13 +08:00
Gyuho Lee
18adf55c92
Merge pull request #12199 from ptabor/20200803-expect-replace-fix
tests/e2e: Update github.com/creack/pty v1.1.7 -> v1.1.11
2020-08-11 11:59:12 -07:00
Gyuho Lee
844091dda3
Merge pull request #12206 from ptabor/20200807-setLoggingDataRace
integration: Fix flakes due to .setupLogging race.
2020-08-11 11:58:47 -07:00
tangcong
afa0e8196c etcdserver: ignore mvcc.ErrCompacted error 2020-08-11 23:45:20 +08:00
Jingyi Hu
cd25d6c06e
Merge pull request #12130 from ptabor/master
functional/tester: Update cluster_test.go to reflect functional.yaml
2020-08-09 02:14:31 +08:00