We shouldn't fail the grpc-server (completely) by a not implemented RPC.
Failing whole server by remote request is anti-pattern and security
risk.
Refer to https://github.com/etcd-io/etcd/runs/7034342964?check_suite_focus=true#step:5:2284
```
=== RUN TestWatchRequestProgress/1-watcher
panic: not implemented
goroutine 83024 [running]:
go.etcd.io/etcd/proxy/grpcproxy.(*watchProxyStream).recvLoop(0xc009232f00, 0x4a73e1, 0xc00e2406e0)
/home/runner/work/etcd/etcd/proxy/grpcproxy/watch.go:265 +0xbf2
go.etcd.io/etcd/proxy/grpcproxy.(*watchProxy).Watch.func1(0xc0038a3bc0, 0xc009232f00)
/home/runner/work/etcd/etcd/proxy/grpcproxy/watch.go:125 +0x70
created by go.etcd.io/etcd/proxy/grpcproxy.(*watchProxy).Watch
/home/runner/work/etcd/etcd/proxy/grpcproxy/watch.go:123 +0x73b
FAIL go.etcd.io/etcd/clientv3/integration 222.813s
FAIL
```
Signed-off-by: Benjamin Wang <wachao@vmware.com>
This PR resolves an issue where the `/metrics` endpoints exposed by the proxy were not returning metrics of the etcd members servers but of the proxy itself.
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
Now returns errors from checkPermissionForWatch() via the CancelReason
field. This allows clients to understand why the watch was canceled.
Additionally, this change protects a watch from starting and that
otherwise might hang indefinitely.
Current etcd code uses the string literals ("token", "authorization")
as field names of grpc and swappger metadata for passing token. It is
difficult to maintain so this commit introduces new constants for the
purpose.
Like the previous commit 10f783efdd12, this commit lets grpcproxy
forward an auth token supplied by its client in an explicit
manner. snapshot is a stream RPC so this process is required like
watch.
This commit lets grpcproxy handle authed watch. The main changes are:
1. forwrading a token of a new broadcast client
2. checking permission of a new client that participates to an
existing broadcast
Problem Observed
----------------
When there is no etcd process behind the proxy,
clients repeat resending lease grant requests without delay.
This behavior can cause abnormal resource consumption on CPU/RAM and
network.
Problem Detail
--------------
`LeaseGrant()` uses a bare protobuf client to forward requests.
However, it doesn't use `grpc.FailFast(false)`, which means the method returns
an `Unavailable` error immediately when no etcd process is available.
In clientv3, `Unavailable` errors are not considered the "Halt" error,
and library retries the request without delay.
Both clients and the proxy consume much CPU cycles to process retry requests.
Resolution
----------
Add `grpc.FailFast(false))` to `LeaseGrant()` of the `leaseProxy`.
This makes the proxy not to return immediately when no etcd process is
available. Clients will simply timeout requests instead.