Gyuho Lee
b6d11019e0
rafthttp: probe all raft transports
...
This PR adds another probing routine to monitor the connection
for Raft message transports. Previously, we only monitored
snapshot transports.
In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means etcd server
was not sampling enough while such latency spikes happen
outside of snapshot pipeline connection.
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:17:16 -07:00
Wenjia
9452e5c1e5
Merge pull request #10042 from wenjiaswe/automated-cherry-pick-of-#9997-upstream-release-3.2
...
Automated cherry pick of #9997
2018-09-04 12:54:58 -07:00
Gyuho Lee
e2dfe0f5d9
etcdserver/api/rafthttp: add v3 snapshot send/receive metrics
...
Distribution would be:
0.1 second or more
...
25.6 seconds or more
51.2 seconds or more
etcd_network_snapshot_send_success
etcd_network_snapshot_send_failures
etcd_network_snapshot_send_total_duration_seconds
etcd_network_snapshot_receive_success
etcd_network_snapshot_receive_failures
etcd_network_snapshot_receive_total_duration_seconds
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:51:31 -07:00
Gyuho Lee
7f337ef13a
rafthttp: clarify "became inactive" warning
...
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:33:46 -07:00
Joe Betz
29185da0e0
Merge pull request #9502 from jpbetz/automated-cherry-pick-of-#9415-release-3.2
...
Automated cherry pick of #9415
2018-03-28 11:21:36 -07:00
Gyuho Lee
0a4560319f
rafthttp: add missing "peer_sent_failures_total" metrics call
...
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:09:50 -07:00
Gyuho Lee
431fd391da
rafthttp: add "ActivePeers" to "Transport"
...
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:06:28 -07:00
Anthony Romano
963339d265
rafthttp: permit very large v2 snapshots
...
v2 snapshots were hitting the 512MB message decode limit, causing
sending snapshots to new members to fail for being too big.
2017-06-09 10:49:51 -07:00
Anthony Romano
9169ad0d7d
*: fix go tool vet -all -shadow errors
2017-06-06 09:47:06 -07:00
Anthony Romano
1153e1e7d9
Merge pull request #7687 from heyitsanthony/deny-tls-ipsan
...
transport: deny incoming peer certs with wrong IP SAN
2017-04-13 15:03:25 -07:00
Gyu-Ho Lee
56b111df0c
rafthttp: use 'transport.IsClosedConnError'
...
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 11:55:22 -07:00
Anthony Romano
cad1215b18
*: deny incoming peer certs with wrong IP SAN
2017-04-12 13:41:33 -07:00
Gyu-Ho Lee
8db8d01712
rafthttp: move test-only functions to '_test.go'
...
Not used in actual code base, only used in tests
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-10 16:07:31 -07:00
Gyu-Ho Lee
3d75395875
*: remove never-unused vars, minor lint fix
...
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:59:12 -08:00
sharat
2656b594bb
rafthttp: use http.Request.WithContext instead of Cancel
2017-02-02 02:30:36 +05:30
Gyu-Ho Lee
fa9a78450c
rafthttp: add 3.2.0 stream type
2017-01-13 14:23:15 -08:00
Gyu-Ho Lee
d25f9feb19
rafthttp: bump up timeout in pipeline test
...
Fix https://github.com/coreos/etcd/issues/6283 .
The timeout is too short. It could take more than 10ms
to send when the buffer gets full after 'pipelineBufSize' of
requests.
2016-12-30 09:46:16 -08:00
Gyu-Ho Lee
0626ee048e
rafthttp: fix gofmt issues with go tip
2016-10-20 16:32:56 -07:00
Gyu-Ho Lee
8827619f5b
rafthttp: add v3.x to supported streams
2016-09-16 20:49:00 +09:00
Xiang Li
fb760b4c53
Merge pull request #6403 from vimalk78/rafthttp-mertics-record-rw-failures
...
rafthttp/metrics.go:fixed TODO: record write/recv failures.
2016-09-15 02:46:20 -05:00
Vimal Kumar
64e1a327ee
rafthttp/metrics.go:fixed TODO: record write/recv failures.
2016-09-15 11:32:08 +05:30
Xiang Li
0d35ba9b94
rafthttp: fix TestPipelineExceedMaximumServing
...
The timeout is too short. It might take more than 10ms to send
request over a blocking chan (buffer is full). Changing the timeout
to 1 second can fix this issue.
2016-09-13 19:06:11 +08:00
Anthony Romano
0250f0c984
rafthttp: log stream stopped message before closing channel
...
Was causing spurious goroutine leak failures in testing.
2016-09-09 12:47:06 -07:00
Anthony Romano
96ed856bca
Merge pull request #6345 from topecongiro/patch-1
...
rafthttp: remove unnecessary sendc from peer
2016-09-06 11:32:16 -07:00
Nikita Vetoshkin
da26e230a0
rafthttp: fix misprint in readBytesLimit value
...
and make test path in restricted test environments
2016-09-05 11:06:08 +05:00
Gyu-Ho Lee
5c8ba23767
rafthttp: check decode size before buffer alloc
...
Fix https://github.com/coreos/etcd/issues/5386 .
2016-09-05 14:06:03 +09:00
topecongiro
ec9e77db96
rafthttp: remove unnecessary sendc from peer
2016-09-04 13:07:31 +09:00
Anthony Romano
784c4446d9
rafthttp: fix race in TestStreamWriterAttachOutgoingConn
...
Fixes #6230
2016-08-19 19:59:16 -07:00
Anthony Romano
da1e022890
rafthttp: remove WaitSchedule() from tests
...
Fixes #6187
2016-08-18 16:26:35 -07:00
Gyu-Ho Lee
bd450c1ba3
rafthttp: use reportCriticalError, fix typo
2016-08-15 10:40:58 -07:00
Anthony Romano
9eb6ea34bd
Merge pull request #6175 from heyitsanthony/fix-conn-race
...
rafthttp: fix race between streamReader.stop() and connection closer
2016-08-15 09:27:24 -07:00
Anthony Romano
911c8442b7
rafthttp: fix race between streamReader.stop() and connection closer
2016-08-15 01:36:09 -07:00
Gyu-Ho Lee
0503676bde
rafthttp: fix httputil.RequestCanceler
2016-08-14 14:36:51 -07:00
Gyu-Ho Lee
937ae658dd
rafthttp: add Transport.Cut/MendPeer
...
From https://github.com/coreos/etcd/pull/6140 .
2016-08-10 17:09:35 -07:00
Anthony Romano
59ac42ff38
Merge pull request #6073 from heyitsanthony/rafthttp-close-stream
...
rafthttp: close http socket when pipeline handler gets a raft error
2016-07-31 21:49:04 -07:00
Anthony Romano
911dcc9386
rafthttp: close http socket when pipeline handler gets a raft error
...
Otherwise the http stream remains open and keeps receiving raft messages.
This can lead to "raft: stopped" log spam on closing an embedded server.
Fixes #5981
2016-07-31 20:25:42 -07:00
Xiang Li
9311d7b77e
rafthttp: log health checking error early
2016-07-31 19:58:22 -07:00
Anthony Romano
3a080143a7
rafthttp: make health check meaning clearer
2016-07-06 10:31:13 -07:00
Nikita Vetoshkin
fd5bc21522
rafthttp: use pointers to avoid extra copies upon message encoding
2016-06-29 21:17:18 +05:00
Gyu-Ho Lee
e221699fd8
rafthttp: fix from go vet, go lint
2016-06-22 12:04:15 -07:00
Xiang Li
6af0917812
*: add peer prefix for network metrics between peers
2016-06-17 11:59:49 -07:00
Anthony Romano
dc91da50b5
rafthttp: snapshot tests
2016-06-06 11:38:11 -07:00
Xiang Li
5183631f17
rafthttp: report error to correct chan
2016-06-03 09:18:02 -07:00
Xiang Li
a047aa4a81
rafthttp: rename to to peerID
2016-06-01 22:12:47 -07:00
Xiang Li
c25c00fcf9
rafthttp: simplify initialization funcs
2016-06-01 21:47:46 -07:00
Xiang Li
8528c8c599
*: more logging on critical state change
...
Add more logging for better debugging purpose.
2016-05-31 23:31:03 -07:00
Xiang Li
86269ab5bf
rafthttp: simplify streamReader initilization
2016-05-31 12:13:37 -07:00
Xiang Li
ba68d7bbe6
rafthttp: make newRemote simpler
2016-05-30 16:24:26 -07:00
Xiang Li
efe0ee7e59
rafthttp: remove the newPipeline func
...
Using struct to initialize pipeline is better when we have many
fields to file in.
2016-05-30 16:19:50 -07:00
Gyu-Ho Lee
c9264c5e65
rafthttp: replace append with pre-allocated slice
2016-05-20 15:20:55 -07:00