From 4c8408f92fb7fc8ca3be9fb16285dc7f1861f735 Mon Sep 17 00:00:00 2001 From: Yicheng Qin Date: Wed, 24 Jun 2015 16:04:46 -0700 Subject: [PATCH 1/2] docs: doc metrics used in rafthttp package --- Documentation/metrics.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/Documentation/metrics.md b/Documentation/metrics.md index 0889de366..fb304c8bc 100644 --- a/Documentation/metrics.md +++ b/Documentation/metrics.md @@ -46,3 +46,15 @@ Abnormally high fsync duration (`fsync_durations_microseconds`) indicates disk i | snapshot_save_total_durations_microseconds | The total latency distributions of save called by snapshot | Summary | Abnormally high snapshot duration (`snapshot_save_total_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable. + +### rafthttp + +| Name | Description | Type | +|-----------------------------------|--------------------------------------------|---------| +| message_sent_latency_microseconds | The latency distributions of messages sent | Summary | +| message_sent_failed_total | The total number of failed messages sent | Summary | + + +Abnormally high message duration (`message_sent_latency_microseconds`) indicates network issues and might cause the cluster to be unstable. + +An increase in message failures (`message_sent_failed_total`) indicates more severe network issues and might cause the cluster to be unstable. From fcdd9779e97ac0a4030243552dd16b5a91b8bbaa Mon Sep 17 00:00:00 2001 From: Yicheng Qin Date: Thu, 25 Jun 2015 11:50:53 -0700 Subject: [PATCH 2/2] docs: explain label in rafthttp metrics --- Documentation/metrics.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/Documentation/metrics.md b/Documentation/metrics.md index fb304c8bc..31c2aab42 100644 --- a/Documentation/metrics.md +++ b/Documentation/metrics.md @@ -49,12 +49,18 @@ Abnormally high snapshot duration (`snapshot_save_total_durations_microseconds`) ### rafthttp -| Name | Description | Type | -|-----------------------------------|--------------------------------------------|---------| -| message_sent_latency_microseconds | The latency distributions of messages sent | Summary | -| message_sent_failed_total | The total number of failed messages sent | Summary | +| Name | Description | Type | Labels | +|-----------------------------------|--------------------------------------------|---------|----------------------------| +| message_sent_latency_microseconds | The latency distributions of messages sent | Summary | channel, msgType, remoteID | +| message_sent_failed_total | The total number of failed messages sent | Summary | channel, msgType, remoteID | Abnormally high message duration (`message_sent_latency_microseconds`) indicates network issues and might cause the cluster to be unstable. An increase in message failures (`message_sent_failed_total`) indicates more severe network issues and might cause the cluster to be unstable. + +Label `channel` is the channel to send message. `message`, `msgapp` and `msgappv2` channels use HTTP streaming, while `pipeline` does HTTP request for each message. + +Label `msgType` is the type of raft message. `MsgApp` is log replication message; `MsgSnap` is snapshot install message; `MsgProp` is proposal forward message; the others are used to maintain raft internal status. If you have a large snapshot, you would expect a long msgSnap sending latency. For other types of messages, you would expect low latency, which is comparable to your ping latency if you have enough network bandwidth. + +Label `remoteID` is the member ID of the message destination.