This is for coreos#3859 switching slice to map for synced watchings.
For a large amount of synced watchings, map implementation performs better.
When putting 1 million watchers on the same key and canceling them one by
one: original implementation takes 9m7.268221091s, while the one with map
takes only 430.531637ms.
When a watch stream closes, both of the watcher.Chan and closec
will be closed.
If watcher.Chan is closed, we should not send out the empty event.
Sending the empty is wrong and waste a lot of CPU resources.
Instead we should just return.
Based on the configuration doc, seems these two flags are missing
in the help. So add them and the descriptions are from config.go in
the same directory.
Signed-off-by: Yiqiao Pu <ypu@redhat.com>
The point is to decouple the key-value storage layer and the
event notification layer clearly. It gives the watchableKV the
flexibility to define whatever event structure it wants without
breaking the ondisk format at key-value storage layer.
Changes:
1. change the format of key and value stored in backend
Store KeyValue struct instead of Event struct in backend value for
better abstraction as xiang suggests. And record the corresponded
action in the backend key.
2. Remove word 'event' from functions
State outright that etcd is used in production and ready for more of same.
Supersedes #3884.
Adopt #3884 in spirit, but directly in README as jonboulle suggested.
Delete Documentation/production-ready.md.
We should wrap the blocking function with a closure. And first
creates a go routine to execute the function. Or the inner function
blocks before creating the go routine.
We should open real txn for applying txn requests. Or the intermediate
state might be observed by reader.
This also fixes#3803. Same consistent(raft) index per multiple indenpendent
operations confuses consistentStore.
We need to be able to force an election (on one node) after creating a
new group (cockroachdb/cockroach#1384), but it is difficult to ensure
that our call to Campaign does not race with an election that may be
started by raft itself. A redundant call to Campaign should be a no-op
instead of a panic. (But the panic in becomeCandidate remains, because
we don't want to update the term or change the committed index in this
case)
Rather than copying in .proto files, use the same symlink
trick we do for doing the actual etcd build.
Also check for exact version of protoc early on.
Extend the timeout from 1s to defaultRequestTimeout 5s.
The 1s may bring unwanted burden to the target member. If the member is
busy at recovering, it has limited bandwidth for client requests. A
short timeout at client side will retry quickly while keeping the
on-going connections. Thus, etcd will queue lots of requests and
connections and takes long time to clear them. This finally causes the
timeout of member health check.
This problem is a general one that how etcd handles amounts of requests
at the same time in a good way. We don't plan to address it at current
stage.
This adds build constraints in order to pass memory-map flags to bolt.Option.
If backend package passes syscall.MAP_POPULATE flag, the boltdb does read-ahead
which speeds up entire-database read, which then leads to faster storage
Restore. Benchmark result shows for 4GB, it opens and loads 6x faster in SSD,
and 12x faster in HDD. For 2GB, 1.6x faster with MAP_POPULATE in SSD.
The primary goal of this doc is to confirm the memory
consumption of watch is as expected. Each connection
consumes O(10kb) of memory. Each stream consumes O(10kb)
of memory. Each watching consumes < O(1kb) of memory.
Then when you have a large number of watching with small
number of connections and streams, the ave memory consumption
per watch will be O(1kb).