mirror of
https://github.com/bigchaindb/bigchaindb.git
synced 2024-10-13 13:34:05 +00:00
Deleted most Topic Guides pages
This commit is contained in:
parent
73ed9c4f75
commit
f529e40385
@ -1,20 +0,0 @@
|
|||||||
# How BigchainDB is Good for Asset Registrations & Transfers
|
|
||||||
|
|
||||||
BigchainDB can store data of any kind (within reason), but it's designed to be particularly good for storing asset registrations and transfers:
|
|
||||||
|
|
||||||
* The fundamental thing that one submits to a BigchainDB federation to be checked and stored (if valid) is a _transaction_, and there are two kinds: creation transactions and transfer transactions.
|
|
||||||
* A creation transaction can be use to register any kind of indivisible asset, along with arbitrary metadata.
|
|
||||||
* An asset can have zero, one, or several owners.
|
|
||||||
* The owners of an asset can specify (crypto-)conditions which must be satisified by anyone wishing transfer the asset to new owners. For example, a condition might be that at least 3 of the 5 current owners must cryptographically sign a transfer transaction.
|
|
||||||
* BigchainDB verifies that the conditions have been satisified as part of checking the validity of transfer transactions. (Moreover, anyone can check that they were satisfied.)
|
|
||||||
* BigchainDB prevents double-spending of an asset.
|
|
||||||
* Validated transactions are strongly tamper-resistant; see [the section about immutability / tamper-resistance](immutable.html).
|
|
||||||
|
|
||||||
You can read more about the details of BigchainDB transactions in [the section on Transaction, Block and Vote Models (data models)](models.html).
|
|
||||||
|
|
||||||
|
|
||||||
## BigchainDB Integration with Other Blockchains
|
|
||||||
|
|
||||||
BigchainDB works with the [Interledger protocol](https://interledger.org/), enabling the transfer of assets between BigchainDB and other blockchains, ledgers, and payment systems.
|
|
||||||
|
|
||||||
We’re actively exploring ways that BigchainDB can be used with other blockchains and platforms.
|
|
@ -1,19 +0,0 @@
|
|||||||
# BigchainDB and Byzantine Fault Tolerance
|
|
||||||
|
|
||||||
We have Byzantine fault tolerance (BFT) in our roadmap, as a switch that people can turn on. We anticipate that turning it on will cause a severe dropoff in performance (to gain some extra security). See [Issue #293](https://github.com/bigchaindb/bigchaindb/issues/293).
|
|
||||||
|
|
||||||
Among the big, industry-used distributed databases in production today (e.g. DynamoDB, Bigtable, MongoDB, Cassandra, Elasticsearch), none of them are BFT. Indeed, almost all wide-area distributed systems in production are not BFT, including military, banking, healthcare, and other security-sensitive systems.
|
|
||||||
|
|
||||||
The are many more practical things that nodes can do to increase security (e.g. firewalls, key management, access controls).
|
|
||||||
|
|
||||||
From a [recent essay by Ken Birman](http://sigops.org/sosp/sosp15/history/05-birman.pdf) (of Cornell):
|
|
||||||
|
|
||||||
> Oh, and with respect to the BFT point: Jim [Gray] felt that real systems fail by crashing [54]. Others have since done studies reinforcing this view, or finding that even crash-failure solutions can sometimes defend against application corruption. One interesting study, reported during a SOSP WIPS session by Ben Reed (one of the co-developers of Zookeeper), found that at Yahoo, Zookeeper itself had never experienced Byzantine faults in a one-year period that they studied closely.
|
|
||||||
|
|
||||||
> [54] Jim Gray. Why Do Computers Stop and What Can Be Done About It? SOSP, 1985.
|
|
||||||
|
|
||||||
Ben Reed never published those results, but Birman wrote more about them in the book *Guide to Reliable Distributed Systems: Building High-Assurance Applications*. From page 358 of that book:
|
|
||||||
|
|
||||||
> But the cloud community, led by Ben Reed and Flavio Junqueira at Yahoo, sees things differently (these are the two inventor’s [sic] of Yahoo’s ZooKeeper service). **They have described informal studies of how applications and machines at Yahoo failed, concluding that the frequency of Byzantine failures was extremely small relative to the frequency of crash failures** [emphasis added]. Sometimes they did see data corruption, but then they often saw it occur in a correlated way that impacted many replicas all at once. And very often they saw failures occur in the client layer, then propagate into the service. BFT techniques tend to be used only within a service, not in the client layer that talks to that service, hence offer no protection against malfunctioning clients. **All of this, Reed and Junqueira conclude, lead to the realization that BFT just does not match the real needs of a cloud computing company like Yahoo, even if the data being managed by a service really is of very high importance** [emphasis added]. Unfortunately, they have not published this study; it was reported at an “outrageous opinions” session at the ACM Symposium on Operating Systems Principles, in 2009.
|
|
||||||
|
|
||||||
> The practical use of the Byzantine protocol raises another concern: The timing assumptions built into the model [i.e. synchronous or partially-synchronous nodes] are not realizable in most computing environments…
|
|
@ -1,19 +0,0 @@
|
|||||||
# How BigchainDB is Decentralized
|
|
||||||
|
|
||||||
Decentralization means that no one owns or controls everything, and there is no single point of failure.
|
|
||||||
|
|
||||||
Ideally, each node in a BigchainDB cluster is owned and controlled by a different person or organization. Even if the cluster lives within one organization, it's still preferable to have each node controlled by a different person or subdivision.
|
|
||||||
|
|
||||||
We use the phrase "BigchainDB federation" (or just "federation") to refer to the set of people and/or organizations who run the nodes of a BigchainDB cluster. A federation requires some form of governance to make decisions such as membership and policies. The exact details of the governance process are determined by each federation, but it can be very decentralized (e.g. purely vote-based, where each node gets a vote, and there are no special leadership roles).
|
|
||||||
|
|
||||||
The actual data is decentralized in that it doesn’t all get stored in one place. Each federation node stores the primary of one shard and replicas of some other shards. (A shard is a subset of the total set of documents.) Sharding and replication are handled by RethinkDB.
|
|
||||||
|
|
||||||
A federation can increase its decentralization (and its resilience) by increasing its jurisdictional diversity, geographic diversity, and other kinds of diversity. This idea is expanded upon in [the section on node diversity](diversity.html).
|
|
||||||
|
|
||||||
There’s no node that has a long-term special position in the federation. All nodes run the same software and perform the same duties.
|
|
||||||
|
|
||||||
RethinkDB has an “admin” user which can’t be deleted and which can make big changes to the database, such as dropping a table. Right now, that’s a big security vulnerability, but we have plans to mitigate it by:
|
|
||||||
1. Locking down the admin user as much as possible.
|
|
||||||
2. Having all nodes inspect RethinkDB admin-type requests before acting on them. Requests can be checked against an evolving whitelist of allowed actions (voted on by federation nodes).
|
|
||||||
|
|
||||||
It’s worth noting that the RethinkDB admin user can’t transfer assets, even today. The only way to create a valid transfer transaction is to fulfill the current (crypto) conditions on the asset, and the admin user can’t do that because the admin user doesn’t have the necessary private keys (or preimages, in the case of hashlock conditions). They’re not stored in the database.
|
|
@ -1,11 +0,0 @@
|
|||||||
# Kinds of Node Diversity
|
|
||||||
|
|
||||||
Steps should be taken to make it difficult for any one actor or event to control or damage “enough” of the nodes. (“Enough” is usually a quorum.) There are many kinds of diversity to consider, listed below. It may be quite difficult to have high diversity of all kinds.
|
|
||||||
|
|
||||||
1. **Jurisdictional diversity.** The nodes should be controlled by entities within multiple legal jurisdictions, so that it becomes difficult to use legal means to compel enough of them to do something.
|
|
||||||
2. **Geographic diversity.** The servers should be physically located at multiple geographic locations, so that it becomes difficult for a natural disaster (such as a flood or earthquake) to damage enough of them to cause problems.
|
|
||||||
3. **Hosting diversity.** The servers should be hosted by multiple hosting providers (e.g. Amazon Web Services, Microsoft Azure, Digital Ocean, Rackspace), so that it becomes difficult for one hosting provider to influence enough of the nodes.
|
|
||||||
4. **Operating system diversity.** The servers should use a variety of operating systems, so that a security bug in one OS can’t be used to exploit enough of the nodes.
|
|
||||||
5. **Diversity in general.** In general, membership diversity (of all kinds) confers many advantages on a federation. For example, it provides the federation with a source of various ideas for addressing challenges.
|
|
||||||
|
|
||||||
Note: If all the nodes are running the same code, i.e. the same implementation of BigchainDB, then a bug in that code could be used to compromise all of the nodes. Ideally, there would be several different, well-maintained implementations of BigchainDB Server (e.g. one in Python, one in Go, etc.), so that a federation could also have a diversity of server implementations.
|
|
@ -1,17 +0,0 @@
|
|||||||
# How BigchainDB is Immutable / Tamper-Resistant
|
|
||||||
|
|
||||||
The blockchain community often describes blockchains as “immutable.” If we interpret that word literally, it means that blockchain data is unchangeable or permanent, which is absurd. The data _can_ be changed. For example, a plague might drive humanity extinct; the data would then get corrupted over time due to water damage, thermal noise, and the general increase of entropy. In the case of Bitcoin, nothing so drastic is required: a 51% attack will suffice.
|
|
||||||
|
|
||||||
It’s true that blockchain data is more difficult to change than usual: it’s more tamper-resistant than a typical file system or database. Therefore, in the context of blockchains, we interpret the word “immutable” to mean tamper-resistant. (Linguists would say that the word “immutable” is a _term of art_ in the blockchain community.)
|
|
||||||
|
|
||||||
BigchainDB achieves strong tamper-resistance in the following ways:
|
|
||||||
|
|
||||||
1. **Replication.** All data is sharded and shards are replicated in several (different) places. The replication factor can be set by the federation. The higher the replication factor, the more difficult it becomes to change or delete all replicas.
|
|
||||||
2. **Internal watchdogs.** All nodes monitor all changes and if some unallowed change happens, then appropriate action is taken. For example, if a valid block is deleted, then it is put back.
|
|
||||||
3. **External watchdogs.** Federations may opt to have trusted third-parties to monitor and audit their data, looking for irregularities. For federations with publicly-readable data, the public can act as an auditor.
|
|
||||||
4. **Cryptographic signatures** are used throughout BigchainDB as a way to check if messages (transactions, blocks and votes) have been tampered with enroute, and as a way to verify who signed the messages. Each block is signed by the node that created it. Each vote is signed by the node that cast it. A creation transaction is signed by the node that created it, although there are plans to improve that by adding signatures from the sending client and multiple nodes; see [Issue #347](https://github.com/bigchaindb/bigchaindb/issues/347). Transfer transactions can contain multiple fulfillments (one per asset transferred). Each fulfillment will typically contain one or more signatures from the owners (i.e. the owners before the transfer). Hashlock fulfillments are an exception; there’s an open issue ([#339](https://github.com/bigchaindb/bigchaindb/issues/339)) to address that.
|
|
||||||
5. **Full or partial backups** of the database may be recorded from time to time, possibly on magnetic tape storage, other blockchains, printouts, etc.
|
|
||||||
6. **Strong security.** Node owners can adopt and enforce strong security policies.
|
|
||||||
7. **Node diversity.** Diversity makes it so that no one thing (e.g. natural disaster or operating system bug) can compromise enough of the nodes. See [the section on the kinds of node diversity](diversity.html).
|
|
||||||
|
|
||||||
Some of these things come "for free" as part of the BigchainDB software, and others require some extra effort from the federation and node owners.
|
|
@ -1,3 +0,0 @@
|
|||||||
# BigchainDB and Smart Contracts
|
|
||||||
|
|
||||||
BigchainDB isn’t intended for running smart contracts. That said, it can do many of the things that smart contracts are used to do. For example, the owners of an asset can impose conditions that must be fulfilled by anyone wishing to transfer the asset to new owners; see the [section on conditions](models.html#conditions-and-fulfillments). BigchainDB also [supports a form of escrow](../drivers-clients/python-server-api-examples.html#escrow).
|
|
@ -1,95 +0,0 @@
|
|||||||
# Timestamps in BigchainDB
|
|
||||||
|
|
||||||
Each transaction, block and vote has an associated timestamp. Interpreting those timestamps is tricky, hence the need for this section.
|
|
||||||
|
|
||||||
|
|
||||||
## Timestamp Sources & Accuracy
|
|
||||||
|
|
||||||
A transaction's timestamp is provided by the client which created and submitted the transaction to a BigchainDB node. A block's timestamp is provided by the BigchainDB node which created the block. A vote's timestamp is provided by the BigchainDB node which created the vote.
|
|
||||||
|
|
||||||
When a BigchainDB client or node needs a timestamp, it calls a BigchainDB utility function named `timestamp()`. There's a detailed explanation of how that function works below, but the short version is that it gets the [Unix time](https://en.wikipedia.org/wiki/Unix_time) from its system clock, rounded to the nearest second.
|
|
||||||
|
|
||||||
We can't say anything about the accuracy of the system clock on clients. Timestamps from clients are still potentially useful, however, in a statistical sense. We say more about that below.
|
|
||||||
|
|
||||||
We advise BigchainDB nodes to run special software (an "NTP daemon") to keep their system clock in sync with standard time servers. (NTP stands for [Network Time Protocol](https://en.wikipedia.org/wiki/Network_Time_Protocol).)
|
|
||||||
|
|
||||||
|
|
||||||
## Converting Timestamps to UTC
|
|
||||||
|
|
||||||
To convert a BigchainDB timestamp (a Unix time) to UTC, you need to know how the node providing the timestamp was set up. That's because different setups will report a different "Unix time" value around leap seconds! There's [a nice Red Hat Developer Blog post about the various setup options](http://developers.redhat.com/blog/2015/06/01/five-different-ways-handle-leap-seconds-ntp/). If you want more details, see [David Mills' pages about leap seconds, NTP, etc.](https://www.eecis.udel.edu/~mills/leap.html) (David Mills designed NTP.)
|
|
||||||
|
|
||||||
We advise BigchainDB nodes to run an NTP daemon [with particular settings](../appendices/ntp-notes.html), so that their timestamps are consistent.
|
|
||||||
|
|
||||||
If a timestamp comes from a node that's set up as we advise, it can be converted to UTC as follows:
|
|
||||||
|
|
||||||
1. Use a standard "Unix time to UTC" converter to get a UTC timestamp.
|
|
||||||
2. Is the UTC timestamp a leap second, or the second before/after a leap second? There's [a list of all the leap seconds on Wikipedia](https://en.wikipedia.org/wiki/Leap_second).
|
|
||||||
3. If no, then you are done.
|
|
||||||
4. If yes, then it might not be possible to convert it to a single UTC timestamp. Even if it can't be converted to a single UTC timestamp, it _can_ be converted to a list of two possible UTC timestamps.
|
|
||||||
Showing how to do that is beyond the scope of this documentation.
|
|
||||||
In all likelihood, you will never have to worry about leap seconds because they are very rare.
|
|
||||||
(There were only 26 between 1972 and the end of 2015.)
|
|
||||||
|
|
||||||
|
|
||||||
## Calculating Elapsed Time Between Two Timestamps
|
|
||||||
|
|
||||||
There's another gotcha with (Unix time) timestamps: you can't calculate the real-world elapsed time between two timestamps (correctly) by subtracting the smaller timestamp from the larger one. The result won't include any of the leap seconds that occured between the two timestamps. You could look up how many leap seconds happened between the two timestamps and add that to the result. There are many library functions for working with timestamps; those are beyond the scope of this documentation.
|
|
||||||
|
|
||||||
|
|
||||||
## Avoid Doing Transactions Around Leap Seconds
|
|
||||||
|
|
||||||
Because of the ambiguity and confusion that arises with Unix time around leap seconds, we advise users to avoid creating transactions around leap seconds.
|
|
||||||
|
|
||||||
|
|
||||||
## Interpreting Sets of Timestamps
|
|
||||||
|
|
||||||
You can look at many timestamps to get a statistical sense of when something happened. For example, a transaction in a decided-valid block has many associated timestamps:
|
|
||||||
|
|
||||||
* its own timestamp
|
|
||||||
* the timestamps of the other transactions in the block; there could be as many as 999 of those
|
|
||||||
* the timestamp of the block
|
|
||||||
* the timestamps of all the votes on the block
|
|
||||||
|
|
||||||
Those timestamps come from many sources, so you can look at all of them to get some statistical sense of when the transaction "actually happened." The timestamp of the block should always be after the timestamp of the transaction, and the timestamp of the votes should always be after the timestamp of the block.
|
|
||||||
|
|
||||||
|
|
||||||
## How BigchainDB Uses Timestamps
|
|
||||||
|
|
||||||
BigchainDB _doesn't_ use timestamps to determine the order of transactions or blocks. In particular, the order of blocks is determined by RethinkDB's changefeed on the bigchain table.
|
|
||||||
|
|
||||||
BigchainDB does use timestamps for some things. It uses them to determine if a transaction has been waiting in the backlog for too long (i.e. because the node assigned to it hasn't handled it yet). It also uses timestamps to determine the status of timeout conditions (used by escrow).
|
|
||||||
|
|
||||||
|
|
||||||
## Including Trusted Timestamps
|
|
||||||
|
|
||||||
If you want to create a transaction payload with a trusted timestamp, you can.
|
|
||||||
|
|
||||||
One way to do that would be to send a payload to a trusted timestamping service. They will send back a timestamp, a signature, and their public key. They should also explain how you can verify the signature. You can then include the original payload, the timestamp, the signature, and the service's public key in your transaction. That way, anyone with the verification instructions can verify that the original payload was signed by the trusted timestamping service.
|
|
||||||
|
|
||||||
|
|
||||||
## How the timestamp() Function Works
|
|
||||||
|
|
||||||
BigchainDB has a utility function named `timestamp()` which amounts to:
|
|
||||||
```python
|
|
||||||
timestamp() = str(round(time.time()))
|
|
||||||
```
|
|
||||||
|
|
||||||
In other words, it calls the `time()` function in Python's `time` module, [rounds](https://docs.python.org/3/library/functions.html#round) that to the nearest integer, and converts the result to a string.
|
|
||||||
|
|
||||||
It rounds the output of `time.time()` to the nearest second because, according to [the Python documentation for `time.time()`](https://docs.python.org/3.4/library/time.html#time.time), "...not all systems provide time with a better precision than 1 second."
|
|
||||||
|
|
||||||
How does `time.time()` work? If you look in the C source code, it calls `floattime()` and `floattime()` calls [clock_gettime()](https://www.cs.rutgers.edu/~pxk/416/notes/c-tutorials/gettime.html), if it's available.
|
|
||||||
```text
|
|
||||||
ret = clock_gettime(CLOCK_REALTIME, &tp);
|
|
||||||
```
|
|
||||||
|
|
||||||
With `CLOCK_REALTIME` as the first argument, it returns the "Unix time." ("Unix time" is in quotes because its value around leap seconds depends on how the system is set up; see above.)
|
|
||||||
|
|
||||||
|
|
||||||
## Why Not Use UTC, TAI or Some Other Time that Has Unambiguous Timestamps for Leap Seconds?
|
|
||||||
|
|
||||||
It would be nice to use UTC or TAI timestamps, but unfortunately there's no commonly-available, standard way to get always-accurate UTC or TAI timestamps from the operating system on typical computers today (i.e. accurate around leap seconds).
|
|
||||||
|
|
||||||
There _are_ commonly-available, standard ways to get the "Unix time," such as clock_gettime() function available in C. That's what we use (indirectly via Python). ("Unix time" is in quotes because its value around leap seconds depends on how the system is set up; see above.)
|
|
||||||
|
|
||||||
The Unix-time-based timestamps we use are only ambiguous circa leap seconds, and those are very rare. Even for those timestamps, the extra uncertainty is only one second, and that's not bad considering that we only report timestamps to a precision of one second in the first place. All other timestamps can be converted to UTC with no ambiguity.
|
|
Loading…
x
Reference in New Issue
Block a user