mirror of
https://github.com/bigchaindb/bigchaindb.git
synced 2024-10-13 13:34:05 +00:00
Merge branch 'master' into update-docs-on-clusters
This commit is contained in:
commit
53b4325d01
@ -145,6 +145,13 @@ Once you accept and submit the CLA, we'll email you with further instructions. (
|
||||
|
||||
Someone will then merge your branch or suggest changes. If we suggest changes, you won't have to open a new pull request, you can just push new code to the same branch (on `origin`) as you did before creating the pull request.
|
||||
|
||||
### Tip: Upgrading All BigchainDB Dependencies
|
||||
|
||||
Over time, your versions of the Python packages used by BigchainDB will get out of date. You can upgrade them using:
|
||||
```text
|
||||
pip install --upgrade -e .[dev]
|
||||
```
|
||||
|
||||
## Quick Links
|
||||
|
||||
* [BigchainDB Community links](https://www.bigchaindb.com/community)
|
||||
|
@ -1,21 +1,21 @@
|
||||
# Terminology
|
||||
|
||||
There is some specialized terminology associated with BigchainDB. To get started, you should at least know what what we mean by a BigchainDB *node*, *cluster* and *consortium*.
|
||||
There is some specialized terminology associated with BigchainDB. To get started, you should at least know the following:
|
||||
|
||||
|
||||
## Node
|
||||
## BigchainDB Node
|
||||
|
||||
A **BigchainDB node** is a machine or set of closely-linked machines running RethinkDB/MongoDB Server, BigchainDB Server, and related software. (A "machine" might be a bare-metal server, a virtual machine or a container.) Each node is controlled by one person or organization.
|
||||
A **BigchainDB node** is a machine or set of closely-linked machines running RethinkDB/MongoDB Server, BigchainDB Server, and related software. Each node is controlled by one person or organization.
|
||||
|
||||
|
||||
## Cluster
|
||||
## BigchainDB Cluster
|
||||
|
||||
A set of BigchainDB nodes can connect to each other to form a **cluster**. Each node in the cluster runs the same software. A cluster contains one logical RethinkDB datastore. A cluster may have additional machines to do things such as cluster monitoring.
|
||||
A set of BigchainDB nodes can connect to each other to form a **BigchainDB cluster**. Each node in the cluster runs the same software. A cluster contains one logical RethinkDB/MongoDB datastore. A cluster may have additional machines to do things such as cluster monitoring.
|
||||
|
||||
|
||||
## Consortium
|
||||
## BigchainDB Consortium
|
||||
|
||||
The people and organizations that run the nodes in a cluster belong to a **consortium** (i.e. another organization). A consortium must have some sort of governance structure to make decisions. If a cluster is run by a single company, then the "consortium" is just that company.
|
||||
The people and organizations that run the nodes in a cluster belong to a **BigchainDB consortium** (i.e. another organization). A consortium must have some sort of governance structure to make decisions. If a cluster is run by a single company, then the "consortium" is just that company.
|
||||
|
||||
**What's the Difference Between a Cluster and a Consortium?**
|
||||
|
||||
|
Binary file not shown.
Before Width: | Height: | Size: 82 KiB After Width: | Height: | Size: 38 KiB |
@ -1,25 +0,0 @@
|
||||
# Example RethinkDB Storage Setups
|
||||
|
||||
## Example Amazon EC2 Setups
|
||||
|
||||
We have some scripts for [deploying a _test_ BigchainDB cluster on AWS](../clusters-feds/aws-testing-cluster.html). Those scripts include command sequences to set up storage for RethinkDB.
|
||||
In particular, look in the file [/deploy-cluster-aws/fabfile.py](https://github.com/bigchaindb/bigchaindb/blob/master/deploy-cluster-aws/fabfile.py), under `def prep_rethinkdb_storage(USING_EBS)`. Note that there are two cases:
|
||||
|
||||
1. **Using EBS ([Amazon Elastic Block Store](https://aws.amazon.com/ebs/)).** This is always an option, and for some instance types ("EBS-only"), it's the only option.
|
||||
2. **Using an "instance store" volume provided with an Amazon EC2 instance.** Note that our scripts only use one of the (possibly many) volumes in the instance store.
|
||||
|
||||
There's some explanation of the steps in the [Amazon EC2 documentation about making an Amazon EBS volume available for use](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html).
|
||||
|
||||
You shouldn't use an EC2 "instance store" to store RethinkDB data for a production node, because it's not replicated and it's only intended for temporary, ephemeral data. If the associated instance crashes, is stopped, or is terminated, the data in the instance store is lost forever. Amazon EBS storage is replicated, has incremental snapshots, and is low-latency.
|
||||
|
||||
|
||||
## Example Using Amazon EFS
|
||||
|
||||
TODO
|
||||
|
||||
|
||||
## Other Examples?
|
||||
|
||||
TODO
|
||||
|
||||
Maybe RAID, ZFS, ... (over EBS volumes, i.e. a DIY Amazon EFS)
|
@ -21,7 +21,7 @@ Appendices
|
||||
generate-key-pair-for-ssh
|
||||
firewall-notes
|
||||
ntp-notes
|
||||
example-rethinkdb-storage-setups
|
||||
rethinkdb-reqs
|
||||
rethinkdb-backup
|
||||
licenses
|
||||
install-with-lxd
|
||||
|
@ -1,20 +1,8 @@
|
||||
# Production Node Requirements
|
||||
# RethinkDB Requirements
|
||||
|
||||
Note: This section will be broken apart into several pages, e.g. NTP requirements, RethinkDB requirements, BigchainDB requirements, etc. and those pages will add more details.
|
||||
[The RethinkDB documentation](https://rethinkdb.com/docs/) should be your first source of information about its requirements. This page serves mostly to document some of its more obscure requirements.
|
||||
|
||||
|
||||
## OS Requirements
|
||||
|
||||
* RethinkDB Server [will run on any modern OS](https://www.rethinkdb.com/docs/install/). Note that the Fedora package isn't officially supported. Also, official support for Windows is fairly recent ([April 2016](https://rethinkdb.com/blog/2.3-release/)).
|
||||
* BigchainDB Server requires Python 3.4+ and Python 3.4+ [will run on any modern OS](https://docs.python.org/3.4/using/index.html).
|
||||
* BigchaindB Server uses the Python `multiprocessing` package and [some functionality in the `multiprocessing` package doesn't work on OS X](https://docs.python.org/3.4/library/multiprocessing.html#multiprocessing.Queue.qsize). You can still use Mac OS X if you use Docker or a virtual machine.
|
||||
|
||||
The BigchainDB core dev team uses recent LTS versions of Ubuntu and recent versions of Fedora.
|
||||
|
||||
We don't test BigchainDB on Windows or Mac OS X, but you can try.
|
||||
|
||||
* If you run into problems on Windows, then you may want to try using Vagrant. One of our community members ([@Mec-Is](https://github.com/Mec-iS)) wrote [a page about how to install BigchainDB on a VM with Vagrant](https://gist.github.com/Mec-iS/b84758397f1b21f21700).
|
||||
* If you have Mac OS X and want to experiment with BigchainDB, then you could do that [using Docker](../appendices/run-with-docker.html).
|
||||
RethinkDB Server [will run on any modern OS](https://www.rethinkdb.com/docs/install/). Note that the Fedora package isn't officially supported. Also, official support for Windows is fairly recent ([April 2016](https://rethinkdb.com/blog/2.3-release/)).
|
||||
|
||||
|
||||
## Storage Requirements
|
||||
@ -28,6 +16,20 @@ For RethinkDB's failover mechanisms to work, [every RethinkDB table must have at
|
||||
|
||||
As for the read & write rates, what do you expect those to be for your situation? It's not enough for the storage system alone to handle those rates: the interconnects between the nodes must also be able to handle them.
|
||||
|
||||
**Storage Notes Specific to RethinkDB**
|
||||
|
||||
* The RethinkDB storage engine has a number of SSD optimizations, so you _can_ benefit from using SSDs. ([source](https://www.rethinkdb.com/docs/architecture/))
|
||||
|
||||
* If you have an N-node RethinkDB cluster and 1) you want to use it to store an amount of data D (unique records, before replication), 2) you want the replication factor to be R (all tables), and 3) you want N shards (all tables), then each BigchainDB node must have storage space of at least R×D/N.
|
||||
|
||||
* RethinkDB tables can have [at most 64 shards](https://rethinkdb.com/limitations/). What does that imply? Suppose you only have one table, with 64 shards. How big could that table be? It depends on how much data can be stored in each node. If the maximum amount of data that a node can store is d, then the biggest-possible shard is d, and the biggest-possible table size is 64 times that. (All shard replicas would have to be stored on other nodes beyond the initial 64.) If there are two tables, the second table could also have 64 shards, stored on 64 other maxed-out nodes, so the total amount of unique data in the database would be (64 shards/table)×(2 tables)×d. In general, if you have T tables, the maximum amount of unique data that can be stored in the database (i.e. the amount of data before replication) is 64×T×d.
|
||||
|
||||
* When you set up storage for your RethinkDB data, you may have to select a filesystem. (Sometimes, the filesystem is already decided by the choice of storage.) We recommend using a filesystem that supports direct I/O (Input/Output). Many compressed or encrypted file systems don't support direct I/O. The ext4 filesystem supports direct I/O (but be careful: if you enable the data=journal mode, then direct I/O support will be disabled; the default is data=ordered). If your chosen filesystem supports direct I/O and you're using Linux, then you don't need to do anything to request or enable direct I/O. RethinkDB does that.
|
||||
|
||||
<p style="background-color: lightgrey;">What is direct I/O? It allows RethinkDB to write directly to the storage device (or use its own in-memory caching mechanisms), rather than relying on the operating system's file read and write caching mechanisms. (If you're using Linux, a write-to-file normally writes to the in-memory Page Cache first; only later does that Page Cache get flushed to disk. The Page Cache is also used when reading files.)</p>
|
||||
|
||||
* RethinkDB stores its data in a specific directory. You can tell RethinkDB _which_ directory using the RethinkDB config file, as explained below. In this documentation, we assume the directory is `/data`. If you set up a separate device (partition, RAID array, or logical volume) to store the RethinkDB data, then mount that device on `/data`.
|
||||
|
||||
|
||||
## Memory (RAM) Requirements
|
||||
|
@ -81,4 +81,4 @@ where, as before, `<key-name>` must be replaced.
|
||||
|
||||
## Next Steps
|
||||
|
||||
You could make changes to the Ansible playbook (and the resources it uses) to make the node more production-worthy. See [the section on production node assumptions, components and requirements](../nodes/index.html).
|
||||
You could make changes to the Ansible playbook (and the resources it uses) to make the node more production-worthy. See [the section on production node assumptions, components and requirements](../production-nodes/index.html).
|
||||
|
@ -8,7 +8,7 @@ BigchainDB Server Documentation
|
||||
introduction
|
||||
quickstart
|
||||
cloud-deployment-templates/index
|
||||
nodes/index
|
||||
production-nodes/index
|
||||
dev-and-test/index
|
||||
server-reference/index
|
||||
drivers-clients/index
|
||||
|
@ -1,10 +0,0 @@
|
||||
Production Node Assumptions, Components & Requirements
|
||||
======================================================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
node-assumptions
|
||||
node-components
|
||||
node-requirements
|
||||
setup-run-node
|
@ -1,13 +0,0 @@
|
||||
# Production Node Assumptions
|
||||
|
||||
If you're not sure what we mean by a BigchainDB *node*, *cluster*, *consortium*, or *production node*, then see [the section in the Introduction where we defined those terms](../introduction.html#some-basic-vocabulary).
|
||||
|
||||
We make some assumptions about production nodes:
|
||||
|
||||
1. **Each production node is set up and managed by an experienced professional system administrator (or a team of them).**
|
||||
|
||||
2. Each production node in a cluster is managed by a different person or team.
|
||||
|
||||
Because of the first assumption, we don't provide a detailed cookbook explaining how to secure a server, or other things that a sysadmin should know. (We do provide some [templates](../cloud-deployment-templates/index.html), but those are just a starting point.)
|
||||
|
||||
|
@ -1,23 +0,0 @@
|
||||
# Production Node Components
|
||||
|
||||
A BigchainDB node must include, at least:
|
||||
|
||||
* BigchainDB Server and
|
||||
* RethinkDB Server.
|
||||
|
||||
When doing development and testing, it's common to install both on the same machine, but in a production environment, it may make more sense to install them on separate machines.
|
||||
|
||||
In a production environment, a BigchainDB node should have several other components, including:
|
||||
|
||||
* nginx or similar, as a reverse proxy and/or load balancer for the Gunicorn server(s) inside the node
|
||||
* An NTP daemon running on all machines running BigchainDB code, and possibly other machines
|
||||
* A RethinkDB proxy server
|
||||
* A RethinkDB "wire protocol firewall" (in the future: this component doesn't exist yet)
|
||||
* Scalable storage for RethinkDB (e.g. using RAID)
|
||||
* Monitoring software, to monitor all the machines in the node
|
||||
* Configuration management agents (if you're using a configuration managment system that uses agents)
|
||||
* Maybe more
|
||||
|
||||
The relationship between these components is illustrated below.
|
||||
|
||||

|
@ -1,193 +0,0 @@
|
||||
# Set Up and Run a Cluster Node
|
||||
|
||||
This is a page of general guidelines for setting up a production node. It says nothing about how to upgrade software, storage, processing, etc. or other details of node management. It will be expanded more in the future.
|
||||
|
||||
|
||||
## Get a Server
|
||||
|
||||
The first step is to get a server (or equivalent) which meets [the requirements for a BigchainDB node](node-requirements.html).
|
||||
|
||||
|
||||
## Secure Your Server
|
||||
|
||||
The steps that you must take to secure your server depend on your server OS and where your server is physically located. There are many articles and books about how to secure a server. Here we just cover special considerations when securing a BigchainDB node.
|
||||
|
||||
There are some [notes on BigchainDB-specific firewall setup](../appendices/firewall-notes.html) in the Appendices.
|
||||
|
||||
|
||||
## Sync Your System Clock
|
||||
|
||||
A BigchainDB node uses its system clock to generate timestamps for blocks and votes, so that clock should be kept in sync with some standard clock(s). The standard way to do that is to run an NTP daemon (Network Time Protocol daemon) on the node. (You could also use tlsdate, which uses TLS timestamps rather than NTP, but don't: it's not very accurate and it will break with TLS 1.3, which removes the timestamp.)
|
||||
|
||||
NTP is a standard protocol. There are many NTP daemons implementing it. We don't recommend a particular one. On the contrary, we recommend that different nodes in a cluster run different NTP daemons, so that a problem with one daemon won't affect all nodes.
|
||||
|
||||
Please see the [notes on NTP daemon setup](../appendices/ntp-notes.html) in the Appendices.
|
||||
|
||||
|
||||
## Set Up Storage for RethinkDB Data
|
||||
|
||||
Below are some things to consider when setting up storage for the RethinkDB data. The Appendices have a [section with concrete examples](../appendices/example-rethinkdb-storage-setups.html).
|
||||
|
||||
We suggest you set up a separate storage "device" (partition, RAID array, or logical volume) to store the RethinkDB data. Here are some questions to ask:
|
||||
|
||||
* How easy will it be to add storage in the future? Will I have to shut down my server?
|
||||
* How big can the storage get? (Remember that [RAID](https://en.wikipedia.org/wiki/RAID) can be used to make several physical drives look like one.)
|
||||
* How fast can it read & write data? How many input/output operations per second (IOPS)?
|
||||
* How does IOPS scale as more physical hard drives are added?
|
||||
* What's the latency?
|
||||
* What's the reliability? Is there replication?
|
||||
* What's in the Service Level Agreement (SLA), if applicable?
|
||||
* What's the cost?
|
||||
|
||||
There are many options and tradeoffs. Don't forget to look into Amazon Elastic Block Store (EBS) and Amazon Elastic File System (EFS), or their equivalents from other providers.
|
||||
|
||||
**Storage Notes Specific to RethinkDB**
|
||||
|
||||
* The RethinkDB storage engine has a number of SSD optimizations, so you _can_ benefit from using SSDs. ([source](https://www.rethinkdb.com/docs/architecture/))
|
||||
|
||||
* If you want a RethinkDB cluster to store an amount of data D, with a replication factor of R (on every table), and the cluster has N nodes, then each node will need to be able to store R×D/N data.
|
||||
|
||||
* RethinkDB tables can have [at most 64 shards](https://rethinkdb.com/limitations/). For example, if you have only one table and more than 64 nodes, some nodes won't have the primary of any shard, i.e. they will have replicas only. In other words, once you pass 64 nodes, adding more nodes won't provide more storage space for new data. If the biggest single-node storage available is d, then the most you can store in a RethinkDB cluster is < 64×d: accomplished by putting one primary shard in each of 64 nodes, with all replica shards on other nodes. (This is assuming one table. If there are T tables, then the most you can store is < 64×d×T.)
|
||||
|
||||
* When you set up storage for your RethinkDB data, you may have to select a filesystem. (Sometimes, the filesystem is already decided by the choice of storage.) We recommend using a filesystem that supports direct I/O (Input/Output). Many compressed or encrypted file systems don't support direct I/O. The ext4 filesystem supports direct I/O (but be careful: if you enable the data=journal mode, then direct I/O support will be disabled; the default is data=ordered). If your chosen filesystem supports direct I/O and you're using Linux, then you don't need to do anything to request or enable direct I/O. RethinkDB does that.
|
||||
|
||||
<p style="background-color: lightgrey;">What is direct I/O? It allows RethinkDB to write directly to the storage device (or use its own in-memory caching mechanisms), rather than relying on the operating system's file read and write caching mechanisms. (If you're using Linux, a write-to-file normally writes to the in-memory Page Cache first; only later does that Page Cache get flushed to disk. The Page Cache is also used when reading files.)</p>
|
||||
|
||||
* RethinkDB stores its data in a specific directory. You can tell RethinkDB _which_ directory using the RethinkDB config file, as explained below. In this documentation, we assume the directory is `/data`. If you set up a separate device (partition, RAID array, or logical volume) to store the RethinkDB data, then mount that device on `/data`.
|
||||
|
||||
|
||||
## Install RethinkDB Server
|
||||
|
||||
If you don't already have RethinkDB Server installed, you must install it. The RethinkDB documentation has instructions for [how to install RethinkDB Server on a variety of operating systems](https://rethinkdb.com/docs/install/).
|
||||
|
||||
|
||||
## Configure RethinkDB Server
|
||||
|
||||
Create a RethinkDB configuration file (text file) named `instance1.conf` with the following contents (explained below):
|
||||
```text
|
||||
directory=/data
|
||||
bind=all
|
||||
direct-io
|
||||
# Replace node?_hostname with actual node hostnames below, e.g. rdb.examples.com
|
||||
join=node0_hostname:29015
|
||||
join=node1_hostname:29015
|
||||
join=node2_hostname:29015
|
||||
# continue until there's a join= line for each node in the cluster
|
||||
```
|
||||
|
||||
* `directory=/data` tells the RethinkDB node to store its share of the database data in `/data`.
|
||||
* `bind=all` binds RethinkDB to all local network interfaces (e.g. loopback, Ethernet, wireless, whatever is available), so it can communicate with the outside world. (The default is to bind only to local interfaces.)
|
||||
* `direct-io` tells RethinkDB to use direct I/O (explained earlier). Only include this line if your file system supports direct I/O.
|
||||
* `join=hostname:29015` lines: A cluster node needs to find out the hostnames of all the other nodes somehow. You _could_ designate one node to be the one that every other node asks, and put that node's hostname in the config file, but that wouldn't be very decentralized. Instead, we include _every_ node in the list of nodes-to-ask.
|
||||
|
||||
If you're curious about the RethinkDB config file, there's [a RethinkDB documentation page about it](https://www.rethinkdb.com/docs/config-file/). The [explanations of the RethinkDB command-line options](https://rethinkdb.com/docs/cli-options/) are another useful reference.
|
||||
|
||||
See the [RethinkDB documentation on securing your cluster](https://rethinkdb.com/docs/security/).
|
||||
|
||||
|
||||
## Install Python 3.4+
|
||||
|
||||
If you don't already have it, then you should [install Python 3.4+](https://www.python.org/downloads/).
|
||||
|
||||
If you're testing or developing BigchainDB on a stand-alone node, then you should probably create a Python 3.4+ virtual environment and activate it (e.g. using virtualenv or conda). Later we will install several Python packages and you probably only want those installed in the virtual environment.
|
||||
|
||||
|
||||
## Install BigchainDB Server
|
||||
|
||||
First, [install the OS-level dependencies of BigchainDB Server (link)](../appendices/install-os-level-deps.html).
|
||||
|
||||
With OS-level dependencies installed, you can install BigchainDB Server with `pip` or from source.
|
||||
|
||||
|
||||
### How to Install BigchainDB with pip
|
||||
|
||||
BigchainDB (i.e. both the Server and the officially-supported drivers) is distributed as a Python package on PyPI so you can install it using `pip`. First, make sure you have an up-to-date Python 3.4+ version of `pip` installed:
|
||||
```text
|
||||
pip -V
|
||||
```
|
||||
|
||||
If it says that `pip` isn't installed, or it says `pip` is associated with a Python version less than 3.4, then you must install a `pip` version associated with Python 3.4+. In the following instructions, we call it `pip3` but you may be able to use `pip` if that refers to the same thing. See [the `pip` installation instructions](https://pip.pypa.io/en/stable/installing/).
|
||||
|
||||
On Ubuntu 16.04, we found that this works:
|
||||
```text
|
||||
sudo apt-get install python3-pip
|
||||
```
|
||||
|
||||
That should install a Python 3 version of `pip` named `pip3`. If that didn't work, then another way to get `pip3` is to do `sudo apt-get install python3-setuptools` followed by `sudo easy_install3 pip`.
|
||||
|
||||
You can upgrade `pip` (`pip3`) and `setuptools` to the latest versions using:
|
||||
```text
|
||||
pip3 install --upgrade pip setuptools
|
||||
pip3 -V
|
||||
```
|
||||
|
||||
Now you can install BigchainDB Server (and officially-supported BigchainDB drivers) using:
|
||||
```text
|
||||
pip3 install bigchaindb
|
||||
```
|
||||
|
||||
(If you're not in a virtualenv and you want to install bigchaindb system-wide, then put `sudo` in front.)
|
||||
|
||||
Note: You can use `pip3` to upgrade the `bigchaindb` package to the latest version using `pip3 install --upgrade bigchaindb`.
|
||||
|
||||
|
||||
### How to Install BigchainDB from Source
|
||||
|
||||
If you want to install BitchainDB from source because you want to use the very latest bleeding-edge code, clone the public repository:
|
||||
```text
|
||||
git clone git@github.com:bigchaindb/bigchaindb.git
|
||||
python setup.py install
|
||||
```
|
||||
|
||||
|
||||
## Configure BigchainDB Server
|
||||
|
||||
Start by creating a default BigchainDB config file:
|
||||
```text
|
||||
bigchaindb -y configure rethinkdb
|
||||
```
|
||||
|
||||
(There's documentation for the `bigchaindb` command is in the section on [the BigchainDB Command Line Interface (CLI)](bigchaindb-cli.html).)
|
||||
|
||||
Edit the created config file:
|
||||
|
||||
* Open `$HOME/.bigchaindb` (the created config file) in your text editor.
|
||||
* Change `"server": {"bind": "localhost:9984", ... }` to `"server": {"bind": "0.0.0.0:9984", ... }`. This makes it so traffic can come from any IP address to port 9984 (the HTTP Client-Server API port).
|
||||
* Change `"keyring": []` to `"keyring": ["public_key_of_other_node_A", "public_key_of_other_node_B", "..."]` i.e. a list of the public keys of all the other nodes in the cluster. The keyring should _not_ include your node's public key.
|
||||
|
||||
For more information about the BigchainDB config file, see [Configuring a BigchainDB Node](configuration.html).
|
||||
|
||||
|
||||
## Run RethinkDB Server
|
||||
|
||||
Start RethinkDB using:
|
||||
```text
|
||||
rethinkdb --config-file path/to/instance1.conf
|
||||
```
|
||||
|
||||
except replace the path with the actual path to `instance1.conf`.
|
||||
|
||||
Note: It's possible to [make RethinkDB start at system startup](https://www.rethinkdb.com/docs/start-on-startup/).
|
||||
|
||||
You can verify that RethinkDB is running by opening the RethinkDB web interface in your web browser. It should be at `http://rethinkdb-hostname:8080/`. If you're running RethinkDB on localhost, that would be [http://localhost:8080/](http://localhost:8080/).
|
||||
|
||||
|
||||
## Run BigchainDB Server
|
||||
|
||||
After all node operators have started RethinkDB, but before they start BigchainDB, one designated node operator must configure the RethinkDB database by running the following commands:
|
||||
```text
|
||||
bigchaindb init
|
||||
bigchaindb set-shards numshards
|
||||
bigchaindb set-replicas numreplicas
|
||||
```
|
||||
|
||||
where:
|
||||
|
||||
* `bigchaindb init` creates the database within RethinkDB, the tables, the indexes, and the genesis block.
|
||||
* `numshards` should be set to the number of nodes in the initial cluster.
|
||||
* `numreplicas` should be set to the database replication factor decided by the consortium. It must be 3 or more for [RethinkDB failover](https://rethinkdb.com/docs/failover/) to work.
|
||||
|
||||
Once the RethinkDB database is configured, every node operator can start BigchainDB using:
|
||||
```text
|
||||
bigchaindb start
|
||||
```
|
10
docs/server/source/production-nodes/index.rst
Normal file
10
docs/server/source/production-nodes/index.rst
Normal file
@ -0,0 +1,10 @@
|
||||
Production Nodes
|
||||
================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
node-assumptions
|
||||
node-components
|
||||
node-requirements
|
||||
setup-run-node
|
16
docs/server/source/production-nodes/node-assumptions.md
Normal file
16
docs/server/source/production-nodes/node-assumptions.md
Normal file
@ -0,0 +1,16 @@
|
||||
# Production Node Assumptions
|
||||
|
||||
Be sure you know the key BigchainDB terminology:
|
||||
|
||||
* [BigchainDB node, BigchainDB cluster and BigchainDB consortum](https://docs.bigchaindb.com/en/latest/terminology.html)
|
||||
* [dev/test node, bare-bones node and production node](../introduction.html)
|
||||
|
||||
We make some assumptions about production nodes:
|
||||
|
||||
1. Production nodes use MongoDB, not RethinkDB.
|
||||
1. Each production node is set up and managed by an experienced professional system administrator or a team of them.
|
||||
1. Each production node in a cluster is managed by a different person or team.
|
||||
|
||||
You can use RethinkDB when building prototypes, but we don't advise or support using it in production.
|
||||
|
||||
We don't provide a detailed cookbook explaining how to secure a server, or other things that a sysadmin should know. (We do provide some [templates](../cloud-deployment-templates/index.html), but those are just a starting point.)
|
22
docs/server/source/production-nodes/node-components.md
Normal file
22
docs/server/source/production-nodes/node-components.md
Normal file
@ -0,0 +1,22 @@
|
||||
# Production Node Components
|
||||
|
||||
A production BigchainDB node must include:
|
||||
|
||||
* BigchainDB Server
|
||||
* MongoDB Server 3.4+ (mongod)
|
||||
* Scalable storage for MongoDB
|
||||
|
||||
It could also include several other components, including:
|
||||
|
||||
* NGINX or similar, to provide authentication, rate limiting, etc.
|
||||
* An NTP daemon running on all machines running BigchainDB Server or mongod, and possibly other machines
|
||||
* **Not** MongoDB Automation Agent. It's for automating the deployment of an entire MongoDB cluster, not just one MongoDB node within a cluster.
|
||||
* MongoDB Monitoring Agent
|
||||
* MongoDB Backup Agent
|
||||
* Log aggregation software
|
||||
* Monitoring software
|
||||
* Maybe more
|
||||
|
||||
The relationship between the main components is illustrated below. Note that BigchainDB Server must be able to communicate with the _primary_ MongoDB instance, and any of the MongoDB instances might be the primary, so BigchainDB Server must be able to communicate with all the MongoDB instances. Also, all MongoDB instances must be able to communicate with each other.
|
||||
|
||||

|
17
docs/server/source/production-nodes/node-requirements.md
Normal file
17
docs/server/source/production-nodes/node-requirements.md
Normal file
@ -0,0 +1,17 @@
|
||||
# Production Node Requirements
|
||||
|
||||
**This page is about the requirements of BigchainDB Server.** You can find the requirements of MongoDB, NGINX, your NTP daemon, your monitoring software, and other [production node components](node-components.html) in the documentation for that software.
|
||||
|
||||
|
||||
## OS Requirements
|
||||
|
||||
BigchainDB Server requires Python 3.4+ and Python 3.4+ [will run on any modern OS](https://docs.python.org/3.4/using/index.html), but we recommend using an LTS version of [Ubuntu Server](https://www.ubuntu.com/server) or a similarly server-grade Linux distribution.
|
||||
|
||||
_Don't use macOS_ (formerly OS X, formerly Mac OS X), because it's not a server-grade operating system. Also, BigchaindB Server uses the Python multiprocessing package and [some functionality in the multiprocessing package doesn't work on Mac OS X](https://docs.python.org/3.4/library/multiprocessing.html#multiprocessing.Queue.qsize).
|
||||
|
||||
|
||||
## General Considerations
|
||||
|
||||
BigchainDB Server runs many concurrent processes, so more RAM and more CPU cores is better.
|
||||
|
||||
As mentioned on the page about [production node components](node-components.html), every machine running BigchainDB Server should be running an NTP daemon.
|
137
docs/server/source/production-nodes/setup-run-node.md
Normal file
137
docs/server/source/production-nodes/setup-run-node.md
Normal file
@ -0,0 +1,137 @@
|
||||
# Set Up and Run a Cluster Node
|
||||
|
||||
This is a page of general guidelines for setting up a production BigchainDB node. Before continuing, make sure you've read the pages about production node [assumptions](node-assumptions.html), [components](node-components.html) and [requirements](node-requirements.html).
|
||||
|
||||
Note: These are just guidelines. You can modify them to suit your needs. For example, if you want to initialize the MongoDB replica set before installing BigchainDB, you _can_ do that. If you'd prefer to use Docker and Kubernetes, you can (and [we have a template](../cloud-deployment-templates/node-on-kubernetes.html)). We don't cover all possible setup procedures here.
|
||||
|
||||
|
||||
## Security Guidelines
|
||||
|
||||
There are many articles, websites and books about securing servers, virtual machines, networks, etc. Consult those.
|
||||
There are some [notes on BigchainDB-specific firewall setup](../appendices/firewall-notes.html) in the Appendices.
|
||||
|
||||
|
||||
## Sync Your System Clock
|
||||
|
||||
A BigchainDB node uses its system clock to generate timestamps for blocks and votes, so that clock should be kept in sync with some standard clock(s). The standard way to do that is to run an NTP daemon (Network Time Protocol daemon) on the node.
|
||||
|
||||
MongoDB also recommends having an NTP daemon running on all MongoDB nodes.
|
||||
|
||||
NTP is a standard protocol. There are many NTP daemons implementing it. We don't recommend a particular one. On the contrary, we recommend that different nodes in a cluster run different NTP daemons, so that a problem with one daemon won't affect all nodes.
|
||||
|
||||
Please see the [notes on NTP daemon setup](../appendices/ntp-notes.html) in the Appendices.
|
||||
|
||||
|
||||
## Set Up Storage for MongoDB
|
||||
|
||||
We suggest you set up a separate storage device (partition, RAID array, or logical volume) to store the data in the MongoDB database. Here are some questions to ask:
|
||||
|
||||
* How easy will it be to add storage in the future? Will I have to shut down my server?
|
||||
* How big can the storage get? (Remember that [RAID](https://en.wikipedia.org/wiki/RAID) can be used to make several physical drives look like one.)
|
||||
* How fast can it read & write data? How many input/output operations per second (IOPS)?
|
||||
* How does IOPS scale as more physical hard drives are added?
|
||||
* What's the latency?
|
||||
* What's the reliability? Is there replication?
|
||||
* What's in the Service Level Agreement (SLA), if applicable?
|
||||
* What's the cost?
|
||||
|
||||
There are many options and tradeoffs.
|
||||
|
||||
Consult the MongoDB documentation for its recommendations regarding storage hardware, software and settings, e.g. in the [MongoDB Production Notes](https://docs.mongodb.com/manual/administration/production-notes/).
|
||||
|
||||
|
||||
## Install and Run MongoDB
|
||||
|
||||
* [Install MongoDB 3.4+](https://docs.mongodb.com/manual/installation/). (BigchainDB only works with MongoDB 3.4+.)
|
||||
* [Run MongoDB (mongod)](https://docs.mongodb.com/manual/reference/program/mongod/)
|
||||
|
||||
|
||||
## Install BigchainDB Server
|
||||
|
||||
### Install BigchainDB Server Dependencies
|
||||
|
||||
Before you can install BigchainDB Server, you must [install its OS-level dependencies](../appendices/install-os-level-deps.html) and you may have to [install Python 3.4+](https://www.python.org/downloads/).
|
||||
|
||||
### How to Install BigchainDB Server with pip
|
||||
|
||||
BigchainDB is distributed as a Python package on PyPI so you can install it using `pip`. First, make sure you have an up-to-date Python 3.4+ version of `pip` installed:
|
||||
```text
|
||||
pip -V
|
||||
```
|
||||
|
||||
If it says that `pip` isn't installed, or it says `pip` is associated with a Python version less than 3.4, then you must install a `pip` version associated with Python 3.4+. In the following instructions, we call it `pip3` but you may be able to use `pip` if that refers to the same thing. See [the `pip` installation instructions](https://pip.pypa.io/en/stable/installing/).
|
||||
|
||||
On Ubuntu 16.04, we found that this works:
|
||||
```text
|
||||
sudo apt-get install python3-pip
|
||||
```
|
||||
|
||||
That should install a Python 3 version of `pip` named `pip3`. If that didn't work, then another way to get `pip3` is to do `sudo apt-get install python3-setuptools` followed by `sudo easy_install3 pip`.
|
||||
|
||||
You can upgrade `pip` (`pip3`) and `setuptools` to the latest versions using:
|
||||
```text
|
||||
pip3 install --upgrade pip setuptools
|
||||
pip3 -V
|
||||
```
|
||||
|
||||
Now you can install BigchainDB Server using:
|
||||
```text
|
||||
pip3 install bigchaindb
|
||||
```
|
||||
|
||||
(If you're not in a virtualenv and you want to install bigchaindb system-wide, then put `sudo` in front.)
|
||||
|
||||
Note: You can use `pip3` to upgrade the `bigchaindb` package to the latest version using `pip3 install --upgrade bigchaindb`.
|
||||
|
||||
|
||||
### How to Install BigchainDB Server from Source
|
||||
|
||||
If you want to install BitchainDB from source because you want to use the very latest bleeding-edge code, clone the public repository:
|
||||
```text
|
||||
git clone git@github.com:bigchaindb/bigchaindb.git
|
||||
cd bigchaindb
|
||||
python setup.py install
|
||||
```
|
||||
|
||||
|
||||
## Configure BigchainDB Server
|
||||
|
||||
Start by creating a default BigchainDB config file for a MongoDB backend:
|
||||
```text
|
||||
bigchaindb -y configure mongodb
|
||||
```
|
||||
|
||||
(There's documentation for the `bigchaindb` command is in the section on [the BigchainDB Command Line Interface (CLI)](../server-reference/bigchaindb-cli.html).)
|
||||
|
||||
Edit the created config file by opening `$HOME/.bigchaindb` (the created config file) in your text editor:
|
||||
|
||||
* Change `"server": {"bind": "localhost:9984", ... }` to `"server": {"bind": "0.0.0.0:9984", ... }`. This makes it so traffic can come from any IP address to port 9984 (the HTTP Client-Server API port).
|
||||
* Change `"keyring": []` to `"keyring": ["public_key_of_other_node_A", "public_key_of_other_node_B", "..."]` i.e. a list of the public keys of all the other nodes in the cluster. The keyring should _not_ include your node's public key.
|
||||
* Ensure that `database.host` and `database.port` are set to the hostname and port of your MongoDB instance. (The port is usually 27017, unless you changed it.)
|
||||
|
||||
For more information about the BigchainDB config file, see the page about the [BigchainDB configuration settings](../server-reference/configuration.html).
|
||||
|
||||
|
||||
## Get All Other Nodes to Update Their Keyring
|
||||
|
||||
All other BigchainDB nodes in the cluster must add your new node's public key to their BigchainDB keyring. Currently, the only way to get BigchainDB Server to "notice" a changed keyring is to shut it down and start it back up again (with the new keyring).
|
||||
|
||||
|
||||
## Maybe Update the MongoDB Replica Set
|
||||
|
||||
**If this isn't the first node in the BigchainDB cluster**, then someone with an existing BigchainDB node (not you) must add your MongoDB instance to the MongoDB replica set. They can do so (on their node) using:
|
||||
```text
|
||||
bigchaindb add-replicas your-mongod-hostname:27017
|
||||
```
|
||||
|
||||
where they must replace `your-mongod-hostname` with the actual hostname of your MongoDB instance, and they may have to replace `27017` with the actual port.
|
||||
|
||||
|
||||
## Start BigchainDB
|
||||
|
||||
**Warning: If you're not deploying the first node in the BigchainDB cluster, then don't start BigchainDB before your MongoDB instance has been added to the MongoDB replica set (as outlined above).**
|
||||
|
||||
```text
|
||||
# See warning above
|
||||
bigchaindb start
|
||||
```
|
Loading…
x
Reference in New Issue
Block a user