Merge branch 'develop' into feat/116/more-solid-webserver

This commit is contained in:
vrde 2016-04-07 11:00:47 +02:00
commit b11cbce5cd
No known key found for this signature in database
GPG Key ID: 6581C7C39B3D397D
17 changed files with 931 additions and 4 deletions

6
.gitignore vendored
View File

@ -65,3 +65,9 @@ target/
# pyenv
.python-version
# Private key files from AWS
*.pem
# Some files created when deploying a cluster on AWS
deploy-cluster-aws/conf/rethinkdb.conf
deploy-cluster-aws/hostlist.py

View File

@ -19,6 +19,7 @@ Tag name: TBD
committed: TBD
### Added
- AWS deployment scripts: [Issue #151](https://github.com/bigchaindb/bigchaindb/issues/151)
- `CHANGELOG.md` (this file)
- Multisig support: [Pull Request #107](https://github.com/bigchaindb/bigchaindb/pull/107)
- API/Wire protocol (RESTful HTTP API): [Pull Request #102](https://github.com/bigchaindb/bigchaindb/pull/102)

View File

@ -0,0 +1,51 @@
# How to Handle Pull Requests
This document is for whoever has the ability to merge pull requests in the Git repositories associated with BigchainDB.
If the pull request is from an employee of ascribe GmbH, then you can ignore this document.
If the pull request is from someone who is _not_ an employee of ascribe, then:
* Have they agreed to the Individual Contributor Agreement in the past? (Troy, Greg, and others have a list.) If yes, then you can merge the PR and ignore the rest of this document.
* Do they belong to a company or organization which agreed to the Entity Contributor Agreement in the past, and will they be contributing on behalf of that company or organization? (Troy, Greg, and others have a list.) If yes, then you can merge the PR and ignore the rest of this document.
* Otherwise, go to the pull request in question and post a comment using this template:
Hi @nameofuser
Before we can merge this pull request, which may contain your intellectual property in the form of copyright or patents, our lawyers say we need you or your organization to agree to one of our contributor agreements. If you are contributing on behalf of yourself (and not on behalf of your employer or another organization you are part of) then you should:
1. Go to: https://www.bigchaindb.com/cla/
2. Read the Individual Contributor Agreement
3. Fill in the form "For Individuals"
4. Check the box to agree
5. Click the SEND button
If you're contributing as an employee, and/or you want all employees of your employing organization to be covered by our contributor agreement, then someone in your organization with the authority to enter agreements on behalf of all employees must do the following:
1. Go to: https://www.bigchaindb.com/cla/
2. Read the Entity Contributor Agreement
3. Fill in the form "For Organizations”
4. Check the box to agree
5. Click the SEND button
We will email you (or your employer) with further instructions.
(END OF COMMENT)
Once they click SEND, we (ascribe) will get an email with the information in the form. (Troy gets those emails for sure, I'm not sure who else.) The next step is to send an email to the email address submitted in the form, saying something like (where the stuff in [square brackets] should be replaced):
Hi [NAME],
The next step is for you to copy the following block of text into the comments of Pull Request #[NN] on GitHub:
BEGIN BLOCK
This is to confirm that I agreed to and accepted the BigchainDB [Entity/Individual] Contributor Agreement at https://www.bigchaindb.com/cla/ and to represent and warrant that I have authority to do so.
[Insert long random string here. One good source of those is https://www.grc.com/passwords.htm ]
END BLOCK
(END OF EMAIL)
The next step is to wait for them to copy that comment into the comments of the indicated pull request. Once they do so, it's safe to merge the pull request.

32
codecov.yml Normal file
View File

@ -0,0 +1,32 @@
codecov:
branch: develop # the branch to show by default
# The help text for bot says:
# "the username that will consume any oauth requests
# must have previously logged into Codecov"
# In GitHub - BigchainDB organization settings - Third-party access,
# it says, for Codecov: "approval requested by r-marques"
bot: r-marques
coverage:
precision: 2
round: down
range: "70...100"
status:
project:
target: auto
if_no_uploads: error
patch:
target: "80%"
if_no_uploads: error
ignore: # files and folders that will be removed during processing
- "deploy-cluster-aws/*"
- "docs/*"
- "tests/*"
comment:
layout: "header, diff, changes, sunburst, suggestions"
behavior: default

View File

@ -0,0 +1,34 @@
# -*- coding: utf-8 -*-
"""Shared AWS-related global constants and functions.
"""
from __future__ import unicode_literals
# Global constants
# None yet
# Functions
def get_naeips(client0):
"""Get a list of (allocated) non-associated elastic IP addresses
(NAEIPs) on EC2.
Args:
client0: A client created from an EC2 resource.
e.g. client0 = ec2.meta.client
See http://boto3.readthedocs.org/en/latest/guide/clients.html
Returns:
A list of NAEIPs in the EC2 account associated with the client.
To interpret the contents, see http://tinyurl.com/hrnuy74
"""
# response is a dict with 2 keys: Addresses and ResponseMetadata
# See http://tinyurl.com/hrnuy74
response = client0.describe_addresses()
allocated_eips = response['Addresses']
non_associated_eips = []
for eip in allocated_eips:
if 'InstanceId' not in eip:
non_associated_eips.append(eip)
return non_associated_eips

View File

@ -0,0 +1,105 @@
#
# RethinkDB instance configuration sample
#
# - Give this file the extension .conf and put it in /etc/rethinkdb/instances.d in order to enable it.
# - See http://www.rethinkdb.com/docs/guides/startup/ for the complete documentation
# - Uncomment an option to change its value.
#
###############################
## RethinkDB configuration
###############################
### Process options
## User and group used to run rethinkdb
## Command line default: do not change user or group
## Init script default: rethinkdb user and group
# runuser=rethinkdb
# rungroup=rethinkdb
## Stash the pid in this file when the process is running
## Note for systemd users: Systemd uses its own internal mechanism. Do not set this parameter.
## Command line default: none
## Init script default: /var/run/rethinkdb/<name>/pid_file (where <name> is the name of this config file without the extension)
# pid-file=/var/run/rethinkdb/rethinkdb.pid
### File path options
## Directory to store data and metadata
## Command line default: ./rethinkdb_data
## Init script default: /var/lib/rethinkdb/<name>/ (where <name> is the name of this file without the extension)
directory=/data
## Log file options
## Default: <directory>/log_file
#log-file=/var/log/rethinkdb
### Network options
## Address of local interfaces to listen on when accepting connections
## May be 'all' or an IP address, loopback addresses are enabled by default
## Default: all local addresses
# bind=127.0.0.1
bind=all
## Address that other rethinkdb instances will use to connect to this server.
## It can be specified multiple times
# canonical-address=
## The port for rethinkdb protocol for client drivers
## Default: 28015 + port-offset
# driver-port=28015
## The port for receiving connections from other nodes
## Default: 29015 + port-offset
# cluster-port=29015
## The host:port of a node that rethinkdb will connect to
## This option can be specified multiple times.
## Default: none
# join=example.com:29015
## All ports used locally will have this value added
## Default: 0
# port-offset=0
## r.http(...) queries will use the given server as a web proxy
## Default: no proxy
# reql-http-proxy=socks5://example.com:1080
### Web options
## Port for the http admin console
## Default: 8080 + port-offset
# http-port=8080
## Disable web administration console
# no-http-admin
### CPU options
## The number of cores to use
## Default: total number of cores of the CPU
# cores=2
### Memory options
## Size of the cache in MB
## Default: Half of the available RAM on startup
# cache-size=1024
### Disk
## How many simultaneous I/O operations can happen at the same time
# io-threads=64
#io-threads=128
## Enable direct I/O
direct-io
### Meta
## The name for this server (as will appear in the metadata).
## If not specified, it will be randomly chosen from a short list of names.
# server-name=server1

View File

@ -0,0 +1,43 @@
# -*- coding: utf-8 -*-
"""(Re)create the RethinkDB configuration file conf/rethinkdb.conf.
Start with conf/rethinkdb.conf.template
then append additional configuration settings (lines).
"""
from __future__ import unicode_literals
import os
import os.path
import shutil
from hostlist import hosts_dev
# cwd = current working directory
old_cwd = os.getcwd()
os.chdir('conf')
if os.path.isfile('rethinkdb.conf'):
os.remove('rethinkdb.conf')
# Create the initial rethinkdb.conf using rethinkdb.conf.template
shutil.copy2('rethinkdb.conf.template', 'rethinkdb.conf')
# Append additional lines to rethinkdb.conf
with open('rethinkdb.conf', 'a') as f:
f.write('## The host:port of a node that RethinkDB will connect to\n')
for public_dns_name in hosts_dev:
f.write('join=' + public_dns_name + ':29015\n')
os.chdir(old_cwd)
# Note: The original code by Andreas wrote a file with lines of the form
# join=public_dns_name_0:29015
# join=public_dns_name_1:29015
# but it stopped about halfway through the list of public_dns_names
# (publist). In principle, it's only strictly necessary to
# have one join= line.
# Maybe Andreas thought that more is better, but all is too much?
# Below is Andreas' original code. -Troy
# lfile = open('add2dbconf', 'w')
# before = 'join='
# after = ':29015'
# lfile.write('## The host:port of a node that rethinkdb will connect to\n')
# for entry in range(0,int(len(publist)/2)):
# lfile.write(before + publist[entry] + after + '\n')

View File

@ -0,0 +1,27 @@
# -*- coding: utf-8 -*-
""" Generating genesis block
"""
from __future__ import with_statement, unicode_literals
from fabric import colors as c
from fabric.api import *
from fabric.api import local, puts, settings, hide, abort, lcd, prefix
from fabric.api import run, sudo, cd, get, local, lcd, env, hide
from fabric.api import task, parallel
from fabric.contrib import files
from fabric.contrib.files import append, exists
from fabric.contrib.console import confirm
from fabric.contrib.project import rsync_project
from fabric.operations import run, put
from fabric.context_managers import settings
from fabric.decorators import roles
from fabtools import *
env.user = 'ubuntu'
env.key_filename = 'pem/bigchaindb.pem'
@task
def init_bigchaindb():
run('bigchaindb -y start &', pty = False)

197
deploy-cluster-aws/fabfile.py vendored Normal file
View File

@ -0,0 +1,197 @@
# -*- coding: utf-8 -*-
"""A fabfile with functionality to prepare, install, and configure
bigchaindb, including its storage backend.
"""
from __future__ import with_statement, unicode_literals
import requests
from time import *
import os
from datetime import datetime, timedelta
import json
from pprint import pprint
from fabric import colors as c
from fabric.api import *
from fabric.api import local, puts, settings, hide, abort, lcd, prefix
from fabric.api import run, sudo, cd, get, local, lcd, env, hide
from fabric.api import task, parallel
from fabric.contrib import files
from fabric.contrib.files import append, exists
from fabric.contrib.console import confirm
from fabric.contrib.project import rsync_project
from fabric.operations import run, put
from fabric.context_managers import settings
from fabric.decorators import roles
from fabtools import *
from hostlist import hosts_dev
env.hosts = hosts_dev
env.roledefs = {
"role1": hosts_dev,
"role2": [hosts_dev[0]],
}
env.roles = ["role1"]
env.user = 'ubuntu'
env.key_filename = 'pem/bigchaindb.pem'
######################################################################
# base software rollout
@task
@parallel
def install_base_software():
# new from Troy April 5, 2016. Why? See http://tinyurl.com/lccfrsj
# sudo('rm -rf /var/lib/apt/lists/*')
# sudo('apt-get -y clean')
# from before:
sudo('apt-get -y update')
sudo('dpkg --configure -a')
sudo('apt-get -y -f install')
sudo('apt-get -y install build-essential wget bzip2 ca-certificates \
libglib2.0-0 libxext6 libsm6 libxrender1 libssl-dev \
git gcc g++ python-dev libboost-python-dev libffi-dev \
software-properties-common python-software-properties \
python3-pip ipython3 sysstat s3cmd')
# RethinkDB
@task
@parallel
def install_rethinkdb():
"""Installation of RethinkDB"""
with settings(warn_only=True):
# preparing filesystem
sudo("mkdir -p /data")
# Locally mounted storage (m3.2xlarge, aber auch c3.xxx)
try:
sudo("umount /mnt")
sudo("mkfs -t ext4 /dev/xvdb")
sudo("mount /dev/xvdb /data")
except:
pass
# persist settings to fstab
sudo("rm -rf /etc/fstab")
sudo("echo 'LABEL=cloudimg-rootfs / ext4 defaults,discard 0 0' >> /etc/fstab")
sudo("echo '/dev/xvdb /data ext4 defaults,noatime 0 0' >> /etc/fstab")
# activate deadline scheduler
with settings(sudo_user='root'):
sudo("echo deadline > /sys/block/xvdb/queue/scheduler")
# install rethinkdb
sudo("echo 'deb http://download.rethinkdb.com/apt trusty main' | sudo tee /etc/apt/sources.list.d/rethinkdb.list")
sudo("wget -qO- http://download.rethinkdb.com/apt/pubkey.gpg | sudo apt-key add -")
sudo("apt-get update")
sudo("apt-get -y install rethinkdb")
# change fs to user
sudo('chown -R rethinkdb:rethinkdb /data')
# copy config file to target system
put('conf/rethinkdb.conf',
'/etc/rethinkdb/instances.d/instance1.conf', mode=0600, use_sudo=True)
# initialize data-dir
sudo('rm -rf /data/*')
# finally restart instance
sudo('/etc/init.d/rethinkdb restart')
# bigchaindb deployment
@task
@parallel
def install_bigchaindb():
sudo('python3 -m pip install bigchaindb')
# startup all nodes of bigchaindb in cluster
@task
@parallel
def start_bigchaindb_nodes():
sudo('screen -d -m bigchaindb -y start &', pty=False)
@task
def install_newrelic():
with settings(warn_only=True):
sudo('echo deb http://apt.newrelic.com/debian/ newrelic non-free >> /etc/apt/sources.list')
# sudo('apt-key adv --keyserver hkp://subkeys.pgp.net --recv-keys 548C16BF')
sudo('apt-get update')
sudo('apt-get -y --force-yes install newrelic-sysmond')
sudo('nrsysmond-config --set license_key=c88af00c813983f8ee12e9b455aa13fde1cddaa8')
sudo('/etc/init.d/newrelic-sysmond restart')
###############################
# Security / FirewallStuff next
###############################
@task
def harden_sshd():
"""Security harden sshd."""
# Disable password authentication
sed('/etc/ssh/sshd_config',
'#PasswordAuthentication yes',
'PasswordAuthentication no',
use_sudo=True)
# Deny root login
sed('/etc/ssh/sshd_config',
'PermitRootLogin yes',
'PermitRootLogin no',
use_sudo=True)
@task
def disable_root_login():
"""Disable `root` login for even more security. Access to `root` account
is now possible by first connecting with your dedicated maintenance
account and then running ``sudo su -``."""
sudo('passwd --lock root')
@task
def set_fw():
# snmp
sudo('iptables -A INPUT -p tcp --dport 161 -j ACCEPT')
sudo('iptables -A INPUT -p udp --dport 161 -j ACCEPT')
# dns
sudo('iptables -A OUTPUT -p udp -o eth0 --dport 53 -j ACCEPT')
sudo('iptables -A INPUT -p udp -i eth0 --sport 53 -j ACCEPT')
# rethinkdb
sudo('iptables -A INPUT -p tcp --dport 28015 -j ACCEPT')
sudo('iptables -A INPUT -p udp --dport 28015 -j ACCEPT')
sudo('iptables -A INPUT -p tcp --dport 29015 -j ACCEPT')
sudo('iptables -A INPUT -p udp --dport 29015 -j ACCEPT')
sudo('iptables -A INPUT -p tcp --dport 8080 -j ACCEPT')
sudo('iptables -A INPUT -i eth0 -p tcp --dport 8080 -j DROP')
sudo('iptables -I INPUT -i eth0 -s 127.0.0.1 -p tcp --dport 8080 -j ACCEPT')
# save rules
sudo('iptables-save > /etc/sysconfig/iptables')
#########################################################
# some helper-functions to handle bad behavior of cluster
#########################################################
# rebuild indexes
@task
@parallel
def rebuild_indexes():
run('rethinkdb index-rebuild -n 2')
@task
def stopdb():
sudo('service rethinkdb stop')
@task
def startdb():
sudo('service rethinkdb start')
@task
def restartdb():
sudo('/etc/init.d/rethinkdb restart')

View File

@ -0,0 +1,194 @@
# -*- coding: utf-8 -*-
"""This script:
0. allocates more elastic IP addresses if necessary,
1. launches the specified number of nodes (instances) on Amazon EC2,
2. tags them with the specified tag,
3. waits until those instances exist and are running,
4. for each instance, it associates an elastic IP address
with that instance,
5. writes the shellscript add2known_hosts.sh
6. (over)writes a file named hostlist.py
containing a list of all public DNS names.
"""
from __future__ import unicode_literals
import sys
import time
import argparse
import botocore
import boto3
from awscommon import (
get_naeips,
)
# First, ensure they're using Python 2.5-2.7
pyver = sys.version_info
major = pyver[0]
minor = pyver[1]
print('You are in an environment where "python" is Python {}.{}'.
format(major, minor))
if not ((major == 2) and (minor >= 5) and (minor <= 7)):
print('but Fabric only works with Python 2.5-2.7')
sys.exit(1)
# Parse the command-line arguments
parser = argparse.ArgumentParser()
parser.add_argument("--tag",
help="tag to add to all launched instances on AWS",
required=True)
parser.add_argument("--nodes",
help="number of nodes in the cluster",
required=True,
type=int)
args = parser.parse_args()
tag = args.tag
num_nodes = int(args.nodes)
# Get an AWS EC2 "resource"
# See http://boto3.readthedocs.org/en/latest/guide/resources.html
ec2 = boto3.resource(service_name='ec2')
# Create a client from the EC2 resource
# See http://boto3.readthedocs.org/en/latest/guide/clients.html
client = ec2.meta.client
# Ensure they don't already have some instances with the specified tag
# Get a list of all instances with the specified tag.
# (Technically, instances_with_tag is an ec2.instancesCollection.)
filters = [{'Name': 'tag:Name', 'Values': [tag]}]
instances_with_tag = ec2.instances.filter(Filters=filters)
# len() doesn't work on instances_with_tag. This does:
num_ins = 0
for instance in instances_with_tag:
num_ins += 1
if num_ins != 0:
print('You already have {} instances with the tag {} on EC2.'.
format(num_ins, tag))
print('You should either pick a different tag or '
'terminate all those instances and '
'wait until they vanish from your EC2 Console.')
sys.exit(1)
# Before launching any instances, make sure they have sufficient
# allocated-but-unassociated EC2 elastic IP addresses
print('Checking if you have enough allocated-but-unassociated ' +
'EC2 elastic IP addresses...')
non_associated_eips = get_naeips(client)
print('You have {} allocated elastic IPs which are '
'not already associated with instances'.
format(len(non_associated_eips)))
if num_nodes > len(non_associated_eips):
num_eips_to_allocate = num_nodes - len(non_associated_eips)
print('You want to launch {} instances'.
format(num_nodes))
print('so {} more elastic IPs must be allocated'.
format(num_eips_to_allocate))
for _ in range(num_eips_to_allocate):
try:
# Allocate an elastic IP address
# response is a dict. See http://tinyurl.com/z2n7u9k
response = client.allocate_address(DryRun=False, Domain='standard')
except botocore.exceptions.ClientError:
print('Something went wrong when allocating an '
'EC2 elastic IP address on EC2. '
'Maybe you are already at the maximum number allowed '
'by your AWS account? More details:')
raise
except:
print('Unexpected error:')
raise
print('Commencing launch of {} instances on Amazon EC2...'.
format(num_nodes))
for _ in range(num_nodes):
# Request the launch of one instance at a time
# (so list_of_instances should contain only one item)
list_of_instances = ec2.create_instances(
ImageId='ami-accff2b1', # ubuntu-image
# 'ami-596b7235', # ubuntu w/ iops storage
MinCount=1,
MaxCount=1,
KeyName='bigchaindb',
InstanceType='m3.2xlarge',
# 'c3.8xlarge',
# 'c4.8xlarge',
SecurityGroupIds=['bigchaindb']
)
# Tag the just-launched instances (should be just one)
for instance in list_of_instances:
time.sleep(5)
instance.create_tags(Tags=[{'Key': 'Name', 'Value': tag}])
# Get a list of all instances with the specified tag.
# (Technically, instances_with_tag is an ec2.instancesCollection.)
filters = [{'Name': 'tag:Name', 'Values': [tag]}]
instances_with_tag = ec2.instances.filter(Filters=filters)
print('The launched instances will have these ids:'.format(tag))
for instance in instances_with_tag:
print(instance.id)
print('Waiting until all those instances exist...')
for instance in instances_with_tag:
instance.wait_until_exists()
print('Waiting until all those instances are running...')
for instance in instances_with_tag:
instance.wait_until_running()
print('Associating allocated-but-unassociated elastic IPs ' +
'with the instances...')
# Get a list of elastic IPs which are allocated but
# not associated with any instances.
# There should be enough because we checked earlier and
# allocated more if necessary.
non_associated_eips_2 = get_naeips(client)
for i, instance in enumerate(instances_with_tag):
print('Grabbing an allocated but non-associated elastic IP...')
eip = non_associated_eips_2[i]
public_ip = eip['PublicIp']
print('The public IP address {}'.format(public_ip))
# Associate that Elastic IP address with an instance
response2 = client.associate_address(
DryRun=False,
InstanceId=instance.instance_id,
PublicIp=public_ip
)
print('was associated with the instance with id {}'.
format(instance.instance_id))
# Get a list of the pubic DNS names of the instances_with_tag
hosts_dev = []
for instance in instances_with_tag:
public_dns_name = getattr(instance, 'public_dns_name', None)
if public_dns_name is not None:
hosts_dev.append(public_dns_name)
# Write a shellscript to add remote keys to ~/.ssh/known_hosts
print('Preparing shellscript to add remote keys to known_hosts')
with open('add2known_hosts.sh', 'w') as f:
f.write('#!/bin/bash\n')
for public_dns_name in hosts_dev:
f.write('ssh-keyscan ' + public_dns_name + ' >> ~/.ssh/known_hosts\n')
# Create a file named hostlist.py containing hosts_dev.
# If a hostlist.py already exists, it will be overwritten.
print('Writing hostlist.py')
with open('hostlist.py', 'w') as f:
f.write('# -*- coding: utf-8 -*-\n')
f.write('from __future__ import unicode_literals\n')
f.write('hosts_dev = {}\n'.format(hosts_dev))
# Wait
wait_time = 45
print('Waiting {} seconds to make sure all instances are ready...'.
format(wait_time))
time.sleep(wait_time)

81
deploy-cluster-aws/startup.sh Executable file
View File

@ -0,0 +1,81 @@
#! /bin/bash
# The set -e option instructs bash to immediately exit if any command has a non-zero exit status
set -e
function printErr()
{
echo "usage: ./startup.sh <tag> <number_of_nodes_in_cluster>"
echo "No argument $1 supplied"
}
if [ -z "$1" ]
then
printErr "<tag>"
exit 1
fi
if [ -z "$2" ]
then
printErr "<number_of_nodes_in_cluster>"
exit 1
fi
TAG=$1
NODES=$2
# Check for AWS private key file (.pem file)
if [ ! -f "pem/bigchaindb.pem" ]
then
echo "File pem/bigchaindb.pem (AWS private key) is missing"
exit 1
fi
# Change the file permissions on pem/bigchaindb.pem
# so that the owner can read it, but that's all
chmod 0400 pem/bigchaindb.pem
# The following Python script does these things:
# 0. allocates more elastic IP addresses if necessary,
# 1. launches the specified number of nodes (instances) on Amazon EC2,
# 2. tags them with the specified tag,
# 3. waits until those instances exist and are running,
# 4. for each instance, it associates an elastic IP address
# with that instance,
# 5. writes the shellscript add2known_hosts.sh
# 6. (over)writes a file named hostlist.py
# containing a list of all public DNS names.
python launch_ec2_nodes.py --tag $TAG --nodes $NODES
# Make add2known_hosts.sh executable then execute it.
# This adds remote keys to ~/.ssh/known_hosts
chmod +x add2known_hosts.sh
./add2known_hosts.sh
# (Re)create the RethinkDB configuration file conf/rethinkdb.conf
python create_rethinkdb_conf.py
# rollout base packages (dependencies) needed before
# storage backend (rethinkdb) and bigchaindb can be rolled out
fab install_base_software
# rollout storage backend (rethinkdb)
fab install_rethinkdb
# rollout bigchaindb
fab install_bigchaindb
# generate genesis block
# HORST is the last public_dns_name listed in conf/rethinkdb.conf
# For example:
# ec2-52-58-86-145.eu-central-1.compute.amazonaws.com
HORST=`tail -1 conf/rethinkdb.conf|cut -d: -f1|cut -d= -f2`
fab -H $HORST -f fab_prepare_chain.py init_bigchaindb
# initiate sharding
fab start_bigchaindb_nodes
# cleanup
rm add2known_hosts.sh
# DONE

View File

@ -0,0 +1,153 @@
# Deploy a Cluster on AWS
This section explains a way to deploy a cluster of BigchainDB nodes on Amazon Web Services (AWS). We use some Bash and Python scripts to launch several instances (virtual servers) on Amazon Elastic Compute Cloud (EC2). Then we use Fabric to install RethinkDB and BigchainDB on all those instances.
**NOTE: At the time of writing, these script _do_ launch a bunch of EC2 instances, and they do install RethinkDB plus BigchainDB on each instance, but don't expect to be able to use the cluster for anything useful. There are several issues related to configuration, networking, and external clients that must be sorted out first. That said, you might find it useful to try out the AWS deployment scripts, because setting up to use them, and using them, will be very similar once those issues get sorted out.**
## Why?
You might ask why one would want to deploy a centrally-controlled BigchainDB cluster. Isn't BigchainDB supposed to be decentralized, where each node is controlled by a different person or organization?
That's true, but there are some reasons why one might want a centrally-controlled cluster: 1) for testing, and 2) for initial deployment. Afterwards, the control of each node can be handed over to a different entity.
## Python Setup
The instructions that follow have been tested on Ubuntu 14.04, but may also work on similar distros or operating systems.
**Note: Our Python scripts for deploying to AWS use Python 2 because Fabric doesn't work with Python 3.**
Maybe create a Python 2 virtual environment and activate it. Then install the following Python packages (in that virtual environment):
```text
pip install fabric fabtools requests boto3 awscli
```
What did you just install?
* "[Fabric](http://www.fabfile.org/) is a Python (2.5-2.7) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks."
* [fabtools](https://github.com/ronnix/fabtools) are "tools for writing awesome Fabric files"
* [requests](http://docs.python-requests.org/en/master/) is a Python package/library for sending HTTP requests
* "[Boto](https://boto3.readthedocs.org/en/latest/) is the Amazon Web Services (AWS) SDK for Python, which allows Python developers to write software that makes use of Amazon services like S3 and EC2." (`boto3` is the name of the latest Boto package.)
* [The aws-cli package](https://pypi.python.org/pypi/awscli), which is an AWS Command Line Interface (CLI).
## AWS Setup
Before you can deploy a BigchainDB cluster on AWS, you must have an AWS account. If you don't already have one, you can [sign up for one for free](https://aws.amazon.com/).
### Create an AWS Access Key
The next thing you'll need is an AWS access key. If you don't have one, you can create one using the [instructions in the AWS documentation](http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSGettingStartedGuide/AWSCredentials.html). You should get an access key ID (e.g. AKIAIOSFODNN7EXAMPLE) and a secret access key (e.g. wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY).
You should also pick a default AWS region name (e.g. `eu-central-1`). That's where your cluster will run. The AWS documentation has [a list of them](http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region).
Once you've got your AWS access key, and you've picked a default AWS region name, go to a terminal session and enter:
```text
aws configure
```
and answer the four questions. For example:
```text
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: eu-central-1
Default output format [None]: [Press Enter]
```
This writes two files:
* `~/.aws/credentials`
* `~/.aws/config`
AWS tools and packages look for those files.
### Get Enough Amazon Elastic IP Addresses
Our AWS deployment scripts use elastic IP addresses (although that may change in the future). By default, AWS accounts get five elastic IP addresses. If you want to deploy a cluster with more than five nodes, then you will need more than five elastic IP addresses; you may have to apply for those; see [the AWS documentation on elastic IP addresses](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html).
### Create an Amazon EC2 Key Pair
Go to the AWS EC2 Console and select "Key Pairs" in the left sidebar. Click the "Create Key Pair" button. Give it the name `bigchaindb`. You should be prompted to save a file named `bigchaindb.pem`. That file contains the RSA private key. (Amazon keeps the corresponding public key.) Save the file in `bigchaindb/deploy-cluster-aws/pem/bigchaindb.pem`.
You should not share your private key.
### Create an Amazon EC2 Security Group
Go to the AWS EC2 Console and select "Security Groups" in the left sidebar. Click the "Create Security Group" button. Give it the name `bigchaindb`. The description probably doesn't matter but we also put `bigchaindb` for that.
Add some rules for Inbound traffic:
* Type = All TCP, Protocol = TCP, Port Range = 0-65535, Source = 0.0.0.0/0
* Type = SSH, Protocol = SSH, Port Range = 22, Source = 0.0.0.0/0
* Type = All UDP, Protocol = UDP, Port Range = 0-65535, Source = 0.0.0.0/0
* Type = All ICMP, Protocol = ICMP, Port Range = 0-65535, Source = 0.0.0.0/0
**Note: These rules are extremely lax! They're meant to make testing easy.** You'll want to tighten them up if you intend to have a secure cluster. For example, Source = 0.0.0.0/0 is [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for "allow this traffic to come from _any_ IP address."
## Deployment
Here's an example of how one could launch a BigchainDB cluster of 4 nodes tagged `wrigley` on AWS:
```text
cd bigchaindb
cd deploy-cluster-aws
./startup.sh wrigley 4
```
`startup.sh` is a Bash script which calls some Python 2 and Fabric scripts. Here's what it does:
0. allocates more elastic IP addresses if necessary,
1. launches the specified number of nodes (instances) on Amazon EC2,
2. tags them with the specified tag,
3. waits until those instances exist and are running,
4. for each instance, it associates an elastic IP address with that instance,
5. adds remote keys to `~/.ssh/known_hosts`,
6. (re)creates the RethinkDB configuration file `conf/rethinkdb.conf`,
7. installs base (prerequisite) software on all instances,
8. installs RethinkDB on all instances,
9. installs BigchainDB on all instances,
10. generates the genesis block,
11. starts BigchainDB on all instances.
It should take a few minutes for the deployment to finish. If you run into problems, see the section on Known Deployment Issues below.
The EC2 Console has a section where you can see all the instances you have running on EC2. You can `ssh` into a running instance using a command like:
```text
ssh -i pem/bigchaindb.pem ubuntu@ec2-52-29-197-211.eu-central-1.compute.amazonaws.com
```
except you'd replace the `ec2-52-29-197-211.eu-central-1.compute.amazonaws.com` with the public DNS name of the instance you want to `ssh` into. You can get that from the EC2 Console: just click on an instance and look in its details pane at the bottom of the screen. Some commands you might try:
```text
ip addr show
sudo service rethinkdb status
bigchaindb --help
bigchaindb show-config
```
There are fees associated with running instances on EC2, so if you're not using them, you should terminate them. You can do that from the AWS EC2 Console.
The same is true of your allocated elastic IP addresses. There's a small fee to keep them allocated if they're not associated with a running instance. You can release them from the AWS EC2 Console.
## Known Deployment Issues
### NetworkError
If you tested with a high sequence it might be possible that you run into an error message like this:
```text
NetworkError: Host key for ec2-xx-xx-xx-xx.eu-central-1.compute.amazonaws.com
did not match pre-existing key! Server's key was changed recently, or possible
man-in-the-middle attack.
```
If so, just clean up your `known_hosts` file and start again. For example, you might copy your current `known_hosts` file to `old_known_hosts` like so:
```text
mv ~/.ssh/known_hosts ~/.ssh/old_known_hosts
```
Then terminate your instances and try deploying again with a different tag.
### Failure of sudo apt-get update
The first thing that's done on all the instances, once they're running, is basically [`sudo apt-get update`](http://askubuntu.com/questions/222348/what-does-sudo-apt-get-update-do). Sometimes that fails. If so, just terminate your instances and try deploying again with a different tag. (These problems seem to be time-bounded, so maybe wait a couple of hours before retrying.)
### Failure when Installing Base Software
If you get an error with installing the base software on the instances, then just terminate your instances and try deploying again with a different tag.

View File

@ -20,6 +20,7 @@ Table of Contents
http-client-server-api
python-driver-api-examples
local-rethinkdb-cluster
deploy-on-aws
cryptography
models
json-serialization

View File

@ -44,7 +44,7 @@ $ sudo dnf install libffi-devel gcc-c++ redhat-rpm-config python3-devel openssl-
With OS-level dependencies installed, you can install BigchainDB Server with `pip` or from source.
### How to Install BigchainDB with `pip`
### How to Install BigchainDB with pip
BigchainDB (i.e. both the Server and the officially-supported drivers) is distributed as a Python package on PyPI so you can install it using `pip`. First, make sure you have a version of `pip` installed for Python 3.4+:
```text

View File

@ -40,8 +40,10 @@ At a high level, a "digital asset" is something which can be represented digital
In BigchainDB, only the federation nodes are allowed to create digital assets, by doing a special kind of transaction: a `CREATE` transaction.
```python
from bigchaindb import crypto
# create a test user
testuser1_priv, testuser1_pub = b.generate_keys()
testuser1_priv, testuser1_pub = crypto.generate_key_pair()
# define a digital asset data payload
digital_asset_payload = {'msg': 'Hello BigchainDB!'}

View File

@ -26,7 +26,7 @@ You can also run all unit tests via `setup.py`, using:
$ python setup.py test
```
### Using `docker-compose` to Run the Tests
### Using docker-compose to Run the Tests
You can also use `docker-compose` to run the unit tests. (You don't have to start RethinkDB first: `docker-compose` does that on its own, when it reads the `docker-compose.yml` file.)

View File

@ -71,7 +71,7 @@ setup(
'rethinkdb==2.2.0.post4',
'pysha3==0.3',
'pytz==2015.7',
'cryptography==1.2.1',
'cryptography==1.2.3',
'statsd==3.2.1',
'python-rapidjson==0.0.6',
'logstats==0.2.1',