Merge branch 'develop' into feat/116/more-solid-webserver

2024-10-13 13:34:05 +00:00 · 2016-04-07 11:00:47 +02:00 · 2016-04-07 11:00:47 +02:00 · b11cbce5cd
commit b11cbce5cd
parent b988b3f6f7 ebb6e1a882
17 changed files with 931 additions and 4 deletions
--- a/.gitignore
+++ b/.gitignore
@ -65,3 +65,9 @@ target/
 # pyenv
 .python-version

+# Private key files from AWS
+*.pem
+
+# Some files created when deploying a cluster on AWS
+deploy-cluster-aws/conf/rethinkdb.conf
+deploy-cluster-aws/hostlist.py
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -19,6 +19,7 @@ Tag name: TBD
 committed: TBD

 ### Added
+- AWS deployment scripts: [Issue #151](https://github.com/bigchaindb/bigchaindb/issues/151)
 - `CHANGELOG.md` (this file)
 - Multisig support: [Pull Request #107](https://github.com/bigchaindb/bigchaindb/pull/107)
 - API/Wire protocol (RESTful HTTP API): [Pull Request #102](https://github.com/bigchaindb/bigchaindb/pull/102)
--- a/HOW_TO_HANDLE_PULL_REQUESTS.md
+++ b/HOW_TO_HANDLE_PULL_REQUESTS.md
@ -0,0 +1,51 @@
+# How to Handle Pull Requests
+
+This document is for whoever has the ability to merge pull requests in the Git repositories associated with BigchainDB.
+
+If the pull request is from an employee of ascribe GmbH, then you can ignore this document.
+
+If the pull request is from someone who is _not_ an employee of ascribe, then:
+
+* Have they agreed to the Individual Contributor Agreement in the past? (Troy, Greg, and others have a list.) If yes, then you can merge the PR and ignore the rest of this document.
+* Do they belong to a company or organization which agreed to the Entity Contributor Agreement in the past, and will they be contributing on behalf of that company or organization? (Troy, Greg, and others have a list.) If yes, then you can merge the PR and ignore the rest of this document.
+* Otherwise, go to the pull request in question and post a comment using this template:
+
+Hi @nameofuser
+
+Before we can merge this pull request, which may contain your intellectual property in the form of copyright or patents, our lawyers say we need you or your organization to agree to one of our contributor agreements. If you are contributing on behalf of yourself (and not on behalf of your employer or another organization you are part of) then you should:
+
+1. Go to: https://www.bigchaindb.com/cla/
+2. Read the Individual Contributor Agreement
+3. Fill in the form "For Individuals"
+4. Check the box to agree
+5. Click the SEND button
+
+If you're contributing as an employee, and/or you want all employees of your employing organization to be covered by our contributor agreement, then someone in your organization with the authority to enter agreements on behalf of all employees must do the following:
+
+1. Go to: https://www.bigchaindb.com/cla/
+2. Read the Entity Contributor Agreement
+3. Fill in the form "For Organizations”
+4. Check the box to agree
+5. Click the SEND button
+
+We will email you (or your employer) with further instructions.
+
+(END OF COMMENT)
+
+Once they click SEND, we (ascribe) will get an email with the information in the form. (Troy gets those emails for sure, I'm not sure who else.) The next step is to send an email to the email address submitted in the form, saying something like (where the stuff in [square brackets] should be replaced):
+
+Hi [NAME],
+
+The next step is for you to copy the following block of text into the comments of Pull Request #[NN] on GitHub:
+
+BEGIN BLOCK
+
+This is to confirm that I agreed to and accepted the BigchainDB [Entity/Individual] Contributor Agreement at https://www.bigchaindb.com/cla/ and to represent and warrant that I have authority to do so.
+
+[Insert long random string here. One good source of those is https://www.grc.com/passwords.htm ]
+
+END BLOCK
+
+(END OF EMAIL)
+
+The next step is to wait for them to copy that comment into the comments of the indicated pull request. Once they do so, it's safe to merge the pull request.
--- a/codecov.yml
+++ b/codecov.yml
@ -0,0 +1,32 @@
+codecov:
+  branch: develop    # the branch to show by default
+
+  # The help text for bot says:
+  # "the username that will consume any oauth requests
+  # must have previously logged into Codecov"
+  # In GitHub - BigchainDB organization settings - Third-party access,
+  # it says, for Codecov: "approval requested by r-marques"
+  bot: r-marques
+
+coverage:
+  precision: 2
+  round: down
+  range: "70...100"
+
+  status:
+    project:
+      target: auto
+      if_no_uploads: error
+
+    patch:
+      target: "80%"
+      if_no_uploads: error
+
+  ignore:          # files and folders that will be removed during processing
+    - "deploy-cluster-aws/*"
+    - "docs/*"
+    - "tests/*"
+
+comment:
+  layout: "header, diff, changes, sunburst, suggestions"
+  behavior: default
--- a/deploy-cluster-aws/awscommon.py
+++ b/deploy-cluster-aws/awscommon.py
@ -0,0 +1,34 @@
+# -*- coding: utf-8 -*-
+"""Shared AWS-related global constants and functions.
+"""
+
+from __future__ import unicode_literals
+
+
+# Global constants
+# None yet
+
+
+# Functions
+def get_naeips(client0):
+    """Get a list of (allocated) non-associated elastic IP addresses
+       (NAEIPs) on EC2.
+
+    Args:
+        client0: A client created from an EC2 resource.
+                 e.g. client0 = ec2.meta.client
+                 See http://boto3.readthedocs.org/en/latest/guide/clients.html
+
+    Returns:
+        A list of NAEIPs in the EC2 account associated with the client.
+        To interpret the contents, see http://tinyurl.com/hrnuy74
+    """
+    # response is a dict with 2 keys: Addresses and ResponseMetadata
+    # See http://tinyurl.com/hrnuy74
+    response = client0.describe_addresses()
+    allocated_eips = response['Addresses']
+    non_associated_eips = []
+    for eip in allocated_eips:
+        if 'InstanceId' not in eip:
+            non_associated_eips.append(eip)
+    return non_associated_eips
--- a/deploy-cluster-aws/conf/rethinkdb.conf.template
+++ b/deploy-cluster-aws/conf/rethinkdb.conf.template
@ -0,0 +1,105 @@
+#
+# RethinkDB instance configuration sample
+#
+# - Give this file the extension .conf and put it in /etc/rethinkdb/instances.d in order to enable it.
+# - See http://www.rethinkdb.com/docs/guides/startup/ for the complete documentation
+# - Uncomment an option to change its value.
+#
+
+###############################
+## RethinkDB configuration
+###############################
+
+### Process options
+
+## User and group used to run rethinkdb
+## Command line default: do not change user or group
+## Init script default: rethinkdb user and group
+# runuser=rethinkdb
+# rungroup=rethinkdb
+
+## Stash the pid in this file when the process is running
+## Note for systemd users: Systemd uses its own internal mechanism. Do not set this parameter.
+## Command line default: none
+## Init script default: /var/run/rethinkdb/<name>/pid_file (where <name> is the name of this config file without the extension)
+# pid-file=/var/run/rethinkdb/rethinkdb.pid
+
+### File path options
+
+## Directory to store data and metadata
+## Command line default: ./rethinkdb_data
+## Init script default: /var/lib/rethinkdb/<name>/ (where <name> is the name of this file without the extension)
+directory=/data
+
+## Log file options
+## Default: <directory>/log_file
+#log-file=/var/log/rethinkdb
+
+### Network options
+
+## Address of local interfaces to listen on when accepting connections
+## May be 'all' or an IP address, loopback addresses are enabled by default
+## Default: all local addresses
+# bind=127.0.0.1
+bind=all
+
+## Address that other rethinkdb instances will use to connect to this server.
+## It can be specified multiple times
+# canonical-address=
+
+## The port for rethinkdb protocol for client drivers
+## Default: 28015 + port-offset
+# driver-port=28015
+
+## The port for receiving connections from other nodes
+## Default: 29015 + port-offset
+# cluster-port=29015
+
+## The host:port of a node that rethinkdb will connect to
+## This option can be specified multiple times.
+## Default: none
+# join=example.com:29015
+
+## All ports used locally will have this value added
+## Default: 0
+# port-offset=0
+
+## r.http(...) queries will use the given server as a web proxy
+## Default: no proxy
+# reql-http-proxy=socks5://example.com:1080
+
+### Web options
+
+## Port for the http admin console
+## Default: 8080 + port-offset
+# http-port=8080
+
+## Disable web administration console
+# no-http-admin
+
+### CPU options
+
+## The number of cores to use
+## Default: total number of cores of the CPU
+# cores=2
+
+### Memory options
+
+## Size of the cache in MB
+## Default: Half of the available RAM on startup
+# cache-size=1024
+
+### Disk
+
+## How many simultaneous I/O operations can happen at the same time
+# io-threads=64
+#io-threads=128
+
+## Enable direct I/O
+direct-io
+
+### Meta
+
+## The name for this server (as will appear in the metadata).
+## If not specified, it will be randomly chosen from a short list of names.
+# server-name=server1
--- a/deploy-cluster-aws/create_rethinkdb_conf.py
+++ b/deploy-cluster-aws/create_rethinkdb_conf.py
@ -0,0 +1,43 @@
+# -*- coding: utf-8 -*-
+"""(Re)create the RethinkDB configuration file conf/rethinkdb.conf.
+Start with conf/rethinkdb.conf.template
+then append additional configuration settings (lines).
+"""
+
+from __future__ import unicode_literals
+import os
+import os.path
+import shutil
+from hostlist import hosts_dev
+
+# cwd = current working directory
+old_cwd = os.getcwd()
+os.chdir('conf')
+if os.path.isfile('rethinkdb.conf'):
+    os.remove('rethinkdb.conf')
+
+# Create the initial rethinkdb.conf using rethinkdb.conf.template
+shutil.copy2('rethinkdb.conf.template', 'rethinkdb.conf')
+
+# Append additional lines to rethinkdb.conf
+with open('rethinkdb.conf', 'a') as f:
+    f.write('## The host:port of a node that RethinkDB will connect to\n')
+    for public_dns_name in hosts_dev:
+        f.write('join=' + public_dns_name + ':29015\n')
+
+os.chdir(old_cwd)
+
+# Note: The original code by Andreas wrote a file with lines of the form
+#       join=public_dns_name_0:29015
+#       join=public_dns_name_1:29015
+#       but it stopped about halfway through the list of public_dns_names
+#       (publist). In principle, it's only strictly necessary to
+#       have one join= line.
+#       Maybe Andreas thought that more is better, but all is too much?
+#       Below is Andreas' original code. -Troy
+# lfile = open('add2dbconf', 'w')
+# before = 'join='
+# after = ':29015'
+# lfile.write('## The host:port of a node that rethinkdb will connect to\n')
+# for entry in range(0,int(len(publist)/2)):
+#     lfile.write(before + publist[entry] + after + '\n')
--- a/deploy-cluster-aws/fab_prepare_chain.py
+++ b/deploy-cluster-aws/fab_prepare_chain.py
@ -0,0 +1,27 @@
+# -*- coding: utf-8 -*-
+
+""" Generating genesis block
+"""
+
+from __future__ import with_statement, unicode_literals
+
+from fabric import colors as c
+from fabric.api import *
+from fabric.api import local, puts, settings, hide, abort, lcd, prefix
+from fabric.api import run, sudo, cd, get, local, lcd, env, hide
+from fabric.api import task, parallel
+from fabric.contrib import files
+from fabric.contrib.files import append, exists
+from fabric.contrib.console import confirm
+from fabric.contrib.project import rsync_project
+from fabric.operations import run, put
+from fabric.context_managers import settings
+from fabric.decorators import roles
+from fabtools import *
+
+env.user = 'ubuntu'
+env.key_filename = 'pem/bigchaindb.pem'
+
+@task
+def init_bigchaindb():
+    run('bigchaindb -y start &', pty = False)
--- a/deploy-cluster-aws/fabfile.py
+++ b/deploy-cluster-aws/fabfile.py
@ -0,0 +1,197 @@
+# -*- coding: utf-8 -*-
+
+"""A fabfile with functionality to prepare, install, and configure
+bigchaindb, including its storage backend.
+"""
+
+from __future__ import with_statement, unicode_literals
+
+import requests
+from time import *
+import os
+from datetime import datetime, timedelta
+import json
+from pprint import pprint
+
+from fabric import colors as c
+from fabric.api import *
+from fabric.api import local, puts, settings, hide, abort, lcd, prefix
+from fabric.api import run, sudo, cd, get, local, lcd, env, hide
+from fabric.api import task, parallel
+from fabric.contrib import files
+from fabric.contrib.files import append, exists
+from fabric.contrib.console import confirm
+from fabric.contrib.project import rsync_project
+from fabric.operations import run, put
+from fabric.context_managers import settings
+from fabric.decorators import roles
+from fabtools import *
+
+from hostlist import hosts_dev
+
+env.hosts = hosts_dev
+env.roledefs = {
+    "role1": hosts_dev,
+    "role2": [hosts_dev[0]],
+    }
+env.roles = ["role1"]
+env.user = 'ubuntu'
+env.key_filename = 'pem/bigchaindb.pem'
+
+
+######################################################################
+
+# base software rollout
+@task
+@parallel
+def install_base_software():
+    # new from Troy April 5, 2016. Why? See http://tinyurl.com/lccfrsj
+    # sudo('rm -rf /var/lib/apt/lists/*')
+    # sudo('apt-get -y clean')
+    # from before:
+    sudo('apt-get -y update')
+    sudo('dpkg --configure -a')
+    sudo('apt-get -y -f install')
+    sudo('apt-get -y install build-essential wget bzip2 ca-certificates \
+                     libglib2.0-0 libxext6 libsm6 libxrender1 libssl-dev \
+                     git gcc g++ python-dev libboost-python-dev libffi-dev \
+                     software-properties-common python-software-properties \
+                     python3-pip ipython3 sysstat s3cmd')
+
+
+# RethinkDB
+@task
+@parallel
+def install_rethinkdb():
+    """Installation of RethinkDB"""
+    with settings(warn_only=True):
+        # preparing filesystem
+        sudo("mkdir -p /data")
+        # Locally mounted storage (m3.2xlarge, aber auch c3.xxx)
+        try:
+            sudo("umount /mnt")
+            sudo("mkfs -t ext4 /dev/xvdb")
+            sudo("mount /dev/xvdb /data")
+        except:
+            pass
+
+        # persist settings to fstab
+        sudo("rm -rf /etc/fstab")
+        sudo("echo 'LABEL=cloudimg-rootfs	/	 ext4     defaults,discard    0   0' >> /etc/fstab")
+        sudo("echo '/dev/xvdb  /data        ext4    defaults,noatime    0   0' >> /etc/fstab")
+        # activate deadline scheduler
+        with settings(sudo_user='root'):
+            sudo("echo deadline > /sys/block/xvdb/queue/scheduler")
+        # install rethinkdb
+        sudo("echo 'deb http://download.rethinkdb.com/apt trusty main' | sudo tee /etc/apt/sources.list.d/rethinkdb.list")
+        sudo("wget -qO- http://download.rethinkdb.com/apt/pubkey.gpg | sudo apt-key add -")
+        sudo("apt-get update")
+        sudo("apt-get -y install rethinkdb")
+        # change fs to user
+        sudo('chown -R rethinkdb:rethinkdb /data')
+        # copy config file to target system
+        put('conf/rethinkdb.conf',
+            '/etc/rethinkdb/instances.d/instance1.conf', mode=0600, use_sudo=True)
+        # initialize data-dir
+        sudo('rm -rf /data/*')
+        # finally restart instance
+        sudo('/etc/init.d/rethinkdb restart')
+
+
+# bigchaindb deployment
+@task
+@parallel
+def install_bigchaindb():
+    sudo('python3 -m pip install bigchaindb')
+
+
+# startup all nodes of bigchaindb in cluster
+@task
+@parallel
+def start_bigchaindb_nodes():
+    sudo('screen -d -m bigchaindb -y start &', pty=False)
+
+
+@task
+def install_newrelic():
+    with settings(warn_only=True):
+        sudo('echo deb http://apt.newrelic.com/debian/ newrelic non-free >> /etc/apt/sources.list')
+        # sudo('apt-key adv --keyserver hkp://subkeys.pgp.net --recv-keys 548C16BF')
+        sudo('apt-get update')
+        sudo('apt-get -y --force-yes install newrelic-sysmond')
+        sudo('nrsysmond-config --set license_key=c88af00c813983f8ee12e9b455aa13fde1cddaa8')
+        sudo('/etc/init.d/newrelic-sysmond restart')
+
+
+###############################
+# Security / FirewallStuff next
+###############################
+
+@task
+def harden_sshd():
+    """Security harden sshd."""
+
+    # Disable password authentication
+    sed('/etc/ssh/sshd_config',
+        '#PasswordAuthentication yes',
+        'PasswordAuthentication no',
+        use_sudo=True)
+    # Deny root login
+    sed('/etc/ssh/sshd_config',
+        'PermitRootLogin yes',
+        'PermitRootLogin no',
+        use_sudo=True)
+
+
+@task
+def disable_root_login():
+    """Disable `root` login for even more security. Access to `root` account
+    is now possible by first connecting with your dedicated maintenance
+    account and then running ``sudo su -``."""
+    sudo('passwd --lock root')
+
+
+@task
+def set_fw():
+    # snmp
+    sudo('iptables -A INPUT -p tcp --dport 161 -j ACCEPT')
+    sudo('iptables -A INPUT -p udp --dport 161 -j ACCEPT')
+    # dns
+    sudo('iptables -A OUTPUT -p udp -o eth0 --dport 53 -j ACCEPT')
+    sudo('iptables -A INPUT -p udp -i eth0 --sport 53 -j ACCEPT')
+    # rethinkdb
+    sudo('iptables -A INPUT -p tcp --dport 28015 -j ACCEPT')
+    sudo('iptables -A INPUT -p udp --dport 28015 -j ACCEPT')
+    sudo('iptables -A INPUT -p tcp --dport 29015 -j ACCEPT')
+    sudo('iptables -A INPUT -p udp --dport 29015 -j ACCEPT')
+    sudo('iptables -A INPUT -p tcp --dport 8080 -j ACCEPT')
+    sudo('iptables -A INPUT -i eth0 -p tcp --dport 8080 -j DROP')
+    sudo('iptables -I INPUT -i eth0 -s 127.0.0.1 -p tcp --dport 8080 -j ACCEPT')
+    # save rules
+    sudo('iptables-save >  /etc/sysconfig/iptables')
+
+
+#########################################################
+# some helper-functions to handle bad behavior of cluster
+#########################################################
+
+# rebuild indexes
+@task
+@parallel
+def rebuild_indexes():
+    run('rethinkdb index-rebuild -n 2')
+
+
+@task
+def stopdb():
+    sudo('service rethinkdb stop')
+
+
+@task
+def startdb():
+    sudo('service rethinkdb start')
+
+
+@task
+def restartdb():
+    sudo('/etc/init.d/rethinkdb restart')
--- a/deploy-cluster-aws/launch_ec2_nodes.py
+++ b/deploy-cluster-aws/launch_ec2_nodes.py
@ -0,0 +1,194 @@
+# -*- coding: utf-8 -*-
+"""This script:
+0. allocates more elastic IP addresses if necessary,
+1. launches the specified number of nodes (instances) on Amazon EC2,
+2. tags them with the specified tag,
+3. waits until those instances exist and are running,
+4. for each instance, it associates an elastic IP address
+   with that instance,
+5. writes the shellscript add2known_hosts.sh
+6. (over)writes a file named hostlist.py
+   containing a list of all public DNS names.
+"""
+
+from __future__ import unicode_literals
+import sys
+import time
+import argparse
+import botocore
+import boto3
+from awscommon import (
+    get_naeips,
+)
+
+# First, ensure they're using Python 2.5-2.7
+pyver = sys.version_info
+major = pyver[0]
+minor = pyver[1]
+print('You are in an environment where "python" is Python {}.{}'.
+      format(major, minor))
+if not ((major == 2) and (minor >= 5) and (minor <= 7)):
+    print('but Fabric only works with Python 2.5-2.7')
+    sys.exit(1)
+
+# Parse the command-line arguments
+parser = argparse.ArgumentParser()
+parser.add_argument("--tag",
+                    help="tag to add to all launched instances on AWS",
+                    required=True)
+parser.add_argument("--nodes",
+                    help="number of nodes in the cluster",
+                    required=True,
+                    type=int)
+args = parser.parse_args()
+
+tag = args.tag
+num_nodes = int(args.nodes)
+
+# Get an AWS EC2 "resource"
+# See http://boto3.readthedocs.org/en/latest/guide/resources.html
+ec2 = boto3.resource(service_name='ec2')
+
+# Create a client from the EC2 resource
+# See http://boto3.readthedocs.org/en/latest/guide/clients.html
+client = ec2.meta.client
+
+# Ensure they don't already have some instances with the specified tag
+# Get a list of all instances with the specified tag.
+# (Technically, instances_with_tag is an ec2.instancesCollection.)
+filters = [{'Name': 'tag:Name', 'Values': [tag]}]
+instances_with_tag = ec2.instances.filter(Filters=filters)
+# len() doesn't work on instances_with_tag. This does:
+num_ins = 0
+for instance in instances_with_tag:
+    num_ins += 1
+if num_ins != 0:
+    print('You already have {} instances with the tag {} on EC2.'.
+          format(num_ins, tag))
+    print('You should either pick a different tag or '
+          'terminate all those instances and '
+          'wait until they vanish from your EC2 Console.')
+    sys.exit(1)
+
+# Before launching any instances, make sure they have sufficient
+# allocated-but-unassociated EC2 elastic IP addresses
+print('Checking if you have enough allocated-but-unassociated ' +
+      'EC2 elastic IP addresses...')
+
+non_associated_eips = get_naeips(client)
+
+print('You have {} allocated elastic IPs which are '
+      'not already associated with instances'.
+      format(len(non_associated_eips)))
+
+if num_nodes > len(non_associated_eips):
+    num_eips_to_allocate = num_nodes - len(non_associated_eips)
+    print('You want to launch {} instances'.
+          format(num_nodes))
+    print('so {} more elastic IPs must be allocated'.
+          format(num_eips_to_allocate))
+    for _ in range(num_eips_to_allocate):
+        try:
+            # Allocate an elastic IP address
+            # response is a dict. See http://tinyurl.com/z2n7u9k
+            response = client.allocate_address(DryRun=False, Domain='standard')
+        except botocore.exceptions.ClientError:
+            print('Something went wrong when allocating an '
+                  'EC2 elastic IP address on EC2. '
+                  'Maybe you are already at the maximum number allowed '
+                  'by your AWS account? More details:')
+            raise
+        except:
+            print('Unexpected error:')
+            raise
+
+print('Commencing launch of {} instances on Amazon EC2...'.
+      format(num_nodes))
+
+for _ in range(num_nodes):
+    # Request the launch of one instance at a time
+    # (so list_of_instances should contain only one item)
+    list_of_instances = ec2.create_instances(
+            ImageId='ami-accff2b1',          # ubuntu-image
+            # 'ami-596b7235',                 # ubuntu w/ iops storage
+            MinCount=1,
+            MaxCount=1,
+            KeyName='bigchaindb',
+            InstanceType='m3.2xlarge',
+            # 'c3.8xlarge',
+            # 'c4.8xlarge',
+            SecurityGroupIds=['bigchaindb']
+            )
+
+    # Tag the just-launched instances (should be just one)
+    for instance in list_of_instances:
+        time.sleep(5)
+        instance.create_tags(Tags=[{'Key': 'Name', 'Value': tag}])
+
+# Get a list of all instances with the specified tag.
+# (Technically, instances_with_tag is an ec2.instancesCollection.)
+filters = [{'Name': 'tag:Name', 'Values': [tag]}]
+instances_with_tag = ec2.instances.filter(Filters=filters)
+print('The launched instances will have these ids:'.format(tag))
+for instance in instances_with_tag:
+    print(instance.id)
+
+print('Waiting until all those instances exist...')
+for instance in instances_with_tag:
+    instance.wait_until_exists()
+
+print('Waiting until all those instances are running...')
+for instance in instances_with_tag:
+    instance.wait_until_running()
+
+print('Associating allocated-but-unassociated elastic IPs ' +
+      'with the instances...')
+
+# Get a list of elastic IPs which are allocated but
+# not associated with any instances.
+# There should be enough because we checked earlier and
+# allocated more if necessary.
+non_associated_eips_2 = get_naeips(client)
+
+for i, instance in enumerate(instances_with_tag):
+    print('Grabbing an allocated but non-associated elastic IP...')
+    eip = non_associated_eips_2[i]
+    public_ip = eip['PublicIp']
+    print('The public IP address {}'.format(public_ip))
+
+    # Associate that Elastic IP address with an instance
+    response2 = client.associate_address(
+        DryRun=False,
+        InstanceId=instance.instance_id,
+        PublicIp=public_ip
+        )
+    print('was associated with the instance with id {}'.
+          format(instance.instance_id))
+
+# Get a list of the pubic DNS names of the instances_with_tag
+hosts_dev = []
+for instance in instances_with_tag:
+    public_dns_name = getattr(instance, 'public_dns_name', None)
+    if public_dns_name is not None:
+        hosts_dev.append(public_dns_name)
+
+# Write a shellscript to add remote keys to ~/.ssh/known_hosts
+print('Preparing shellscript to add remote keys to known_hosts')
+with open('add2known_hosts.sh', 'w') as f:
+    f.write('#!/bin/bash\n')
+    for public_dns_name in hosts_dev:
+        f.write('ssh-keyscan ' + public_dns_name + ' >> ~/.ssh/known_hosts\n')
+
+# Create a file named hostlist.py containing hosts_dev.
+# If a hostlist.py already exists, it will be overwritten.
+print('Writing hostlist.py')
+with open('hostlist.py', 'w') as f:
+    f.write('# -*- coding: utf-8 -*-\n')
+    f.write('from __future__ import unicode_literals\n')
+    f.write('hosts_dev = {}\n'.format(hosts_dev))
+
+# Wait
+wait_time = 45
+print('Waiting {} seconds to make sure all instances are ready...'.
+      format(wait_time))
+time.sleep(wait_time)
--- a/deploy-cluster-aws/startup.sh
+++ b/deploy-cluster-aws/startup.sh
@ -0,0 +1,81 @@
+#! /bin/bash
+
+# The set -e option instructs bash to immediately exit if any command has a non-zero exit status
+set -e
+
+function printErr()
+    {
+        echo "usage: ./startup.sh <tag> <number_of_nodes_in_cluster>"
+        echo "No argument $1 supplied"
+    }
+
+if [ -z "$1" ]
+  then
+    printErr "<tag>"
+    exit 1
+fi
+
+if [ -z "$2" ]
+  then
+    printErr "<number_of_nodes_in_cluster>"
+    exit 1
+fi
+
+TAG=$1
+NODES=$2
+
+# Check for AWS private key file (.pem file)
+if [ ! -f "pem/bigchaindb.pem" ]
+    then
+        echo "File pem/bigchaindb.pem (AWS private key) is missing"
+        exit 1
+fi
+
+# Change the file permissions on pem/bigchaindb.pem
+# so that the owner can read it, but that's all
+chmod 0400 pem/bigchaindb.pem
+
+# The following Python script does these things:
+# 0. allocates more elastic IP addresses if necessary,
+# 1. launches the specified number of nodes (instances) on Amazon EC2,
+# 2. tags them with the specified tag,
+# 3. waits until those instances exist and are running,
+# 4. for each instance, it associates an elastic IP address
+#    with that instance,
+# 5. writes the shellscript add2known_hosts.sh
+# 6. (over)writes a file named hostlist.py
+#    containing a list of all public DNS names.
+python launch_ec2_nodes.py --tag $TAG --nodes $NODES 
+
+# Make add2known_hosts.sh executable then execute it.
+# This adds remote keys to ~/.ssh/known_hosts
+chmod +x add2known_hosts.sh
+./add2known_hosts.sh
+
+# (Re)create the RethinkDB configuration file conf/rethinkdb.conf
+python create_rethinkdb_conf.py
+
+# rollout base packages (dependencies) needed before
+# storage backend (rethinkdb) and bigchaindb can be rolled out
+fab install_base_software
+
+# rollout storage backend (rethinkdb)
+fab install_rethinkdb
+
+# rollout bigchaindb
+fab install_bigchaindb
+
+# generate genesis block
+# HORST is the last public_dns_name listed in conf/rethinkdb.conf
+# For example:
+# ec2-52-58-86-145.eu-central-1.compute.amazonaws.com
+HORST=`tail -1 conf/rethinkdb.conf|cut -d: -f1|cut -d= -f2`
+fab -H $HORST -f fab_prepare_chain.py init_bigchaindb
+
+# initiate sharding
+fab start_bigchaindb_nodes
+
+# cleanup
+rm add2known_hosts.sh
+
+# DONE
--- a/docs/source/deploy-on-aws.md
+++ b/docs/source/deploy-on-aws.md
@ -0,0 +1,153 @@
+# Deploy a Cluster on AWS
+
+This section explains a way to deploy a cluster of BigchainDB nodes on Amazon Web Services (AWS). We use some Bash and Python scripts to launch several instances (virtual servers) on Amazon Elastic Compute Cloud (EC2). Then we use Fabric to install RethinkDB and BigchainDB on all those instances.
+
+**NOTE: At the time of writing, these script _do_ launch a bunch of EC2 instances, and they do install RethinkDB plus BigchainDB on each instance, but don't expect to be able to use the cluster for anything useful. There are several issues related to configuration, networking, and external clients that must be sorted out first. That said, you might find it useful to try out the AWS deployment scripts, because setting up to use them, and using them, will be very similar once those issues get sorted out.**
+
+## Why?
+
+You might ask why one would want to deploy a centrally-controlled BigchainDB cluster. Isn't BigchainDB supposed to be decentralized, where each node is controlled by a different person or organization?
+
+That's true, but there are some reasons why one might want a centrally-controlled cluster: 1) for testing, and 2) for initial deployment. Afterwards, the control of each node can be handed over to a different entity.
+
+## Python Setup
+
+The instructions that follow have been tested on Ubuntu 14.04, but may also work on similar distros or operating systems.
+
+**Note: Our Python scripts for deploying to AWS use Python 2 because Fabric doesn't work with Python 3.**
+
+Maybe create a Python 2 virtual environment and activate it. Then install the following Python packages (in that virtual environment):
+```text
+pip install fabric fabtools requests boto3 awscli
+```
+
+What did you just install?
+
+* "[Fabric](http://www.fabfile.org/) is a Python (2.5-2.7) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks."
+* [fabtools](https://github.com/ronnix/fabtools) are "tools for writing awesome Fabric files"
+* [requests](http://docs.python-requests.org/en/master/) is a Python package/library for sending HTTP requests
+* "[Boto](https://boto3.readthedocs.org/en/latest/) is the Amazon Web Services (AWS) SDK for Python, which allows Python developers to write software that makes use of Amazon services like S3 and EC2." (`boto3` is the name of the latest Boto package.)
+* [The aws-cli package](https://pypi.python.org/pypi/awscli), which is an AWS Command Line Interface (CLI).
+
+## AWS Setup
+
+Before you can deploy a BigchainDB cluster on AWS, you must have an AWS account. If you don't already have one, you can [sign up for one for free](https://aws.amazon.com/).
+
+### Create an AWS Access Key
+
+The next thing you'll need is an AWS access key. If you don't have one, you can create one using the [instructions in the AWS documentation](http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSGettingStartedGuide/AWSCredentials.html). You should get an access key ID (e.g. AKIAIOSFODNN7EXAMPLE) and a secret access key (e.g. wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY).
+
+You should also pick a default AWS region name (e.g. `eu-central-1`). That's where your cluster will run. The AWS documentation has [a list of them](http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region).
+
+Once you've got your AWS access key, and you've picked a default AWS region name, go to a terminal session and enter:
+```text
+aws configure
+```
+
+and answer the four questions. For example:
+```text
+AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
+AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
+Default region name [None]: eu-central-1
+Default output format [None]: [Press Enter]
+```
+
+This writes two files: 
+
+* `~/.aws/credentials`
+* `~/.aws/config`
+
+AWS tools and packages look for those files.
+
+### Get Enough Amazon Elastic IP Addresses
+
+Our AWS deployment scripts use elastic IP addresses (although that may change in the future). By default, AWS accounts get five elastic IP addresses. If you want to deploy a cluster with more than five nodes, then you will need more than five elastic IP addresses; you may have to apply for those; see [the AWS documentation on elastic IP addresses](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html).
+
+### Create an Amazon EC2 Key Pair
+
+Go to the AWS EC2 Console and select "Key Pairs" in the left sidebar. Click the "Create Key Pair" button. Give it the name `bigchaindb`. You should be prompted to save a file named `bigchaindb.pem`. That file contains the RSA private key. (Amazon keeps the corresponding public key.) Save the file in `bigchaindb/deploy-cluster-aws/pem/bigchaindb.pem`.
+
+You should not share your private key. 
+
+### Create an Amazon EC2 Security Group
+
+Go to the AWS EC2 Console and select "Security Groups" in the left sidebar. Click the "Create Security Group" button. Give it the name `bigchaindb`. The description probably doesn't matter but we also put `bigchaindb` for that.
+
+Add some rules for Inbound traffic:
+
+* Type = All TCP, Protocol = TCP, Port Range = 0-65535, Source = 0.0.0.0/0
+* Type = SSH, Protocol = SSH, Port Range = 22, Source = 0.0.0.0/0
+* Type = All UDP, Protocol = UDP, Port Range = 0-65535, Source = 0.0.0.0/0
+* Type = All ICMP, Protocol = ICMP, Port Range = 0-65535, Source = 0.0.0.0/0
+
+**Note: These rules are extremely lax! They're meant to make testing easy.** You'll want to tighten them up if you intend to have a secure cluster. For example, Source = 0.0.0.0/0 is [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for "allow this traffic to come from _any_ IP address."
+
+
+## Deployment
+
+Here's an example of how one could launch a BigchainDB cluster of 4 nodes tagged `wrigley` on AWS:
+```text
+cd bigchaindb
+cd deploy-cluster-aws
+./startup.sh wrigley 4
+```
+
+`startup.sh` is a Bash script which calls some Python 2 and Fabric scripts. Here's what it does:
+
+0. allocates more elastic IP addresses if necessary,
+1. launches the specified number of nodes (instances) on Amazon EC2,
+2. tags them with the specified tag,
+3. waits until those instances exist and are running,
+4. for each instance, it associates an elastic IP address with that instance,
+5. adds remote keys to `~/.ssh/known_hosts`,
+6. (re)creates the RethinkDB configuration file `conf/rethinkdb.conf`,
+7. installs base (prerequisite) software on all instances,
+8. installs RethinkDB on all instances,
+9. installs BigchainDB on all instances,
+10. generates the genesis block,
+11. starts BigchainDB on all instances.
+
+It should take a few minutes for the deployment to finish. If you run into problems, see the section on Known Deployment Issues below.
+
+The EC2 Console has a section where you can see all the instances you have running on EC2. You can `ssh` into a running instance using a command like:
+```text
+ssh -i pem/bigchaindb.pem ubuntu@ec2-52-29-197-211.eu-central-1.compute.amazonaws.com
+```
+
+except you'd replace the `ec2-52-29-197-211.eu-central-1.compute.amazonaws.com` with the public DNS name of the instance you want to `ssh` into. You can get that from the EC2 Console: just click on an instance and look in its details pane at the bottom of the screen. Some commands you might try:
+```text
+ip addr show
+sudo service rethinkdb status
+bigchaindb --help
+bigchaindb show-config
+```
+
+There are fees associated with running instances on EC2, so if you're not using them, you should terminate them. You can do that from the AWS EC2 Console.
+
+The same is true of your allocated elastic IP addresses. There's a small fee to keep them allocated if they're not associated with a running instance. You can release them from the AWS EC2 Console.
+
+## Known Deployment Issues
+
+### NetworkError
+
+If you tested with a high sequence it might be possible that you run into an error message like this:
+```text
+NetworkError: Host key for ec2-xx-xx-xx-xx.eu-central-1.compute.amazonaws.com 
+did not match pre-existing key! Server's key was changed recently, or possible 
+man-in-the-middle attack.
+```
+
+If so, just clean up your `known_hosts` file and start again. For example, you might copy your current `known_hosts` file to `old_known_hosts` like so:
+```text
+mv ~/.ssh/known_hosts ~/.ssh/old_known_hosts
+```
+
+Then terminate your instances and try deploying again with a different tag.
+
+### Failure of sudo apt-get update
+
+The first thing that's done on all the instances, once they're running, is basically [`sudo apt-get update`](http://askubuntu.com/questions/222348/what-does-sudo-apt-get-update-do). Sometimes that fails. If so, just terminate your instances and try deploying again with a different tag. (These problems seem to be time-bounded, so maybe wait a couple of hours before retrying.)
+
+### Failure when Installing Base Software
+
+If you get an error with installing the base software on the instances, then just terminate your instances and try deploying again with a different tag.
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -20,6 +20,7 @@ Table of Contents
   http-client-server-api
   python-driver-api-examples
   local-rethinkdb-cluster
+   deploy-on-aws
   cryptography
   models
   json-serialization
--- a/docs/source/installing-server.md
+++ b/docs/source/installing-server.md
@ -44,7 +44,7 @@ $ sudo dnf install libffi-devel gcc-c++ redhat-rpm-config python3-devel openssl-

 With OS-level dependencies installed, you can install BigchainDB Server with `pip` or from source.

-### How to Install BigchainDB with `pip`
+### How to Install BigchainDB with pip

 BigchainDB (i.e. both the Server and the officially-supported drivers) is distributed as a Python package on PyPI so you can install it using `pip`. First, make sure you have a version of `pip` installed for Python 3.4+:
 ```text
--- a/docs/source/python-server-api-examples.md
+++ b/docs/source/python-server-api-examples.md
@ -40,8 +40,10 @@ At a high level, a "digital asset" is something which can be represented digital
 In BigchainDB, only the federation nodes are allowed to create digital assets, by doing a special kind of transaction: a `CREATE` transaction.

 ```python
+from bigchaindb import crypto
+
 # create a test user
-testuser1_priv, testuser1_pub = b.generate_keys()
+testuser1_priv, testuser1_pub = crypto.generate_key_pair()

 # define a digital asset data payload
 digital_asset_payload = {'msg': 'Hello BigchainDB!'}
--- a/docs/source/running-unit-tests.md
+++ b/docs/source/running-unit-tests.md
@ -26,7 +26,7 @@ You can also run all unit tests via `setup.py`, using:
 $  python setup.py test
 ```

-### Using `docker-compose` to Run the Tests
+### Using docker-compose to Run the Tests

 You can also use `docker-compose` to run the unit tests. (You don't have to start RethinkDB first: `docker-compose` does that on its own, when it reads the `docker-compose.yml` file.)

--- a/setup.py
+++ b/setup.py
@ -71,7 +71,7 @@ setup(
        'rethinkdb==2.2.0.post4',
        'pysha3==0.3',
        'pytz==2015.7',
-        'cryptography==1.2.1',
+        'cryptography==1.2.3',
        'statsd==3.2.1',
        'python-rapidjson==0.0.6',
        'logstats==0.2.1',