Merge pull request #572 from bigchaindb/feat/540/provision-prod-node-aws

Using Terraform to provision a one-machine BigchainDB node on AWS
This commit is contained in:
Troy McConaghy 2016-08-18 14:24:05 +02:00 committed by GitHub
commit 9045126c57
13 changed files with 272 additions and 4 deletions

View File

@ -31,6 +31,7 @@ coverage:
- "bigchaindb/version.py"
- "benchmarking-tests/*"
- "speed-tests/*"
- "ntools/*"
comment:
# @stevepeak (from codecov.io) suggested we change 'suggestions' to 'uncovered'

View File

@ -14,4 +14,3 @@ In a production environment, a BigchainDB node can have several other components
* A RethinkDB proxy server
* Scalable storage for RethinkDB (e.g. using RAID)
* Monitoring software, to monitor all the machines in the node
* Maybe a configuration management (CM) server and CM agents on all machines

View File

@ -5,4 +5,6 @@ Production Node Setup & Management
:maxdepth: 1
overview
install-terraform
prov-one-m-aws

View File

@ -0,0 +1,27 @@
# Install Terraform
The [Terraform documentation has installation instructions](https://www.terraform.io/intro/getting-started/install.html) for all common operating systems.
Note: Hashicorp (the company behind Terraform) will try to convince you that running Terraform on their servers (inside Atlas) would be great. **While that might be true for many, it is not true for BigchainDB.** BigchainDB federations are supposed to be decentralized, and if everyone used Atlas, that would be a point of centralization. If you don't want to run Terraform on your local machine, you could install it on a cloud machine under your control (e.g. on AWS).
## Ubuntu Installation Tips
If you want to install Terraform on Ubuntu, first [download the .zip file](https://www.terraform.io/downloads.html). Then install it in `/opt`:
```text
sudo mkdir -p /opt/terraform
sudo unzip path/to/zip-file.zip -d /opt/terraform
```
Why install it in `/opt`? See [the answers at Ask Ubuntu](https://askubuntu.com/questions/1148/what-is-the-best-place-to-install-user-apps).
Next, add `/opt/terraform` to your path. If you use bash for your shell, then you could add this line to `~/.bashrc`:
```text
export PATH="/opt/terraform:$PATH"
```
After doing that, relaunch your shell or force it to read `~/.bashrc` again, e.g. by doing `source ~/.bashrc`. You can verify that terraform is installed and in your path by doing:
```text
terraform --version
```
It should say the current version of Terraform.

View File

@ -7,6 +7,7 @@ Deploying and managing a production BigchainDB node is much more involved than w
* Production nodes need monitoring
* Production nodes need maintenance, e.g. software upgrades, scaling
Thankfully, there are tools to help!
Thankfully, there are tools to help! We use:
This section explains how to use various tools to deploy and manage a production node.
* [Terraform](https://www.terraform.io/) to provision infrastructure such as AWS instances, storage and security groups
* [Ansible](https://www.ansible.com/) to manage the software installed on that infrastructure (configuration management)

View File

@ -0,0 +1,50 @@
# Provision a One-Machine Node on AWS
This page describes how to provision the resources needed for a one-machine BigchainDB node on AWS using Terraform.
## Get Set
First, do the [basic AWS setup steps outlined in the Appendices](../appendices/aws-setup.html).
Then go to the `.../bigchaindb/ntools/one-m/aws/` directory and open the file `variables.tf`. Most of the variables have sensible default values, but you can change them if you like. In particular, you may want to change `aws_region`. (Terraform looks in `~/.aws/credentials` to get your AWS credentials, so you don't have to enter those anywhere.)
The `ssh_key_name` has no default value, so Terraform will prompt you every time it needs it.
To see what Terraform will do, run:
```text
terraform plan
```
It should ask you the value of `ssh_key_name`.
It figured out the plan by reading all the `.tf` Terraform files in the directory.
If you don't want to be asked for the `ssh_key_name`, you can change the default value of `ssh_key_name` or [you can set an environmen variable](https://www.terraform.io/docs/configuration/variables.html) named `TF_VAR_ssh_key_name`.
## Provision
To provision all the resources specified in the plan, do the following. **Note: This will provision actual resources on AWS, and those cost money. Be sure to shut down the resources you don't want to keep running later, otherwise the cost will keep growing.**
```text
terraform apply
```
Terraform will report its progress as it provisions all the resources. Once it's done, you can go to the Amazon EC2 web console and see the instance, its security group, its elastic IP, and its attached storage volumes (one for the root directory and one for RethinkDB storage).
At this point, there is no software installed on the instance except for Ubuntu 14.04 and whatever else came with the Amazon Machine Image (AMI) specified in the configuration. The next step is to use Ansible to install and configure all the necessary software.
## (Optional) "Destroy"
If you want to shut down all the resources just provisioned, you must first disable termination protection on the instance:
1. Go to the EC2 console and select the instance you just launched. It should be named `BigchainDB_node`.
2. Click **Actions** > **Instance Settings** > **Change Termination Protection** > **Yes, Disable**
3. Back in your terminal, do `terraform destroy`
Terraform should "destroy" (i.e. terminate or delete) all the AWS resources you provisioned above.
## See Also
* The [Terraform Documentation](https://www.terraform.io/docs/)
* The [Terraform Documentation for the AWS "Provider"](https://www.terraform.io/docs/providers/aws/index.html)

1
ntools/README.md Normal file
View File

@ -0,0 +1 @@
This directory contains tools for provisioning, deploying and managing a BigchainDB node (on AWS, Azure or wherever).

20
ntools/one-m/aws/amis.tf Normal file
View File

@ -0,0 +1,20 @@
# Each AWS region has a different AMI name
# even though the contents are the same.
# This file has the mapping from region --> AMI name.
#
# These are all Ubuntu 14.04 LTS AMIs
# with Arch = amd64, Instance Type = hvm:ebs-ssd
# from https://cloud-images.ubuntu.com/locator/ec2/
variable "amis" {
type = "map"
default = {
eu-west-1 = "ami-55452e26"
eu-central-1 = "ami-b1cf39de"
us-east-1 = "ami-8e0b9499"
us-west-2 = "ami-547b3834"
ap-northeast-1 = "ami-49d31328"
ap-southeast-1 = "ami-5e429c3d"
ap-southeast-2 = "ami-25f3c746"
sa-east-1 = "ami-97980efb"
}
}

View File

@ -0,0 +1,6 @@
# You can get the value of "ip_address" after running terraform apply using:
# $ terraform output ip_address
# You could use that in a script, for example
output "ip_address" {
value = "${aws_eip.ip.public_ip}"
}

View File

@ -0,0 +1,6 @@
provider "aws" {
# An AWS access_key and secret_key are needed; Terraform looks
# for an AWS credentials file in the default location.
# See https://tinyurl.com/pu8gd9h
region = "${var.aws_region}"
}

View File

@ -0,0 +1,47 @@
# One instance (virtual machine) on AWS:
# https://www.terraform.io/docs/providers/aws/r/instance.html
resource "aws_instance" "instance" {
ami = "${lookup(var.amis, var.aws_region)}"
instance_type = "${var.aws_instance_type}"
tags {
Name = "BigchainDB_node"
}
ebs_optimized = true
key_name = "${var.ssh_key_name}"
vpc_security_group_ids = ["${aws_security_group.node_sg1.id}"]
root_block_device = {
volume_type = "gp2"
volume_size = "${var.root_storage_in_GiB}"
delete_on_termination = true
}
# Enable EC2 Instance Termination Protection
disable_api_termination = true
}
# This EBS volume will be used for database storage (not for root).
# https://www.terraform.io/docs/providers/aws/r/ebs_volume.html
resource "aws_ebs_volume" "db_storage" {
type = "gp2"
availability_zone = "${aws_instance.instance.availability_zone}"
# Size in GiB (not GB!)
size = "${var.DB_storage_in_GiB}"
tags {
Name = "BigchainDB_db_storage"
}
}
# This allocates a new elastic IP address, if necessary
# and then associates it with the above aws_instance
resource "aws_eip" "ip" {
instance = "${aws_instance.instance.id}"
vpc = true
}
# This attaches the instance to the EBS volume for RethinkDB storage
# https://www.terraform.io/docs/providers/aws/r/volume_attachment.html
resource "aws_volume_attachment" "ebs_att" {
# Why /dev/sdp? See https://tinyurl.com/z2zqm6n
device_name = "/dev/sdp"
volume_id = "${aws_ebs_volume.db_storage.id}"
instance_id = "${aws_instance.instance.id}"
}

View File

@ -0,0 +1,89 @@
resource "aws_security_group" "node_sg1" {
name_prefix = "BigchainDB_"
description = "Single-machine BigchainDB node security group"
tags = {
Name = "BigchainDB_one-m"
}
# Allow *all* outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
# SSH
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# DNS
ingress {
from_port = 53
to_port = 53
protocol = "udp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTP used by some package managers
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# NTP daemons use port 123 but the request will
# come from inside the firewall so a response is expected
# SNMP (e.g. for server monitoring)
ingress {
from_port = 161
to_port = 161
protocol = "udp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTPS used when installing RethinkDB
# and by some package managers
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# StatsD
ingress {
from_port = 8125
to_port = 8125
protocol = "udp"
cidr_blocks = ["0.0.0.0/0"]
}
# Don't allow port 8080 for the RethinkDB web interface.
# Use a SOCKS proxy or reverse proxy instead.
# BigchainDB Client-Server REST API
ingress {
from_port = 9984
to_port = 9984
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Port 28015 doesn't have to be open to the outside
# since the RethinkDB client and server are on localhost
# RethinkDB intracluster communications use port 29015
ingress {
from_port = 29015
to_port = 29015
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}

View File

@ -0,0 +1,19 @@
variable "aws_region" {
default = "eu-central-1"
}
variable "aws_instance_type" {
default = "m4.xlarge"
}
variable "root_storage_in_GiB" {
default = 10
}
variable "DB_storage_in_GiB" {
default = 30
}
variable "ssh_key_name" {
# No default. Ask as needed.
}