We use Ansible to build and configure our servers.

Our Ansible configuration is stored on Bitbucket (see below for the URL).

Once you check out the above git repository from BitBucket, you'll find a more detailed description of the organization of our Ansible code at

https://bitbucket.org/ChemicalHeritageFoundation/ansible-inventory/src/master/README.md .

Overview of the codebase

The information in the Bitbucket repository is organized into four main file types: the hosts file, a set of playbooks, roles called by each playbook, and encrypted variables files.

Hosts file (hosts)

This is a catalog of our Ansible-managed servers. Each server is associated with information about its purpose (tags) and instructions about how to contact it (ip addresses and hostnames). The file is organized by category; each category contains zero or more server IP's associated with it. These are:

"Tier" categories:
- prod: production servers.
- stage: staging servers.
- dev: development servers. (Currently empty; our dev servers are not currently managed by Ansible.)
"Role" categories:
- fedora: a server on which Ansible knows to install Fedora.
- solr: a server on which Ansible knows to install SOLR.
- app: a server on which Ansible knows to install the digital collections application (chf-sufia.)
- jobs: a server whose purpose is to run background jobs.
- monitor: a server on which Ansible knows to install Netdata, our monitoring software (see https://github.com/firehol/netdata).
- aspace: an ArchivesSpace server (see http://archivesspace.org/).

Playbooks

These are the main tasks Ansible knows how to perform:

build_ami.yml : Create an Amazon machine image, which serves as a template for the actual EC2 cloud servers.
create_ec2.yml: Create a new server based on an AMI.
resize.yml: Stop a server, resize its disks, and then restart it.
sync.yml: Copy over data from production to staging.
update-box.yml: Runs updates on a server.

Roles

These are the subtasks of the playbooks (for instance, the recipe for installing Apache on a given server). They are listed in roles/*/tasks/main.yml files. The roles can also refer to:

Subsidiary tasks, known as handlers: roles/*/handlers/main.yml
Templates, typically for configuration files. These are roles/*/templates/* and are often stored as Jinja templates. (see http://jinja.pocoo.org/). For instance, the roles/apache/tasks/main.yml role has a choice of three Apache configuration files in roles/apache/templates/, which it uses depending on which server it's configuring.

Variables files

The values of the variables are used by the roles in building and modifying the servers. These are in group_vars/* and there is one for each category listed in the Hosts files. In certain cases, when a server belongs to two categories (e.g. it's both prod and solr), a special override file is used: e.g. group_vars/solr is overridden by group_vars/solr_prod_override .

The variables in these files describe specific features of each server, such as:

"Switches" that describe in more detail what software needs to be installed on the server. For instance, ansible-inventory/group_vars/jobs sets switch "fedora" to false but "sufia" to true; these switches are used by the roles to run or skip various parts of the installation.
A listing of each volume that needs to be created and mounted on the server, along with details such as size, mount point, and filesystem type.
Details associated with specific softare: versions, usernames, passwords, versions, port numbers and so on.

Variables files are the only ones encrypted.

The password for ansible-vault is in

Shared/P/Othmer Library/Digital Collections - Internal Access/ .

To view a file:

ansible-vault --vault-password-file FILE_CONTAINING_PASSWORD view ENCRYPTED_FILE

To search the files for a particular string:

ls -1 ansible-inventory/group_vars | while read N ; do echo $N: ; \
ansible-vault --vault-password-file FILE_CONTAINING_PASSWORD \
view ansible-inventory/group_vars/$N | grep -i STRING_TO_LOOK_FOR ; done

See Ansible-Hydra Submodule for details of the submodule we use

See Editing Ansible for notes on current practices for editing.

Building a new machine on AWS with Ansible

Note: ansible-vault password and all current AWS keys are in shared network drive Othmer Library\Digital Collections - Internal Access\Authentication - Confidential
If you do not have access, speak with Michelle about getting added to the allowed group.

Check ansible variables in the encrypted file
1. $ ansible-vault edit group_vars/all (will need password)
2. Look for # Use these temporarily for new instances
  1. RIght now certain values such as fedora_ip, solr_ip, and the rest will need to be determined once the box has been built and a valid IP exists.
  2. Generally speaking the best way to build boxes to minimize needing to go back and edit IPs is
    1. Fedora
    2. Solr
    3. Sufia machines (riiif, app)
3. ensure your ssh key is listed under keys_to_add, this is needed for capistrano deploys and ssh access with your personal account.
run the ansible playbook
1. $ ansible-playbook create_ec2.yml --ask-vault-pass --private-key=/PATH/TO/KEY --extra-vars "role=ROLE tier=SERVICE_LEVEL" --extra-vars "@group_vars/ROLE_SERVICE_LEVEL_override"
  1. Use chf_prod.pem for all production level machines
  2. Use test.pem for all other machines
  3. Select the role and service level of the machine you want to build.
2. OR, if you're re-running scripts on an existing machine:
  1. $ ansible-playbook -i hosts my_playbook.yml --ask-vault-pass [-e hosts=target]
    1. target can be one of the groups in the hosts file: staging, production, dev, ec2hosts
Assign an elastic IP to the new box if if needs one
Consider naming the aws volumes for 'root' and 'data' – this isn't done in the scripts (but probably could be!)
Set up to use capistrano (below) or just deploy with capistrano (above)
Run configure_prod.yml if on production to set up e-mail password resets, ssl, and backup procedures.

Updating boxes with Ansible

New AWS Key

Generate a new ssh key on AWS (EC2 > Keypairs)
1. place it in ~/.ssh
2. chmod 0600.
1. useful command if you're having problems with the key: $ openssl rsa -in chf_prod.pem -check

Git repositories for ansible - structure and use (this section is out of date)

The code we use to administer Sufia via Ansible lives at https://github.com/curationexperts/ansible-hydra

A wrapper with local configuration lives at https://bitbucket.org/ChemicalHeritageFoundation/ansible-inventory. Wrapper contains:

our hosts file
our group_vars files
ansible-hydra as a git submodule
an ansible.config which points to ansible-hydra for roles_path.
A number of roles and plays for CHF specific customization

Aside: pull requests can be submitted via branches; there's really no need to fork this repo since we'll all be owners.

To use

$ git clone clone git@bitbucket.org:ChemicalHeritageFoundation/ansible-inventory.git
$ cd ansible-inventory
$ git submodule update --init

Subsequently, when you pull ansible-inventory and the submodule has been updated, just run

$ git submodule update

Playbook Notes

configure_prod: Sets up backup (via s3 and postgres roles) for production servers. If the SSL certs are installed (and they should be) it will also set the machine to send secured password reset e-mails. It also adds the secrets data for capistrano. All of this is handled by roles which as of 2/26/16 are fairly atomic but could be trimmed down further.

Ansible