Deploying a multi-node application to AWS using chef-provisioning

This post is for people who are getting started with chef-provisioning and want to use it to deploy to AWS. It will take you through creating a couple of machines and deploying a simple application to them. In a future post, I’ll extend this to cover setting up some networking and infrastructure (VPC, subnets, security groups), but this post will assume you are using the default VPC created by AWS.

If you’re just looking to try chef-provisioning and you use Vagrant, you may want start with my other post: Deploying a multi-node application to Vagrant using chef-provisioning.

For an overview of chef-provisioning ( (formerly known as chef-metal), take a look at this Chef-Provisioning: Infrastructure as Code blog post. Also, see the Chef provisioning docs for more details.

Getting setup with chef-provisioning

Chef-provisioning is included in the latest ChefDK (0.3.6 at time of writing). Make sure you have this version or later installed by typing:

chef --version

If not, you can download or upgrade it here.

Create a new Chef repository to explore chef-provisioning:

cd ~
chef generate repo chefprov

We are going to use chef-client in local mode to run our provisioning recipes, so we want to set up a .chef directory that will be used specifically for this repo.

cd ~/chefprov
mkdir .chef
cd .chef

In the .chef directory, create a knife.rb file containing the following:

log_level                :info
current_dir = File.dirname(__FILE__)
node_name                "provisioner"
client_key               "#{current_dir}/dummy.pem"
validation_client_name   "validator"

Our workstation is going to behave like a chef-client talking to the local-mode server on our workstation,  so it needs a node name and a key. The key can be any well-formed key as the local-mode server will not validate it. For example:

ssh-keygen -f dummy.pem

Check the setup is working by performing an empty chef-client run:

chef-client -z

This will perform a local mode chef-client run with no recipes, using the built-in chef-zero server running on port 8889. You should see output similar to:

Starting Chef Client, version 11.18.0
[2015-01-31T16:16:43-06:00] INFO: *** Chef 11.18.0 ***
[2015-01-31T16:16:43-06:00] INFO: Chef-client pid: 14113
[2015-01-31T16:16:44-06:00] INFO: Run List is []
[2015-01-31T16:16:44-06:00] INFO: Run List expands to []
[2015-01-31T16:16:44-06:00] INFO: Starting Chef Run for provisioner
[2015-01-31T16:16:44-06:00] INFO: Running start handlers
[2015-01-31T16:16:44-06:00] INFO: Start handlers complete.
[2015-01-31T16:16:44-06:00] INFO: HTTP Request Returned 404 Not Found : Object not found: /reports/nodes/provisioner/runs
[2015-01-31T16:16:44-06:00] WARN: Node provisioner has an empty run list.
Converging 0 resources
[2015-01-31T16:16:44-06:00] INFO: Chef Run complete in 0.032696323 seconds
Running handlers:
[2015-01-31T16:16:44-06:00] INFO: Running report handlers
Running handlers complete
[2015-01-31T16:16:44-06:00] INFO: Report handlers complete
Chef Client finished, 0/0 resources updated in 1.117898047 seconds

If you’re curious, take a look at the ‘nodes/provisioner.json’ file. This is where the local-mode server stores its node data. You can also run commands like:

knife node show provisioner -z

This command will query the local-mode server and show summary details that it has about your provisioner node (i.e. your workstation).

Preparing the AWS client

To use chef-provisioning, you need to have the AWS CLI client installed. Follow the AWS CLI setup instructions to download and install the client, and to obtain your access keys.

If you are using an existing AWS account, please take appropriate precautions to make sure that you are working in a ‘sandbox’ that minimizes the chance of bad things when you get a script wrong. For example, you might configure your AWS client default to use a region where you do not have existing resources. To do this, edit the ~/.aws/config file and make sure  your selected region is in the default stanza. The following example sets us-west-2 (Oregon) as the default region:

region = us-west-2

Also check that you have the right access keys are configured as default. I prefer to separate these out into the ~/.aws/credentials file, rather than put them in the config file:

aws_access_key_id = AMADEUPACCESSKEY
aws_secret_access_key = AMadeUPSecreTACcesSKEYXXYyyyZzzZ1234

To make sure the AWS client is working, run the following command:

aws ec2 describe-availability-zones

It should give you a list of availability zones in the region you are using.

If, like me, you are a little more paranoid, you may want to create an IAM user with limited access to resources. I won’t cover this in detail, but below is an example policy that may be useful as a basis to restrict access. Feel free to skip over this to the next section!

  "Version": "2012-10-17",
  "Statement": [
            "Sid": "AllowDescribeAndBasicSetup",
            "Effect": "Allow",
            "Action": ["ec2:Describe*", 
                "ec2:ModifyInstanceAttribute" ],
            "Resource": "*"
            "Sid": "AllowInstanceResourceActions",
            "Effect": "Allow",
            "Action": ["ec2:RunInstances"],
            "Resource": [
            "Sid": "AllowOtherInstanceActions",
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:ec2:us-west-2:632055226646:instance/*"
            "Sid": "AllowToSeeWhatCantDo",
            "Effect": "Allow",
            "Action": [
            "Resource": "*"

Lines 3-8 allow the user to perform most query operations on any region, import key pairs (see later section on SSH access), and create tags (which is something the chef-provisioning resources like ‘machine’ do by default).

Lines 9-21 only allow the user to create instances and associate them with resources in the us-west-2 region. Lines 22-31 allow the user to manage the instances after creation.

Lines 32-38 are optional but can be useful. If the access policy is too restrictive, you will get a ‘You are not authorized to perform this operation’ message. Sometimes this will include an encoded message which gives you information about what you were not authorized to do. With the above authorization, you can run:

aws sts decode-authorization-message --encoded-message xxxxxxxxxxxxxxxx

Where “xxxxxxxxxxxxxxxx” is the encoded message.

Preparing SSH access into AWS

In order to run chef-client on the instances that you are going to create in AWS, you need to enable SSH access to those instances. There are two main things you need to do:

  • Setup a key-pair
  • Enable SSH access from your IP address

Setup a key-pair

Use the EC2 console to create a keypair in the region you are using. Download the private key (‘test2_aws.pem’) and save it in ~/.ssh. Make sure its permissions are read-only:

chmod 400 ~/.ssh/test2_aws.pem

You will also need the public key. You can retrieve this from the private key by running:

ssh-keygen -y >

and giving it the name of the file.

If you are using an IAM user without a console logon, generate a keypair using ssh-keygen then import it using the AWS CLI:

aws ec2 import-key-pair --key-name test2_aws --public-key-material file://

The ‘file://’ method of loading the file ensures that the key is base64 encoded, which is required to upload a key via the CLI.

Enable SSH access from your IP address

By default, AWS does not enable SSH from external sources into its VPCs. You need to use the EC2 console to allow inbound SSH access from your IP address, by adding a rule to a security group.

This post assumes you can add this rule to the default security group for the default VPC in the region you are using. This will allow immediate access to the machines we will create with chef-provisioning.  If you can’t do this, the examples won’t work without some manual intervention – i.e. you will need to add the security group to the created instances before you can run recipes on them.

We also need to let chef-provisioning know about the keys. Add the following to your ./chef/knife.rb file:

knife[:ssh_user] = "ubuntu"
knife[:aws_ssh_key_id] = 'test2_aws'
private_keys     'test2_aws' => '/home/christine/.ssh/test2_aws.pem'
public_keys      'test2_aws' => '/home/christine/.ssh/'

Line 1 is the user name to use when SSH’ing to the instance. For the standard Ubuntu image, this should be ‘ubuntu’. Line 2 specifies which key name to use for AWS, and Lines 3 & 4 setup the locations of the private and public keys.

Enable external access to the application

Our test application requires TCP access on port 3001. Open this port by adding a Custom TCP rule to the security group for the default VPC, allowing access from any IP address (CIDR block ‘’).

The inbound rules should now look something like this:

Creating the AWS instances

Create basic machine provisioning recipe

Our first pass at the chef-provisioning recipes will just create the instances, with nothing on them.

We will create two recipes. The first will set up the AWS-specific details. The second will create the machines.


require 'chef/provisioning/aws_driver'
with_driver 'aws'

  with_machine_options :bootstrap_options => {
  :key_name => 'test2_aws',
  :instance_type => 't1.micro',
  :associate_public_ip_address => true

Lines 4-8 specify what sort of instances we want to create.

Line 2 tells chef to use the ‘chef-provisioning-aws’ provider. This provider is one of two AWS providers distributed with ChefDK, and is an alternative to the more established chef-provisioning-fog driver. I am using it because of its support for a growing range of other AWS resources (VPCs, security groups, S3, and others). To use the fog driver, replace ‘aws’ with ‘fog:aws’. You may also need to make other changes, for example ‘:instance_type’ is ‘flavor_id’ in the fog driver.

In Line 7, we choose the smallest and cheapest type of instance to experiment with.

Line 8 associates a public IP address with the instance, so that chef can SSH to it.

We are using the default AMI, which is currently Ubuntu 14.04.

The full set of ‘:bootstrap_options’ corresponds to the options listed for the AWS create-instance method.

The second recipe specifies a simple topology with two machines in it:


require 'chef/provisioning'
machine 'db'
machine 'appserver'

This recipe will create and start the machines, and bootstrap the chef-client onto them.

UPDATE: chef-provisioning-aws 1.2.1 introduces new default AMIs. If the command above fails with:

AWS::EC2::Errors::InvalidParameterCombination: Non-Windows instances with a 
virtualization type of 'hvm' are currently not supported for this instance type.

then replace t1.micro with t2.micro in the above:

  with_machine_options :bootstrap_options => {
  :key_name => 'test2_aws',
  :instance_type => 't2.micro',
  :associate_public_ip_address => true

UPDATE: If the above command fails with:

         Unexpected Error:
         ChefZero::ServerNotFound: No socketless chef-zero server on given port 8889

then add the following to each machine resource:

machine 'db' do
  chef_server( :chef_server_url => 'http://localhost:8889') 
machine 'appserver' do
 chef_server( :chef_server_url => 'http://localhost:8889') 

or add the following in the setup recipe:

with_chef_server "http://localhost:8889"

This problem exists in chefDK 6.0 to 6.2.

Run the recipe

Before proceeding, be aware that you will be charged for the resources that these recipes create. Make sure you delete any instances after you are done. I will tell you how to do that using chef-provisioning, but I advise you to logon to the EC2 console and making sure you have no instances left running when you are done.

To run the recipes, enter:
chef-client -z aws_setup.rb topo.rb

For each of the two machines, you should see the chef-client run create a node, wait for the machine to become connectable (this may take a while), bootstrap the chef-client and perform an empty run.

If you go to the EC2 console, you should see both machines (named ‘db’ and ‘appserver’) are up and running.

Working around SSH issue

If you are trying this with ChefDK 0.3.6 on Ubuntu, you may encounter the following error:

         Chef encountered an error attempting to load the node data for "db"

         Unexpected Error:
         NoMethodError: undefined method `gsub' for nil:NilClass

This is a known issue with chef-provisioning providing a bad URL for the local-mode server. If you can upgrade to chefDK 0.4.0, this problem has been fixed (but be aware that chefDK 0.4 embeds Chef 12 and not Chef 11).

A workaround for chefDK 0.3.6 is to create the following Gemfile in your chefprov directory:

source ''

gem 'chef-dk'
gem 'chef-provisioning'
gem 'chef-provisioning-aws'
gem 'net-ssh', '=2.9.1'

and then run chef-client using:

bundle exec chef-client -z aws_setup.rb topo.rb

This will run the chef-client using a previous version of ‘net-ssh’, which avoids the problem.

You will likely need to use ‘bundle exec’ in front of all of the chef-client runs described in this post.

Setup and deploy the Application

Get the application cookbooks

The basic application we will install can be found in the ‘test-repo’ for the ‘knife-topo’ plugin on Github.

First, download the latest release of  the knife-topo repository and unzip it.

Then we will use ‘berks vendor’ to assemble the cookbooks we need to deploy this application:

cd knife-topo-0.0.11/test-repo
berks vendor
cp -R berks-cookbooks/* ~/chefprov/cookbooks

Line 2 uses the Berksfile to assemble all of the necessary cookbooks into the ‘berks-cookbooks’ directory.

Line 3 copies them into our ‘chefprov’ repo, where the local-mode server will look for them when it runs the chef-provisioning recipes.

Extend machine provisioning to include runlists

Now change the topo.rb provisioning recipe as follows:

require 'chef/provisioning'

machine 'db' do
  run_list ['apt','testapp::db']

machine 'appserver' do
  run_list ['apt','testapp::appserver']

and rerun the chef-client:
chef-client -z aws_setup.rb topo.rb

This time, the chef-client running on the two instances will execute the specified recipes, installing nodejs on ‘appserver’ and mongodb on ‘db’.

Deploy the application

We will now create a third recipe to deploy the application. We could have included this as part of the ‘topo.rb’ recipe, but I chose to make it a separate recipe, so it can be run independently.

Here’s what the recipe looks like:


require 'chef/provisioning'

machine 'appserver' do
 run_list ['testapp::deploy']
 attribute ['testapp', 'user'], 'ubuntu'
 attribute ['testapp', 'path'], '/var/opt'
 attribute ['testapp', 'db_location'], lazy { search(:node, "name:db").first['ipaddress'] }

ruby_block "print out public IP" do
 block do
 appservernode = search(:node, "name:appserver").first"Application can be accessed at http://#{appservernode['ec2']['public_ipv4']}:3001")

Line 4 runs the recipe to deploy the application.

Lines 5 to 7 set attributes on the node that customize the test application. For example, Line 7 sets the attribute node[‘testapp’][‘db_location’] to the IP address of the database server, which it looks up using a search for node information stored in the local-mode Chef server (i.e. in the ‘chefprov/nodes’ directory).

In Line 5, ‘lazy’ is used so that the search occurs during the converge phase of the chef-run, not during the compile phase. This is important if the ‘topo.rb’ and ‘deploy.rb’ recipes are run in a single runlist, because the IP address of the database server will only be known after the db machine resource has actually been executed in the converge phase.

Lines 8-13 print out the URL for the application, which uses the public IP address of the application server. This is executed in a ‘ruby_block’ resource so that it occurs in the converge phase once the application server has been created and configured.

Run the chef-client:
chef-client -z aws_setup.rb deploy.rb

At the end of the run, you should see something like:

  * ruby_block[print out public IP] action run[2015-01-31T21:28:38-06:00] INFO: Processing ruby_block[print out public IP] action run (@recipe_files::/home/christine/chefprov/deploy.rb line 9)
[2015-01-31T21:28:38-06:00] INFO: Application can be accessed at
[2015-01-31T21:28:38-06:00] INFO: ruby_block[print out public IP] called

    - execute the ruby block print out public IP
[2015-01-31T21:28:38-06:00] INFO: Chef Run complete in 21.74813493 seconds

Running handlers:
[2015-01-31T21:28:38-06:00] INFO: Running report handlers
Running handlers complete
[2015-01-31T21:28:38-06:00] INFO: Report handlers complete
Chef Client finished, 2/2 resources updated in 23.594399737 seconds

Browse to the application URL, and you should see something like:

 Congratulations! You have installed a test application using the knife topo plugin.

 Here are some commands you can run to look at what the plugin did:

    knife node list
    knife node show dbserver01
    knife node show appserver01
    knife node show appserver01 -a normal
    knife data bag show topologies test1
    cat cookbooks/testsys_test1/attributes/softwareversion.rb

Go to the knife-topo plugin on Github

Ignore the example commands as we did not use the knife-topo plugin.

Destroy the machines

To destroy the machines, create a recipe:


require 'chef/provisioning'
machine 'db' do

machine 'appserver' do

And run it:
chef-client -z destroy.rb

You should see messages like:

  * machine[appserver] action destroy[2015-02-01T09:20:43-06:00] INFO: Processing machine[appserver] action destroy (@recipe_files::/home/christine/chefprov/destroy.rb line 7)

    - Terminate appserver (i-93a8db50) in us-west-1 ...[2015-02-01T09:20:46-06:00] INFO: Processing chef_node[appserver] action delete (basic_chef_client::block line 26)

    - delete node appserver at http://localhost:8889[2015-02-01T09:20:46-06:00] INFO: Processing chef_client[appserver] action delete (basic_chef_client::block line 30)
[2015-02-01T09:20:46-06:00] INFO: chef_client[appserver] deleted client appserver at http://localhost:8889

    - delete client appserver at clients

For both ‘db’ and ‘appserver’. If the run succeeds but you do not see these messages, you may have specified the wrong machine name.

Until you are confident in your scripts, you may want to use the EC2 console to make sure you have terminated the instances (don’t forget to navigate to the right region). You may also want to remove the added rules from the VPC default security group.