Chef: Bootstrapping an autoscaled Node using Cloud-Init, Cloud-Config and UserData

01 / Oct / 2015 by Aakash Garg 1 comments

This blog refers to the bootstrapping an autoscaled Node using Cloud-Init, Cloud-Config and UserData and get it available for serving requests. Here, in this blog I am bootstrapping a web server. We will be using a yaml file and a recipe to implement the same.

What is Cloud-Init?

Cloud-Init is the package which is already installed on the cloud environment, it is used to initialize the early stages of the cloud servers. It locates user-data and executes the instructions pre written on it. It understands only two userData formats – shell script and cloud-config directives.

What is Cloud-Config?

It is Cloud-Init script which defines the parameters which are required when the server boots up.

What is UserData?

UserData is a type of metadata that can be passed to an instance to perform different actions including executing scripts and Chef-recipes on boot. It is limited to 16KB.


It is the configuration management tool written in Ruby and Erlang. It is used to manage and configure organisation’s servers.


While bootstrapping a node from Chef, user-data on the node will be called by the Cloud-Init program and chef-client will be installed which will register itself on Chef Server and will pull out all the configured recipes for that ‘type’ of node.

Use Case:

We had a requirement where we wanted an AWS autoscaling node to automatically bootstrap and get itself ready for serving web requests via Nginx. Autoscaling node supposed to be a raw machine and will get its packages and services from Chef server via User data and Cloud-Init.

User Data Dissection



# This example assumes the instance is 12.04 (precise)
# The default is to install from packages.

# Key from
apt_upgrade: true
– source: "deb $RELEASE-0.10 main"
key: |
Version: GnuPG v1.4.9 (GNU/Linux)


#The above will verify that the we are downloading is from trusted source



The above will upgrade the package sources list and it will install chef client on the machine.
apt_upgrade :- It will upgrade the package sources.
apt_sources :- Here we specify the package which we are required to be installed.
key :- Key present here will verify the key that the package is being installed from trusted source.


– "recipe[nginx]"

# Valid values are ‘gems’ and ‘packages’ and ‘omnibus’
install_type: "packages"

# Boolean: run ‘install_type’ code even if chef-client
# appears already installed.
force_install: false

# Chef settings
server_url: "https://ip-X-X-X-X.ec2.internal/organizations/organisation-name"

# Node Name
# Defaults to the instance-id if not present
node_name: "#{Socket.gethostname}-test"
# Default validation name is organisation-validator
validation_name: "organisation-validator"
validation_key: |

#The above key is organisation-validator.pem which is present on the chef server



The above code will be executed after chef-client has installed
run_list :- Here we specify the recipes which needs to be executed after Chef has booted.
install_type :- It specifies which type of installation you want which can be ‘gems’, ‘packages’ and ‘omnibus’.
server_url :- Here we specify the Chef server URL.
node_name :- Here we specify the name of the node which you want to be registered as.
validation_name :- Here we give the name organisation validator.
validation_key :- Here we specify the organisation validator key to authenticate the node with Chef server.


# if install_type is ‘omnibus’, change the url to download
omnibus_url: ""

# Capture all subprocess output into a logfile
# Useful for troubleshooting cloud-init issues
output: {all: ‘| tee -a /var/log/cloud-init-output.log’}



omnibus_url :- Here we specify the URL from which we can download packages if install type omnibus is selected.
output :- It specifies where you want to store the logs of the above events.
runcmd :- Here we can write shell commands or scripts which will be executed on the machine.

The above yaml script can be downloaded from here


Logs can be checked at cloud-init-output.log and client.log.


This file would contain all the logs of the events occurred when the userdata.yaml is executed. This has specified in the userdata.yaml itself.
output: {all: ‘| tee -a /var/log/cloud-init-output.log’}


This file contain the logs after Chef has installed and it tries to connect to Chef server. We can check here whether our node has connected to our Chef server.

What this UserData will do?

  • It will first update the package sources.
  • Then it will download and install the chef-client on that node from the defined repository.
  • For initial handshake and registration, it will use organization validator key and will generate a RSA Private key (client.pem) for further communication.
  • After making connection with the Chef server, it will get its packages as configured on Chef server and execute the recipe which is passed as run_list.
  • Once chef-client run is completed, Server is ready !!


We can bootstrap our servers automatically with the pre defined cookbooks without logging into that server instance or into chef-workstation to manually bootstrap it.


comments (1 “Chef: Bootstrapping an autoscaled Node using Cloud-Init, Cloud-Config and UserData”)

  1. Narendra

    Hi Garg,

    It looks very helpful for me to get started bootstrapping the ec2 instances automatically. I’ve come across a problem, where I’ve to pass unique name to the name= “” parameter to identify my components in chef-server. am using terraform currently to provision the infrastructure. I am not able to achieve this scenario, it would be great if you can help me on this. Thanks very much in advance


Leave a Reply

Your email address will not be published. Required fields are marked *