Avoiding the possible pitfalls of derived attributes

What this post is about

This post is intended for folks who are comfortable with the basics of attributes in Chef, but want to understand some of the subtleties better. It focusses on one specific aspect – derived or computed attributes – and how to make sure they end up with the value you intend. I’m going to cover four topics:

This post is with huge thanks to the many people at the 2014 Chef Summit and online who helped me with this topic, including but not limited to Noah Kantrowski, Julian Dunn and Lamont Granquist. The good ideas are theirs, any mistakes are mine.

Attribute precedence in practice

First: it’s not all bad news. Much of the time the attribute precedence scheme in Chef 11, although complex, will do what you want it to. The complexity is there because Chef supports multiple different approaches to customizing attribute values, particularly (1) using wrapper cookbooks versus (2) using roles and environments. You can see some of these tradeoffs in this description of Chef 11 attribute changes.

Here’s a reminder of the attribute precedence scheme. The highest numbers indicate the highest precedence:


Image linked from Chef documentation.

One benefit of the above scheme is that you can override default attributes with new values in a wrapper cookbook at default level. You do not need to use a higher priority level. This is important because you can wrapper the wrapper if you have to, without suffering “attribute priority inflation”. Why wrapper a wrapper? It can be very useful when you need multiple levels of specialization, e.g. to set defaults for an organization; override some of those defaults for a business system, and then do further customizations for a specific deployment of that business system.

A benefit of the precedence scheme when working with roles and environments is that you can set default attributes in a role or environment, and they will override the default attributes in cookbooks. The mental model is that your cookbooks are designed to be generally reusable, and have the least context-awareness. An environment provides additional context such as implementing the policies specific to your organization. Similarly, roles can configure the recipes to meet a specific functional purpose.

The possible pitfalls of derived attributes

So where can it go wrong? Let’s use a simple example, consisting of a base cookbook called “app”, and a wrapper cookbook called “my_app” which has a dependency on “app” in its metadata.rb. The contents of those cookbooks are:

cookbooks/app/attributes/default.rb:

  default["app"]["name"] = "app"
  default["app"]["install_dir"] = "/var/#{node["app"]["name"]}"

-------------------------------------------------------------------------------
cookbooks/app/recipes/default.rb:

ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, app name is: #{node['app']['name']};" +
      " install_dir is #{node['app']['install_dir']}"
  end
end

------------------------------------------------------------------------------
cookbooks/my_app/attributes/default.rb:

  default["app"]["name"] = "my_app"

------------------------------------------------------------------------------
cookbooks/my_app/recipes/default.rb:

  include_recipe "app::default"

And they are uploaded to the server using:

knife cookbook upload app my_app

The base “app” cookbook has an application “name” attribute which defaults to “app”, and an “install_dir” attribute which is calcyulated from the application name. For simplicity, the recipe which would actually deploy “app” just prints out the value of the attributes using a ruby block so that we see the values that would be used when the resources are run. The wrapper “my_app” cookbook changes the application name attribute from “app” to “my_app”.

What happens if we run the wrapper cookbook?

sudo chef-client -r 'recipe[my_app]' -linfo

run1

The “name” attribute is set to “my_app”, however the derived “install_dir” attribute still has its old value of “/var/app”, which is probably not what was intended.  This is not a question of priority: if the wrapper contained override["app"]["name"] = "my_app", we would get the same result. To understand why this happens, we need to look at the order of evaluation of the attributes.

What happens during this chef-client run in this example is as follows:

  1. As there are no roles or environments, the “compile” phase starts by evaluating attribute files based on the cookbooks in the runlist and their dependencies.
  2. The first cookbook in the list is “my_app”, which has a dependency on “app”. Dependencies are loaded first, so the default “name” attribute is set to “app” and the default “install_dir” attribute is set to “/var/app”.
  3. The “my_app” wrapper attribute file is loaded second and updates the default “name” attribute to “my_app”. The “install_dir” attribute is not updated and therefore keeps its value of “/var/app”.
  4. After that, the recipe files are loaded and the ruby_block resource is added to the resource collection, instantiated with the current values of the “name” and “install_dir” attributes.
  5. The “converge” phase executes the resources in the resource collection, printing out the values “my_app” and “/var/app”.

How attribute values are determined

Basic model

The following diagrams may help explain how attribute values are evaluated.

First, let’s work with a runlist like the following, consisting of three recipes in three cookbooks (cb1, wcb, cb3). The second recipe(wcb::rc2) is a wrapper of a recipe in a fourth cookbook (cb2::r2). Each cookbook has a single attribute file (a_cb1, etc).

runlist

The diagram below illustrates how the values of the attributes in this example change through the run. The attribute files are evaluated in runlist order but with dependencies (from metadata.rb) evaluated first. In this case, ‘a_cb2’ is evaluated after ‘a_cb1’ but before ‘a_wcb’. As the attribute files are evaluated, attribute values are put into “buckets” based on the attribute name and priority, e.g. node.default['x'] updates the value of ‘x’ in the default bucket for ‘x’. Each subsequent update to the same attribute and priority replaces the value in that bucket.

compile-converge

When the recipes are run and they access an attribute e.g. node['x'], the value that is passed back is that of the highest priority bucket that has a value in it.

Here’s an example showing the problem with a derived attribute. In the first step when “y” is calculated, the value of “x” is “app” and so “y” is set to “/var/app”. The value of “x” is set to “my_app” in the second step. When recipe r2 retrieves the attribute values in the third step, it therefore gets “my_app” and “/var/app”:

eval1

This diagram shows why using a higher priority does not solve the problem. Again, “y” is calculated in the first step and “x” is not set to “my_app” until the second step:

eval2

There is a wrinkle that you should be aware of. If you chose to set “normal” precedence rather than “override” in the above, the first run would still give the same result as above, but subsequent runs would “work”. Normal attributes are special because they persist across chef-client runs. If the wrapper cookbook contained normal['x']="my_app", “y” would still be computed as “/var/app” on the first run. On the second run, however, it would change to “/var/my_app”, because “my_app” would be in the “normal” bucket at the start of evaluation and would be used in the first step to calculate “y”.

Model including roles

Roles introduce two changes to our model:

  1. Role attributes have a higher precedence than those in cookbooks, effectively creating two new rows of buckets labelled as “role_default” and “role_override” in the diagram below
  2. Role attributes are always evaluated before cookbook attributes, regardless of their runlist position

eval3

These precedence rules mean that you can use roles to avoid the derived attribute problem, as shown below:

eval3

Setting the default value of “x” to “my_app” in the role guarantees that the value of “my_app” will be present when “y” is evaluated in cookbook cb2. “my_app” will be used rather than “app” because a role default value takes precedence over the cookbook default (it is in a higher priority bucket).

Model including environments

Environments add two new precedence levels, one between default and role_default; one after role_override and before “automatic”. Like roles, they are always evaluated before cookbook attributes.

Some ways to solve the problem

As a user of a cookbook with a derived attribute

As a user of a cookbook with a derived attribute and you do not have the option of modifying the base cookbook, you have two basic choices:

  • Always set any computed attributes if you change the attributes that they are derived from
  • Use a role or environment

Set computed attributes

The simplest approach is to make sure you set all of the attributes that are derived from attributes that you want to change. In our original example, we would specify both “name” and “install_dir”, e.g.:

my_app/attributes/default.rb:

  default["app"]["name"] = "my_app"
  default["app"]["install_dir"] = "/var/my_app"

This is probably the approach you will want to take if you use the wrapper cookbook approach.

Use a role or environment

As explained in the Model including roles section, attributes in roles have priority over attributes in cookbooks, and are also always evaluated before them. If you use roles, then setting an attribute in a role will also change any computed attributes. In our original example, we could define myapp role as:

roles/myapp.json:
{
  "name": "myapp",
  "default_attributes": {
    "app": {
      "name": "my_app"
    }
  }
}

knife role from file roles/myapp.json

Then run with a modified runlist:

sudo chef-client -r 'recipe[app]','role[myapp]' -linfo

The result would be to set both the “name” attribute to “my_app”, and the “install_dir” attribute to “/var/my_app”.

As an author of a cookbook with a derived attribute

As an author of a cookbook, you may prefer not to rely on users noticing derived attributes and handling them appropriately. Here are some possibilities to make life easier for your users:

  • Use a variable and not an attribute
  • Use delayed evaluation in the recipe
  • Use conditional assignment in the recipe

A gist for this example.

Use a variable and not an attribute

If the derived value should always be calculated, then don’t use an attribute, use a ruby variable in the recipe. In our original example, if “install_dir” should always be “/var” followed by the application name, remove the derived attribute and instead do the following in the recipe:

app/attributes/default.rb:

  default["app"]["name"] = "app"

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = "/var/#{node["app"]["name"]}"
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

Similarly, if the user needs to be able to change the root path for the install directory but the application should always be installed in a directory with the application name, create two attributes for “root_path” and “name”, and combine them using a variable:

app/attributes/default.rb:

  default["app"]["name"] = "app"
  default["app"]["root_path"] = "/var"

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = "#{node["app"]["root_path"]}/#{node["app"]["name"]}"
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

Use delayed evaluation in the recipe

Noah Kantrowitz proposed an approach for delaying evaluation of the derived attribute into the recipe, whilst still allowing it to be defined and overridden in the attribute file.

This approach sets up a template for the derived attribute in the attribute file, using the ruby %{} operator to define a placeholder. It then uses the ruby % operator in the recipe file to perform string interpolation, i.e. to substitute the actual value of the placeholder. In our original example, this would look like:

app/attributes/default.rb:

  default["app"]["name"] = "app"
  default["app"]["install_dir"] = "/var/%{name}"

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = node["app"]["install_dir"] % { name: node["app"]["name"]}
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

node["app"]["install_dir"] % { name: node["app"]["name"]} causes Ruby to substitute the value of the “name” attribute wherever the placeholder “%{name}” appears in the “install_dir” attribute. Because this substitution is delayed until the recipe is evaluated, the “name” attribute has already been set by the wrapper cookbook, and “install_dir” will be set to “/var/my_app”.

One consequence of this approach is that the “install_dir” attribute will have a value of “/var/%{name}” in the node object at the end of the run. This may not be desirable if “install_dir” was something you used in node searches. It also means that any cookbooks that reference the “install_dir” attribute need to perform the placeholder substitution before using it.

Use conditional assignment in the recipe

This approach is based on something suggested by Lamont Granquist. It uses conditional logic in the recipe that will only set the default value if no other value has been provided in a wrapper cookbook. Our example would look like this:

app/attributes/default.rb:

  default["app"]["name"] = "app"
  # default["app"]["install_dir"] = "/var/#{node['app']['name']" 

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = node["app"]["install_dir"] || "/var/#{node['app']['name']"
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

The line for “install_dir” in the attribute file is commented out, so that it does not take effect but a user can see that the attribute exists and can be overridden. The line install_dir = node["app"]["install_dir"] || "/var/#{node['app']['name']" will take any overridden value of the node attribute, but otherwise will set it based on the “name” attribute. The conditional assignment is important because otherwise it would overwrite an assignment done in the wrapper cookbook.

With this code, the “install_dir” attribute saved in the node object will be null unless it has been overridden. If you want the actual value used to be saved, you may want to conditionally set the node attribute rather than a variable, e.g. node.default["app"]["install_dir"] = "/var/#{node['app']['name']" unless node["app"]["install_dir"].

11 thoughts on “Avoiding the possible pitfalls of derived attributes

  1. Thanks for the post. A solution we hit upon was to have our recipes call another recipe, which would set the derived attribute.

    It’s not ideal but it worked well for our specific use case: we have a cookbook that sets a bunch of attributes, and then calls some third-party cookbooks with those attributes. We want the callers of our cookbook to only set one attribute, from which we will then derive several others.

    So we had each of our recipes that use these attributes call a recipe that simply sets those derived attributes, before proceeding on to calling the cookbooks that use them.

    Like

Leave a comment