Advanced Chef: Writing Heavy Weight Resource Providers (HWRP)

The last in a short series of Chef related blog posts

Heavy Weight Resource Providers

This is a bit of a backwards term. When Chef first came out, there was no Light Weight Resource Provider (LWRP) syntax and any hardcore extension to Chef had to be written in Ruby. However, Opscode saw a need to be filled and created LWRP, making it easier to create your own Resources. The problem comes when LWRP cannot fulfill all of your needs. This means you need to fall back to writing pure ruby code. For lack of a better term, I’ll call this method a HWRP, or Heavy Weight Resource Provider.

While writing a LWRP is meant to be simple and elegant, writing a HWRP is meant to be flexible. It gives you the full power of ruby in exchange for elegance.

Let’s go over some interactions between LWRP and HWRP.

A few things you need to know

HWRPs and LWRPS are interchangeable

With LWRP you are taught to create a Resource and a Provider together. This is the simplest way. However, just because you need to convert a resource definition or a provider into a HWRP you do not need to convert both.

The LWRP syntax ’compiles’ into real ruby code, so Chef will not know the difference in how they were defined. A valid cookbook directory structure:

1
2
3
4
5
6
7
8
libraries/
    provider_default.rb
providers/
resources/
    default.rb
recipes/
    default.rb
metadata.rb

HWRPs live in library files

Anything you put in resources/ or providers/ Chef will attempt to parse at runtime. We don’t want Chef trying to read our HWRP as the Chef DSL, we want it to interpret it as code. Luckily, anything stored in the libraries/ folder Chef will try to import at runtime. A good example of this can be seen in the runit cookbook.

How to write a HWRP:

Let’s go through an example. We are going to create a HWRP that is very simple, it could easily be written as a LWRP. In fact, it will be. While we write the HWRP I will post examples of the analogous LWRP code when applicable.

  • Cookbook Name: cloud
  • Resource Name: magic

An example of calling this in a recipe:

1
2
3
4
5
cloud_magic "My Cloud Magic" do
  action :create
  cloud "My Cloud Magic"
  magic true
end

Resources

Class Structure

We need to inherit from the appropriate Chef classes in our HWRP. Note the class hierarchy as well as the inheritance:

HWRP

1
2
3
4
5
6
7
8
9
10
11
require 'chef/resource'

class Chef
  class Resource
    class CloudMagic < Chef::Resource

    # Some Magic Happens

    end
  end
end

LWRP

This has no counterpart in a LWRP as this part is done automagically by Chef when it reads your files.

Initialization Method

We need to override the initialize method to make sure we have some defaults. We aren’t defining all of our resource attributes here, just the ones that need defaults.

HWRP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
require 'chef/resource'

class Chef
  class Resource
    class  CloudMagic < Chef::Resource

      def initialize(name, run_context=nil)
        super
        @resource_name = :cloud_magic          # Bind ourselves to the name with an underscore
        @provider = Chef::Provider::CloudMagic # We need to tie to our provider
        @action = :enable                     # Default Action Goes here
        @allowed_actions = [:create, :remove]

        # Now we need to set up any resource defaults
        @magic = true
        @cloud = name  # This is equivalent to setting :name_attribute => true
      end

    # Some Magic Happens

    end
  end
end

Here is a similar LWRP although it actually defines a bit more than the HWRP due to the terseness of the syntax.

LWRP

1
2
3
4
5
action :enable, :remove
default_action :enable

attribute :magic, kind_of => [TrueClass, FalseClass], :default => true
attribute :cloud, kind_of => String, :name_attribute => true

Attribute Methods

Now lets set up some attribute methods in our HWRP. Make sure to read the code comments for an explanation of what is going on.

HWRP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
require 'chef/resource'

class Chef
  class Resource
    class  CloudMagic < Chef::Resource

      def initialize(name, run_context=nil)
        super
        @resource_name = :cloud_magic          # Bind ourselves to the name with an underscore
        @provider = Chef::Provider::CloudMagic # We need to tie to our provider
        @action = :enable                      # Default Action Goes here
        @allowed_actions = [:create, :remove]

        # Now we need to set up any resource defaults
        @magic = true
        @cloud = name  # This is equivalent to setting :name_attribute => true
      end

      # Define the attributes we set defaults for
      def magic(arg=nil)
        set_or_return(:magic, arg, :kind_of => [TrueClass, FalseClass])
      end

      def cloud(arg=nil)
        # set_or_return is a magic function from Chef that does most of the heavy
        # lifting for attribute access.
        set_or_return(:cloud, arg, :kind_of => String)
        # For now all you need to know is that the method name should be given as
        # a symbol (like :cloud) and that you declare the type with :kind_of
      end

    end
  end
end

There are no more changes to our LWRP everything we just added with these two methods is already included in the LWRP syntax.

LWRP

1
2
3
4
5
action :enable, :remove
default_action :enable

attribute :magic, kind_of => [TrueClass, FalseClass], :default => true
attribute :cloud, kind_of => String, :name_attribute => true

Awesome, now we have defined our resource. Let’s move on to a basic provider.

Providers

Class Structure

Very similar to resources, here is the basic class structure for a provider.

HWRP

1
2
3
4
5
6
7
8
9
class Chef
  class Provider
    class CloudMagic < Chef::Provider

    # Magic Happens

    end
  end
end

LWRP

Once again, this is pure boilerplate with no analogy in a LWRP.

Initialization Method

While we don’t need to write an initialize method (we can), we do need to override load_current_resource.

HWRP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Chef
  class Provider
    class CloudMagic < Chef::Provider

      # We MUST override this method in our custom provider
      def load_current_resource
        # Here we keep the existing version of the resource
        # if none exists we create a new one from the resource we defined earlier
        @current_resource ||= Chef::Resource::CloudMagic.new(new_resource.name)

        # New resource represents the chef DSL block that is being run (from a recipe for example)
        @current_resource.magic(new_resource.cloud)
        # Although you can reference @new_resource throughout the provider it is best to
        # only make modifications to the current version
        @current_resource.cloud(new_resource.magic)
        @current_resource
      end

      # Magic Happens

    end
  end
end

LWRP

No LWRP changes yet, all we have done so far is set up some boilerplate for the HWRP.

Action Methods

Now it is time to define what we do in our actions, with our HWRP we need to define methods like action_create to define a :create action. Chef will do some introspection to find these methods and hook them up.

HWRP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class Chef
  class Provider
    class CloudMagic < Chef::Provider

      # We MUST override this method in our custom provider
      def load_current_resource
        # Here we keep the existing version of the resource
        # if none exists we create a new one from the resource we defined earlier
        @current_resource ||= Chef::Resource::CloudMagic.new(new_resource.name)

        # New resource represents the chef DSL block that is being run (from a recipe for example)
        @current_resource.magic(new_resource.cloud)
        # Although you can reference @new_resource throughout the provider it is best to
        # only make modifications to the current version
        @current_resource.cloud(new_resource.magic)
        @current_resource
      end

      def action_create
        # Some ruby code
      end

      def action_remove
        # More ruby code
      end

    end
  end
end

For our LWRP, it is pretty simple

LWRP

1
2
3
4
5
6
7
action :create do
  # Some Chef Code
end

action :remove do
  # Some more Chef code
end

Off into the wild blue yonder

At this point you should have a functioning HWRP. Sure, it doesn’t do anything, but it is best to start small.

Also take a look at the swap cookbook, a hybrid cookbook with a provider written as a HWRP.

Now that you can read these, you should be able to start picking apart definitions inside Chef core as they are very similar.

Andrew Gross is a Developer at Yipit, you can find him as @awgross on Twitter

Semantic Chef: Writing Readable Recipes

The second in a short series of Chef related blog posts

Why Chef is awesome

  • Programmable in Ruby
  • Repeatable recipes define your infrastructure

Why Chef is painful

  • Programmable in Ruby
  • Slow ‘REPL’ to test changes

Ruby lets you do just about anything. It’s a programming language, that’s what it is supposed to do. Chef’s DSL is concise, readable and extensible. The power of Chef comes by allowing you to program in Ruby. When mixing Chef’s DSL with the power of ruby, things can get out of hand if you are not careful.

Cookbook Examples

Simple

The Opscode examples show nice things. As a systems person it is very easy to understand what is going on.

1
2
3
4
5
6
7
8
9
10
11
12
13
package "ntp" do
    action [:install]
end

template "/etc/ntp.conf" do
    source "ntp.conf.erb"
    variables( :ntp_server => "time.nist.gov" )
    notifies :restart, "service[ntpd]"
end

service "ntpd" do
    action [:enable,:start]
end

Messy

Add some more complex logic in and we get a bit messier.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
if platform_family?("rhel") and node['python']['install_method'] == 'package'
  pip_binary = "/usr/bin/pip"
elsif platform_family?("smartos")
  pip_binary = "/opt/local/bin/pip"
else
  pip_binary = "/usr/local/bin/pip"
end

# Ubuntu's python-setuptools, python-pip and python-virtualenv packages
# are broken...this feels like Rubygems!
# http://stackoverflow.com/questions/4324558/whats-the-proper-way-to-install-pip-virtualenv-and-distribute-for-python
# https://bitbucket.org/ianb/pip/issue/104/pip-uninstall-on-ubuntu-linux
remote_file "#{Chef::Config[:file_cache_path]}/distribute_setup.py" do
  source node['python']['distribute_script_url']
  mode "0644"
  not_if { ::File.exists?(pip_binary) }
end

execute "install-pip" do
  cwd Chef::Config[:file_cache_path]
  command <<-EOF
  #{node['python']['binary']} distribute_setup.py --download-base=#{node['python']['distribute_option']['download_base']}
  #{::File.dirname(pip_binary)}/easy_install pip
  EOF
  not_if { ::File.exists?(pip_binary) }
end

Inscrutable

Sure you can figure out what it does… eventually.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
if node["platform"] == "windows"
    existence_check = :exists?
# Where will also return files that have extensions matching PATHEXT (e.g.
# *.bat). We don't want the batch file wrapper, but the actual script.
    which = 'set PATHEXT=.exe & where'
    Chef::Log.debug "Using exists? and 'where', since we're on Windows"
else
    existence_check = :executable?
    which = 'which'
    Chef::Log.debug "Using executable? and 'which' since we're on Linux"
end

# COOK-635 account for alternate gem paths
# try to use the bin provided by the node attribute
if ::File.send(existence_check, node["chef_client"]["bin"])
  client_bin = node["chef_client"]["bin"]
  Chef::Log.debug "Using chef-client bin from node attributes: #{client_bin}"
# search for the bin in some sane paths
elsif Chef::Client.const_defined?('SANE_PATHS') && (chef_in_sane_path=Chef::Client::SANE_PATHS.map{|p| p="#{p}/chef-client";p if ::File.send(existence_check, p)}.compact.first) && chef_in_sane_path
  client_bin = chef_in_sane_path
  Chef::Log.debug "Using chef-client bin from sane path: #{client_bin}"
# last ditch search for a bin in PATH
elsif (chef_in_path=%x{#{which} chef-client}.chomp) && ::File.send(existence_check, chef_in_path)
  client_bin = chef_in_path
  Chef::Log.debug "Using chef-client bin from system path: #{client_bin}"
else
  raise "Could not locate the chef-client bin in any known path. Please set the proper path by overriding node['chef_client']['bin'] in a role."
end

node.set["chef_client"]["bin"] = client_bin

# libraries/helpers.rb method to DRY directory creation resources
create_directories

case node["chef_client"]["init_style"]
when "init"

  #argh?
  dist_dir, conf_dir = value_for_platform_family(
    ["debian"] => ["debian", "default"],
    ["fedora"] => ["redhat", "sysconfig"],
    ["rhel"] => ["redhat", "sysconfig"],
    ["suse"] => ["suse", "sysconfig"]
  )
# .. snip 200 lines of case statement

What’s Wrong

I dont like this. When I look at a recipe I want to be able to easily understand what it is doing. It is a recipe, not an annotated thesis. It should be a simple browsable description of your intentions. Real cookbooks don’t describe the minutia of each step, they tell you the general steps and leave you to figure out the details.

How?

  • Get your logic out of your recipe, put it in a library.
  • Abstract connected DSL blocks to LWRPs, control will be simpler
  • Use name attributes carefully.
  • Give friendly names to referenced node attributes.

Why?

  • Readability
  • Testability
  • Development Speed
  • Readability

Developers who aren’t that familiar with Ruby or Chef can still easily read recipe code. It is designed to be readable and declarative. The recipes should define the pieces of your infrastructure, not the edge cases.

Testability

It is hard to test Chef directly. Usually it just results in you running all of the code and testing the system’s final state. When you separate your logic from your Chef code, you separate your concerns. You now have logic to which you can easily apply standard Ruby testing methods. The rest of the recipe is now just Chef code that only needs to be checked for regressions.

Development Speed

Now that you have your logic separate, you can write it just like regular Ruby. You don’t need to waste all of your time waiting for VMs to spin up and compile packages. The only time you need to run Chef is to verify your DSL blocks are set up correctly.

Downsides

  • You have to learn about library files
  • You have to write tests
  • You have to name more things.
  • You’ll have to learn about when Chef loads and runs libraries
  • You’ll have to learn more Ruby
  • Less elegant code to force calling from Library files.

Next Step

Chef manipulates Infrastructure as Code, its about time you started treating your infrastructure code like your application code.

Note : Discussions of this could easily devolve into the MVC “where do I put my logic” debate.

Andrew Gross is a Developer at Yipit, you can find him as @awgross on Twitter

Pragmatic Chef: Verifying Remote Files

The first in a short series of Chef related blog posts

Imagine this everyday occurrence:

  • You are writing a Chef cookbook
  • You need to download the source code of a program as a tarball.
  • Being the good programmer you are, you use for Chef’s remote_file resource.
  • You grab the checksum of the file from the website and add it to the resource.

Everything is good right? Unfortunately not.

The Chef remote_file resource does have a checksum property, but it is not meant as a security measure. Instead it is implemented as a bandwidth saving property. If the checksum does not match, it will download the new version but no further validation will be done to match it with the checksum.

Here is an example of using a standard remote_file callback pattern to download and install from source.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
remote_file "Download Redis Source" do
  path "#{download_dir}/#{tarball}"
  source download_url
  backup false
  checksum sha256_checksum
  notifies :run, "execute[Unpack Redis Tarball]", :immediately
end

execute "Unpack Redis Tarball" do
  command "tar -xvzf #{tarball}"
  action :nothing
  cwd download_dir
  notifies :run, "execute[Make Redis]", :immediately
end

We are using the checksum field, but it is only checking for already downloaded versions. Here is a better pattern that forces a correct checksum on the downloaded file before you unpack it any further.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
remote_file "Download Redis Source" do
  path "#{download_dir}/#{tarball}"
  source download_url
  backup false
  checksum sha256_checksum
  notifies :create, "ruby_block[Validate Tarball Checksum]", :immediately
  not_if { redis_exists? && node['redisio']['safe_install'] }
end

ruby_block "Validate Tarball Checksum" do
  action :nothing
  block do
    require 'digest'
    checksum = Digest::SHA1.file("#{download_dir}/#{tarball}").hexdigest
    if checksum != node['redisio']['sha1_checksum']
      raise "Downloaded Tarball Checksum #{checksum} does not match known checksum #{node['redisio']['sha1_checksum']}"
    end
  end
  notifies :run, "execute[Unpack Redis Tarball]", :immediately
end

execute "Unpack Redis Tarball" do
  command "tar -xvzf #{tarball}"
  action :nothing
  cwd download_dir
  notifies :run, "execute[Make Redis]", :immediately
end

In this case Redis provides SHA1 checksums while remote_file uses SHA256 so we must add an additional node attribute that stores the SHA1 checksum. These examples were taken from my fork of the Redisio cookbook.

Andrew Gross is a Developer at Yipit, you can find him as @awgross on Twitter

Making Django ORM Queries 30% Faster by Changing 8 Characters

There are dozens of ways to optimize your database queries with the Django ORM. As usual, the Django documentation is great and has a good list of them.

In the middle of that page, it states Understand select_related() and prefetch_related() thoroughly, and use them, with links to their respective method documentation. The prefetch_related() documentation goes a bit more in depth:

select_related works by creating a SQL join and including the fields of the
related object in the SELECT statement. For this reason, select_related gets
the related objects in the same database query. However, to avoid the much
larger result set that would result from joining across a 'many'
relationship, select_related is limited to single-valued relationships -
foreign key and one-to-one.

prefetch_related, on the other hand, does a separate lookup for each
relationship, and does the 'joining' in Python. This allows it to prefetch
many-to-many and many-to-one objects, which cannot be done using
select_related, in addition to the foreign key and one-to-one relationships
that are supported by select_related.

tl; dr: select_related does a join, prefetch_related does two separate queries.

We’ve been using prefetch_related extensively for m2m fields ever since we upgraded to Django 1.4, but hadn’t spent much time looking into using prefetch_related for foreign key fields since we were already using select_related for those. We spent some time benchmarking prefetch_related for foreign keys in different scenarios and were surprised by the size of the improvement in some of the scenarios.

Necessary Warning: There are a ton of different variables at play here including network speed, data size, database load, etc. You absolutely need to be profiling your queries to determine whether or not this (or any) optimization will make sense in a given scenario.

Hypothetical Scenario

Let’s imagine you have some models like the following:

models.py

1
2
3
4
5
6
class Publisher(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    name = models.CharField(max_length=100)
    publisher = models.ForeighKey(Publisher)

and in our views.py:

1
2
3
4
5
def books(request, template="books.html"):
    context = {
        'books': Book.objects.all().select_related('publisher'),
    }
    return render(request, template, context)

and books.html:

1
2
3
4
5
6
7
8
9
10
11
12
13
<h1>Awesome Books</h1>
{% for book in books %}
    {% if 'Awesome' in book.name %}
        <h2>{{ book.name }}</h2>
        <h3>{{ book.publisher.name }}</h3>
    {% endif %}
{% endfor %}

<h1>All Books</h1>
{% for book in books %}
    <h2>{{ book.name }}</h2>
    <h3>{{ book.publisher.name }}</h3>
{% endfor %}

The select_related does it’s job. A join happens between the book and publisher tables and all the data is pulled in one query. If we didn’t have the select_related, a query of the publisher table would be done for every book iteration in the template.

Now let’s imagine that we live in a world with lots of books, but there are only a few publishers. Say we have 500 books in our system, but only four publishers. The above code will join the book and publisher tables and create 500 book objects and 500 associated publisher objects.

What happens if we used prefetch_related in our view instead?

1
Book.objects.all().prefetch_related('publisher')

Now two database queries will be done. The first will grab all the books and the second will grab all the publishers. The ORM will create 500 book objects and 4 publisher objects. Then it will associate them. There are two potential reasons why this way might be better.

First, the db may perform better with the two simple queries than the one complicated query. In the above example, it may not seem like that big of a deal, but imagine if you had multiple foreign keys.

Second, we only have to instantiate 4 publisher objects in the latter, whereas we have 500 in the former. This takes less time and uses less memory.

As I warned before, whether or not either of those two will be true in your particular circumstance depends on a large number of factors and you really need to do proper profiling before making a decision. In some scenarios, we have found up to a 30% speed improvement.

Note that in this scenario, we are accessing the queryset multiple times. If you are only accessing the queryset once, there is a good chance that using iterator() and select_related is the better option.

Steve Pulec is a developer at Yipit. You can follow him on twitter @spulec and follow @YipitDjango for more django tips from all the Yipit engineers.

Oh, by the way, we’re hiring.

orm

How Yipit Scales Thumbnailing With Thumbor and Cloudfront

Yipit, like many other sites, displays collected images, and like many other sites, Yipit needs to display these images in different sizes in different places.

Up until recently, we were using django-imagekit, which works pretty well but has presented some issues as we’ve grown.

Dynamic Generation Issues

Imagekit supports dynamic thumbnail generation, but it checks for and creates the image while rendering the final url for where the image will be accessed. This means, in order to take advanatge of this feature, rendering a page with 10 thumbnails needs to, best case, make 10 network calls to check for the images existence, and worst case scenario, retrieve, process, and upload 10 images.

Pre-Generation Issues

The other option, which is how Yipit used Imagekit, is to pre-generate all thumbnails and assume they’ve been created properly when rendering a pages. Of course, they haven’t always been generated properly, so having a system to find and re-generate those images is a pain to maintain. Also, adding a new image size for a new design requires going back and creating new thumbnails for all of the old images.

Solution Requirements

We wanted a thumbnailing solution that would:

  • Dynamically create images when they’re needed
  • Serve those images quickly
  • Not slow down server response times

We were able to achieve this goals using Thumbor behind AWS Cloudfront.

Thumbor is a service written in python that allows you to pass the url of an image as well as thumbnailing options in a URI and then dynamically creates the images. There are libraries for URI generation in Python, Node.js, Ruby, and Java.

Installing Thumbor

Thumbor is installable via pip. We run it in a virtualenv at /var/www/thumbor-env so our entire installation is essentially this:

1
2
3
4
$ cd /var/www
$ virtualenv thumbor-env
$ source thumbor-env/bin/activate
$ pip install thumbor

Configuring the thumbor server and the available options are very well documented with a sample configuration file.

We then run it with supervisor behind nginx with these configurations:

Supervisor:

1
2
3
4
5
6
7
8
9
10
11
[program:thumbor]
command=/var/www/thumbor-env/bin/python /var/www/thumbor-env/bin/thumbor --port=900%(process_num)s --conf=/var/www/thumbor-env/thumbor/thumbor.conf
process_name=thumbor900%(process_num)s
numprocs=4
user=ubuntu
autostart=true
autorestart=true
stdout_logfile=/var/log/supervisor/thumbor900%(process_num)s.stdout.log
stdout_logfile_backups=3
stderr_logfile=/var/log/supervisor/thumbor900%(process_num)s.stderr.log
stderr_logfile_backups=3

nginx:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
upstream thumbor {
    server 127.0.0.1:9000;
    server 127.0.0.1:9001;
    server 127.0.0.1:9002;
    server 127.0.0.1:9003;
}

server {
    listen 8000;
    server_name thumbor.yipit.com;
    # merge_slashes needs to be off if the image src comes in with a protocol
    merge_slashes off;
    location ^~ /thumbor/ {
        rewrite /thumbor(/.*) $1 break;
        proxy_pass http://thumbor;
    }

    location / {
        proxy_pass http://thumbor;
    }


}

This setup runs 4 tornado processes load balanced behind nginx.

Setting Up Cloudfront

Setting up Cloudfront is also very easy. If you want to setup a dedicated cloudfront distribution for Thumbor just create a new distribution through the AWS Web Console set your thumbor url (In this case thumbor.yipit.com:8000) as your origin domain.

If you want to set up a namespace for thumbor on an existing distribution, first you’ll need to create a new origin pointing to your thumbor server and then a new behavior with the pattern thumbor/* that points at that origin.

Using it in your application

Now your application server doesn’t need to worry about image generation or exstience. All you need to do is render thumbor URIs. Here’s the function we use at Yipit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from django.conf import settings
from libthumbor import CryptoURL

def thumb(url, **kwargs):
    '''
        returns a thumbor url for 'url' with **kwargs as thumbor options.
 
        Positional arguments:
        url -- the location of the original image
 
        Keyword arguments:
        For the complete list of thumbor options
        https://github.com/globocom/thumbor/wiki/Usage
        and the actual implementation for the url generation
        https://github.com/heynemann/libthumbor/blob/master/libthumbor/url.py
    '''
    if settings.THUMBOR_BASE_URL:
        # If THUMBOR_BASE_URL is explicity set, use that
        base = settings.THUMBOR_BASE_URL
    else:
        # otherwise assume that thumbor is setup behind the same
        # CDN behind the `thumbor` namespace.
        scheme, netloc = urlparse.urlsplit(url)[:2]
        base = '{}://{}/thumbor'.format(scheme, netloc)
    crypto = CryptoURL(key=settings.THUMBOR_KEY)

    # just for code clarity
    thumbor_kwargs = kwargs
    if not 'fit_in' in thumbor_kwargs:
        thumbor_kwargs['fit_in'] = True

    thumbor_kwargs['image_url'] = url
    path = crypto.generate(**thumbor_kwargs)
    return u'{}{}'.format(base, path)

This can also be easily wrapped from a template tag:

1
2
3
@register.simple_tag
def thumbor_url(image_url, **kwargs):
    return thumb(image_url, **kwargs)

making adding thumbnails to your pages as as easy as:

1
<img height="192" width="192" src="{% thumbor_url img_url width=192 height=192 %}" />

Zach Smith is the VP of Engineering at Yipit. You can follow him on twitter @zmsmith and follow @YipitDjango for more django tips from all the yipit engineers.

Oh, by the way, we’re hiring.