How Yipit Scales Thumbnailing With Thumbor and Cloudfront

Yipit, like many other sites, displays collected images, and like many other sites, Yipit needs to display these images in different sizes in different places.

Up until recently, we were using django-imagekit, which works pretty well but has presented some issues as we’ve grown.

Dynamic Generation Issues

Imagekit supports dynamic thumbnail generation, but it checks for and creates the image while rendering the final url for where the image will be accessed. This means, in order to take advanatge of this feature, rendering a page with 10 thumbnails needs to, best case, make 10 network calls to check for the images existence, and worst case scenario, retrieve, process, and upload 10 images.

Pre-Generation Issues

The other option, which is how Yipit used Imagekit, is to pre-generate all thumbnails and assume they’ve been created properly when rendering a pages. Of course, they haven’t always been generated properly, so having a system to find and re-generate those images is a pain to maintain. Also, adding a new image size for a new design requires going back and creating new thumbnails for all of the old images.

Solution Requirements

We wanted a thumbnailing solution that would:

  • Dynamically create images when they’re needed
  • Serve those images quickly
  • Not slow down server response times

We were able to achieve this goals using Thumbor behind AWS Cloudfront.

Thumbor is a service written in python that allows you to pass the url of an image as well as thumbnailing options in a URI and then dynamically creates the images. There are libraries for URI generation in Python, Node.js, Ruby, and Java.

Installing Thumbor

Thumbor is installable via pip. We run it in a virtualenv at /var/www/thumbor-env so our entire installation is essentially this:

1
2
3
4
$ cd /var/www
$ virtualenv thumbor-env
$ source thumbor-env/bin/activate
$ pip install thumbor

Configuring the thumbor server and the available options are very well documented with a sample configuration file.

We then run it with supervisor behind nginx with these configurations:

Supervisor:

1
2
3
4
5
6
7
8
9
10
11
[program:thumbor]
command=/var/www/thumbor-env/bin/python /var/www/thumbor-env/bin/thumbor --port=900%(process_num)s --conf=/var/www/thumbor-env/thumbor/thumbor.conf
process_name=thumbor900%(process_num)s
numprocs=4
user=ubuntu
autostart=true
autorestart=true
stdout_logfile=/var/log/supervisor/thumbor900%(process_num)s.stdout.log
stdout_logfile_backups=3
stderr_logfile=/var/log/supervisor/thumbor900%(process_num)s.stderr.log
stderr_logfile_backups=3

nginx:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
upstream thumbor {
    server 127.0.0.1:9000;
    server 127.0.0.1:9001;
    server 127.0.0.1:9002;
    server 127.0.0.1:9003;
}

server {
    listen 8000;
    server_name thumbor.yipit.com;
    # merge_slashes needs to be off if the image src comes in with a protocol
    merge_slashes off;
    location ^~ /thumbor/ {
        rewrite /thumbor(/.*) $1 break;
        proxy_pass http://thumbor;
    }

    location / {
        proxy_pass http://thumbor;
    }


}

This setup runs 4 tornado processes load balanced behind nginx.

Setting Up Cloudfront

Setting up Cloudfront is also very easy. If you want to setup a dedicated cloudfront distribution for Thumbor just create a new distribution through the AWS Web Console set your thumbor url (In this case thumbor.yipit.com:8000) as your origin domain.

If you want to set up a namespace for thumbor on an existing distribution, first you’ll need to create a new origin pointing to your thumbor server and then a new behavior with the pattern thumbor/* that points at that origin.

Using it in your application

Now your application server doesn’t need to worry about image generation or exstience. All you need to do is render thumbor URIs. Here’s the function we use at Yipit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from django.conf import settings
from libthumbor import CryptoURL

def thumb(url, **kwargs):
    '''
        returns a thumbor url for 'url' with **kwargs as thumbor options.
 
        Positional arguments:
        url -- the location of the original image
 
        Keyword arguments:
        For the complete list of thumbor options
        https://github.com/globocom/thumbor/wiki/Usage
        and the actual implementation for the url generation
        https://github.com/heynemann/libthumbor/blob/master/libthumbor/url.py
    '''
    if settings.THUMBOR_BASE_URL:
        # If THUMBOR_BASE_URL is explicity set, use that
        base = settings.THUMBOR_BASE_URL
    else:
        # otherwise assume that thumbor is setup behind the same
        # CDN behind the `thumbor` namespace.
        scheme, netloc = urlparse.urlsplit(url)[:2]
        base = '{}://{}/thumbor'.format(scheme, netloc)
    crypto = CryptoURL(key=settings.THUMBOR_KEY)

    # just for code clarity
    thumbor_kwargs = kwargs
    if not 'fit_in' in thumbor_kwargs:
        thumbor_kwargs['fit_in'] = True

    thumbor_kwargs['image_url'] = url
    path = crypto.generate(**thumbor_kwargs)
    return u'{}{}'.format(base, path)

This can also be easily wrapped from a template tag:

1
2
3
@register.simple_tag
def thumbor_url(image_url, **kwargs):
    return thumb(image_url, **kwargs)

making adding thumbnails to your pages as as easy as:

1
<img height="192" width="192" src="{% thumbor_url img_url width=192 height=192 %}" />

Zach Smith is the VP of Engineering at Yipit. You can follow him on twitter @zmsmith and follow @YipitDjango for more django tips from all the yipit engineers.

Oh, by the way, we’re hiring.

Comments