Setting up Discourse on AWS

alexkuang - 30 Jun 2014

The Bizo dev team recently decided to experiment with using discourse as a discussion forum. Discourse has published an installation guide for a single-machine setup using DigitalOcean, but we decided to deploy on AWS instead.

Why AWS?

While the default installation is fast and easy to implement, it does leave the web server running in the same box as all the important data: postgres, redis, file uploads. This means that if the server location were, hypothetically, invaded by a troop of angry hammer-wielding monkeys, there would be much downtime and data loss involved.

As proper monkey-fearing developers, we prefer to use immutable servers whenever possible. By using immutable servers and keeping data in dedicated AWS services, we can treat them as disposable and essentially expect them to be torn down and rebuilt on a regular basis. This provides incentive to keep configuration simple, which makes automation easier, and in turn leads to less time fussing over things like server state and maintaining exact run-time configuration.

We prefer sticking to this approach even for experimental apps like discourse. In this case, we used EC2 for the webserver, RDS for the postgres database, ElastiCache for the redis store, and S3 for file uploads.

Setting up AWS Services

Setting up AWS services is generally pretty straighforward, but there are a few things to keep in mind.

Security Groups

Security groups are always a good idea, but they are required for RDS and ElastiCache to communicate with EC2 instances. Make sure to allow ssh (port 22) and http (ports 80 + 443), and give it a memorable name like discourse-prod.

RDS, ElastiCache

Again, the only catch here is to make sure that the security group attached to these instances is the same one that the EC2 instance is attached to. Otherwise, the web server won't be able to talk to either of them. Make sure to note the hostnames, auth credentials, etc here for use in configuring discourse.

EC2

The only officially supported install method for discourse is via Docker, which requires choosing specific versions of Ubuntu for the EC2 instance. See the docker documentation for more details.

Discourse Config

Since we're not doing a standalone deploy, we based our discourse docker config on the web-only example that discourse provides. The final config looked something like this:

templates:
  - "templates/sshd.template.yml"
  - "templates/web.template.yml"

expose:
  - "80:80"
  - "2222:22"

params:
  version: HEAD

env:
  # Creating an account with a developer email will automatically give it 
  # admin access to the site for setup
  DISCOURSE_DEVELOPER_EMAILS: 'email_address@company.com'
  DISCOURSE_HOSTNAME: 'DOMAIN_FOR_DISCOURSE_SITE.com'
  # Enter info from RDS here
  DISCOURSE_DB_SOCKET: ''
  DISCOURSE_DB_HOST: 'DB_INSTANCE_ID.REGION.rds.amazonaws.com'
  DISCOURSE_DB_PORT: '5432'
  DISCOURSE_DB_USERNAME: 'DB_USER'
  DISCOURSE_DB_PASSWORD: 'DB_PASSWORD'
  DISCOURSE_DB_NAME: 'DB_NAME'
  # Enter info from elasticache here
  DISCOURSE_REDIS_HOST: 'REDIS_INSTANCE.cache.amazonaws.com'
  DISCOURSE_REDIS_PORT: '6379'
  # Amazon SES can be used for SMTP, or even gmail for lower volumes
  DISCOURSE_SMTP_ADDRESS: SMTP_SERVER
  DISCOURSE_SMTP_PORT: SMTP_PORT
  DISCOURSE_SMTP_USER_NAME: SMTP_USER
  DISCOURSE_SMTP_PASSWORD: SMTP_PASSWORD

volumes:
  - volume:
        host: /var/docker/shared
        guest: /shared

# you may use the docker manager to upgrade and monitor your docker image
# UI will be visible at http://yoursite.com/admin/docker
hooks:
# you may import your key using launchpad if needed
#after_sshd:
#    - exec: ssh-import-id some-user
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - mkdir -p plugins
          - git clone https://github.com/discourse/docker_manager.git

Bootstrapping EC2 instances

We have existing infrastructure that will spin up new instances with applicable settings such as security groups, auto scaling group, and load balancers. A simple shell script sets up the internals on the instance itself. First, installing docker and the docker container infrastructure for discourse:

# This is taken almost verbatim from the discourse installation guide
(wget -qO- https://get.docker.io/ | bash) > docker_install.log 2>&1
install -g docker -m 2775 -d /var/docker
git clone https://github.com/discourse/discourse_docker.git /var/docker

At the time of this writing, some of the provided install scripts do not play very nicely with Ubuntu's default dash so bash is used explicitly.

Next, we customize the deploy. In addition to dropping in our custom config, we also generate an ad-hoc ssh key to feed into the docker container. This is because the discourse app runs entirely inside the container, so any sort of direct interaction with it requires the user to interact with the container itself, rather than just the server running the container. Conveniently, discourse sets up a sshd from templates/ssh.template.yml and will automatically load the current user's key from ~/.ssh/id_rsa into the container. This part of the script makes sure that for each deploy there is a fresh key that is completely separate from any other (potentially important/sensitive) keys that might be living in the server.

cp config/bizo-discourse.yml /var/docker/containers/app.yml

# Make sure to back up anything that might be in the existing id_rsa!  Ideally
# the install is running as its own special user.
(mkdir -p ssh-key && cd ssh-key && ssh-keygen -f id_rsa -t rsa -N '')
(mkdir -p ~/.ssh && cp ssh-key/* ~/.ssh)

And finally, to bootstrap the container and start the actual discourse app inside it:

bash /var/docker/launcher bootstrap app > discourse_bootstrap.log 2>&1
bash /var/docker/launcher start app > discourse_run.log 2>&1 &

S3 Uploads

The final step after getting the app set up is to get file uploads on S3, which is explained in this guide. However, this might break existing user avatars, which the server still expects to find in their old location. (Note that this also applies to gravatars since discourse will download and cache them by default.) The fix is simple: ssh into the container and run bundle exec rake avatars:refresh. This will re-download avatars and update the users in the database accordingly.

Maintenance

With this setup, maintenance becomes extremely easy. Server crashes and software upgrades can be handled by spinning up new instances running the bootstrap shell script. Scaling is also as easy as spinning up another instance and putting everything behind a load balancer. Forum backups and data recovery can be handled easily with Amazon's native tools. At the end of the day, all this adds up to less time wasted on maintaining forum software and more time working on business-critical apps.