Developer Blog

Java/Scala and Highly Scalable Systems on AWS

Pruning EBS Snapshots

16 July 2009

We've been using Amazon's Elastic Block Storage (EBS) for some time now. In a nutshell, EBS is like a "hard drive for the AWS cloud". You simply create an EBS volume and then mount it on your EC2 instance. You then read/write to it as if it were local storage. For a good intro to EBS, check out this RightScale blog post.


custom map scripts and hive

14 July 2009

First, I have to say that after using Hive for the past couple of weeks and actually writing some real reporting tasks with it, it would be really hard to go back. If you are writing straight hadoop jobs for any kind of report, please give hive a shot. You'll thank me.


Load testing with Tsung

7 July 2009

One of the big issues with building scalable software is making tests scale along with the application. A high performance web application should be tested under heavy loads, preferably to the breaking point. Of course, now you need a second application that can generate lots of traffic. You could use something simple like httperf; however, this doesn't work so well with complex systems, since you're only hitting one URL at a time.


custom UDFs and hive

23 June 2009

We just started playing around with Hive. Basically, it lets you write your hadoop map/reduce jobs using a SQL-like language. This is pretty powerful. Hive also seems to be pretty extendable -- custom data/serialization formats, custom functions, etc.


Google Visualizations Java Data Source Library

11 June 2009

As with any data-oriented company, most of our projects revolve around collecting data, processing data, and exposing data to users. In that third category, we've been moving towards Google Visualizations to draw our pretty graphs and charts. So, while the free Android phone and Google Wave were attracting a lot of attention at Google I/O, from a practical standpoint, I was actually most excited about Google's new Data Source Java Library. We had previously written something similar to this in-house, but we were still working on some of the optional parts of the specification when this library was released.

Read's SOAP/REST library for Google App Engine/Java

11 June 2009

As long as I'm reflecting on our Google I/O experiences, I also want to point out what looks like a very useful library from Salesforce. The Web Services Connector is a toolkit designed to simplify calling WSDL-defined SOAP and REST services. The best part is that they have a version that works on Google App Engine for Java! (Make sure that you use wsc-gae-16_0.jar, not the regular version.)


new version of s3-simple

20 May 2009

Just committed some small changes to the s3-simple library for specifying ACLs, and/or arbitrary request headers/meta-data while storing keys.


Work @ Bizo

8 May 2009

We’re looking for an out-of-the-box thinker with a good sense-of-humor and a great attitude to join our product development team. As one of five software engineers for Bizo, you will take responsibility for developing key components of the Bizographic Targeting Platform, a revolutionary new way to target business advertising online. You will be a key player on an incredible team as we build our world-beating, game-changing, and massively-scalable bizographic advertising and targeting platform.


Spring MVC on Google App Engine

5 May 2009

I've been developing an application on Google App Engine and reached a point where I really wanted to be able to use Spring and Spring MVC on the server side.


google app engine (java) and s3

1 May 2009

After struggling for way too long, I finally (sort of) got app engine talking to s3.