Java/Scala and Highly Scalable Systems on AWS

s3fsr 1.4 released

12 October 2009

s3fsr is a tool we built at Bizo to help quickly get files into/out of S3. It's had a few 1.x releases, but by 1.4 we figured it was worth getting around to posting about.


Efficiently selecting random sub-collections.

8 October 2009

Here's a handy algorithm for randomly choosing k elements from a collection of n elements (assume k < n)


hive map reduce in java

7 October 2009

In my last post, I went through an example of writing custom reduce scripts in hive.


Simple DB Firefox Plugin -- New Release

6 October 2009

I finally got around to updating our open-sourced Simple DB Firefox Plugin creatively named SDB Tool.


reduce scripts in hive

6 October 2009

In a previous post, I discussed writing custom map scripts in hive. Now, let's talk about reduce tasks.


Developing on the Scala console with JavaRebel

5 October 2009

If you're the type of developer who likes to mess around interactively with your code, you should definitely be using the Scala console. Even if you're not actually using any Scala in your code, you can still instantiate your Java classes, call their methods, and play around with the results. Here's a handy script that I stick in the top-level of my Eclipse projects that will start an interactive console with my compiled code on the classpath:


Running ScalaTest BDD Tests from Eclipse

10 September 2009

At Bizo, we're using Scala for a few things here and there. While investigating testing approaches for Scala, I came across ScalaTest and its Behavior Driven Development (BDD) spec approach.


GWT hosted mode on snow leopard

9 September 2009

One of the first things I noticed after installing Snow Leopard was that GWT hosted mode no longer worked. You'll see the message "You must use a Java 1.5 runtime to use GWT Hosted Mode on Mac OS X." After spending about 10 minutes convincing myself that I was in fact using jdk1.5 for eclipse, ant, etc., and like, wasn't this working last week? I finally looked at the jdk symlinks in JavaVM.framework and figured out that 1.5 was just pointing to 1.6... interesting.


Setting up AWS keys for Eclipse

11 August 2009

One somewhat annoying thing about running JUnit tests in Eclipse is that they do not inherit your system's environment variables. There are good reasons for this, but we pass our AWS credentials to all of our applications via system variable, and it's a pain to add these to every single run configuration that needs them. This gets especially tedious when a significant number of your JUnit tests require AWS access.


Dependency management for Scala scripts using Ivy

22 July 2009

I'm quickly becoming a huge fan of Scala scripting. Because Scala is Java-compatible, we can easily use our existing Java code base in scripts. This is especially convenient as we're moving our reporting to Hive, which supports script-based Hadoop streaming for custom Mappers and Reducers.