Developer Blog

Java/Scala and Highly Scalable Systems on AWS

Command Query Responsibility Segregation with S3 and JSON

18 April 2011

We recently tackled a problem at Bizo where we wanted to decouple our high-volume servers from our MySQL database.

Read

Crowdflower Trickery: dynamic tasks

15 April 2011

Here at Bizo, we often use crowdflower to improve the quality of our data. In doing so, we’ve come across some cool, but under-documented, tricks. One trick that we’ve particularly found useful is using liquid for designers to dynamically generate crowdflower tasks. Let us take a look at how to do this with a toy example.

Read

Hive Unit Testing

14 April 2011

Introduction

Read

Hive 0.7 no longer auto-downloads transform scripts

6 April 2011

I ran into a bit of a surprise moving a Hive 0.5 script to Hive 0.7 the other day.

Read

On Building a Kick Ass Engineering Team -- Part 1

11 March 2011

We started Bizo just about three years ago with the goal of building a great business and a world-class engineering team. One of the first things I did was write down what I thought it would take to create a kick-ass engineering team.

Read

"dynamic" columns in Hive

24 February 2011

One of the presentations at the HBase meetup the other night was on building a query language on top of HBase. No less than 3 people asked "Why not use Hive?". The main reason given was that Hive is too slow for doing simple selects. But, the other thing they really liked about using HBase was that your columns were dynamic -- it's easy to add new fields to your data.

Read

Adventures In GWT-land Part #1: Awkward Baby Steps

27 January 2011

Over the past several months we’ve been working on a (super) secret shiny new GWT application (it’s basically going to rock your socks off). This was my first GWT application and coming from a non-Java, non-GWT background where I was used to writing raw Javascript pretty often - it’s been interesting to say the least. What follows is the first in a multi-part series where I’d like to reflect on life in GWT-land and hopefully provide a few cool tips and code samples along the way.

Read

EMR/Hive: recovering a large number of partitions

26 January 2011

If you try to run "alter table ... recover partitions" on a table with a large number of partitions, you may run into this error:

Read

Spring NamespaceHandler debugging

17 November 2010

While updating a library that used Spring yesterday, I began suffering the dreaded "Unable to locate Spring NamespaceHandler for XML schema namespace" exception.

Read

CSV and Hive

12 November 2010

CSV

Read