Developer Blog

Java/Scala and Highly Scalable Systems on AWS

Asanban: Lean Development with Asana and Kanban

23 January 2013

On Bizo's External Apps team (aka. 'xapps'), we've been using a Kanban system to manage our work. All of Bizo Engineering uses Asana to track tasks, which isn't specifically designed for Kanban. We've settled on a set of of conventions that we use in Asana which enable our Kanban system. These conventions also help us to track metrics like the average lead time from month to month.


What Makes Spark Exciting

21 January 2013

At Bizo, we're currently evaluating/prototyping Spark as a replacement for Hive for our batch reports.


Grouping pageviews into visits: a Scala code kata

26 September 2012

The basic units of any website traffic analysis are pageviews, visits, and unique visitors. Tracking pageviews is simply a matter of counting requests to the server. Calculating unique visitors usually relies on cookies and unique identifiers. Visits, however, require a bit more work. For our purposes, a single visit is defined as a sequence of pageviews where the interval between pageviews is less than a fixed length like 15 minutes.


Using GROUP BYs or multiple INSERTs with complex data types in Hive.

19 September 2012

In any sort of ad hoc data analysis, the first step is often to extract a specific subset of log lines from our files. For example, when looking at a single partner’s web traffic, I often use an initial query to copy that partner’s data into a new table. In addition to segregating out only the data relevant to my analysis, I use this to copy the data from S3 into HDFS, which will make later queries more efficient. (Using maps as our log lines is how we supportdynamic columns.)


mdadm: device or resource busy

7 July 2012

I just spent a few hours tracking an issue with mdadm (Linux utility used to manage software RAID devices) and figured I'd write a quick blog post to share the solution so others don't have to waste time on the same.


Amazon Web Services Outages: 4 Steps for Survival

3 July 2012

(Cross-post from the Bizo Blog)


AWS Billing Info in Hive

14 June 2012

Amazon recently (finally!) launched programmatic access to your AWS billing data.


The golden rule of programming style

13 June 2012

There's an interesting page on the subject of compilation units per file over at the scala style guide.


Scala Test Plug-in for Sublime Text 2

20 April 2012

I have documented and put some polish on the Sublime Text 2 plug-in I blogged about previously. It lets you run a single Scala Test, or all tests in your project. It also lets you quickly navigate to any scala files in your project folder, and switch back and forth between a class and its test.  Check it out here:


Dev Days: Hacking, Open Source and Docs

20 April 2012

Dev Days

Every month we have a "Dev Day" where engineers take a break from their projects and work on "other stuff". Most start-up engineering teams have a "Hack Day" where everyone gets to hack on anything they want as long as they ship and share it with the rest of the team. Of course we have Hack Days but we also have other types of Dev Days too. In fact, we have three types of Dev Days: