Developer Blog
Java/Scala and Highly Scalable Systems on AWS
Asanban: Lean Development with Asana and Kanban
23 January 2013
On Bizo's External Apps team (aka. 'xapps'), we've been using a Kanban system to manage our work. All of Bizo Engineering uses Asana to track tasks, which isn't specifically designed for Kanban. We've settled on a set of of conventions that we use in Asana which enable our Kanban system. These conventions also help us to track metrics like the average lead time from month to month.
ReadWhat Makes Spark Exciting
21 January 2013
At Bizo, we're currently evaluating/prototyping Spark as a replacement for Hive for our batch reports.
ReadGrouping pageviews into visits: a Scala code kata
26 September 2012
The basic units of any website traffic analysis are pageviews, visits, and unique visitors. Tracking pageviews is simply a matter of counting requests to the server. Calculating unique visitors usually relies on cookies and unique identifiers. Visits, however, require a bit more work. For our purposes, a single visit is defined as a sequence of pageviews where the interval between pageviews is less than a fixed length like 15 minutes.
ReadUsing GROUP BYs or multiple INSERTs with complex data types in Hive.
19 September 2012
In any sort of ad hoc data analysis, the first step is often to extract a specific subset of log lines from our files. For example, when looking at a single partner’s web traffic, I often use an initial query to copy that partner’s data into a new table. In addition to segregating out only the data relevant to my analysis, I use this to copy the data from S3 into HDFS, which will make later queries more efficient. (Using maps as our log lines is how we supportdynamic columns.)
Readmdadm: device or resource busy
7 July 2012
I just spent a few hours tracking an issue with mdadm (Linux utility used to manage software RAID devices) and figured I'd write a quick blog post to share the solution so others don't have to waste time on the same.
ReadAWS Billing Info in Hive
14 June 2012
Amazon recently (finally!) launched programmatic access to your AWS billing data.
ReadThe golden rule of programming style
13 June 2012
There's an interesting page on the subject of compilation units per file over at the scala style guide.
ReadScala Test Plug-in for Sublime Text 2
20 April 2012
I have documented and put some polish on the Sublime Text 2 plug-in I blogged about previously. It lets you run a single Scala Test, or all tests in your project. It also lets you quickly navigate to any scala files in your project folder, and switch back and forth between a class and its test. Check it out here: https://github.com/patgannon/sublimetext-scalatest
ReadDev Days: Hacking, Open Source and Docs
20 April 2012
Dev Days
Every month we have a "Dev Day" where engineers take a break from their projects and work on "other stuff". Most start-up engineering teams have a "Hack Day" where everyone gets to hack on anything they want as long as they ship and share it with the rest of the team. Of course we have Hack Days but we also have other types of Dev Days too. In fact, we have three types of Dev Days:
Read