Today marks the first interview with the Hadoop Summit track chairs. We sat down with Evert Lammerts, chair of the Operating Hadoop track. For the past two years, Evert has been responsible for defining and building the Hadoop & Big Data services for the academic community in the Netherlands at SURFsara.
Tell us about the Operating Hadoop track.
In the Operating Hadoop track we wanted to touch on what we regard the most important aspects of large-scale infrastructure operations, namely architecture, administration, monitoring, optimisation, and scalability, and illustrate these with examples and best practices. We had lots of submissions, most of exceptional quality, and it’s a shame we had to keep the acceptance rate at around 20%. The final cut includes talks about infrastructure, virtualisation, Apache Ambari, scaling of HBase, and the intimate secrets of the operations team at LinkedIn – I’m looking forward to it!
Who else was involved in the selection process for your track?
The committee included two others in addition to myself. Sebastien Goasguen is a technology evangelist for Citrix and works on the integration of Apache Hadoop and Apache CloudStack. Lars Francke is a freelance software developer who currently works on Hadoop for the Global Biodiversity Information Facility (GBIF).
What was the hardest part about the selection process?
Speaking for myself, the most difficult aspect was trying to find the right balance in subjects. We made a list of all abstracts ordered by quality. It might seem counterintuitive, but it’s not the top 7 of that list that made the cut. If subjects of two talks are too similar, you’ll have to drop one of them. So that’s what we did, in favour of entries further down the list. Luckily we had many good entries to choose from.
What takeaways do you expect for folks attending sessions in your track?
In my experience many organisations are not used to the commodity, do-it-yourself nature of Hadoop-style large-scale infrastructure. Many think it is out of their scope, or that their ops team cannot handle it. I hope this track can lift the veil on Hadoop operations. Sure, you need some skills, but it’s really not the black magic that it seems to be.
What are you personally most looking forward to at Hadoop Summit Europe?
The people! I’m really looking forward to having this many Hadoop-minded people in a single place.
If you could attend one session from another track, which session would it be?
If I had to choose just one, I would definitely attend “Scaling Big Data Mining Infrastructure: The Twitter Experience”, by Jimmy Lin and Dmitriy Ryaboy.