Thank you to the Apache Hadoop community for casting 8,480 total votes during the Hadoop Summit Community Choice voting process. This is a new record for the Community Choice program, exceeding the number of votes received for the 2012 Hadoop Summit Conference hosted in San Jose.
As we announced earlier, the sessions that received the most votes in each track are automatically accepted into the Hadoop Summit agenda. As such, we are pleased to announce the winners of the Hadoop Summit Community Choice vote and the first confirmed sessions in the Hadoop Summit Europe agenda:
Applied Hadoop track
Analyzing 1.4 Trillion Events with Hadoop (Michael Brown, CTO, comScore)
Abstract: This session provides details on how comScore uses Hadoop to process over 1.4 trillion internet and mobile events per day to understand, analyze and produce information on what is happening on the Web worldwide. The talk will highlight the use of Hadoop to determine how activities at web sites translate into real user behaviors. Attendees will gain insight into how comScore has used Hadoop to handle the scalability needs of its Validated Campaign Essentials product. The talk will also detail how algorithms running on top of Hadoop combine information to develop broader insights Internet usage.
Introduction to Microsoft HDInsights and BI Tools (Abhijit Lele, Solutions Engineer and Rohit Bakshi, Product Manager, Hortonworks)
Abstract: Hortonworks Data Platform powers Microsoft HDInsight and Windows Azure HDInsight, their Hadoop-based solutions for Windows Server and Windows Azure. With HDInsight, Microsoft eases the adoption of Hadoop with the simplicity and manageability of Windows and enables customers to easily derive insights from structured and unstructured data through familiar tools like Excel. This presentation looks at all core components of HDInsigt and Microsoft Ecosystem.
Splout SQL: When Big Data Output is Also Big Data – A Richer, Open-Source Database “Spout” for Hadoop (Iván Prado Alonso, CEO Datasalt)
Abstract: There are many Big Data problems whose output is also Big Data. In this presentation we will show Splout SQL, which allows serving an arbitrarily big dataset by partitioning it. Splout is to Hadoop + SQL what Voldemort or Elephant DB are to Hadoop + Key/Value. When the output of a Hadoop process is big, there isn`t a satisfying solution for serving it. Splout decouples database creation from database serving and makes it efficient and safe to deploy Hadoop-generated datasets. Splout is not a “fast analytics” engine. Splout is made for demanding web or mobile applications where query performance is critical. On top of that, Splout is scalable, flexible, RESTful & open-source.
Managing your Hadoop Clusters with Apache Ambari (Pramod Thangali, Director of Engineering and Mahadev Konar, co-founder and architect, Hortonworks)
Abstract: Apache Ambari provides a 100% open source and intuitive set of tools to monitor, manage and efficiently provision your Apache Hadoop cluster. Ambari simplifies the operation and hides the complexity of Hadoop, making Hadoop appear like a single, cohesive data platform. Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of nodes involved. Ambari allows you to control Hadoop cluster services from a single point. In this session, we will provide an overview of the Apache Ambari key features, architecture and web service-based APIs.
Thanks again to everyone that participated in the Community Choice voting process. The winning sessions above, plus additional sessions being selected by the Hadoop Summit content selection committee, will be posted on the website in the very near future.
Please register for the conference now if you haven’t already done so. Passes are selling quickly and you don’t want to miss the first Hadoop Summit conference in Europe.