Speakers

Session Speakers

Aaron Davidson

Software engineer — Databricks

Speaker Bio:

Aaron Davidson is an Apache Spark committer and software engineer at Databricks. He has implemented Spark standalone cluster fault tolerance and shuffle file consolidation, and has helped in the design, implementation, and testing of Spark`s external sorting and driver fault tolerance.

Aaron Myers

Software Engineer — Cloudera, Inc.

Speaker Bio:

Aaron T. Myers is a Software Engineer at Cloudera and an Apache Hadoop Committer. Aaron’s work is primarily focused on HDFS. Prior to joining Cloudera, Aaron was a Software Engineer and VP of Engineering at Amie Street, where he worked on all components of the software stack, including operations, infrastructure, and customer-facing feature development. Aaron holds both an Sc.B. and Sc.M. in Computer Science from Brown University. http://blog.cloudera.com/blog/2012/08/meet-the-engineer-aaron-t-myers/

Abhijit Pol

Chief Architect — Rocket Fuel

Speaker Bio:

Abhijit is the Chief Architect of Big Data Systems at Rocket Fuel that enabled learning and gaining insight over several petabytes of data everyday. Prior to Rocket Fuel, he was an architect working on Yahoo?s Behavioral Targeting Platform that builds machine-learning models over big data. He holds Ph.D. in Computer Science specialized in databases and Approximate Query Processing. Abhijit is a co-author of ?decision support systems? which is used as a textbook in many universities. He is author of 10+ research papers published in prestigious international database conferences. Abhijit also co-authored a paper that won of SIGMOD-2005 best paper award.

Adam Gibson

Owner — Blix.io

Speaker Bio:

Adam is a Deep Learning specialist based in San Francisco assisting Fortune 500 companies, hedge funds, PR firms and startup accelerators with their machine learning projects. Adam has a strong track record helping companies handle and interpret big real-time data. Adam has been a computer nerd since he was 13 and actively contributes to the open source community.

Adam Kawa

Data Engineer — Spotify

Speaker Bio:

Adam Kawa works as Data Engineer at Spotify, where his main responsibility is to maintain one of the largest Hadoop-YARN clusters in Europe. Every so often, he implements and troubleshoots Python MapReduce, Hive and Pig jobs. Adam is a frequent speaker at Hadoop conferences and Hadoop User Groups meetups. He co-organizes Stockholm and Warsaw Hadoop User Groups. He regularly blogs about the Hadoop ecosystem at HakunaMapData.com.

Alan Gardner

Solutions Architect — Pythian

Speaker Bio:

Alan works in Pythian’s CTO office, helping clients architect their data infrastructure. His focus is on assessing, selecting and integrating cutting-edge technologies including Hadoop and ecosystem projects. When he isn’t working on client systems, Alan develops web applications on a variety of platforms to solve Pythian’s and his own problems.

Alan Gates

Co-founder — Hortonworks

Speaker Bio:

Alan is a co-founder at Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan also designed HCatalog and guided its adoption as an Apache Incubator project. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a book from O’Reilly Press.

Alex Moundalexis

Solutions Architect — Cloudera Government Solutions

Speaker Bio:

Alex Moundalexis is a Solutions Architect for Cloudera Government Solutions and has spent the last year installing and configuring Hadoop clusters across the country for a variety of commercial and government customers. Before entering the land of Big Data, Alex spent the better part of ten years wrangling Linux server farms and writing Perl as a contractor to the Department of Defense and Department of Justice. He likes shiny objects.

Alexis Roos

Senior Solutions Architect — Concurrent Inc.

Speaker Bio:

Alexis Roos is a Senior Solutions Architect focusing on Big Data solutions at Concurrent, Inc. He has over 18 years of experience in software and sales engineering, helping both Fortune 500 firms and start-ups build new products that leverage Big Data, application infrastructure, security, databases and mobile technologies. Prior, Alexis worked for Sun Microsystems and Oracle for over 13 years, and has also spent time at Couchbase and several large systems integrators over in Europe. Alexis has spoken at dozens of conferences as well as university courses and holds a Master’s Degree in Computer Science with a Cognitive Science emphasis.

Allen Day

Data Scientist — MapR Technologies

Speaker Bio:

Allen has been working on scientific computing and BigData problems for 15 years, first with Beowulf clusters and Hadoop since 2008. Allen created one of the first video discovery platforms for mobile phones, and also built the largest public database of human gene expression data. As an open source developer, Allen has authored R and Perl libraries and most recently contributed to Storm. Allen has founded and co-founded multiple machine learning and biotech companies, and he holds a Ph.D. in Human Genetics from the School of Medicine at UCLA.

Allen Wittenauer


Speaker Bio:

Allen Wittenauer has been involved with Apache Hadoop since May 2007, when he was hired by Yahoo! to bring large-scale operational experience to the fledgling project. His work there helped create the basic blueprints that almost all Hadoop deployments follow today. At LinkedIn, his experience provided key insight and a foundation to its award-winning data science team.

Amareshwari S

Architect — InMobi

Speaker Bio:

Amareshwari is currently working as Architect in platform team at Inmobi, where she works on Hadoop and related projects for data collection and analytics. She is member of Apache Hadoop PMC and is Apache Hive committer. She has been working on Hadoop and its eco system since 2007. Prior to Inmobi, she was working with Yahoo! in core Hadoop team. She holds bachelor`s degree in computer science and engineering from National institute of technology, Waragal, India; and master`s degree in Internet science and engineering from Indian Institute of Science (IISc), Bangalore, India.

Amrit Lal

Product Manager, Hadoop and Big Data — Yahoo

Speaker Bio:

Amritashwar Lal is a Product Manager at Yahoo where he is engaged in building a high class and robust Hadoop infrastructure services. He has eight years of experience across HSBC, Oracle and Google in developing products and platforms for high growth enterprises. He earned his MBA from Carnegie Mellon University Tepper School of Business in Marketing and Entrepreneurship.

Andy Feng

Distinguished Architect — Yahoo

Speaker Bio:

Andy Feng is a Distinguished Architect at Yahoo!, and a Committer at Apache Storm project. He is leading architecture design of Yahoo big-data platform.

Ankur Gupta

GM — MetaScale

Speaker Bio:

Ankur Gupta is an IT Director at Sears and Global Head of Sales, Marketing and Operations at MetaScale, a Big Data technology Subsidiary of Sears Holdings. Ankur is leading efforts to accelerate Big Data efforts for other enterprises, leveraging learning from implementing Hadoop and other Big Data technologies at Sears. Before moving into this role, Ankur has led several other major monetization initiatives at Sears in various businesses. Prior to Sears, Ankur worked with IBM Global Services in India and US. Ankur received an MBA from Duke University and a degree in Mechanical Engineering from Indian Institute of Technology, Roorkee.

Armand Prieditis

Principal Scientist — Neustar

Speaker Bio:

Dr. Prieditis received PhD in machine learning from Rutgers University. He developed a program automatically discovered the first known effective heuristic for Rubik?s Cube. Afterwards, he was professor at University of California-Davis. He started own company with National Science Foundation Small Business Research Innovation award to develop system learned to find web pages for which the Google user was searching but lacked the vocabulary to find. He also started another company with funding from Siemens Corporation and a $2 million grant from NIST?s Advanced Technology Program. Currently, he is Principal Scientist at Neustar working on machine learning and search methods for big data.

Avery Ching

Software Engineer — Facebook

Speaker Bio:

Avery has a PhD from Northwestern University in the area of parallel computing. He worked at Yahoo! Search for four years on the web map analytics platform, large-scale ad hoc serving infrastructure, and cluster management. During the past two and a half years, he has been working at Facebook in the general area of big data computational frameworks (Corona – MapReduce, Giraph).

Bernardo de Seabra

Staff Software Engineer — BrightRoll

Speaker Bio:

Staff Software Engineer, Data Team at BrightRoll. Previously at Playdom.

Bikas Saha

Software Engineer — Hortonworks

Speaker Bio:

Bikas has been working in the Apache Hadoop ecosystem since 2011 and is a committer/PMC member of the Apache Hadoop and Tez projects. He has been a key contributor in making Hadoop run natively on Windows and has focused on YARN and the Hadoop compute stack, with special interest in Tez. Prior to Hadoop, he has worked extensively on the Dryad distributed data processing framework that runs on some of the world’s largest clusters as part of Microsoft Bing infrastructure. @bikassaha

Bill Yetman

VP of Engineering. Commerce, Data and Analytics — Ancestry.com

Speaker Bio:

Bill Yetman has served as VP of Engineering at Ancestry.com since January 2014. Bill has held multiple positions with Ancestry.com from August 2002, including Senior Director of Engineering, Director of Sites, Mobile and APIs, Director of Ad Operations and Ad Sales, Senior Software Manager of eCommerce and Senior Software Developer. Prior to joining Ancestry.com, he held several developer and programmer roles with Coresoft Technologies, Inc., Novell/Word Perfect, Fujitsu Systems of America and NCR. Bill holds a B.S. in Computer Science and a B.A. in Psychology from San Diego State University.

Bobby Evans

Principal Engineer — Yahoo

Speaker Bio:

Bobby Evans is a Principal Engineer at Yahoo! and a Hadoop PMC member at the Apache Software Foundation.

Casey Stella

Principal Architect — Hortonworks

Speaker Bio:

I specialize in writing software and solving problems where there are either scalability concerns due to large amounts of traffic or large amounts of data. I have a particular passion for data science problems or any thing vaguely mathematical. As a Principal Architect focused on data science, I spend time with a variety of clients, large and small, mentoring and helping them use Hadoop to solve their problems. I have specialized in the past in Oil & Gas and Healthcare.

Chintan Bhatt

Technical Specialist — Syntel

Speaker Bio:

Chintan holds Masters of Technology from IIIT. A Passionate big data enthusiast, with 8 years of experience, works with Syntel. He works on Research & development. He is also involved in building the Industry solutions

Chris Nauroth

Member of Technical Staff — Hortonworks Inc

Speaker Bio:

Chris is a software engineer at Hortonworks and an Apache Hadoop committer. He is an active contributor to HDFS, YARN, and MapReduce. Prior to Hortonworks, Chris worked for Disney, where he deployed Hadoop, developed data management solutions on top of it, and was responsible for operational support.

Costin Leau

Engineer — Elasticsearch

Speaker Bio:

Costin Leau is an engineer at Elasticsearch, currently working with NoSQL and Big Data technologies. An open-source veteran, Costin led various Spring projects (Spring OSGi, GemFire, Redis, Hadoop) and authored an OSGi spec. Speaker at various editions of EclipseCon/OSGi DevCon, JavaOne, Devoxx/Javapolis, JavaZone, SpringOne, TSSJS on Java/Hadoop/Spring related topics.

Dan Houser

Security Architect — Cardinal Health

Speaker Bio:

Dan Houser is Security Architect for a Fortune 50 global healthcare firm, where he establishes security standards and provides innovative solutions for meeting global imperatives for secure healthcare operations. Houser is a published author, a frequent international speaker on topics of governance, cryptography, metrics and identity, and serves on the (ISC)2 Board of Directors.

Daniel Dai

Member of Technical Staff — Hortonworks

Speaker Bio:

Daniel is an Apache Pig PMC member/committer involved with Pig for 5 years at Yahoo and now at Hortonworks. He has a PhD in Computer Science with specialization in computer security, data mining and distributed computing from University of Central Florida. He is interested in data science, large scale processing, Hadoop, Pig, Hive, and more.

Daniel Templeton

Certification Developer — Cloudera, Inc.

Speaker Bio:

Daniel works in the Cloudera training team building Cloudera’s developer and data science Cloudera Certified Professional certifications. Daniel also has a long history as a software engineer in the high performance computing space and has been kicking around big data since about 2009. Prior to Cloudera, Daniel spent more than a decade at Sun doing various engineering and product management roles and speaking at conferences. Daniel has a BE in EE/CS from Vanderbilt and is just finishing an MSCS from Stanford.

Danil Zburivsky

Big Data Consultant/Solutions Architect — Pythian

Speaker Bio:

Danil has been working with databases and information systems since his early years in university, where he received a Masters Degree in Applied Math. Danil has 7 years of experience architecting, building and supporting large mission-critical data platforms using various flavours of MySQL, Hadoop and MongoDB. He is also the author of “Hadoop Cluster Deployment” book. Besides databases Danil is interested in functional programming, machine learning and rock climbing.

David Chaiken

CTO — Altiscale, Inc

Speaker Bio:

David Chaiken comes to Altiscale from Yahoo!, where he served as Chief Architect. At Yahoo, he led teams building consumer advertising and media systems with Apache Hadoop at their core. Over his career, David has also built voice-search products for consumers, mobile applications for enterprises, network management systems, project management software, large-scale multiprocessor architectures, a tablet computer, and several other information appliances. David earned a BS in Mathematics from Brown University and a PhD in Electrical Engineering and Computer Science from MIT.

David Chen

Software Engineer — LinkedIn Corporation

Speaker Bio:

David is a software engineer on the Hadoop Development team in LinkedIn`s Data Analytics Infrastructure group. He has previously worked at Microsoft in the Windows Kernel Platform Group.

David Mariani

CEO — AtScale, Inc.

Speaker Bio:

David P. Mariani is CEO of AtScale, Inc., an incubating software startup focused on bringing business intelligence into age of Hadoop. Prior to AtScale, David was VP Engineering of Klout, a social analytics data service that scores over 450 million user profiles daily and collects over 12 billion events across the social web. Previously, David ran the analytics and data pipelines for Yahoo!`s consumer sites and advertising services, where he built the world`s largest cube and drove early Hadoop development and adoption.

Enis Soztutar

Member of Technical Staff — Hortonworks, Inc.

Speaker Bio:

Enis Soztutar is an Apache HBase, Hadoop, and Gora PMC member and member of the Apache Software Foundation. He has been using and developing Hadoop ecosystem projects since 2007. He is currently working at Hortonworks as a part of the HBase engineering team.

Gang Chen

Data Scientist — Neustar

Speaker Bio:

Dr. Chen received his PhD in Statistics from University of Illinois at Urbana-Champaign, before that he got Master degree in Computer Science from Shanghai Jiaotong University. Currently, He is a Data Scientist in Neustar, focus on finding business intelligence from big data and developing statistics algorithms for big data.

Garrett Wu

CTO — WibiData

Speaker Bio:

Garrett joined WibiData in 2010 and focuses on distributed infrastructure and algorithms. Previously he worked at Google in Mountain View, CA and New York City where he was the tech lead of the personalized recommendations team. Garrett’s areas of interest include natural language processing, machine learning and data mining.

Gera Shegalov

Software Engineer — Twitter

Speaker Bio:

Gera graduated from Universitt des Saarlandes with PhD in Computer Science. He has published various papers related to Database and transactional systems. He has worked at Oracle, MapR and now at Twitter.

Gopal Vijayaraghavan

Performance lead — Hortonworks

Speaker Bio:

Gopal Vijayaraghavan is a late entry into the hadoop game, having started working on it in 2012. He works on hive and Tez as part of the Stinger initiative, fixing query performance at scale.

Govind Kamat

Software engineer — Cloudera

Speaker Bio:

Govind Kamat is a member of the Performance Engineering Team at Cloudera, focusing on Hadoop and HBase performance and scalability. His experience includes the development of large-scale software systems, microprocessor architecture, compilers and electronic design. Before Cloudera, he was a member of the Performance Engineering Group at Yahoo!

Gunther Hagleitner

Sr. Manager Software Development — Hortonworks

Speaker Bio:

Gunther Hagleitner has been contributing to various hadoop projects for over four years both at Yahoo! as well as Hortonworks. He is an active committer in the Apache Hive project as well as a PMC member of the Apache Tez project. Before Hadoop, Gunther has been working on database technology for more than a decade. At Hortonworks he is leading Hive efforts in the Stinger project – delivering performance and SQL capabilities in the ecosystem. Gunther holds has a MS in Mathematics from the University of Konstanz.

Hitesh Shah

Software Engineer — Hortonworks

Speaker Bio:

Hitesh Shah currently works on various things related to Apache Hadoop at Hortonworks with his primary focus on Apache Tez and Apache Hadoop YARN. He is a PMC member and committer for the Apache Hadoop, Tez and Ambari projects. Earlier to that, he spent close to a decade at Yahoo! building various frameworks all the way from data storage platforms for social content to a multi-threaded event-driven framework for building high throughput advertising serving platforms.

Ingo Mierswa

CEO — RapidMiner

Speaker Bio:

Ingo Mierswa is an industry-veteran data scientist since he began to develop RapidMiner predictive analytics software at the Artificial Intelligence Division of the University of Dortmund, Germany. Mierswa, the scientist, has authored numerous award-winning publications about predictive analytics and Big Data. Mierswa, the entrepreneur, is the founder of RapidMiner. Under his leadership, the company has grown up to 300 percent per year over the past five years. In 2012, he spearheaded the go-international strategy with the opening of offices in the U.S.

James Sirota

Big Data Solutions Architect — Cisco

Speaker Bio:

James Sirota loves to code, tinker, and experiment. In his spare time he works as a Big Data Solutions Architect and a Data Scientist at Cisco. He is a lead engineer on the big data platform for OpenSOC, a new security analytics platform from Cisco. James has over 10 years of experience as a software developer and a security engineer. He has an M.S. in Systems Engineering from the University of Southern California and a B.S. in Computer Science from Arizona State University.

James Taylor

Engineer — Salesforce.com

Speaker Bio:

James Taylor is an engineer at Salesforce.com in the Big Data Group. He founded the Apache Phoenix project and has lead the development effort on that for the past several years. Prior to working at Salesforce.com, James worked at BEA Systems on federated query processing systems and event driven programming models. He lives in San Francisco with his wife and two daughters.

Jason Lowe

Senior Principal Engineer — Yahoo

Speaker Bio:

Jason is an Apache Hadoop PMC member and has been contributing to Hadoop for the last 2 years. He is one of the lead Hadoop developers at Yahoo with a primary focus of running YARN and MapReduce on large-scale clusters.

Jay Tang

Director of Big Data Platform — PayPal

Speaker Bio:

The speaker Jay Tang is currently Director of Big Data Engineering at PayPal leading its big data effort. He is passionate about data. Jay was a member of the original Hadoop team at Yahoo! building and managing the world`s largest Hadoop cluster at that time. He has built large parallel databases at IBM/Informix/Yahoo! and BI products at Oracle/Hyperion.

Jeff Graham

Sr Advisor, Data Analytics — Cardinal Health

Speaker Bio:

Jeff Graham is a Senior Advisor of Data Analytics who is responsible for Big Data architecture at Cardinal Health. He has over 20 years experience in BI, database performance tuning as well as systems and application development.

Jennifer Lim

Technology Architect — Sprint

Speaker Bio:

Jennifer Lim has over 14 years of experience in large scale enterprise data warehousing and analytics. Most recently, she was a Research Scientist for the Sprint Advanced Analytics Lab and is now acting as a Lead Technology Architect, focusing on upgrading the analytics infrastructure in support of all those great use cases being discovered in the research lab.

Jian He

Member of Technical Staff — Hortonworks

Speaker Bio:

Jian He is a committer to the Apache Hadoop project. He is a Software Engineer at Hortonworks in the MapReduce team and mostly focuses on YARN development. Prior to joining Hortonworks, he received a Masters degree in Computer Science from Brown University.

Jitendra Pandey

Software Engineer — Hortonworks

Speaker Bio:

Jitendra Pandey works at Hortonworks Inc and has been contributing to Hadoop for around 5 years. His current area of focus is Hive performance improvements. Jitendra is a committer and PMC member for Apache Hadoop. He is also a committer and PMC member for Apache Ambari. Jitendra`s contributions include vectorized query processing in Hive, Ambari development, Hadoop security infrastructure, federated HDFS, wire compatibility and high availability of HDFS. Prior to Hortonworks, Jitendra worked at Yahoo on Big Data infrastructure and applications

Jiwon Seo

PhD Candidate — Stanford University

Speaker Bio:

Jiwon Seo is a CS PhD student in Stanford university. He is interested in distributed systems, large-scale data mining, and data visualization. With his advisor, professor Monica Lam, he designed and implemented distributed query language, called SociaLite. With its high-level semantic, SociaLite makes it easy to implement efficient code for large-scale data processing. He is interested in applying SociaLite for large-scale graph analysis and data mining. Jiwon has a Bachelor`s degree in Electrical Engineering from Seoul National University, and a Master`s degree in Electrical Engineering from Stanford University.

Joey Echeverria

Chief Architect of Public Sector — Cloudera

Speaker Bio:

Joey Echeverria is Cloudera`s Chief Architect for Public Sector where he coordinates with Cloudera`s Customers and Partners as well as Cloudera`s Product, Engineering, and Field teams to speed up the time it takes to move Hadoop applications to production. Previously Joey was a Principal Solutions Architect where he worked directly with customers to deploy production Hadoop clusters and solve a diverse range of business and technical problems. Joey joined Cloudera from the NSA where he worked on data mining, network security, and clustered data processing using Hadoop.

John Akred

CTO — Silicon Valley Data Science

Speaker Bio:

With over 15 years in advanced analytical applications and architecture, John is dedicated to helping organizations become more data-driven. He combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.

John Speidel

Software Engineer — Hortonworks

Speaker Bio:

John is a senior engineer at Hortonworks and a member of the Savanna development team. John is also a committer on the Apache Ambari project where he designed the current Ambari REST API. He has 15 years of experience developing commercial middleware systems with a focus on distributed transaction processing.

John Williams

VP, Platform Operations — TrueCar, Inc.

Speaker Bio:

John Williams leads the Platform Operations team for TrueCar, where he is responsible for the company`s overall technology infrastructure and operations strategy. John is a serial entrepreneur and start-up executive with an extensive background building and operating secure and highly scalable technology infrastructure. He was the co-founder and CTO at Preventsys (acquired by McAfee) and led the network penetration testing team for Internet security pioneer Trusted Information Systems. John has been retained as a consultant by numerous world-class technology, financial services and government organizations. Prior to that, John founded one of the first Internet service providers in New York.

Josh Patterson

Principal — Patterson Consulting

Speaker Bio:

Josh Patterson currently runs a consultancy in the big data machine learning space. Previously Josh worked as a Principal Solutions Architect at Cloudera and as a machine learning / distributed systems engineer at the Tennessee Valley Authority where he brought Hadoop into the smart grid with the openPDC project. Josh has a Masters in Computer Science from the University of Tennessee at Chattanooga where he did published research on mesh networks (tinyOS) and social insect optimization algorithms. Josh has over 15 years in software development and is very active in the open source space with projects such as Apache Mahout, Metronome, IterativeReduce, openPDC, and JMotif.

Juergen Urbanski

CTO at T-Systems — Deutsche Telekom

Speaker Bio:

As the Chief Technologist of T-Systems (Deutsche Telekom), Juergen Urbanski is responsible for the development and application of innovative technologies with the goal of growing revenue and profitability. Juergen has a distinctive track record of leading and supporting technology-enabled business transformation at McKinsey, NetApp and Deutsche Telekom. He has also spent 10 years in Silicon Valley serving Microsoft, SAP, Oracle and Symantec on product development issues. He presents this talk in his capacity as a Board Member Big Data & Analytics of the German IT Industry Association BITKOM.

Julian Hyde

Developer — Hortonworks

Speaker Bio:

Julian Hyde is an expert in query optimization and in-memory analytics. He is the lead developer of Optiq, the new cost-based optimizer for Apache Hive, an Apache Drill committer, and lead developer of the Mondrian OLAP engine. He is an architect at Hortonworks.

Julien Le Dem

Staff software engineer — Twitter

Speaker Bio:

Julien is the lead for Parquet?s java implementation. He also leads Data Processing tool development at Twitter and is on the Apache Pig PMC. His French accent makes his talks attractive.

Karthik Kannan

Founder, Chief Product Officer — Caspida Inc.

Speaker Bio:

Karthik Kannan is a successful entrepreneur specializing in product management, marketing and sales. He is currently co-founder and Chief Product Officer of Caspida Inc., an enterprise security startup in the Bay Area. Prior to that, he was co-founder and VP of Products at Cetas, a Big Data analytics company, acquired by VMware in April 2012. The Cetas product line is now a part of Pivotal, a spinoff from EMC and VMware. Prior to Cetas, he was VP of Marketing at Kazeon, a DLP and eDiscovery company, which was acquired by EMC in 2009. He has spoken at various conferences including GigaOm Structure, Data Week 2.0, TiECon etc. More at http://karthikkannan001.blogspot.com/ and @KarthikBigData.

Ken Krugler

President — Scale Unlimited

Speaker Bio:

Ken Krugler started using Hadoop back in the dark ages (2006), when it was still part of Nutch. He is the president of Scale Unlimited, a provider of consulting and training services for big data analytics, search, and machine learning using Hadoop, Cascading, Mahout, Cassandra and Solr. Ken is an Apache Tika committer and a member of the Apache Software Foundation, and in his spare time he tries to make Python programming interesting for high school students.

Kiru Pakkirisamy

CTO — Serendio Inc.

Speaker Bio:

Kiru Pakkirisamy is an accomplished technology leader with a proven ability to conceptualize and deliver products and solutions targeted at Data-driven enterprises. As a CTO at Serendio, he helps Enterprises harness their everyday data for business advantage through innovative Big data Science solutions. Most recently, he served as a Director of Engineering for Splunk, spearheading the development of Splunk-Hadoop integration technologies. Prior to this, Kiru had held technical and management roles at Successfactors, and Sybase. Kiru?s areas of interest include Distributed Computing, Big Data frameworks, and Predictive Science techniques as applied to Insurance, Healthcare, and Retail.

Kishor Angani

Principal Engineer — Yahoo!

Speaker Bio:

Kishor Angani Currently working in Video platforms team in Yahoo! We take care of video transcoding and enrichments. Prior to this I have worked on doing abuse analysis on grid for Yahoo! login systems.

Kishore Gopalakrishna

Staff Engineer — LinkedIn

Speaker Bio:

Kishore Gopalakrishna is software developer with great passion for using and building large scale distributed systems. As part of Data Infrastructure team at LinkedIn, Kishore has built Espresso, a distributed data store and Helix, a generic cluster management system. Prior to LinkedIn, Kishore spent large part of his time at Yahoo working on Ad systems that mostly involved data analysis using Hadoop and building systems like Apache S4 for near real time stream processing.

Kishore Yellamraju

Hadoop Operations Engineer — Rocket Fuel

Speaker Bio:

KishoreKumar Yellamraju is a Hadoop Operations Engineer at RocketFuel where he builds and manages large scale Hadoop-Yarn Clusters. Kishore works mostly on maintaining , optimizing the Big data Infrastructure components at RocketFuel such as Hadoop, HBase , Kafka, Storm , Hive, Hue, Oozie, Ganglia , OpenTSDB and more. Prior to working at Rocket Fuel, he was a senior systems Administrator at Fiserv. He is a certified Apache Hadoop Administrator, and holds Masters degree in Computer Science and bachelors degree in Electronics and communications.

Koji Noguchi

Principal Engineer — Yahoo

Speaker Bio:

Koji is an Apache Hadoop&Pig committer involved in supporting hadoop users at Yahoo for over 7 years. He has a Ph.D. in Information and Computer Science from University of California, Irvine.

Leonid Fedotov

Systems Architect — Hortonworks

Speaker Bio:

Master of Science in Electrical Engineering from Moscow State Technical University, over 25 years of IT experience, over 15 years of Oracle and RDBMS technologies, 5 years of Hadoop experience. Focusing on system administration, architecting, monitoring tools and customer support.

Lohit Vijayarenu

Software Engineer — Twitter

Speaker Bio:

Lohit graduated from Stony Brook University with Masters in Computer Science. He has been working on Hadoop and related technologies at Yahoo!, Quantcast, MapR Technologies and now Twitter.

Makoto Yui

Researcher — National Institute of Advanced Industrial Science and Technology, Japan.

Speaker Bio:

Makoto YUI is a researcher of the Information Technology Research Institute at National Institute of Advanced Industrial Science and Technology (AIST), Japan. He is working on a large-scale machine learning as a research project and released Hivemall as an open source software. Find his profile on https://staff.aist.go.jp/m.yui/

Manohar Prabhu

Product Manager, Data Processing and Analysis — Google

Speaker Bio:

Manohar is the Product Manager for several products in large-scale data processing and analysis at Google, including Pregel and MapReduce. These products provide infrastructure support for applications ranging to multi-PB data crunching for Web search. Manohar has an avid, long-running interest in parallel processing. His prior training and experience include a BS from Harvard, two MS degrees and a PhD in Computer Architectures from Stanford. He holds five patents in computer engineering, and his experience spans over two decades of work for Google and HP Labs and as a consultant for start-ups.

Marie-Luce Picard

Project Manager — EDF

Speaker Bio:

Marie-Luce Picard is a project manager and BI expert at EDF Lab. She has managed different R&D projects dealing with business intelligence and information systems (advanced documentation systems, data-mining for customer insight teams, etc..) She has also managed the EDF Lab team working on BI and data analytics. She is currently in charge of managing the EDF Lab project dealing with Big Data to handle the evolutions of EDF information systems linked to the data deluge expected within a few years impacting all businesses of the Company. She is also a member of the Big Data Coordination Committee and of the BI coordination Committee led by the EDF IT Division.

Matthew Farrellee

Software Engineer — Red Hat

Speaker Bio:

Matthew Farrellee is a Software Engineer in the CTO office at Red Hat with over a decade of experience in distributed and computational system development and management. Matt has been involved with numerous open source projects over the years. His current focus is on big data technologies, including combining OpenStack and Hadoop through the Savanna project. He is also active in the Fedora Big Data SIG and the Fedora community in general.

Mayank Bansal

Principla Engineer in Hadoop Platform Team — Ebay, Inc.

Speaker Bio:

Mayank Bansal is an Apache Hadoop Committer and Apache Oozie PMC and committer. He has been working on Hadoop and Oozie more than 4 years previously from Yahoo! and now from Ebay, Inc.

Michal Wegiel

Technical Lead, Pregel — Google

Speaker Bio:

Michal is the Tech Lead for the Pregel project, on which he?s worked since joining Google in 2011. Michal received his PhD from the University of California, Santa Barbara, where he conducted research on programming language design and implementation, improving memory management performance, interactions between virtual machines for different languages, and type-safe object sharing for co-located VMs for statically- and dynamically-typed languages. He?s been on the external review committee for the ASPLOS, PLDI, PPoPP and TPDS conferences. He?s also previously worked for Sun Microsystems and Motorola.

Naresh Agarwal

Director of Engineering — InMobi

Speaker Bio:

Naresh has extensive experience and deep interest in large scale data management domain & infrastructure. Currently, Naresh is managing the data platform engineering group at Inmobi

Nick Dimiduk

Member of Technical Staff, HBase — Hortonworks

Speaker Bio:

Nick found Hadoop when his nightly ETL jobs started taking 20+ hours to complete. Since then, he has applied Hadoop and HBase to projects over social media, social gaming, click-stream analysis, climatology, and geographic data. Nick also helped establish Seattle?s Scalability Meetup and tried his hand at entrepreneurship. He is an HBase committer and coauthored “HBase in Action,” the unofficial user`s guide for HBase. His passion is scalable, online access to scientific data.

Nicolas Liochon

CTO — Scaled Risk

Speaker Bio:

After a Master degree in distributed real-time systems, Nicolas has stayed focused on the software architecture business at various positions including Head of Architecture at Thomson Reuters for the Risk Management product line. He has been deeply part of the Big Data arena for more than 2 years, working especially with Hortonworks on HBase MTTR. He combines traditional software and enterprise architecture skills with a deep knowledge of Big Data architecture. Nicolas is PMC member for the Apache HBase project. He is also cofounder of Scaled Risk, a company that provides a Big Data solution on top of Hadoop and HBase.

Nong Li

Software Engineer — Cloudera

Speaker Bio:

Nong Li is a software engineer working on the open-source Cloudera Impala. He spends most of his time focusing on improving the performance of the query execution engine, working on the IO subsystem, JIT-compiling portions of the query execution, and working on expression evaluation and other performance-centric components.

Ofer Mendelevitch

Director, Data Science — Hortonworks

Speaker Bio:

Ofer Mendelevitch is Director of data sciences at Hortonworks, where he is responsible for professional services involving data science with Hadoop. Prior to joining Hortonworks, Ofer served as Entrepreneur in Residence at XSeed Capital where he developed an investment strategy around big data. Before XSeed, Ofer served as VP of Engineering at Nor1, and before that he was Director of engineering at Yahoo! where he led multiple engineering and data science teams responsible for R&D of large scale computational advertising projects including CTR prediction (with Hadoop), a new front-end ad-serving system and sales tools.

Oleg Checherin

Founder — DataWarehousing Consulting

Speaker Bio:

Oleg Checherin is an independent consultant in area of Hadoop implementation and integration with corporate systems. He has Sc.M. in Electrical Engineering from Moscow Power Engineering Institute and over 20 years of IT experience with primary focus with data support in enterprise scale.

Oscar Boykin

Staff Data Scientist — Twitter

Speaker Bio:

Oscar Boykin (@posco) is a member of the analytics infrastructure team at Twitter and committer on scalding, algebird, summingbird and several other Twitter open source libraries.

Owen O`Malley

Co-founder & Architect — Hortonworks

Speaker Bio:

Owen O`Malley is a cofounder and architect at Hortonworks, a rapidly growing company that supports customers using Hadoop. Owen has been working on Hadoop since the beginning of 2006 and was the first committer added to the project. In the last 8 years, he has at various times been the architect of MapReduce, Security, and now Hive. Before working on Hadoop, he worked on Yahoo Search`s WebMap project, which was the original motivation for Yahoo to work on Hadoop. Prior to Yahoo, he wandered between testing (UCI), static analysis (Reasoning), configuration management (Sun), and software model checking (NASA). He received his PhD in Software Engineering from University of California, Irvine.

Peter Guerra

Principal — Booz Allen Hamilton

Speaker Bio:

Peter Guerra is a Principal in Booz Allen Hamilton’s Strategic Innovation Group leading a large team of Data Scientists. He has 15 years of professional experience applying computer science servicing National Intelligence, Military, Commercial Health, and Financial Services clients. His specialty is in highly available, large-scale distributed systems and advanced analytics, and is responsible for leading several large-scale Hadoop computing projects. He has been a software and security consultant to government and commercial organizations throughout his diverse IT career, focusing on software development, security engineering, and highly available system design.

Prafulla Wani

Technical Architec — Syntel Ltd

Speaker Bio:

Prafulla Wani is a big data engineer with experience in Hadoop/NoSQL based solutions as well traditional data-warehouse implementations using relational databases and ETL tools. With more than 10 years of IT experience and majority of it has been working at client locations in US, he is currently playing role of Technical Architect – Big Data as part of Strategic Offering Group (SOG) at Syntel.

Rachel Owsley

Sr. Manager, Market Data Analysis — Safeway, Inc.

Speaker Bio:

Rachel Owsley has worked in end-to-end software development and data science for over 12 years. Last year, she lead the Data Science effort at Edo Interactive, a card-affiliated offers startup in Nashville, TN. She now heads the Data Science team for Just for U, Safeway’s personalized deals program.

Rahul Ravindran

Senior Software Engineer — BrightRoll

Speaker Bio:

Senior Software Engineer, Data Team at BrightRoll. Previously at Zynga and Microsoft.

Raj Nair

Director, Data Engineering — Penton Media Inc.

Speaker Bio:

Raj Nair is currently the Director of Data Management and Engineering at Penton Media. At Penton, Raj is focused on building a scalable data management platform combining both SQL and NoSQL technologies, a platform that would ultimately help create new content-centric products. Prior to Penton, Raj was at EMC innovating in the areas of Risk, Compliance, and Governance products, Earlier at IBM/Informix, Raj worked on Database drivers and performance accelerators. With over 15 years of technical and management expertise in High Performance Databases, Nosql and Hadoop technologies, Raj has the unique ability to combine customer needs and emerging technologies into innovative products.

Remy Saissy

Software Architect and IT Consultant — Octo Technology

Speaker Bio:

Remy is a Software Architect and IT Consultant at Octo Technology. After his studies at Epitech Paris where he first worked on distributed systems for raytracing rendering and designed and implemented an exokernel for the School?s System Lab (LSE), he has been involved in several IT projects, working with various technologies such as Opensource ECM and ERP, electronic Strongbox, iPad based video recommandation, distributed systems for simulation or computing and Hadoop for ETL, BI and Machine Learning workload. He also taught software development patterns at ETNA, Paris. He is currently in charge of Octo?s Hadoop R&D and co-organizer of the Paris Hadoop User Group.

Reuben Shaffer

Chief Information Officer — QuestPoint

Speaker Bio:

Chief Information Officer Reuben Shaffer oversees QuestPoint’s architecture, infrastructure and data – from data collection to aggregation and manipulation. In his role, he ensures the 11 billion impressions and 5 billion requests per day that QuestPoint receives on average are seamlessly executed through the company’s data intelligence platform. Through Shaffer’s leadership, companies are able to mine consumer behavior data, affording them powerful competitive advantages for their business’ bottom lines.

Rishit Shroff

Software Engineer — Facebook

Speaker Bio:

I am a Software Engineer at Facebook. I work on building data storage systems on HBase for various applications at Facebook. Before joining Facebook, I worked at NetApp for almost 2 years in the data mobility team. I have done my Bachelors in Information Technology from VJTI, India and Masters in Computer Networking from Carnegie Mellon University, USA.

Rohini Palaniswamy

Principal Engineer — Yahoo! Inc

Speaker Bio:

Rohini currently leads Pig and Oozie development at Yahoo!, and has been working on Hadoop and related projects like Pig, Oozie, HCatalog, Hive, Grid Data Lifecycle Management for the past 5 years at Yahoo! scale. Rohini is a PMC member/committer on the Apache Pig project, and a committer on the Apache Oozie project. She is interested in large-scale data processing and is currently working on Pig-on-Tez which targets low latency ETL on Hadoop.

Roopesh Varier

Director of Development, Big Data — Symantec Corporation

Speaker Bio:

Roopesh Varier leads the Big Data Platform team in Symantec Cloud Platform Engineering Group. Previously, he led the development of a big data platform in Symantec’s Threat analysis organization – building it from scratch to become the largest known security metadata store in the world. Prior to that he has also held multiple leadership roles in other companies.

Russell Foltz-Smith

VP, Data Platform — TrueCar, Inc.

Speaker Bio:

Russ is the VP of Data Platform at TrueCar.com, where he creates the intelligence systems driving TrueCar?s innovative interactive product set. Prior to TrueCar, he held executive, product and technical leadership positions at category leaders like IAC, Grind Networks, and Wolfram|Alpha. Russ holds a degree in mathematics from the University of Chicago and currently lives in Marina Del Rey, CA with his wife and two daughters.

Sameer Agarwal

PhD Candidate — UC Berkeley

Speaker Bio:

Sameer Agarwal is a Ph.D. candidate in the AMPLab at Berkeley working on large-scale approximate query processing frameworks. His research interests are at the intersection of distributed systems, databases and machine learning. He received his B.Tech in Computer Science and Engineering from the Indian Institute of Technology, Guwahati and was awarded the President of India Gold Medal in 2009. He was supported by the Qualcomm Innovation Fellowship during 2012-13 and is supported by the Facebook Graduate Fellowship during 2013-14.

Sanjay Radia

Architect and Founder — Hortonworks Inc

Speaker Bio:

Sanjay is founder and architect at Hortonworks, and an Apache Hadoop committer and member of the Apache Hadoop PMC. Prior to co-founding Hortonworks, Sanjay was the chief architect of core-Hadoop at Yahoo and part of the team that created Hadoop. In Hadoop he has focused mostly on HDFS, MapReduce schedulers, high availability, compatibility, etc. He has also held senior engineering positions at Sun Microsystems and INRIA, where he developed software for distributed systems and grid/utility computing infrastructures. Sanjay has a PhD in Computer Science from the University of Waterloo in Canada.

Santosh Jha

CEO/Founder — Aziksa

Speaker Bio:

Santosh is a software executive with 25 plus years of software development experience. Prior to founding Aziksa, Santosh was CTO for Kovim, a Bay Area learning solution company serving global enterprises. Before Kovim, Santosh was VP of Product Development at GlobalEnglish, where he successfully led the team to build multiple learning products with SaaS model with cloud hosting. Expert in the building strategy, planning, design and delivery of cost effective, high performance technology solution in support of company growth.

Saran Subramanian

Head of Analytics — Thomas Cook

Speaker Bio:

Saran has over 10 years of experience in digital analytics and eCommerce strategy, implementation and management. He specialises in extracting various forms of online and social media data and integrating it with offline data, and utilising for single customer view reporting and predictive modelling.

Saravanan Prabhagaran

Software Engineer — Syntel

Speaker Bio:

Saravanan holds Master of Computer Applications from Anna University and PG. Diploma in Cyber Crime Investigation and Forensic from Asian School of Cyber laws. Hard core developer with 5 years of experience, having versatile programming knowledge. Involved in Research & Development of Big data stack

Savin Goyal

Rocket Scientist — Rocket Fuel

Speaker Bio:

Savin is a Rocket Scientist at Rocket Fuel in the modeling infrastructure team working on large scale data pipelines for data mining and machine learning. He graduated from Indian Institute of Technology, Delhi with a BS in Computer Science.

Sheetal Dolas

Principal Architect — Hortonworks

Speaker Bio:

Sheetal is a Principal Architect working with Hortonworks. He has strong expertise in Hadoop ecosystem with very rich & diverse field experience across various verticals including Telco, Hi Tech, Retail, Internet Companies etc. He has served in key positions as Lead Big Data Architect, SOA Architect in variety of extremely large & complex enterprise programs. Has extensive knowledge of BigData/NoSql technologies including Hadoop/Yarn/Hive/Pig/HBase/Storm/Kafka/ElasticSearch etc. He has defined & established data architectures for multi-petabyte warehouses on Hadoop, has extensive hands on experience in deploying, tuning very large Hadoop clusters & building scalable applications on them.

Shital Mehta

Architect — Yahoo!

Speaker Bio:

Shital Mehta Currently an Architect in Video team where I work on video transcoding and enrichments platform. I have also worked in advertising domain where my main focus was on abuse detection. Before Yahoo! I spent considerable time in VoIP domain building voice mail, video mail, audio/video/web conferencing solutions for Telecom and Cable service providers.

Shivaram Venkataraman

Graduate Student — UC Berkeley

Speaker Bio:

Shivaram Venkataraman is a third year PhD student at the University of California, Berkeley and works with Mike Franklin and Ion Stoica at the AMP Lab. He is a committer on the Apache Spark project and his research interests are in designing frameworks for large scale machine-learning algorithms. Before coming to Berkeley, he completed his M.S at the University of Illinois, Urbana-Champaign and worked as a Software Engineer at Google.

Siddharth Wagle

Member of Technical Staff — Hortonworks

Speaker Bio:

Siddharth Wagle is a Member of Technical Staff at Hortonworks and a committer and PMC member for the Apache Ambari project. His primary focus is developing Ambari backend for provisioning, managing, and monitoring Apache Hadoop clusters. His previous background is in building high performant and scalable systems and APIs at Telenav Inc and Intelliun Corp.

Sivasankaran Chandrasekar

Senior Rocket Scientist — Rocket Fuel

Speaker Bio:

Chandra leads the infrastructure areas used for machine learning systems at Rocket Fuel. Prior to Rocket Fuel, he worked on peta byte scale Stats systems in the Ads Infrastructure team at Google and querying/indexing of semi structured data in the database kernel at Oracle. Chandra has a M.S in Computer Science from the University of Wisconsin, Madison and holds 100+ patents in database systems.

Snehalata Deorukhkar

Software Engineer — Syntel Ltd

Speaker Bio:

Hadoop developer at Syntel with over 3+ years of experience in IT industry with extensive experience in Hadoop and NoSQL based solutions.

Srikanth Sundarrajan

Principal Architect — InMobi

Speaker Bio:

Srikanth Sundarrajan works at Inmobi Technology Services, helping architect and build their next generation data management solution. He has been actively involved in various projects under the Apache Hadoop umbrella including HDFS, MR, Oozie. He has been working with distributed processing systems for over a decade and Hadoop in particular over the last four years. He was with the Hadoop team earlier at Yahoo!.

Srimanth Gunturi

Member of Technical Staff — Hortonworks

Speaker Bio:

Srimanth Gunturi is an Apache Ambari committer and PMC member working at Hortonworks.

Srinivas Nimmagadda

Technical Director, Big Data/Cloud Platform Engineering — Symantec Corporation

Speaker Bio:

Srinivas Nimmagadda is responsible for the architecture and technology of petabyte scale big data analytics cloud platform at Symantec’s Cloud Platform Engineering team. Prior to Symantec, Srinivas has led the efforts in building a large-scale private cloud platform at Intuit while pioneering the use of Software Defined Data Center (SDDC) concepts. Earlier in his career, Srinivas was instrumental in developing one of the world`s largest grid computing platforms for HPC workloads running on both Unix and Windows environments.

Stefan Groschupf

CEO — Datameer

Speaker Bio:

Stefan Groschupf is a big data veteran and serial entrepreneur with strong roots in the open source community. He was one of the very few early contributors to Nutch, the open source project that spun off Hadoop, which 10 years later, is considered a 20 billion dollar business. Stefan is currently the CEO of Datameer, the first big data analytics tool built natively on Hadoop.

Stephanie Caprini

Cheif Operating Officer — Rante

Speaker Bio:

As Rante`s chief operating officer, Stephanie Caprini leads the company?s global sales, marketing and services organization. Under her leadership, she oversees worldwide sales, field marketing, services, support and partner channels, and corporate support functions including Information Technology, Worldwide Licensing & Pricing and Operations. The sales and marketing organization is focused on delivering Rante`s software and services to customers and partners all over the world. At Rante, Stephanie has driven a strong track record of results, execution excellence and improved efficiency while also driving the customer satisfaction scores to the highest in company history.

Steve Ackley

VP of Sales — Altiscale

Speaker Bio:

Steve has led sales organizations at numerous technology companies, most recently at Packet Design, where he was EVP, WW Field Ops. Steve led Packet Design’s field for over 6 years until acquisition. Prior to that, he was VP WW Sales at MonoSphere (Dell), Satmetrix, and Vicinity (Microsoft). Steve was an early member of Marimba (BMC) where he built their reseller and direct channels as regional VP Sales. Before that, he held sales management positions at Tivoli (IBM) and Sun (Oracle). Steve received his J.D from the Santa Clara University School of Law and his B.S. from the San Diego State University School of Business.

Steve Loughran

Member of Technical Staff — Hortonworks

Speaker Bio:

Steve Loughran (@steveloughran) works at Hortonworks on leading-edge developments within the Hadoop ecosystem. Projects he`s worked on recently include OpenStack integration, the YARN service model, specifying and evolving the Hadoop Filesystem APIs -and is currently working on long-lived YARN services. He is the author of Ant in Action, a member of the Apache Software Foundation, an active committer and PMC member in the Hadoop project.

Suma Shivaprasad

Staff Engineer — InMobi

Speaker Bio:

Staff Engineer at Inmobi working on an in-house Data warehouse platform that facilitates querying and managing large datasets residing in Hadoop. Earlier to this worked In Yahoo on Distributed Data processing and Ingestion platforms which leveraged Hadoop.

Sumeet Singh

Director, Product Management — Yahoo

Speaker Bio:

Sumeet Singh is Director of Product Management for Hadoop at Yahoo responsible for Product Management, Customer Engagements, Evangelism and Community Development, and Program Management. In this role, he also leads the Hadoop products team responsible for both Apache open source contributions and Yahoo projects. Sumeet has 15 years of Product Management, Product Development, and Strategy Consulting experience in the technology industry. Sumeet earned his MBA from UCLA Anderson School of Management and MS from Rensselaer Polytechnic Institute, NY.

Sumit Mohanty

Member of Technical Staff — Hortonworks

Speaker Bio:

Sumit Mohanty works at Hortonworks on the Apache Ambari project. He is an Apache Ambari Committer and PPMC member. He is currently working on managing long lived YARN application. Prior to joining Hortonworks he has worked for several years at Microsoft on various aspects of system management and monitoring. He holds a PhD from University of Southern California.

Sunil Gupta

Technical Yahoo! — Yahoo!

Speaker Bio:

Sunil works in the ads and data team in Yahoo! Areas of current focus is to build large scale systems for distributed analytics.

Supreeth Rao

Technical Yahoo! — Yahoo!

Speaker Bio:

Supreeth works in the ads and data team in Yahoo! Areas of current focus is to build large scale systems for distributed analytics.

Suresh Srinivas

Co-founder and Architect — Hortonworks

Speaker Bio:

Suresh is an Apache Hadoop committer and member of Apache Hadoop Project Management Committee (PMC). He is a long term active contributor to the Apache Hadoop project and has designed and developed many significant features for Hadoop. Prior to co-founding Hortonworks, he served as a software architect at Yahoo! working on Apache Hadoop, where he developed features and supported some of the largest installations of Hadoop clusters. Follow Suresh on Twitter: @suresh_m_s

Tanya Shastri

Founder — Natero

Speaker Bio:

Tanya is co-founder of Natero, a big data cloud analytics company. Prior to Natero, as the Big Data lead at Google she led big data partnerships and ecosystem strategy for the Google Cloud Platform. In the Office of the CTO at NetApp she initiated and led the development of the company`s big data product and partner strategy. She started her career at Cisco where she held various roles in product management and software development. Tanya has an MBA from Haas, UC Berkeley and an MS in Electrical Engineering from SUNY Stony Brook, NY.

Tatsiana Maskalevich

Data Scientist — Silicon Valley Data Science

Speaker Bio:

Blending both industrial and academic research, Tatsiana is expert at solving hard business problems. She brings a background in both mathematics and statistics, and has deep experience researching and implementing models for predicting user behavior.

Ted Dunning

Chief Application Architect — MapR

Speaker Bio:

Ted is Chief Application Architect for MapR Technologies and contributes to several Apache open source projects including Mahout Hadoop, Zookeeper and Hbase. He is also a mentor for Apache Drill and Apache Storm. Ted has a Ph.D. in computing science from the University of Sheffield and is named as inventor on 24 issued patents with a dozen more pending. He also bought the beer at the first Hadoop User Group meeting.

Thiruvel Thirumoolan

Principal Engineer-1 — Yahoo!

Speaker Bio:

Thiruvel Thirumoolan is a developer in the Hive and HCatalog team at Yahoo!. In this role he is responsible for deployment of Hive, HiveServer2 and HCatalog across all the Hadoop clusters at Yahoo! and ensuring they work at the scale for the usage patterns of Yahoos. He also contributes the features and fixes to the Apache Hive community. He has a Bachelors degree from Anna University and has been working in Hadoop team at Yahoo! for more than 4 years. His favorite theme at Yahoo! internal Hack Days is Hadoop and also mines the trove of Hadoop logs for usage patterns and insights.

Thomas Graves

Principal Engineer — Yahoo

Speaker Bio:

Thomas Graves is a Principal Engineer at Yahoo. He is a Hadoop committer and PMC member, a Spark committer and Incubator PMC member at the Apache Software Foundation.

Timothy Los

Chairman & CEO — Rante Corporation

Speaker Bio:

Timothy Los is the Chairman and CEO of Rante. He is also the Chairman of Interfint. Rante is engaged in cloud procurement and database management and synchronization services. Interfint is involved in FATCA-related solutions for governments and financial institutions. Mr. Los attended St. John`s School of Law. Mr. Los is currently engaged in NYLS Taxation LLM with a focus on international taxation, transfer pricing, and cross-border legal and taxation issues with the exchange and use of intellectual property. At Rante, Mr. Los oversees the daily business operations and is in charge of the growth of Envizn is current and new markets.

Tristan Reid

Senior Software Development Lead — Hulu

Speaker Bio:

Before Hulu, Tristan was VP of Solution Design at Ares Mgmt, leading a team building research tools for investment professionals. He has taught software development courses for IBM, BEA Systems and others in the US, Europe and Asia. He received his CS degree from Duke University, and completed coursework in Applied Mathematics and various data analysis at Columbia U., Suffolk U. and U. of Illinois at Urbana-Champagne. He earned the CFA designation in 2006. Previously, Tristan built risk management and data analysis tools at Capital Research as a Quant Research Associate. Before his career in finance, he participated in a number of start-ups, both as a resource and as principal.

Varun Vasudev

Member of Technical Staff — Hortonworks, Inc.

Speaker Bio:

Mr Varun Vasudev was Senior Sofware Engineer at Yahoo!, and is experienced in web search. Now he is a Member of Technical Staff at Hortonworks, Inc. He is actively contributing to Hadoop YARN project.

Venkatesh Seetharam

Architect — Hortonworks

Speaker Bio:

Seetharam Venkatesh works at Hortonworks Inc. leading the data integration development efforts. He is an active contributor to Apache Oozie, Apache Sqoop and Apache Flume. He was part of the Hadoop team at Yahoo where he built data management solutions. He has been involved in Hadoop and contributed to many open source projects in the ecosystem over the last 6 years.

Vikram Dixit

Member of Technical Staff — Hortonworks

Speaker Bio:

Vikram Dixit is an active committer on the Apache Hive project and has been contributing to other Apache projects over the past 2 years at Hortonworks. He is also a committer on the Apache Tez and Ambari projects. At Hortonworks, he is involved in the design and development of Hive as part of the Stinger initiative delivering performance and SQL capabilities in the ecosystem. Vikram holds a Masters degree in Computer Science from the University of Southern California.

Vinod Kumar Vavilapalli

Member of Technical Staff — Hortonworks

Speaker Bio:

Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science in Engineering. He has been working on Hadoop for more than 6 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore where he worked on HadoopOnDemand, Hadoop-0.20, CapacityScheduler, and Hadoop security, before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit. He is reachable at twitter handle @tshooter.

Xiangrui Meng

Software Engineer — Databricks, Inc.

Speaker Bio:

Xiangrui Meng is a software engineer at Databricks, the company founded by the creators of Spark. His main interests center around developing and implementing scalable algorithms for scientific computing. He has been actively involved in the developments of Spark MLlib since he joined, contributing new features and helping review pull requests. Before Databricks, he worked as an applied research engineer at LinkedIn, where he was the main developer of an offline machine learning framework and provided user support to multiple offline recommendation pipelines.

Yanpei Chen

Software engineer — Cloudera

Speaker Bio:

Yanpei is a member of the Performance Engineering Team at Cloudera, where he works on internal and competitive performance measurement and optimization. His work touches upon multiple interconnected computation frameworks, including Hadoop, Impala, HBase, Search, and Hive. He holds a Ph.D. from UC Berkeley.

Zhijie Shen

Member of Technical Staff — Hortonworks, Inc.

Speaker Bio:

Dr. Zhijie Shen was awarded a Ph.D. degree in Computer Science from National University of Singapore. Now he is a Member of Technical Staff at Hortonworks, Inc. He is a Apache Hadoop Committer, and one of the core team of Apache Hadoop YARN. Moreover, he has been actively contributing to Hadoop ecosystem since 2011.