SPEAKERS

Arpit Agarwal

Hortonworks
Arpit Agarwal is a Member of Technical Staff at Hortonworks and an active Apache Hadoop committer and PMC member. His interests include distributed computing and software performance.

Session(s):
HDFS: Optimization, Stabilization and Supportability
Tuesday, June 28, 2016, 3:00PM - 3:40PM
BALLROOM B
Rahul Agarwal

Flipkart Internet Pvt. Ltd.
Member of Data Platform team in Flipkart. I am responsible to manage and operate Hadoop Infrastructure that scales to 10s of petabytes. Prior to Hadoop administrations stints at Flipkart/Expedia/VISA, I worked for Yahoo as Systems Engineer for various Listings Properties.
Nitin Aggarwal

Rocket Fuel inc.
Nitin is a Software Engineer at Rocket Fuel where he builds data applications using HBase, MapReduce, YARN and Storm to enable faster access and easier analysis of the petabytes of data Rocket Fuel generates and stores. He has also contributed to developing scalable monitoring and alerting infrastructure for the company using HBase and OpenTSDB. Nitin holds a masters degree in Computer Science with specialization in Algorithms and Distributed Systems.

Session(s):
Fighting Fraud in Real Time by Processing 1M+ TPS Using Storm on Slider (YARN)
Thursday, June 30, 2016, 4:10PM - 4:50PM
212
Andrew Ahn

Hortonworks
Andrew Ahn is an Apache Atlas committer and currently works for Hortonworks as the Governance product manager supporting Apache Atlas. Prior work include Product and Governance duties for the NYSE Euronext, spanning 12 countries and 23 market centers.

Session(s):
Top Three - Big Data Governance Issues and How Apache ATLAS resolves it for the Enterprise
Tuesday, June 28, 2016, 4:10PM - 4:50PM
212

What the #$* is a Business Catalog and Why You Need It!
Tuesday, June 28, 2016, 11:30AM - 12:10PM
BALLROOM B

Extend Governance in Hadoop with Atlas Ecosystem
Thursday, June 30, 2016, 4:10PM - 4:50PM
210A
Ajay Anand

Kyvos Insights, Inc.
Ajay Anand is VP Products at Kyvos Insights, delivering multi-dimensional OLAP solutions that run natively on Hadoop. Ajay was the first product manager for Hadoop at Yahoo, after which he founded Datameer, delivering the first commercial analytics product on Hadoop. Previously Ajay has held director of product management and market development roles at SGI and Sun.

Session(s):
Accelerating Data Warehouse Migration to Hadoop
Wednesday, June 29, 2016, 11:30AM - 12:10PM
230C
Brad Anderson

Liaison Technologies
Brad Anderson oversees Liaison's big data strategy, an enterprise effort to leverage the company's $250M investment in its world-class cloud infrastructure using big data technologies. Before joining Liaison in 2014, Anderson founded or co-founded four technology companies across a variety of industries, all employing big data technologies. He has also worked with MapR Technologies, Ericsson and Cloudant. While at Cloudant, Brad helped design and construct BigCouch, Cloudant's horizontally scalable version of CouchDB.

Session(s):
The Stream is the Database - Revolutionizing Healthcare Data Architecture
Wednesday, June 29, 2016, 5:00PM - 5:40PM
212
Kris Applegate

Dell Inc.
Kris has been with Dell since 2003. Based out of Dell’s Round Rock, Texas Customer Solution Centers, he regularly delivers briefings, workshops, and POCs in all areas of Private Cloud and Big Data. His principal solution expertise is the Dell Cloudera Hadoop Solution(s) and Dell Openstack Solutions. He has authored or co-authored several works including: Dell Hortonworks Reference Configuration, Dell Cassandra Reference Configuration, Dell MongoDB Reference Configuration, Virtualized Hadoop, Hadoop on Dell PowerEdge FX, and How to Execute a Successful Proof of Concept

Session(s):
Big Data in the Cloud, the Time has Come
Tuesday, June 28, 2016, 2:10PM - 2:50PM
230C
Masato Asahara

NEC
Masato Asahara (Ph.D.) received his M.E. degree in computer science and Ph.D. from Keio University in 2007 and 2011, respectively. He currently works at NEC Knowledge Discovery Research Laboratories. His research mission and interest include distributed computing platforms for advanced predictive analysis using novel machine learning algorithms. He moved to the Cupertino office in 2015 and works for R&D and business developments in the field of advanced Big Data analytics solutions using NEC's Heterogeneous Mixture Learning technology. http://www.nec.com/en/global/rd/crl/datamining/members/profile_asahara.html

Session(s):
Big Data Heterogeneous Mixture Learning on Spark
Thursday, June 30, 2016, 12:20PM - 1:00PM
230A
Shivnath Babu

Duke University and Unravel Data Systems
Shivnath Babu is an Associate Professor of Computer Science at Duke University and the CTO at Unravel Data Systems. He got his Ph.D. from Stanford University in 2005 and BTech from IIT Madras in 1999. He has received a U.S. National Science Foundation CAREER Award, three IBM Faculty Awards, and an HP Labs Innovation Research Award. His research interests are in ease-of-use and manageability of data-intensive computing systems, automated problem diagnosis and cluster sizing for systems running on cloud platforms, and automated detection and recovery from corruption of data caused by hardware faults, software bugs, or human mistakes.

Session(s):
Meeting Performance Goals in Multi-tenant Hadoop Clusters
Thursday, June 30, 2016, 11:30AM - 12:10PM
230C
Vladimir Bacvanski

SciSpike
Dr. Vladimir Bacvanski interest is in better and more productive ways of developing Big Data systems. He is a founder of SciSpike, a company doing custom development, consulting and training and engages clients on Big Data and Software Architecture topics. His recent projects include Big Data and Internet of Things applications in healthcare, reactive Big Data and Web Scale systems and introducing Spark in a large financial organization. Vladimir is an enthusiastic speaker and has given keynote talks at data conferences. He is the author of the O'Reilly course "Introduction to Big Data".

Session(s):
Big Data for Managers: From Hadoop to Streaming and Beyond
Tuesday, June 28, 2016, 11:30AM - 12:10PM
230A
Kamil Bajda-Pawlikowski

Teradata
Kamil is a Chief Architect at Teradata Center for Hadoop, Boston. Previously, he was a co-founder and chief software architect at Hadapt, a SQL-on-Hadoop company. Before that, Kamil performed graduate research at Yale University in the area of large scale data processing where he developed the HadoopDB project.

Session(s):
Presto, What's New in SQL-on-Hadoop and Beyond
Wednesday, June 29, 2016, 5:50PM - 6:30PM
210A
Rajesh Balamohan

Hortonworks
Rajesh Balamohan is a "Member of Technical Staff" in Hortonworks. He has been working on Hadoop for last couple of years. Recently he has been concentrating on Tez performance at scale. Rajesh is a committer and PMC in Apache Tez project.

Session(s):
Hadoop & Cloud Storage: Object Store Integration in Production
Thursday, June 30, 2016, 11:30AM - 12:10PM
212
Slim Baltagi

Capital One Financial Corp.
Slim Baltagi (@SlimBaltagi) is currently director of Big Data engineering at Capital One. He has more than 18 years of IT and business experience and has spent the last five years of his life hadooping and more recently sparking and flinking! He has worked on more than a dozen Big Data projects as a solution architect. He enjoys evangelizing about Big Data technologies and maintaining a Big Data Knowledge Base: Hadoop, Spark, Flink, ... He runs the New York City, Chicago, Washington DC, Dallas/Fort Worth Apache Flink meetups and co-organize the Boston, Paris, Sao Paulo Flink meetups.

Session(s):
Analysis of Major Trends in Big Data Analytics
Tuesday, June 28, 2016, 3:00PM - 3:40PM
230A
Arijit Banerjee

Wipro Technologies
Arijit Banerjee is Lead Architect at Wipro Technologies delivering several engagements around consulting, architecture and implementation for large scale big data projects. He is the architect and contributor of Big Data Ready Enterprise framework

Session(s):
Big Data Ready Enterprise Framework
Thursday, June 30, 2016, 3:00PM - 3:40PM
211
Nishant Bangarwa

Hortonworks
Nishant is Druid PMC member and Sr. Software Engineer at Hortonworks. He is part of Business Intelligence team at Hortonworks. Prior to that he was part of Metamarkets backend team and was responsible for analytics infrastructure, including real-time analytics in Druid. He holds a B.Tech in Computer Science from National Institute of Technology, Kurukshetra, India.

Session(s):
Scalable Realtime Analytics using Druid
Wednesday, June 29, 2016, 3:00PM - 3:40PM
BALLROOM B
Vijay Bhat

Capital One

Session(s):
Enterprise-Grade Streaming Under 2ms on Hadoop
Wednesday, June 29, 2016, 2:10PM - 2:50PM
BALLROOM B
Mike Bishop

Prescient
Mike Bishop oversees all technical and operational aspects of the Prescient Traveler platform. Mike served as a paratrooper in the US Army prior to earning degrees in computer science. He has developed and managed a variety of complex risk management solutions for government customers that relate to international travel, counter-intelligence, counter-terrorism, intelligence collection, geospatial analysis, communications and crisis response. Over the past two decades, Mike has supported operations for the US Army, Defense Intelligence Agency, National Ground Intelligence Center, Missile and Space Intelligence Center, Space and Missile Defense Command, and National Security Agency.

Session(s):
Prescient Keeps Travelers Safe with Natural Language Processing and Geospatial Analytics
Wednesday, June 29, 2016, 3:00PM - 3:40PM
212
Ron Bodkin

Think Big a Teradata Company

Session(s):
Integrating Apache Spark and NiFi for Data Lakes
Thursday, June 30, 2016, 2:10PM - 2:50PM
230A
Davor Bonaci

Google Inc.
Davor Bonaci is member of the Apache Beam Project Management Committee and a regular committer to the project since its inception. He is working as a Senior Software Engineer at Google. Before Beam, he has been working on its predecessor, Google Cloud Dataflow, since its beginnings, most recently by leading the development of the Dataflow SDK for Java.

Session(s):
Apache Beam: A Unified Model for Batch and Streaming Data Processing
Thursday, June 30, 2016, 4:10PM - 4:50PM
BALLROOM A
Slim Bouguerra

HortonWorks
Slim is Sr Software Engineer and Druid Committer, recently joined HortonWorks druid team after spending couple of years working at Yahoo inc as part of the open source Druid team. He holds a PhD in computer science from Grenoble University in France.

Session(s):
Scalable Realtime Analytics using Druid
Wednesday, June 29, 2016, 3:00PM - 3:40PM
BALLROOM B
Parth Brahmbhatt

Netflix
Parth contributed the Authorization layer of kafka security and is an active kafka contributor. He is a Apache Storm committer and has worked on the AWS Kinesis team in past.

Session(s):
Apache Kafka Security
Wednesday, June 29, 2016, 2:10PM - 2:50PM
210A
Tilmann Bruckhaus

Intuit
Tilmann is a hands-on big data engineering leader and entrepreneur at Intuit who is passionate about using his technical skills to build great products and agile teams. As leader of data engineering organizations in disruptive environments Tilmann led recruiting new and scaling existing engineering teams, and he created data platforms for machine learning, advanced analytics, and reporting.  Tilmann minimized fraud losses on billion dollar payment volumes and created reliable and efficient systems for ingesting billions of financial transactions.  Tilmann also created data marts for tens of millions of customers and brought new data product lines to market.

Session(s):
The Intuit Analytics Cloud 101
Wednesday, June 29, 2016, 4:10PM - 4:50PM
BALLROOM C
Boni Bruno

EMC
Boni Bruno is a Principal Solutions Architect for EMC’s Emerging Technologies Division. He has extensive experience in deploying Hadoop/Big Data Technologies and conducting extensive Hadoop validations and performance evaluations. He is also the resident security expert on the solution architecture team and has years of experience in Cyber Security Architecture and Analysis. Boni Bruno holds degrees in Electrical Engineering and Information Technology.

Session(s):
Increasing Hadoop Resiliency & Performance with EMC Isilon
Tuesday, June 28, 2016, 2:10PM - 2:50PM
210A
Andrew Brust

Datameer
Andrew is the Senior Director of Market Strategy & Intelligence at Datameer and writes a blog for ZDNet called "Big on Data." Andrew is co-author of "Programming Microsoft SQL Server 2012" (Microsoft Press); an advisor to NYTECH, the New York Technology Council and writes the Redmond Review column for VisualStudioMagazine.com.

Session(s):
The Ecosystem is Too Damn Big
Wednesday, June 29, 2016, 4:10PM - 4:50PM
BALLROOM B
Tom Bryans

Ford Motor Company
Tom Bryans has been at Ford Motor Company for over 20 years starting as a C programmer and performing many Application Development roles along the way. For the past 10 years, Tom has been working in Connected Vehicle; delivering systems that support Sync Gen 1, MyFordTouch and SyncGen3 s well as Cloud based delivery of software binaries to consumers and dealers. Tom is currently the Delivery Manager for the Ford Connected Vehicle Data Platform which collects vehicle data from Embedded Modems, Plug-In devices, Autonomous Vehicles, Smart Mobility Experiments and Connected Consumer mobile applications and utilizes Ford’s Enterprise Hadoop Data Platform.

Session(s):
The "Connected Vehicle" (IoT and Streaming) – Supporting the Mission to Become Both an Automotive and Mobility Company
Tuesday, June 28, 2016, 12:20PM - 1:00PM
210C

The Architectural Journey to our Modern Data Applications – DSC (Data Supply Chain)
Wednesday, June 29, 2016, 4:10PM - 4:50PM
BALLROOM A
Mark Burnette

Pentaho
Mark Burnette is an enterprise sales engineer at Pentaho, where he partners with large organizations to develop successful pilots and proofs of concept for big data solutions and embedded analytics using Pentaho, Hadoop, and other technologies. Prior to Pentaho, Mark ran an IT consulting practice for 15 years, providing custom enterprise application and data solutions.

Session(s):
Filling the Data Lake
Wednesday, June 29, 2016, 5:00PM - 5:40PM
211
Abhiraj Butala

BlueData
Abhiraj is a Software Engineer in the Hadoop Platform team at BlueData and primarily focusses on the HDFS caching infrastructure on BlueData EPIC platform. This infrastructure allows single copy data transfer between containerized Hadoop & Spark applications and remote HDFS backends via an HDFS abstraction layer. He also works on Platform HDFS, Mesos integration and Hadoop Security components for EPIC. Previous stints include roles in the infrastructure teams at Riverbed and Azingo. He has contributed patches to Hadoop (HDFS), Zookeeper & Alluxio (Tachyon) Projects. Abhiraj holds a Masters in Computer Science from Stony Brook University, New York. Reachable @abhirajbutala.

Session(s):
There is a New Ranger in Town! End-to-End Security and Auditing in a Big-Data-as-a-Service Deployment
Tuesday, June 28, 2016, 12:20PM - 1:00PM
212
Alan Byers

Motorists Insurance Group
Alan Byers is AVP of Data Analytics at Motorists Insurance Group. Alan and his team are responsible for Data Platform development, Data Management, Analytics Solution development, and Data Integration support for the Motorists home office and its affiliate companies. He has experience in hardware configuration, systems administration, project management, and Web application development, but has spent most of his 21-year career building and maintaining data solutions in the P&C Insurance and Health & Wellness industries.

Session(s):
Disrupting Insurance with Advanced Analytics – The Next Generation Carrier How Motorist Leapfrogged into the Future of Analytics and Data
Wednesday, June 29, 2016, 5:50PM - 6:30PM
212
Mary Caire MD

MARYCAIREMD
Mary Caire MD is a ground-breaking integrative medical doctor and leader in DNA-informed precision healthcare who empowers all to achieve optimal health and live their best life. Dr. Caire is board-certified in the specialty of Physical Medicine and Rehabilitation and fellowship-trained in anti-aging, regenerative and metabolic medicine. She is founder and medical director of the Caire Institute, located in Allen, TX, where she educates and shares her expertise with the medical community in order to increase the number of practitioners specializing in integrative, precision medicine.

Session(s):
"The Path to Wellness Through Big Data"
Wednesday, June 29, 2016, 5:50PM - 6:30PM
211
Arno Candel

h2o.ai
Arno is the Chief Architect of H2O, a distributed and scalable open-source machine learning platform. He is also the main author of H2O’s Deep Learning. Before joining H2O.ai, Arno was a founding Senior MTS at Skytree where he designed and implemented high-performance machine learning algorithms. He has over a decade of experience in HPC with C++/MPI and had access to the world’s largest supercomputers as a Staff Scientist at SLAC National Accelerator Laboratory where he participated in US DOE scientific computing initiatives and collaborated with CERN on next-generation particle accelerators. Arno holds a PhD and Masters summa cum laude in Physics from ETH Zurich, Switzerland. He has authored dozens of scientific papers and is a sought-after conference speaker. Arno was named "2014 Big Data All-Star" by Fortune Magazine. Follow him on Twitter: @ArnoCandel.

Session(s):
H2O: A Platform for Big Math
Tuesday, June 28, 2016, 4:10PM - 4:50PM
211
Seetha Chakrapany

Macy's
Seetha Chakrapany heads the Marketing Analytic Systems Environments - Business Intelligence teams at Macy's. This department provides Business Intelligence platforms including Tableau, SAS, and Excel, along with access to the Hortonworks Hadoop cluster, to analysts and internal decision makers across the organization, providing insight that is the backbone for online, paid search and advertising analysis and decisions that drive a competitive advantage for Macy's nationwide.

Session(s):
How Macy's Creates Operational Insight on Hadoop.
Tuesday, June 28, 2016, 11:30AM - 12:10PM
BALLROOM A
Kishore Chaliparambil

Microsoft
Kishore is working as a software engineer at Microsoft in the Big Data platform group currently focusing on Azure Data Lake Analytics, Cosmos and YARN. He has been working on the Hadoop ecosystem since 2012. He has also worked in the Exchange Server cloud infrastructure, and Microsoft Business Solutions group. Before Microsoft, he worked at SAP Labs in Netweaver platform and business intelligence group.

Session(s):
To Infinity and Beyond – Datacenter Scale YARN Clusters through Federation
Wednesday, June 29, 2016, 11:30AM - 12:10PM
BALLROOM B
Chester Chen

GoPro
Chester Chen is the Director of Engineering and hands on architect at Alpine Data. He manages the analytics platform development as well as contribute to some of the major developments including Analytic Workflow Engine, Spark Infrastructure, Kerberos Integration, Rest API infrastructure. He is the founder and organizer of SF Big Analytics Meetup, one of organizers of 2014 Scala By the Bay Conference. Before joining Alpine Data, he had played many roles as Technical Director, Architect, Director of development as well as individual developers in many big and small companies (Symantec, AltaVista, Clearstory system, Accent Media etc)

Session(s):
Real Time Visualizing Machine Learning with Spark
Wednesday, June 29, 2016, 5:00PM - 5:40PM
230C
Hao Chen

eBay Inc.
Co-creator, Committer and PMC of Apache Eagle (http://eagle.incubator.apache.org/; http://people.apache.org/~hao)

Session(s):
Apache Eagle - Secure Hadoop in Real Time
Wednesday, June 29, 2016, 5:00PM - 5:40PM
230A
Suma Cherukuri

Symantec Corporation
Software engineer with experience in building highly available and scalable web services and currently focused on contributing to the big data analytics platform.

Session(s):
In-Flux Limiting for a Multi-Tenant Logging Service
Tuesday, June 28, 2016, 2:10PM - 2:50PM
BALLROOM B
Ishan Chhabra

Rocketfuel Inc.
Ishan is a Technical Lead at Rocket Fuel, with a focus on building the next generation of real time storage and processing systems to enable key business use cases. Hadoop, HBase, Storm and Clojure are his tools of choice for tackling complexity and scalability challenges of storing and analyzing petabytes of data generated and stored at Rocket Fuel. Prior to Rocket Fuel, he worked at Bell Labs to enable privacy in large scale recommendation systems using a truly distributed middleware, acquiring a patent in the process. Ishan holds a Bachelors in Computer Science and Engineering.

Session(s):
Fighting Fraud in Real Time by Processing 1M+ TPS Using Storm on Slider (YARN)
Thursday, June 30, 2016, 4:10PM - 4:50PM
212
Darren Chinen

Malwarebytes
Darren is the Sr. Director of Data Science and Engineering at Malwarebytes. Prior to Malwarebytes he implemented Big Data solutions on Hadoop at GoPro and at Apple. He has spent nearly 2 decades in data mostly in IT departments where workload automation was an obvious necessity. In Big Data however, the environment has shifted from traditional IT data professionals to Java and Scala software engineers. Workload automation has been relegated to an afterthought for most Java engineers where options like Jenkins, Cron, or Oozie seem to offer a quick fix.

Session(s):
Workload Automation + Hadoop? Oh Yeah! …a Match Made in Heaven
Tuesday, June 28, 2016, 5:50PM - 6:30PM
210A
RajeshBabu Chintaguntla

Hortonworks
He is a committer of hbase and a committer and PMC of Phoenix. He has been working on hbase from last 4-5 years

Session(s):
Phoenix + HBase: An Enterprise Grade Data-Warehouse Appliance for Interactive Analytics?
Thursday, June 30, 2016, 3:00PM - 3:40PM
210C
Sriharsha Chintalapani

Hortonworks
Apache Kafka and Storm Committer.

Session(s):
Apache Kafka Security
Wednesday, June 29, 2016, 2:10PM - 2:50PM
210A
Kelvin Chu

Uber
Kelvin is a founding member of the Hadoop team at Uber. He is creating tools and services on top of Spark to support multi-tenancy and large scale computation-intensive applications. He is creator and lead engineer of Spark Uber Development Kit, Paricon and SparkPlug services which are main initiatives of Spark Compute at Uber. At Ooyala, he was co-creator of Spark Job Server which was an open source RESTful server for submitting, running, and managing Spark jobs, jars and contexts. He implemented real-time video analytics engines on top of it by datacube materializations via RDD.

Session(s):
Spark Uber Development Kit
Wednesday, June 29, 2016, 11:30AM - 12:10PM
BALLROOM A
Kelly Cook

ConocoPhillips
Kelly Cook is Director of Analytic Platforms at ConocoPhillips with a team of engineers that focus on supporting the company’s exploration, drilling, and production operations across 25 countries. Previously, Mr. Cook held a number of roles with Halliburton and was Principal Technologist at Plan Three Solutions. He firmly believes that information technology should be an ROI-building asset, not a cost center to be minimized. Mr. Cook holds a BBA in Accounting from the University of Houston.

Session(s):
It’s Time: Launching Your Advanced Analytics Program for Success in a Mature Industry Like Oil and Gas
Thursday, June 30, 2016, 12:20PM - 1:00PM
211
Fredrick Crable

Capital One
Mr. Crable has 20+ years experience in software engineering for telecom, cloud computing, and financial institutions. He has lead teams in both building systems for maintenance of Big Data infrastructure as well as implementing services and tools that allow companies to achieve their data analysis and operational goals. He has been instrumental in development of systems using Big Data techniques to optimize the loan origination and funding operations for Capital One bank. Currently he manages systems at Capital One which automate the evaluation of credit and automobile inventories for loan originations.

Session(s):
Automated Systems for Loan Decisions Using AKKA and Spark
Wednesday, June 29, 2016, 3:00PM - 3:40PM
230A
Christopher Crosbie

AWS
Christopher Crosbie has over a decade of experience developing and deploying healthcare and life science information technology solutions in regulated environments. He is currently the solutions architect for the Amazon Web Services healthcare and life sciences partner team where he is a trusted advisor to software vendors that build upon the AWS platform in the health tech space. Previous to joining AWS, Chris headed up the data science team at Memorial Sloan Kettering Cancer Center where he collaborated with physicians and scientists to develop software solutions that enhance cancer research using the organization's vast clinical and genomics databases.

Session(s):
HIPAA Compliance in the Public Cloud
Thursday, June 30, 2016, 12:20PM - 1:00PM
210C
Peter Crossley

Webtrends
As the Chief Technology Officer at Webtrends, Pete drives and oversees the Advanced Technology Group to innovate and adopt new technologies. Pete is involved in all aspects of the business from customer meetings to writing code. Previously, Pete was the architect for Webtrends Optimize, the company's testing and targeting solution, and has been focused on developing SaaS platforms for the past 15 years. Pete has worked in technical roles at companies such as, Philips Medical Systems and Netegrity.

Session(s):
The Life of an Internet of Things (IoT) Electron; its Journey to Become a Positive Influence for Something Greater
Tuesday, June 28, 2016, 4:10PM - 4:50PM
BALLROOM C
Nick Curcuru

MasterCard
Nick is responsible for leading the global enterprise information management practice at MasterCard. Using his 20 years of experience in operations and consulting he and his team delivers big data analytic solutions to his clients. He is a frequent speaker on big data, security strategy and how to enable data driven strategies utilizing both traditional information systems and Hadoop. Nick joined MasterCard Advisors from the SAS Institute, where he brings his extensive experience in the use of advanced analytics and solution design. Prior to the SAS Nick worked for Andersen Consulting and the Walt Disney Company.

Session(s):
Instilling Confidence and Trust - Big Data Security & Governance
Thursday, June 30, 2016, 3:00PM - 3:40PM
BALLROOM A
Daniel Dai

Hortonworks
Daniel is an Apache Pig PMC member/committer involved with Pig for 6 years at Yahoo and now at Hortonworks. He has a PhD in Computer Science with specialization in computer security, data mining and distributed computing from University of Central Florida. He is interested in data science, large scale processing, Hadoop, Pig,Hive, and more.

Session(s):
Hive Hbase Metastore - Improving Hive with a Big Data Metadata Storage
Wednesday, June 29, 2016, 4:10PM - 4:50PM
210C
Dr. Pedro Desouza

IBM
Dr. Desouza started his career teaching in the Department of Computer Science at Unicamp, Brazil, in 1993. A few years later, he joined IBM in Atlanta, GA, as a Senior Consultant, where he was able to apply innovative mathematical optimization techniques to planning and scheduling problems found in the Manufacturing industry. He worked as a Product Manager for i2 Technologies, applying statistics, mathematical programming, and analytical techniques to supply chain problems. Then, as a Director of Product Management for Vitria Technologies, he designed applications for real-time visibility of Supply Chain Management. He joined Business Objects as the Supply Chain Practice Lead, designing analytics on very large datasets for the Distribution, Defense, and Telecommunication industries. After that, he spent five years as a Business Intelligence Manager at Qualcomm, designing data mining and probabilistic algorithms for Augmented Reality and creating predictive analytics for multiple semiconductor chip design problems. More recently, he was Sr. Manager of Big Data & Analytics at EMC Global Professional Services, where he built a team of Data Scientists and Architects and helped customers in Finance, Media & Entertainment, Telecommunication, Retail, Manufacturing, Utilities, and Life Science. Currently, he is an Associate Partner in the Big Data & Analytics Center of Competency at IBM. Dr. Desouza is a Certified APICS Supply Chain Professional, has six patents, a B.S. and M.S. in Electrical Engineering from the Technological Institute of Aeronautics in São José dos Campos, São Paulo, Brazil, and a Ph.D. in Applied Mathematics from the Electrical and Computer Engineering Department at Carnegie Mellon University in Pittsburgh, PA.

Session(s):
Using Hadoop for Cognitive Analytics
Wednesday, June 29, 2016, 12:20PM - 1:00PM
210A
Pawan Divarkarla

Progressive
Sheetal Dolas

Hortonworks
Sheetal is a Principal Architect working with Hortonworks. He has strong expertise in Hadoop ecosystem with very rich & diverse field experience across various verticals including Telco, Hi Tech, Retail, Internet Companies etc. He has served in key positions as Lead Big Data Architect, SOA Architect in variety of extremely large & complex enterprise programs. Has extensive knowledge of BigData/NoSql technologies including Hadoop/Yarn/Hive/Pig/HBase/Storm/Kafka/ElasticSearch etc. He has defined & established data architectures for multi-petabyte warehouses on Hadoop, has extensive hands on experience in deploying, tuning very large Hadoop clusters & building scalable applications on them.

Session(s):
Keep your Hadoop Cluster at its Best!
Tuesday, June 28, 2016, 5:50PM - 6:30PM
BALLROOM B
Chris Douglas

Microsoft
Chris Douglas is a research engineer in the Microsoft Cloud and Information Services Lab (CISL). He has contributed to Apache Hadoop since 2007 and is a member of its PMC.

Session(s):
HDFS Tiered Storage
Tuesday, June 28, 2016, 4:10PM - 4:50PM
BALLROOM B
Kamal Duggireddy

Salesforce.com
Kamal Duggireddy is a hands-on technology leader with over 15 years of experience in successfully developing products and applications using transformative technologies while working in engineering and leadership roles at Salesforce.com, Amex and Axway. Kamal currently leads Data Engineering, Product Data Science Team at Salesforce.com Prior to this, he served as Director - Big Data Architecture at American Express. Combining deep technical skills along with business knowledge and strong execution experience, Kamal developed reference architectures and new enterprise-level capabilities with the Hadoop stack.

Session(s):
LEGO: Data Driven Growth Hacking Powered by Big Data
Tuesday, June 28, 2016, 5:00PM - 5:40PM
210C
Ted Dunning

MapR Technologies
Ted Dunning has been contributing to open source for decades. He likes cool algorithms and plays mandolin poorly, but enthusiastically.

Session(s):
Spark SQL versus Apache Drill: Different Tools with Different Rules
Tuesday, June 28, 2016, 5:50PM - 6:30PM
BALLROOM C

Real-Time Hadoop: Keys for Success from Streams to Queries
Wednesday, June 29, 2016, 12:20PM - 1:00PM
BALLROOM B
Don Bosco Durai

Hortonworks
Don Bosco Durai is an Apache committer and currently working as Security Architect at Hortonworks, focused on enabling enterprise grade security within Hadoop platform. Bosco brings years of experience building and managing enterprise data security products. Before Hortonworks, Bosco was the co-founder and Chief Security Architect of big data security startup, XA Secure. XA Secure was built ground up to address the unique security challenges that big data environments bring. XA Secure was subsequently acquired by Hortonworks.

Session(s):
Fine-Grained Security for Spark and Hive
Wednesday, June 29, 2016, 4:10PM - 4:50PM
210A
Hemananthan Duraiswamy

Hortonworks
Heman Duraiswamy is currently working as a Solutions Engineer for Hortonworks helping customers at various levels of expertise to kickstart their Hadoop journey. Before Hortonworks, Heman spent a significant time at Orbitz contributing to building their Hadoop ecosystem from ground up, specifically championing the data activation part of the orbitz.com website. His specific area of expertise is at providing real-time actionable insights by combining the streaming data with data at rest.

Session(s):
Zero Downtime App Deployment Using Hadoop
Thursday, June 30, 2016, 3:00PM - 3:40PM
BALLROOM C
Brian Durkin

Progressive
Brian Durkin is an innovation strategist in Progressive's Enterprise Architecture Organization. Throughout his eleven years at Progressive he has played many roles, ranging from application developer to enterprise architecture consultant; the common thread being a passion for making data more useful. He is currently part of the product research and development team focusing on geospatial analytics for usage based insurance where he uses technology to power data exploration, ideation, and rapid hypothesis testing on big datasets.

Session(s):
Knowledge from Noise: Geospatial Analytics at Progressive Insurance
Tuesday, June 28, 2016, 12:20PM - 1:00PM
211
Jon Eagles

Yahoo! Inc.
Jon Eagles is a Principal Software Engineer at Yahoo and Apache PMC for Hadoop and Tez at The Apache Software Foundation. He has been working in the open source community on Hadoop since 2011 and working with Hadoop since 2009. He is in currently on the development team rolling Apache Tez out at Yahoo.

Session(s):
Yahoo’s Experience Running Pig on Tez at Scale
Thursday, June 30, 2016, 11:30AM - 12:10PM
230A
Joey Echeverria

Rocana
Joey Echeverria is the platform technical lead at Rocana, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a software engineer at Cloudera, where contributed to several ASF projects including Apache Flume, Apache Sqoop, Apache Hadoop, and Apache HBase. Joey is also a coauthor of Hadoop Security, published by O'Reilly Media.

Session(s):
Embeddable Data Transformation for Real-Time Streams
Thursday, June 30, 2016, 12:20PM - 1:00PM
230C
Chris Eidler

HPE
Jay Etchings

Arizona State University
Jay Etchings is a well-known industry professional with 20 years of progressively versatile, cross-platform experience in management of open systems architecture. As both a health care professional and life sciences consultant, his focus is targeted toward emerging data management strategies, performance computing enhancements and massively scalable, parallel processing of large data sets. Currently, Mr. Etchings group is tightly coupled with ASU teams providing the foundation for new frontiers in personalized medicine through genomic analysis. Arizona State University is ranked among the Top 25 research institutes in the U.S. in terms of research output, innovation, development, research expenditures, and patents.

Session(s):
Statistical Analysis of Genomic Data with Hadoop
Wednesday, June 29, 2016, 4:10PM - 4:50PM
230A
Stephan Ewen

data Artisans
Stephan Ewen is PMC member of Apache Flink and co-founder and CTO of data Artisans. Before founding data Artisans, Stephan was leading the development of Flink since the early days of the project. Stephan has a PhD in Computer Science from TU Berlin, and has been with IBM Research and Microsoft Research in the course of several internships.

Session(s):
Turning the Stream Processor into a Database: Building Online Applications on Streams
Thursday, June 30, 2016, 12:20PM - 1:00PM
212
Michael Fagan

Comcast
Michael Fagan is the Big Data Architect at Comcast. He has over 20 years experience delivering large distributed systems for government and commercial entities.

Session(s):
Managing a Large Multi-tenant Data Lake
Wednesday, June 29, 2016, 5:50PM - 6:30PM
BALLROOM A
Reza Farivar

Capital One
Reza Farivar is a Data Engineering Manager at Capital One, where he works on Big Data / Fast Data Cloud Computing platforms. Before joining Capital One, he was a senior software engineer at Yahoo working on Big/Fast Data platforms such as Apache Storm and Spark. He has done both his PhD and postdoctoral works at the University of Illinois at Urbana-Champaign, with his research focusing on Big Data and Cloud platforms, programming models and the application of these technologies in diverse domains including finance, machine learning and bioinformatics. He holds a special interest in the application of specialized hardware accelerators such as GPUs in big data computing platforms. He is also an adjunct research assistant professor at the Computer Science Department of university of Illinois, where he has been involved in developing and teaching full-semester courses, short classes and online courses (including on Coursera.org website) on Cloud computing, big data and operating systems since 2011.

Session(s):
Performance Comparison of Streaming Big Data Platforms
Wednesday, June 29, 2016, 12:20PM - 1:00PM
210C
Andy Feng

Yahoo!
Andy Feng is a VP Architecture at Yahoo leading the architecture and design of big data and machine learning initiatives. He architected major platforms for personalization, ads serving, NoSQL, and cloud infrastructure. Andy is a PPMC member and committer of the Apache Storm project and a contributor to the Apache Spark project. He served as a track chair and program committee member at Hadoop Summit and Spark Summit. Prior to Yahoo, he was a Chief Architect at AOL/Netscape, and Principal Scientist at Xerox.

Session(s):
Distributed Deep Learning on Hadoop Clusters
Thursday, June 30, 2016, 12:20PM - 1:00PM
BALLROOM C
Alejandro Fernandez

Hortonworks
Alejandro Fernandez has been a PMC for the Apache Ambari project since 2014 and is a Hortonworks developer. He has participated in several Ambari events such as hackathons, meetups, and presented at the Global Big Data Conference. He's a primary contributor to Ambari features like Rolling and Express Upgrade and graduated from Carnegie Mellon University, where he got his Bachelor of Science in Computer Science and additional major in Mathematics.

Session(s):
Apache Ambari – Simplified Cluster Operation and Troubleshooting (including demo)
Tuesday, June 28, 2016, 12:20PM - 1:00PM
230A
Russell Foltz-Smith

RFS Productions
Russell Foltz-Smith is an emerging person who's last 17 years involved making hundreds of millions of dollars for various enterprises while building up a deep reservoir of unanswered fundamental questions. Last year he took leave of corporate life to begin answering, or asking anew, these questions as an artist and philosopher of data. He operates out of constrained studio without an Internet connection in Santa Monica, CA.

Session(s):
What is Data? And What Are You Doing?
Tuesday, June 28, 2016, 2:10PM - 2:50PM
BALLROOM C
Christopher Fregly

PipelineIO
Chris Fregly is a Principal Data Solutions Engineer for the newly-formed IBM Spark Technology Center, an Apache Spark Contributor, and a Netflix Open Source Committer. Chris is also the founder of the global Advanced Apache Spark Meetup and author of the upcoming book, Advanced Spark @ advancedspark.com. Previously, Chris was a Data Solutions Engineer at Databricks and a Streaming Data Engineer at Netflix. When Chris isn’t contributing to Spark and other open source projects, he’s creating book chapters, slides, and demos to share knowledge with his peers at meetups and conferences throughout the world.

Session(s):
Real-time, Streaming Advanced Analytics, Approximations, and Recommendations using Apache Spark ML/GraphX, Kafka Stanford CoreNLP, and Twitter Algebird
Thursday, June 30, 2016, 3:00PM - 3:40PM
210A
Jonathan Fritz

Amazon Web Services
Jonathan Fritz is the Senior Product Manager for Amazon EMR, a managed service for distributed big data projects on the Amazon Web Services cloud. Prior to joining Amazon Web Services, he was the Founder and CEO of Eleven Media Group and performed research in organic chemistry and nanotechnology at Washington University in St. Louis. Jonathan has an MBA from the Stanford Graduate School of Business and a Bachelor's degree from Washington University in St. Louis in chemistry with minor in biology

Session(s):
HIPAA Compliance in the Public Cloud
Thursday, June 30, 2016, 12:20PM - 1:00PM
210C
Adam Fuchs

Sqrrl Data, Inc.
As the Chief Technology Officer and co-founder of Sqrrl, Adam Fuchs is responsible for ensuring that Sqrrl is leading the world in Big Data Infrastructure technology. Previously at the National Security Agency, Adam was an innovator and technical director for several database projects, handling some of the world’s largest and most diverse data sets. He is a co-founder of the Apache Accumulo project. Adam has a BS in Computer Science from the University of Washington and has completed extensive graduate-level course work at the University of Maryland.

Session(s):
Near Real-time Outlier Detection and Interpretation
Tuesday, June 28, 2016, 12:20PM - 1:00PM
210A
Ryohei Fujimaki

NEC
Ryohei Fujimaki (Ph.D.) received MS degree in aerospace engineering from University of Tokyo in 2006 and Ph.D. in 2010. He became the youngest research fellow ever in the history of NEC Labs. due to his business and R&D contributions, and is leading advanced analytics teams in US, Japan and China to develop global leading-edge technologies and business solutions. He has published papers in top conferences such as KDD, ICML, NIPS, as well as developed many predictive analysis solutions with clients. He is a recipient of the Advanced Technology award 2015 in Japan. http://www.nec.com/en/global/rd/crl/datamining/members/profile_fujimaki.html

Session(s):
Big Data Heterogeneous Mixture Learning on Spark
Thursday, June 30, 2016, 12:20PM - 1:00PM
230A
Christopher Gambino

Hortonworks
Chris Gambino comes from a manufacturing history. Working at several small companies around the Boston area he specialized in customized IoT applications for medical devices. The culmination of these log analysis efforts was a 10% yield improvement for cancer detecting assays. He made the jump to Hortonworks in late 2015 and immediately started applying the analytics from his past to big data architectures.

Session(s):
Building a Smarter Home with Nifi and Spark
Tuesday, June 28, 2016, 2:10PM - 2:50PM
210C
Uma Maheswara Rao Gangumalla

Intel
Uma Maheswara Rao G is an Apache Hadoop committer and member of Apache Hadoop Project Management Committee (PMC). He is a long term active contributor to the Apache Hadoop project. He is also a PMC member for Apache BookKeeper project. He is Senior Software Engineer at Intel and majorly responsible for Apache HDFS open source development from Intel.

Session(s):
Debunking the Myths of HDFS Erasure Coding Performance
Wednesday, June 29, 2016, 12:20PM - 1:00PM
BALLROOM C
Alan Gates

Hortonworks
Alan is a founder of Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan has done extensive work in Hive, including adding ACID transactions. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a book from O’Reilly Press.

Session(s):
Apache Hive 2.0 SQL Speed Scale
Wednesday, June 29, 2016, 3:00PM - 3:40PM
BALLROOM A
Adam Gibson

Skymind
Adam Gibson is the cofounder of Skymind and creator of several open-source libraries including Deeplearning4j, a distributed deep Learning on the JVM, andND4J, scientific computing on the JVM, which has a DSL in Scala for different matrix libs (GPUs, native). He is also the author of "Deep Learning: A Practitioner's Guide" by Oreilly (forthcoming summer 2015). Adam has a strong track record of doing big data ranging from teaching at the (now acquired) Zipfian Academy to working on terabyte-or-more search indexes. Adam studied CS at Michigan Tech and is an advisor to the data science Masters program at GalvanizeU.

Session(s):
Deep Learning Using DL4J and Spark on HDP for Fun and Profit
Tuesday, June 28, 2016, 3:00PM - 3:40PM
212
Scott Gnau

Hortonworks
Scott has spent his entire career in the data industry, most recently as president of Teradata Labs where he provided visionary direction for research, development and sales support activities related to Teradata integrated data warehousing, big data analytics, and associated solutions. He also drove the investments and acquisitions in Teradata’s technology related to the solutions from Teradata Labs. Scott holds a BSEE from Drexel University.

Session(s):
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!
Tuesday, June 28, 2016, 4:10PM - 4:50PM
230C
P. Taylor Goetz

Hortonworks
P. Taylor Goetz is the Apache Storm PMC Chair and Streaming Data Tech Lead at Hortonworks. Taylor has over 20 years of software development expertise in projects including financial transaction management, transportation logistics, DoD Command and Control systems, and Master Data Management (MDM). Taylor is an ASF Member, and as a member of the Apache Incubator PMC mentors a number of incubating projects.

Session(s):
From Device to Data Center to Insights: Architectural Considerations for the Internet of Anything
Wednesday, June 29, 2016, 5:50PM - 6:30PM
230A

The Future of Apache Storm
Thursday, June 30, 2016, 11:30AM - 12:10PM
210C
Prashant Gokhale

Salesforce.com
Prashant is currently working on solving big data problems at Salesforce.com using Hadoop and its ecosystem components. Prior to this he held several critical engineering positions at Yahoo, Cloudera & Lookout.

Session(s):
LEGO: Data Driven Growth Hacking Powered by Big Data
Tuesday, June 28, 2016, 5:00PM - 5:40PM
210C
Alex Gorelik

Waterline Data
Alex has spent his career inventing cutting edge data-oriented technology and bringing it to market. Prior to Waterline Data, Alex served as GM of Informatica's Data Quality Business Unit and SVP of R&D for Core Technology. Previously, Alex was an IBM Distinguished Engineer for their Information Integration team. IBM acquired Alex's second startup, Exeros, where he was founder, CTO and VP of Engineering. Alex also co-founded Acta Technology (acquired by Business Objects). Before that, Alex managed development of Replication Server at Sybase and worked on Sybase's strategy for enterprise application integration (EAI).

Session(s):
How to Build a Successful Data Lake
Tuesday, June 28, 2016, 5:00PM - 5:40PM
230C
Raman Goyal

Expedia
I have been working on big data for couple of years and currently shipping code to Expedia Data Platform team in India. I work here on various tools like cluster capacity forecasting and cluster performance enhancements.

Session(s):
Hdfs Analysis for Small Files
Tuesday, June 28, 2016, 2:10PM - 2:50PM
211
Scott Gray

IBM
Scott Gray is the lead architect for IBM Open Platform with Apache Hadoop, the heart of IBM's Hadoop distribution, BigInsights and was previously a senior architect for IBM Big SQL for Hadoop. Scott has an extensive career in the software industry focusing heavily on relational database, architecture, design, optimization and internals. Prior to working with IBM, Scott was the chief architect for ANTs Software’s SQL Skin for Sybase, a real time Sybase T-SQL to IBM SQL PL translation engine.
Jonathan Gray

Cask
Jonathan Gray is an entrepreneur and software engineer with a background in startups, open source, and all things data. As an early member of the Hadoop community and long-time HBase committer, Jonathan has been building real-time applications on Hadoop for 8 years. Prior to founding Cask, Jonathan was a software engineer at Facebook where he helped drive real-time Hadoop and HBase engineering efforts, including Facebook Messages, Puma and several other large-scale projects, from inception to production. Jonathan previously founded Streamy.com where in 2008 he ported the entire backend from PostgreSQL to HBase and lived to tell about it.

Session(s):
The DAP: Where Yarn, HBase, Kafka and Spark go to Production
Thursday, June 30, 2016, 4:10PM - 4:50PM
211
Vaibhav Gumashta

Hortonworks
Vaibhav works on the Hive team at Hortonworks. In the past he has contributed to the HiveServer2 and Metastore components, and is currently working on implementing the ACID functionality of Hive using HBase Metastore.

Session(s):
Hive Hbase Metastore - Improving Hive with a Big Data Metadata Storage
Wednesday, June 29, 2016, 4:10PM - 4:50PM
210C
Amit Hadke

Dremio
Amit Hadke is a principal software engineer at Dremio and a contributor to the open source Apache Drill project. He previously worked on the data infrastructure teams at Nextdoor and Twitter. Prior to Twitter, he was a founding engineer at MapR, where he was responsible for building the company's distributed file system and NoSQL database. He holds an MS in Computer Science from the University of California, Davis and a BS in Computer Science from University of Pune, India.

Session(s):
The Columnar Era: Leveraging Parquet and Kudu for High-Performance Analytics
Thursday, June 30, 2016, 2:10PM - 2:50PM
210A
Oliver Halter

PwC
Oliver Halter is a principal at PwC, focusing on information strategy and creation of value from information. He has more than 23 years of experience in information technology, systems development, and business consulting. Oliver is part of PwC’s Analytics practice and is responsible for PwC’s offerings in Information Strategy and Big Data Architectures and Analytics.

Session(s):
Future of Apache Hadoop – An Enterprise Architecture View
Wednesday, June 29, 2016, 3:00PM - 3:40PM
230C
Ben Hammersley

Ray Harrison

Comcast
Ray is a core team member of Comcast's MELD (Massive Event Level Data) Hadoop platform delivering a full stack multi-tenant ecosystem for real-time, batch and analytics use cases across a broad spectrum of event and other customer and machine data. Ray brings 22 years of data-centric telecommunications and startup experience and enjoys spending time outdoors and with his family in Denver, Colorado.

Session(s):
Managing a Large Multi-tenant Data Lake
Wednesday, June 29, 2016, 5:50PM - 6:30PM
BALLROOM A
Jian He

Hortonworks
Jian He is an Apache Hadoop PMC member and committer. He is currently a member of technical staff at Hortonworks , where he works on Apache Hadoop YARN and MapReduce projects. Prior to joining Hortonworks, he received a Masters degree in Computer Science from Brown University.

Session(s):
Debugging YARN Cluster in Production
Thursday, June 30, 2016, 2:10PM - 2:50PM
211
Chris Herrera

Schlumberger
Chris Herrera is the Program Manager for Real Time Well Construction Software in Schlumberger with a team of engineers and developers that focus on supporting drilling and reservoir characterization technologies across every phase of well construction execution. Mr. Herrera has held a number of roles with Schlumberger including SIS Services Lead Architect focusing on Corporate Data Management for E&P, lead developer of a real time E&P Aggregation system, Real Time Operations Manager for Drilling and Measurements Alaska, and spent time as a Field Engineer in Alaska. Chris holds an Electrical Engineering degree from Auburn University.

Session(s):
From Zero to Data Flow in Hours with Apache NiFi
Tuesday, June 28, 2016, 11:30AM - 12:10PM
BALLROOM C
Mark Holderbaugh

yahoo!
Krisztian Horvath

Hortonworks
Krisztian Horvath is a Sr. Member of Technical staff, committer on Cloudbreak, contributor on various projects (Ambari, Flume, Yarn, Hadoop) and lead developer/architect of the autoscaling feature. He was part of the core founder and technical team at SequenceIQ as well.

Session(s):
Cloudbreak - Internals Deep Dive
Tuesday, June 28, 2016, 11:30AM - 12:10PM
210C
Steve Howard

EXPRESS
Will provide bio along with detailed content for presentation

Session(s):
Customer Journey - Sentiment Analysis for Fashion Retail
Thursday, June 30, 2016, 3:00PM - 3:40PM
230A
Robert Hryniewicz

Hortonworks
Robert Hryniewicz is known for quickly prototyping products spanning both Big Data & Data Science projects. Most recently, he developed a Graph Analytics platform for TiVo. Before Robert got involved with Big Data, he was a CTO at Authentise where he developed and launched a secure cloud for streaming 3D print designs.

Session(s):
Building a Graph Database in Neo4j with Spark & Spark SQL to Gain New Insights from Log Data
Tuesday, June 28, 2016, 12:20PM - 1:00PM
230C
Timothy Hunter

Databricks
Timothy Hunter is a software engineer working on machine learning at Databricks. He received his Ph.D. in Machine Learning from UC Berkeley in 2014. His research focused on large-scale state estimation of cyberphysical systems, and sparse covariance inference.

Session(s):
Combining Machine Learning Frameworks with Apache Spark
Thursday, June 30, 2016, 2:10PM - 2:50PM
BALLROOM A
Julian Hyde

Hortonworks
Julian Hyde is an expert in query optimization, in-memory analytics, and streaming. He is PMC chair of Apache Calcite, the query planning framework behind Hive, Drill, Kylin and Phoenix. He was the original developer of the Mondrian OLAP engine. He is an architect at Hortonworks.

Session(s):
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
Wednesday, June 29, 2016, 11:30AM - 12:10PM
230A

How We Re-Engineered Phoenix with a Cost-Based Optimizer Based on Calcite
Thursday, June 30, 2016, 12:20PM - 1:00PM
BALLROOM B
Pramod Immaneni

DataTorrent
Pramod is Apache Apex PMC member and senior architect at DataTorrent, where he works on Apex and specializes in big data applications. Prior to DataTorrent he was a co-founder and CTO of Leaf Networks LLC, eventually acquired by Netgear Inc, where he built products in core networking space and was granted patents in peer-to-peer VPNs.

Session(s):
Next Gen Big Data Analytics with Apache Apex
Thursday, June 30, 2016, 3:00PM - 3:40PM
212
Mario Inchiosa

Microsoft
Mario’s passion for data science and scalable computing drives his work at Microsoft, where he focuses on machine learning and advanced analytics integrated with the R language. Mario has served in Chief Scientist and Analytics Architect roles at Revolution Analytics, IBM, and Netezza, advancing Hadoop and SQL-based big data analytics. Previously, he was US Chief Science Officer at NuTech Solutions, a computer science consultancy. Dr. Inchiosa holds Bachelors, Masters, and PhD degrees from Harvard University. He has been awarded four patents and has published over 30 research papers, earning Publication of the Year and Open Literature Publication Excellence awards.

Session(s):
Building A Scalable Data Science Platform with R
Thursday, June 30, 2016, 4:10PM - 4:50PM
BALLROOM B
Anand Iyer

Cloudera
Anand Iyer is a senior product manager at Cloudera. His primary areas of focus are platforms for real-time streaming, apache spark, and tools for data ingestion into hadoop. Before joining Cloudera, he worked as an engineer at LinkedIn, where he applied machine learning techniques to improve the relevance and personalization of LinkedIn's Feed. Anand has extensive experience in leveraging big data platforms to deliver products that delight customers. He has a master's in computer science from Stanford and a bachelor's from the University of Arizona.

Session(s):
Ingest and Stream Processing - What Will You Choose?
Thursday, June 30, 2016, 3:00PM - 3:40PM
230C
Virajith Jalaparti

Microsoft
Virajith Jalaparti is a Scientist in the Cloud Information Services Lab at Microsoft. He obtained his Ph.D. in Computer Science from the University of Illinois, Urbana-Champaign (2015). His dissertation was on using cross-layer coordination to improve the end-to-end performance of big data systems.

Session(s):
HDFS Tiered Storage
Tuesday, June 28, 2016, 4:10PM - 4:50PM
BALLROOM B
Carey James

EMC
Gopi Janakiraman

Merck
Director IT/Architecture.

Session(s):
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Learning DSLs
Tuesday, June 28, 2016, 5:00PM - 5:40PM
212
Rohit Jangid

Expedia
I have been working on big data for couple of years and currently shipping code to Expedia Data Platform team in India. I work here on various tools like cluster capacity forecasting and cluster performance enhancements.

Session(s):
Hdfs Analysis for Small Files
Tuesday, June 28, 2016, 2:10PM - 2:50PM
211
Xiaowei Jiang

Alibaba Inc
Xiaowei Jiang is a Director in Alibaba's Search division. Previously, he was a tech lead in Messenger, Timeline Infra and Core Infra at Facebook. Before that, he was a Principal Software Engineer for SQL Server Engine at Microsoft.

Session(s):
Blink−Improved Runtime for Flink and its Application in Alibaba Search
Wednesday, June 29, 2016, 2:10PM - 2:50PM
210C
Rekha Joshi

Intuit
Rekha Joshi is a Principal Software Engineer at Intuit, and is getting amazing work done on Big data ecosystem.Cloud, Security, and Performance keeps her happily engaged.She has worked in diverse domains of finance, advertising, supply chain and research.She is an open source contributor, an author and a speaker at few conferences.Her refueling stops include reading Issac Asimov, Richard Feynman, PG Wodehouse and stalking Elon Musk.

Session(s):
The Evolution of Big Data Pipelines at Intuit
Thursday, June 30, 2016, 2:10PM - 2:50PM
BALLROOM C
Srikanth Kandula

Microsoft
Srikanth Kandula is a Researcher at Microsoft Research. His research interests span some aspects of networked systems including datacenter networks and parallel processing clusters. He has published at top-tier venues including SIGCOMM, NSDI and OSDI. He is a past recipient of the NSDI best student paper award (2005) and a Microsoft Goldstar Award. He obtained his Ph. D. from the Massachusetts Institute of Technology.

Session(s):
GoodFit - An Efficient MRP (Multi-Resource Packing) Allocator for YARN
Thursday, June 30, 2016, 4:10PM - 4:50PM
230A
Kristopher Kane

Hortonworks
Systems architect delivering Hortonworks Data Platform solutions to Hortonworks customers. Previously big data operations and software engineering for the DoD

Session(s):
Processing and Retrieval of Geotagged Unmanned Aerial System Telemetry
Tuesday, June 28, 2016, 5:50PM - 6:30PM
211
Chris Kang

Accenture
Chris is an R&D Innovation Senior Analyst at Accenture Technology Labs as part of the Systems and Platforms research group. His current work is focused on model management for analytical models in big data streaming architectures. Chris' roles at Accenture have varied from software developer, data engineer, and prior to joining Technology Labs, cloud architect with an emphasis on private cloud technologies.

Session(s):
Model Management Demo in the Lambda Architecture
Tuesday, June 28, 2016, 11:30AM - 12:10PM
211
Dheeraj Kapur

Yahoo!
Dheeraj is an engineering lead at Yahoo with 10 years of IT experience, including experience in the areas of cloud computing, advance system automation & tools design and system administration. Dheeraj is skilled in management of infrastructure and implementing technology to support large user groups and effectively managing high end Hadoop Clusters.

Session(s):
Managing Hadoop, HBase, and Storm Clusters at Yahoo Scale
Thursday, June 30, 2016, 2:10PM - 2:50PM
210C
Karthik Karuppaiya

Symantec Corporation
Karthik Karuppaiya is a big data enthusiast, working on hadoop and big data related technologies since 2010. As part of the Cloud Platform Engineering group at Symantec, he is helping product teams in re-architecting/re-engineering the legacy systems to move to cloud. Karthik is responsible for managing a huge analytics platform, that is used by 100s of engineers and developers within Symantec and also holds more than 50 PB of data on HDFS and Hive. The current platform ingests at the rate of 70 Billion events a day through Kafka and Storm and eventually write to Hive.

Session(s):
On-Demand HDP Clusters Using Cloudbreak and Ambari
Tuesday, June 28, 2016, 3:00PM - 3:40PM
210A
Murali Kaundinya

Merck
Murali is a Director and research engineer with leadership experience conceiving and delivering disruptive and innovative solutions that have been successfully deployed at a large scale in healthcare, financial services, manufacturing and consumer product enterprises. He leads a group focused on innovation engineering with Merck & Co., Inc. in Branchburg, New Jersey, USA.

Session(s):
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Learning DSLs
Tuesday, June 28, 2016, 5:00PM - 5:40PM
212
Paul Kent

SAS Institute Inc.
Paul Kent is Vice President of Big Data Initiatives at SAS. He spends his time between customers, partners and the R&D teams discussing, promoting and developing software at the confluence of big data and high-performance computing. He was previously Vice President of Platform R&D at SAS and led groups responsible for the SAS® Foundation and midtier technologies – teams that develop, maintain and test Base SAS and its related products. Kent joined SAS in 1984 and has helped develop many SAS software components. A strong customer advocate, he is widely recognized within the SAS community for his active participation.

Session(s):
IoT, Streaming Analytics and Machine Learning: Delivering Real-Time Intelligence With Apache NiFi
Tuesday, June 28, 2016, 5:00PM - 5:40PM
BALLROOM B
Asad Khan

Microsoft
Tanvir Kherada

Hortonworks
Tanvir Kherada works at Hortonworks with Technical Support team as a Team Lead. He works as subject matter expert for Hbase and Phoenix. He has also been working with core Hadoop framework.

Session(s):
Operating and Supporting Apache HBase - Best Practices and Improvements
Wednesday, June 29, 2016, 5:00PM - 5:40PM
210C
Eric Kienle

Marketo
Eric is Chief Architect at Marketo, a leading Marketing automation SAAS platform. He is focused on creating reliable and scalable architectures for SAAS that run with maximum performance. Prior to Marketo, Eric was Chief Architect and a co-founder of Crowd Factory. Crowd Factory was a social API and marketing company that was acquired by Marketo in 2012.

Session(s):
Successes, Challenges and Pitfalls Migrating a SAAS Business to Hadoop
Tuesday, June 28, 2016, 3:00PM - 3:40PM
BALLROOM C
Shaun Klopfenstein

Marketo
Shaun is CTO of B2C at Marketo, a leading Marketing automation SAAS platform and Lab Director of Marketo’s office in Portland, Oregon, where he is based. At Marketo, he is focused on scale and performance of their marketing platform. Previous to Marketo, Shaun was CTO and a founder of Crowd Factory. Crowd Factory was social a marketing company, which was acquired by Marketo in 2012.

Session(s):
Successes, Challenges and Pitfalls Migrating a SAAS Business to Hadoop
Tuesday, June 28, 2016, 3:00PM - 3:40PM
BALLROOM C
Kelly Kohlleffel

Hortonworks
Kelly Kohlleffel is responsible for driving the adoption of Hadoop and Advanced Analytics across all industry segments in Oil and Gas and provides leadership on solution strategies and go-to-market initiatives for Hortonworks. Previously, Mr. Kohlleffel was Key Account Director in the Oil and Gas Industry at Oracle Corporation with responsibility for all business lines and geographies leading a multi-product line team. With 26 years experience in technology and 13 in the O&G industry, he has been integral in the establishment of key technology solutions across the industry. Mr. Kohlleffel holds a Bachelor of Business Administration degree from Texas A&M University.

Session(s):
It’s Time: Launching Your Advanced Analytics Program for Success in a Mature Industry Like Oil and Gas
Thursday, June 30, 2016, 12:20PM - 1:00PM
211
Eugene Koifman

Hortonworks
I have been an engineer on Hive team at Hortonworks for past 3 years. I’m a committer on Apache Hive project. Prior to Hortonworks I was a lead Engineer at Composite Software where I was responsible for a federated SQL query engine. Prior to that I’ve held engineering positions at BEA, Oracle and others.

Session(s):
ACID Transactions in Hive
Wednesday, June 29, 2016, 12:20PM - 1:00PM
230A
Kenneth Kranz

Cognizant
Mr. Kranz is a senior Big Data Architect involved in business development, technology evangelism, and solution architecture in the UAS, Banking, and Healthcare sectors. Mr. Kranz is spearheading the emerging UAS market vertical at Cognizant with respect to IoT. Additionally, he is responsible for delivering solutions, best practices, design pattern, and various white papers. Mr. Kranz is an established author, public speaker, patent developer, and a private pilot. He has over 20+ years of experience with fixed wing remote controlled aircraft and, more recently, quadcoptor operations, and is applying for an FAA Section 333 exemption.

Session(s):
"I'm Being Followed by Drones!" The Impact of IoT on the Future of Unmanned Aerial Systems
Wednesday, June 29, 2016, 3:00PM - 3:40PM
211
Subru Krishnan

Microsoft
Subru is working as a research engineer at Microsoft in the Cloud and Information Services Lab (CISL) currently focusing on YARN, specifically scaling it to 100K+ nodes and providing SLA guarantees. He has been working on the Hadoop ecosystem since 2007. Prior to Microsoft, he worked at Yahoo! where he contributed to Oozie's precursor, near real-time stream processing on Hadoop and HBase replication.

Session(s):
To Infinity and Beyond – Datacenter Scale YARN Clusters through Federation
Wednesday, June 29, 2016, 11:30AM - 12:10PM
BALLROOM B
Sanjeev Kumar

Saama Technologies
As Saama's Practice Area Leader, Insurance, Sanjeev Kumar is responsible for delivering innovative data analytics solutions for the insurance industry at Saama. Sanjeev has been active in identifying and addressing pressing insurance industry trends such as the smart phone replacing agents, the connected car and usage-based-insurance, millenials and instant expectation customers, and implications of the Internet of Things and cybersecurity concerns, which are all transforming the insurance business and the analytics landscape. He strongly believes that today's next generation business intelligence in the form of advanced analytics will revolutionize the insurance industry.

Session(s):
Disrupting Insurance with Advanced Analytics – The Next Generation Carrier How Motorist Leapfrogged into the Future of Analytics and Data
Wednesday, June 29, 2016, 5:50PM - 6:30PM
212
Amit Kumar

Cisco
Amit has over eighteen years of rigorous experience in Distributed Systems and Object Technology, including 8+ years in the role of a Team Lead/Manager/Architect. He started out his career as a C/C++ developer moving on to Java. He has worked as a Big Data architect at various companies for the last few years, and currently as Big Data Architect at Cisco Intercloud Services (CIS) group.

Session(s):
Preventative Maintenance of Robots in Automotive Industry
Tuesday, June 28, 2016, 2:10PM - 2:50PM
212
Pardeep Kumar

Hortonworks
Pardeep helps enterprises solve their business problems strategically, functionally as well as at scale by using BigData technologies with his strong expertise in Hadoop ecosystem and very rich field experience. He has extensive knowledge of BigData/NoSql technologies and has been working in this space for last 4+ years. He has architected solutions for some of the most complex BigData problems and has helped setup some of the largest Hadoop clusters across multiple Fortune500 companies spanning 10s of PBs of data and 1000+ nodes.

Session(s):
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)
Wednesday, June 29, 2016, 2:10PM - 2:50PM
211
Dhiraj Kumar

Flipkart Internet Pvt. Ltd.
Dhiraj is member of Data Platform team in Flipkart. He's instrumental in developing the management tools around infrastructure. He played key role in developing the migration utilities that helped Flipkart migrate 100s terabytes of data with zero failures. He's been into Software development since 2004, his prior experiences include start-ups like ApnaPaisa.com.
Dhruv Kumar

Hortonworks
Dhruv Kumar is a Solutions Engineer at Hortonworks in the Alliances team and the Spark Subject Matter Expert. Before joining Hortonworks he has held software engineering and solutions architect positions at Terracotta, Concurrent and Google. Dhruv holds a M.S. degree in Computer Engineering from University of Massachusetts Amherst and has been programming computers since age 13. His favorite project so far has been to implement a Java Map Reduce version of the Baum Welch HMM training algorithm for Apache Mahout. Dhruv lives in San Francisco and enjoys hiking in his free time.

Session(s):
Deep Learning Using DL4J and Spark on HDP for Fun and Profit
Tuesday, June 28, 2016, 3:00PM - 3:40PM
212
Bryan Lari

MD Anderson Cancer Center
Bryan Lari is currently the Director of the Institutional Analytics and Informatics department at MD Anderson, where he has worked for the last 8 years. At MD Anderson he leads analytics, business intelligence, data warehousing efforts, as well as the data governance program and efforts around natural language processing and cognitive computing. Prior to MD Anderson, Bryan worked in the energy industry for over 18 years, leading data management, integration, analytics, and data warehousing efforts. Bryan has a bachelors in Computer Science, MBA in International Business, is a Certified Project Management Professional (PMP) and Certified Data Management Professional (CDMP).

Session(s):
HDP @ MD Anderson - Starting the Hadoop Journey at a Global Leader in Cancer Research
Tuesday, June 28, 2016, 5:50PM - 6:30PM
230A
Julien Le Dem

Dremio
Julien is the co-author of Apache Parquet and the PMC Chair of the project. He is also a committer and PMC Member on Apache Pig. Julien is an architect at Dremio, and was previously the Tech Lead for Twitter’s Data Processing tools, where he also obtained a two-character Twitter handle (@J_). Prior to Twitter, Julien was a Principal engineer and tech lead working on Content Platforms at Yahoo! where he received his Hadoop initiation. His French accent makes his talks particularly attractive.

Session(s):
The Columnar Era: Leveraging Parquet and Kudu for High-Performance Analytics
Thursday, June 30, 2016, 2:10PM - 2:50PM
210A
Drew Leamon

Comcast
At Comcast, Drew leads part of the Engineering Analysis organization. His team is developing advanced data visualizations for network data. They are building elastically scaling Big Data infrastructure to support Analytic workloads. Simulations of Comcast’s CDNs andplatforms, developed by Drew’s team, are leveraging this platform and guiding the business and engineering teams. His team is identifying high ROI opportunities. They apply machine learning to datasets and are currently operationalizing the resulting predictive models to help improve customer experience.

Session(s):
Self-Service Analytics on Hadoop: Lessons Learned
Wednesday, June 29, 2016, 5:00PM - 5:40PM
BALLROOM C
Sangjin Lee

Twitter Inc.
Sangjin "joined the flock" at Twitter in February 2012, and works on the Core Hadoop team. As a Hadoop committer he has been actively contributing to the Hadoop project. Sangjin jumped ship from physics to software industry in 1998, and has been working in Silicon Valley ever since. He worked for companies such as Netscape, Siebel Systems, and eBay. Sangjin has a passion for various topics such as distributed systems, concurrent programming, asynchrony, modularity, scalability, and performance.

Session(s):
(Big data) Squared: How YARN Timeline Service v.2 Unlocks 360-Degree Platform Insights at Scale
Thursday, June 30, 2016, 12:20PM - 1:00PM
BALLROOM A
Justin Leet

Hortonworks
Justin Leet is an architect and developer at Hortonworks, helping clients build their big data solutions. He's worked within a variety teams throughout different fields, including medical informatics and retail. He's experienced with a variety of Hadoop related technologies, including core Hadoop, Hive, Pig, Storm, Kafka, and others.

Session(s):
Tracing Your Security Telemetry With Apache Metron
Wednesday, June 29, 2016, 12:20PM - 1:00PM
211
Reid Levesque

RBC
Reid leads the Solution Engineering team across Risk, Finance, AML, Fraud and Compliance at RBC. He leverages cloud and big data technologies to deliver next-generation analytics solutions across multiple business lines. He has previously worked on large grid-based Monte Carlo simulations in Capital Markets. Reid has a strong development background and received his Bachelor of Mathematics in Computer Science from the University of Waterloo. He is based in Toronto, Ontario.

Session(s):
Beyond TCO: Architecting Hadoop for Adoption and Data Applications
Wednesday, June 29, 2016, 12:20PM - 1:00PM
212
Evan Levy

SAS
Evan Levy is an acknowledged speaker, writer, and consultant in the areas of Enterprise Data Strategy and Data Management. In his current role, Evan advises clients on strategies to address business challenges using data, technology, and creative approaches that align IT with business stakeholders. Business is experiencing exponential growth in data volumes, sources, and systems – Evan offers practical real world experience on addressing these challenges in a manner that utilizes the company’s existing skills, coupled with new methods to ensure IT and business success.

Session(s):
Modernizing Your Company’s Data Ecosystem
Thursday, June 30, 2016, 4:10PM - 4:50PM
BALLROOM C
HU LIANG

eBay
Alex is a skilled and dynamic leader to drive the evolution of our analytical capabilities for its rapidly growing Marketplaces business, where over 100 million active worldwide users transact at a rate of more than $3,500 of goods every second. For the past 11 years at Ebay, Alex is helping eBay leverage its unique end-to-end data set and drive continuous innovation to accelerate top-line growth for the company. Direct the utilization of a significant investment in large-scale data management and processing resources and technologies while helping to evolve our world-class analytics capabilities. Enable the business goals through partnering with internal customers

Session(s):
Governed Self Service Analytics at eBay
Tuesday, June 28, 2016, 2:10PM - 2:50PM
230A
Kai Liu

Yahoo! Inc.
A principle engineer and senior engineering manager from Yahoo Ads and Data team, expert in distributed data system design, rich domain knowledge on advertising technology. Leading a team to build Yahoo's next generation user profile system now, which is a horizontal platform supporting user modeling, profile serving and audience insights across all Yahoo ad products.

Session(s):
Yahoo!'s Next-Generation User Profile Platform
Wednesday, June 29, 2016, 2:10PM - 2:50PM
230A
Jason Lowe

Yahoo! Inc.
Jason is a distinguished engineer at Yahoo, and works on Hadoop Core (HDFS, MapReduce, YARN, and Tez) in conjunction with the open-source Apache Hadoop project. He is a PMC member and Committer for Apache Hadoop and Apache Tez.

Session(s):
Investigating the Effects of Over Committing YARN Resources
Tuesday, June 28, 2016, 3:00PM - 3:40PM
210C
Li Lu

Hortonworks
Li is working on YARN at Hortonworks. Prior to join Hortonworks, he got his PhD and master's degree in computer science from the University of Rochester and undergraduate degree from Tsinghua University. He is mainly interested in the scalability, programmability, and synchronization problems of distributed systems. He had experience working on the cluster management systems at Google and datacenter storage systems at Microsoft Research. He is the leading author of several top-tier academic conference papers. He is also a core contributor of two open source projects: the popular BBS terminal software Welly and the Deterministic Parallel Ruby language.

Session(s):
(Big data) Squared: How YARN Timeline Service v.2 Unlocks 360-Degree Platform Insights at Scale
Thursday, June 30, 2016, 12:20PM - 1:00PM
BALLROOM A
Maxim Lukiyanov

Microsoft
Maxim Lukiyanov is Program Manager in Big Data team at Microsoft. He is responsible for Spark in Azure HDInsight service. Maxim joined HDInsight team in 2013 and has been with Microsoft for 9 years.

Session(s):
Open Source Ingredients for Interactive Data Analysis in Spark
Tuesday, June 28, 2016, 11:30AM - 12:10PM
210A
Jayush Luniya

Hortonworks
Jayush Luniya is a Software developer on the Ambari team at Hortonworks. He is an active Apache Ambari committer and Apache Ambari PMC member. He has made significant contributions to Ambari stack orchestration, Ambari upgrade framework, Ambari management packs and enabling Ambari support on Windows. At present, he is actively involved in execution of long-term architectural advancements and enabling new capabilities in Apache Ambari. Prior to joining Hortonworks, he has worked for several years at Microsoft on Windows, Bing and Azure products.

Session(s):
Apache Ambari – Simplified Cluster Operation and Troubleshooting (including demo)
Tuesday, June 28, 2016, 12:20PM - 1:00PM
230A
Ming Ma

Twitter
Ming Ma is a Hadoop committer and software engineer at Twitter, where he works on improving Hadoop reliability, operability and scalability. Prior to Twitter, he spent 3 years at eBay and 11 years at Microsoft in developer and manager roles working on Hadoop, Bing and Windows OS.

Session(s):
Cross-DC Fault-Tolerant ViewFileSystem at Twitter
Thursday, June 30, 2016, 11:30AM - 12:10PM
BALLROOM B
Vivek Madani

Symantec Corporation
Vivek is a Sr. Principal Software Engineer working for Cloud Platform Engineering group at Symantec. He is a big data enthusiast, working on hadoop and big data related technologies. As part of the Cloud Platform Engineering group at Symantec, he is helping product teams architect big-data applications using Hadoop and family, Storm, Kafka. He currently focuses on the real time streaming platform that processes 10s of billions of events a day.

Session(s):
On-Demand HDP Clusters Using Cloudbreak and Ambari
Tuesday, June 28, 2016, 3:00PM - 3:40PM
210A
Kanishk Mahajan

Hortonworks
Kanishk has Product Management responsibility for Big Data Stream Processing technologies as well as IOT Security and Analytics at Hortonworks.

Session(s):
Make Streaming Analytics Work For You: The Devil is in the Details
Wednesday, June 29, 2016, 5:00PM - 5:40PM
210A
Manasi Maheshwari

Hortonworks
UPdated to a speaker pass per MP. She is presenting at the women in big data lunch.
Brian Majeska

YP
Brian manages YP's Platform Data Services Operations team. His team is responsible for YP's multi petabyte Hadoop, HBase, and Vertica clusters.

Session(s):
Meeting Performance Goals in Multi-tenant Hadoop Clusters
Thursday, June 30, 2016, 11:30AM - 12:10PM
230C
James Malone

Google
I love data because it surrounds us - everything is data. I also love open source software, because it shows what is possible when people come together to solve common problems with technology. While they are awesome on their own, I am passionate about combining the power of open source software with the potential unlimited uses of data. That's why I joined Google. I am a product manager for Google Cloud Platform and manage Cloud Dataproc and Apache Beam (incubating). I've previously spent time hanging out at Disney and Amazon. Beyond loving data I like photography, running and Legos.

Session(s):
The Next Generation of Data Processing & OSS
Wednesday, June 29, 2016, 2:10PM - 2:50PM
BALLROOM A
Keith Manthey

EMC
Keith is the CTO of Analytics for EMC's Emerging Technology Division. He brings more then 24 years of Identity Fraud Analytics experience, alternative and traditional data architectures experience, and Financial Systems and Analytics experience. Keith is an advisory board member of the University of Georgia's Management of Information Systems School.

Session(s):
Building a Data Analytics PaaS for Smart Cities
Wednesday, June 29, 2016, 4:10PM - 4:50PM
212
Janos Matyas

Hortonworks
Janos Matyas is a Sr. Dir. of Engineering at Hortonworks and an Apache Ambari and Cloudbreak committer. He was the founder and CTO of SequenceIQ, a startup acquired by Hortonworks. Before Hortonworks he was acting as a Solution Architect on various large scale distributed projects. He's main passion is surfing/windsurfing big waves.

Session(s):
Cloudbreak - Internals Deep Dive
Tuesday, June 28, 2016, 11:30AM - 12:10PM
210C
Sotos Matzanas

Trulia
Sotos Matzanas is a Senior Big Data Engineer at Trulia. He works on Trulia’s Personalization team, developing technology for delivering relevant and quality content for users in near real time. He holds a Masters degree in Computer Science from the University of California Santa Barbara

Session(s):
Lambda Architecture: How we Merged Batch and Real-Time
Wednesday, June 29, 2016, 2:10PM - 2:50PM
212
Ancil McBarnett

Hortonworks
Currently working as a Solutions Engineer for Hortonworks helping different customers kickstart their Hadoop journey. Ancil McBarnett has worked with Data in Justice and Public Safety.  Previously to Hortonworks he was the Architect Manager for a state agency responsible for the sharing of secure and sensitive data among first responder and justice systems.  Since joining Hortonworks he has worked mainly with Health providers who are looking to utilize Hadoop as the ideal platform to store and analyze secure data and to create modern data applications.

Session(s):
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)
Wednesday, June 29, 2016, 2:10PM - 2:50PM
211
Ryan Medlin

Neustar

Session(s):
Make Streaming Analytics Work For You: The Devil is in the Details
Wednesday, June 29, 2016, 5:00PM - 5:40PM
210A
Michael Miklavcic

Hortonworks
Michael is a software engineer with over ten years of industry experience and has been a Systems Architect with Hortonworks for the past two years. He has given numerous talks both on the domestic and international stage, most recently at Apache Con Big Data Europe. He is a code contributor to Apache open source projects and works directly with clients to implement solutions using Hadoop. For over 2 years he has guided many Big Data and Hadoop projects at large enterprises to success. Michael has degrees in computer science and computer information systems from Baldwin Wallace in Cleveland, OH.

Session(s):
Scalable Optical Character Recognition with Apache NiFi and Tesseract
Wednesday, June 29, 2016, 5:50PM - 6:30PM
210C
Paul Miller

Oracle
Paul manages Oracle's Big Data and Advanced Analytics North American Tiger Team who lead Big Data and Advanced Analytics transformative prototypes and provide deep level expertise. In this role, I am responsible for ensuring Oracle continually delivers on our customer's strategic vision and business goals using Big Data and Advanced Analytic capabilities. My team works with every part of a customer's solution at an business, application, information and technology architecture level. Basically, we own Big Data thought leadership and focus on staying on the leading edge.

Session(s):
Extending Hortonworks with Oracle’s Big Data Platform
Wednesday, June 29, 2016, 11:30AM - 12:10PM
BALLROOM C
Saurabh Mishra

Hortonworks
Saurabh is a Solution Architect with strong expertise in Hadoop ecosystem and rich field experience. He helps large to small enterprises solve their business problems strategically, functionally and at scale by leveraging BigData technologies. Saurabh has over 10 years of strong IT experience and has served in key positions as Lead Big Data Architect, Performance Architect, Technology Architect in multiple large and complex enterprise programs. He has extensive knowledge of BigData/NoSql technologies including Hadoop, Yarn, Spark, Hbase, Hive, Pig etc. and has been working in this space for last 5+ years.

Session(s):
End-to-End Processing of 3.7 Million Telemetry Events per Second Using Lambda Architecture
Tuesday, June 28, 2016, 3:00PM - 3:40PM
211
Tyler Mitchell

Actian Corporation
Tyler Mitchell, Senior Software Development Engineer, joined Actian in 2011. Tyler is an accomplished data and Internet technologist with 20-years experience. He is a prolific writer in the data space, in both traditional media and online. He shares his love of R&D, open source, Big Data management, cloud architectures, and geospatial knowledge via his blogs at spatialguru.com and makedatauseful.com

Session(s):
Solving Performance Problems on Hadoop to Move Data Analytics Workloads into Production
Tuesday, June 28, 2016, 3:00PM - 3:40PM
230C
Abhishek Modi

Qubole Inc.
Abhishek Modi works on Hadoop and Yarn stack at Qubole. He has worked on key features in YARN like auto scaling framework and balancing of spot nodes in cluster. He completed his graduation from IIT-Varanasi and has previously worked with Adobe Systems. Abhishek filed multiple patents during his tenure at Adobe.

Session(s):
Operationalizing YARN Based Hadoop Clusters in the Cloud - Lessons and Opportunities
Tuesday, June 28, 2016, 11:30AM - 12:10PM
230C
Peter Monaco

Yahoo!
Hima Mukkamala

General Electric
Himagiri (Hima) Mukkamala is Executive Director of Predix Engineering at GE Digital. He is responsible for leading the Predix engineering organization to deliver high quality software iteratively using Agile & Scrum principles. Predix is GE’s cloud-based platform for Industrial Internet and includes horizontal components and application frameworks for building customer facing solutions. In his previous role at Sybase, Hima led various groups in the Mobile Platform organization and played a significant part in the successful acquisition by SAP.
Raghavendra Nandagopal

Symantec Corporation
Raghavendra Nandagopal is a Cloud Data Service Architect working at Symantec. He has extensive experience working on Big Data technologies and providing Analytics as a Service. He has contributed to some of the security features of Apache Storm

Session(s):
End-to-End Processing of 3.7 Million Telemetry Events per Second Using Lambda Architecture
Tuesday, June 28, 2016, 3:00PM - 3:40PM
211
Ranga Nathan

HPE
Steve Tramack is a senior engineering manager in the Alliances, Performance and Solutions Engineering organization within HPE's Converged Data Infrastructure BU. He is responsible for Big Data Reference Architecture development and certification. Tramack's team also focuses on developing performance and best practices collateral for Hadoop solutions. With more than 26 years of industry experience, Tramack a frequent featured speaker at technology conferences, and writes frequently about trends for Big Data.

Session(s):
A New "Sparkitecture" for Modernizing Your Data Warehouse
Wednesday, June 29, 2016, 11:30AM - 12:10PM
210C
Chris Nauroth

Hortonworks
Chris Nauroth is a software engineer on the HDFS team at Hortonworks. He is an Apache Hadoop committer and PMC member, an Apache ZooKeeper committer, an Apache Yetus committer and PMC member, an Apache Incubator PMC member and an Apache Software Foundation member. His most significant contributions include HDFS operational improvements, Hadoop Windows compatibility and HDFS ACLs. He also helped shepherd the Apache contribution of WASB, a Hadoop-compatible file system backed by Azure Storage. Prior to joining Hortonworks, Chris deployed and maintained Disney's Hadoop infrastructure and developed web services and MapReduce jobs.

Session(s):
Keep your Hadoop Cluster at its Best!
Tuesday, June 28, 2016, 5:50PM - 6:30PM
BALLROOM B

HDFS: Optimization, Stabilization and Supportability
Tuesday, June 28, 2016, 3:00PM - 3:40PM
BALLROOM B

Hadoop & Cloud Storage: Object Store Integration in Production
Thursday, June 30, 2016, 11:30AM - 12:10PM
212
Joseph Niemiec

Hortonworks
Co-Author of the YARN Book, Joseph is currently working with the Big 3 Automotive's to implement their Connected Vehicle Platform on Hadoop/Streaming Techs. He has been personally working with Hadoop in aspects from R&D to Production implementations for the last 4+ years with major Fortune 100 companies.

Session(s):
Building a Smarter Home with Nifi and Spark
Tuesday, June 28, 2016, 2:10PM - 2:50PM
210C
Kopal Niranjan

Expedia
Kopal Niranjan is Software Engineer at Expedia.She recently joined the EDW big data team but quickly added some features in Sqoop which helped teams in Expedia.She is also working on some new products where she will provide solutions for collecting logs with minimum data loss.

Session(s):
Bridging the Gap of Relational to Hadoop Using Sqoop @ Expedia
Tuesday, June 28, 2016, 5:00PM - 5:40PM
211
Lu Niu

Yahoo! Inc.
Software engineer in Yahoo ads and data team, expertise in real-time data processing and performance tuning.

Session(s):
Yahoo!'s Next-Generation User Profile Platform
Wednesday, June 29, 2016, 2:10PM - 2:50PM
230A
Kyle Nusbaum

Yahoo!
Kyle Nusbaum has been working at Yahoo full time on the low-latency team for over a year, and worked part time as an intern for two years prior to that. He is an Apache Storm committer and PMC member and enjoys working with distributed systems of all types.

Session(s):
Performance Comparison of Streaming Big Data Platforms
Wednesday, June 29, 2016, 12:20PM - 1:00PM
210C
Will Ochandarena

MapR Technologies
Will Ochandarena is Director of Product Management at MapR, responsible for user experience and cloud. Prior to MapR, Will spent some time in the SeaMicro group at AMD, responsible for networking and cloud strategy, and before that was a product manager for the Nexus family of data center switches at Cisco. Will has an engineering degree from Rensselaer Polytechnic Institute, and and MBA from Santa Clara University.

Session(s):
The Stream is the Database - Revolutionizing Healthcare Data Architecture
Wednesday, June 29, 2016, 5:00PM - 5:40PM
212
Owen O`Malley

Hortonworks
Owen O'Malley is a co-founder and technical fellow at Hortonworks. Owen has been working on Hadoop since the beginning of 2006 at Yahoo, was the first committer added to the project. He used Hadoop to set the Gray sort benchmark in 2008 and 2009. In the last 10 years, he has been the architect of MapReduce, Security, and now Hive. Recently he has been driving the development of the ORC file format and adding ACID transactions to Hive.

Session(s):
File Format Benchmark - Avro, JSON, ORC, and Parquet
Tuesday, June 28, 2016, 5:00PM - 5:40PM
BALLROOM A
Rohini Palaniswamy

Yahoo! Inc.
Rohini is a Principal Engineer at Yahoo working on the Hadoop Platform Team and leading the development of Apache Pig on Apache Tez. Rohini is the V.P. of Apache Pig and is a PMC member/committer on Apache Tez and Apache Oozie. She is interested in query planners, large-scale data processing and analytics platforms and loves to work on performance and scaling problems.

Session(s):
Yahoo’s Experience Running Pig on Tez at Scale
Thursday, June 30, 2016, 11:30AM - 12:10PM
230A
Mark Palmer

TIBCO Software Inc.
Mark Palmer holds nearly three decades of experience working in the financial technology industry, creating innovative technology, taking products to market, and building companies. He is actively using his skills to help TIBCO create a next-generation "Digital Business" software infrastructure and analytics platform. Mark has spoken at numerous industry events and is frequently quoted in leading publications, including the Wall Street Journal, Time (double check if this is TIME) Magazine, The Financial Times and CNBC. From 2010 through 2013, Institutional Investor consecutively named Mark as one of the "Top Executives and Innovators in Financial Technology."

Session(s):
7 Enterprise Case Studies of IoT, Streaming Analytics, and Real Business Value
Tuesday, June 28, 2016, 11:30AM - 12:10PM
212
Yi Pan

LinkedIn
Yi Pan has worked in the distributed platforms for Internet applications for 8 years. He started in Yahoo! on Yahoo!'s NoSQL database project, leading the development of multiple features, such as real-time notification of database updates, secondary index, and live-migration from legacy systems to NoSQL database. He joined and led the distributed Cloud Messaging System project later, which is used heavily as a pub-sub and transaction logs for distributed databases in Yahoo!. From 2014, he joined LinkedIn and has quickly become the lead of Samza team in LinkedIn and a Committer and PMC member in Apache Samza.

Session(s):
Lambda-less Stream Processing @ Scale in LinkedIn
Wednesday, June 29, 2016, 5:00PM - 5:40PM
BALLROOM B
Jitendra Pandey

Hortonworks
Jitendra has been a active HDFS PMC and Committer for several years. He has also contributed to the Hive project. He was part of the original Hadoop team at Yahoo and now an active member and manager of the HDFS team at Hortonworks.

Session(s):
Evolving HDFS to a Generalized Distributed Storage Subsystem
Tuesday, June 28, 2016, 12:20PM - 1:00PM
BALLROOM B
Kartik Paramasivam

LinkedIn
Kartik is responsible for the Streams Infrastructure group working on the messaging and event processing infrastructure that powers LinkedIn. As part of this mission, he and his team are focussed on design, development and running LinkedIn's PubSub technology (Apache Kafka), Change propagation pipeline from Databases like Oracle/Espresso (Databus), and Stream Processing Infrastructure (Apache Samza).

Session(s):
Lambda-less Stream Processing @ Scale in LinkedIn
Wednesday, June 29, 2016, 5:00PM - 5:40PM
BALLROOM B
Sunil Patil

mapr
I have profound technical work experience as a leader and individual contributor in a wide variety of internal and customer-facing environments worldwide including startups, academia, government, and large corporations. In addition to being a software developer, I can manage the primary components of a data center, including server hardware, operating systems, application stacks, networks, and storage in clustered and standalone deployments on premise and in the cloud. I have a unique skill set that spans completely from solution requirements through product evangelism and can speak to both technical and business decision makers.

Session(s):
Designing and Implementing Your IoT Solution with Open Source
Wednesday, June 29, 2016, 4:10PM - 4:50PM
230C
Pat Patterson

StreamSets
Pat Patterson has been working with Internet technologies since 1997, building software and working with communities at Sun Microsystems, Huawei, Salesforce and StreamSets. At Sun, Pat was the community lead for the OpenSSO open source project, while at Huawei he developed cloud storage infrastructure software. Part of the developer evangelism team at Salesforce, Pat focused on identity, integration and the Internet of Things. Now community champion at StreamSets, Pat is responsible for the care and feeding of the StreamSets open source community.

Session(s):
Ingest and Stream Processing - What Will You Choose?
Thursday, June 30, 2016, 3:00PM - 3:40PM
230C
Boyang (Jerry) Peng

Yahoo! Inc.
Boyang Jerry Peng is a Software Engineer on the Low Latency Team at Yahoo! Inc. where he works on Big Data platforms primarily Storm. He is a PMC member and committer of the Apache Storm project. He has authored several academics papers on the subject of distributed data stream processing systems and Storm. Prior to Yahoo, he was a graduate student at University of Illinois, Urbana-Champaign with a research emphasis on distributed systems, specifically, data stream processing platforms.

Session(s):
Resource Aware Scheduling in Storm
Wednesday, June 29, 2016, 5:50PM - 6:30PM
230C
Francisco Perez-Sorrosal

Yahoo! Inc.
Research engineer at Yahoo in Sunnyvale working on Omid, a high-performant transactional framework with Snapshot Isolation semantics on top of HBase.

Session(s):
Omid: A Transactional Framework for HBase
Wednesday, June 29, 2016, 3:00PM - 3:40PM
210A
Thomas Phelan

BlueData Inc.
Tom has 25 years of experience with virtualization and storage sub-systems, making him uniquely suited to develop technologies pushing the limits of Hadoop performance in virtualized environments. Tom spent 10 years as the ESX storage architect at VMWare. Earlier, he was a member of the original team that designed and implemented XFS, the first commercially available 64 bit file system. Currently, Tom is Chief Architect and co-founder of BlueData Software. He is also a champion and evangelist for open source projects such as Hadoop, GlusterFS, Spark, and Tachyon.

Session(s):
Big Data in the Cloud, the Time has Come
Tuesday, June 28, 2016, 2:10PM - 2:50PM
230C
Marie-Luce Picard

EDF
Marie-Luce Picard is a project manager and BI expert at EDF-R&D. She has managed different R&D projects dealing with business intelligence, and big data. She has also managed the EDF R&D team working on BI and analytics. She is currently in charge of managing the EDF-R&D project dealing with Big Data and Data Science to help EDF to become a data-driven company.

Session(s):
A Data Lake and a Data Lab to Optimize Operations and Safety Within a Nuclear Fleet
Thursday, June 30, 2016, 11:30AM - 12:10PM
211
Rachel Poulsen

TiVo
Rachel Poulsen is the Director of Data Science at TiVo. She has a Master’s degree in Statistics from Brigham Young University and Bachelor’s degrees in Mathematics and Communications. She is responsible for the strategy and execution of data science at TiVo. Her team currently consults a variety of teams within TiVo, including research and development, marketing and communications, and engineering operations. The team is responsible for making sense of the large amounts of user data TiVo has including integrating cross-platform viewership, classifying users by common viewership behaviors, and predicting behaviors based on historical patterns.

Session(s):
Building a Graph Database in Neo4j with Spark & Spark SQL to Gain New Insights from Log Data
Tuesday, June 28, 2016, 12:20PM - 1:00PM
230C
Beata Puncevic

Blue Cross Blue Shield of Michigan
Beata is a technology leader with a passion for transforming the healthcare industry and improving the quality and cost of patient care through data and innovation. She has over 12 years of experience in data engineering, data management, data architecture and integration architecture across the healthcare and retail industries. Currently Beata is leading an effort to build a modern, agile, real time data architecture designed to drastically improve the organizations ability to leverage data insights to improve patient quality and access to care.

Session(s):
How Hadoop and a Modern Data Platform Can Enable Transformation in Healthcare
Tuesday, June 28, 2016, 4:10PM - 4:50PM
230A
Vamshi Punugoti

MD Anderson Cancer Center
Vamshi is an IT leader focused on the optimal use of technology to provide an outstanding experience for the patients, clinicians and researchers at MD Anderson. He has over 18 years of experience being a strategic partner, understanding key business drivers and building relationships with stakeholders, peers and their teams to deliver integrative user-centric solutions in enterprise environments. Currently he leads programs in high performance distributed computing, data storage platforms, data lifecycle management while contributing to imaging, precision-oncology and big data initiatives. Before joining MD Anderson, Vamshi worked at Microsoft developing scalability, automation solutions to enhance operational and service excellence.

Session(s):
HDP @ MD Anderson - Starting the Hadoop Journey at a Global Leader in Cancer Research
Tuesday, June 28, 2016, 5:50PM - 6:30PM
230A
Robin Purohit

BMC
Krishna Puttaswamy

Airbnb
Krishna Puttaswamy is a software engineer on the Data Infrastructure team at Airbnb. He works on making data systems more reliable and building systems for data analysis. Before Airbnb, he was at LinkedIn working on a computation engine, called Cubert, for complex analysis and reporting on Hadoop datasets. Prior to that he was conducting research on distributed systems in research labs and academia.

Session(s):
Reliable and Scalable Data Ingestion at Airbnb
Wednesday, June 29, 2016, 5:00PM - 5:40PM
BALLROOM A
Shankar Radhakrishnan

Impetus
Data specialist with in-depth experience in Data Architecture, Big Data Analytics, Data warehousing and Business Intelligence. Engages directly with CXO's, Business and IT leaders to scope and build Big Data and Analytics, Social, Cloud and Digital programs. Trusted partner to build Big Data Analytics Strategy and Roadmap. A big believer of " Data is your most important digital asset "

Session(s):
Hybrid Data Platform – Cloud Environment Connected with On-premise Data Environment
Wednesday, June 29, 2016, 5:50PM - 6:30PM
BALLROOM B
Mithun Radhakrishnan

Yahoo!
Mithun Radhakrishnan is an Apache Hive committer, and a Hadoop Engineer at Yahoo. He has worked on Hadoop-ish projects in Yahoo since 2009. He is the author of DistCp on Hadoop 2.0. He is an erstwhile firmware developer and is prone to flare-ups from C++ withdrawal.

Session(s):
Faster, Faster, Faster!: The True Story of a Mobile Analytics Data Mart on Hive
Tuesday, June 28, 2016, 12:20PM - 1:00PM
BALLROOM A
Sanjay Radia

Hortonworks
Sanjay is founder and chief architect at Hortonworks, and an Apache Hadoop committer and member of the Apache Hadoop PMC. Prior to co-founding Hortonworks, Sanjay was the chief architect of core-Hadoop at Yahoo and part of the team that created Hadoop. In Hadoop he has contributed to several areas including HDFS, MapReduce schedulers, Yarn's design, high availability, compatibility, etc. He has also held senior engineering positions at Sun Microsystems and INRIA, where he developed software for distributed systems and grid/utility computing infrastructures. Sanjay has a PhD in Computer Science from the University of Waterloo in Canada.

Session(s):
Evolving HDFS to a Generalized Distributed Storage Subsystem
Tuesday, June 28, 2016, 12:20PM - 1:00PM
BALLROOM B
Lokesh Rajaram

Intuit
Lokesh Rajaram is a Senior Software Engineer at Inuit currently operating clickstream batch processing system and also actively involved in re-engineering the batch system into a stream processing system. This role empowers me to build a solution that uses relevant tech stack (Kafka, Spark, AWS), deal with high volume of data, and develop generic solutions, which can be used across Intuit.

Session(s):
The Evolution of Big Data Pipelines at Intuit
Thursday, June 30, 2016, 2:10PM - 2:50PM
BALLROOM C
Venkatesh Ramanathan

PayPal Inc
Venkatesh is a senior data scientist at PayPal where he is working on building state-of-the-art tools for payment fraud detection. He has over 20+ years experience in designing, developing and leading teams to build scalable server side software. In addition to being an expert in big-data technologies, Venkatesh holds a Ph.D. degree in Computer Science with specialization in Machine Learning and Natural Language Processing (NLP) and had worked on various problems in the areas of Anti-Spam, Phishing Detection, and Face Recognition.

Session(s):
Application of Active Learning for Fraud Labeling @PayPal
Tuesday, June 28, 2016, 4:10PM - 4:50PM
210A
Ritesh Ramesh

PwC
Ritesh Ramesh is the Chief Technologist for the Global Data and Analytics (D&A) group at PwC where he drives the standardization and simplification of firm’s D&A infrastructure, architecture assets and offerings across our lines of services across US and Global regions. He has 15 years of consulting experience and has led large scale global transformation programs on Data and Analytics. Ritesh is currently pursuing his Executive MBA at MIT Sloan. He graduated with a Master’s Degree in Computer Science from the Illinois institute of Technology, Chicago and holds an Enterprise Architecture certification from Penn State University, PA.

Session(s):
Future of Apache Hadoop – An Enterprise Architecture View
Wednesday, June 29, 2016, 3:00PM - 3:40PM
230C
Jun Rao

Confluent
Jun Rao is the co-founder of Confluent, a company that provides a stream data platform on top of Apache Kafka. Before Confluent, Jun Rao was a senior staff engineer at LinkedIn where he led the development of Kafka. Before LinkedIn, Jun Rao was a researcher at IBM's Almaden research data center, where he conducted research on database and distributed systems. Jun Rao is the PMC chair of Apache Kafka and a committer of Apache Cassandra.

Session(s):
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with Apache Kafka
Wednesday, June 29, 2016, 11:30AM - 12:10PM
210A
Alex Rasmussen

Trifacta
Savitha Ravikrishnan

Yahoo!
Savitha is a production engineer at Yahoo Grid Operations with 5 years of experience in big data operations and large scale data pipelines.

Session(s):
Managing Hadoop, HBase, and Storm Clusters at Yahoo Scale
Thursday, June 30, 2016, 2:10PM - 2:50PM
210C
Sridhar Reddy

MapR Technologies
Sridhar is Director of Professional Services at MapR. He leads the Application development practice at MapR and helps customers in building Hadoop & HBase solutions and migrating data from RDBMS databases. Sridhar worked as a Technology Evangelist at Sun Microsystems for over 10 years, where he presented at many Technical conferences world wide and helped increase awareness and adoption of Java technology in the worldwide developer community. Sridhar holds a BS in Mechanical Engineering, and MS in Computer Science from Florida. He has extensive experience working with Java and JaveEE in many roles of the software development life cycle, including design, development, management, training and technology evangelism.

Session(s):
Designing and Implementing Your IoT Solution with Open Source
Wednesday, June 29, 2016, 4:10PM - 4:50PM
230C
Scott Reisdorf

Think Big a Teradata Company
Scott Reisdorf

Think Big a Teradata Company
Scott Reisdorf joined Think Big one year ago as a member of the Research and Development team. Scott has over 15 years of experience in software development. He is excited to apply his expertise to Big Data with Think Big. After joining Think Big Scott has helped successfully implement Big Data solutions in many fortune 500 companies. Scott is the lead contributor in the data lake accelerator framework that provides a beautiful web user interface integrating Apache Spark and Nifi.

Session(s):
Integrating Apache Spark and NiFi for Data Lakes
Thursday, June 30, 2016, 2:10PM - 2:50PM
230A
Romain Rigaux

Cloudera
Romain is an engineer at Cloudera and the Lead of Hue. Before he worked on distributed systems at Yahoo! and Google and has been building Web apps since the early days.

Session(s):
SQL and Solr Search with Spark for Big Data Analytics in Your Browser
Tuesday, June 28, 2016, 5:00PM - 5:40PM
BALLROOM C
Vladimir Rodionov

Hortonworks
Hadoop/HBase Architect and Apache HBase contributor.

Session(s):
Internet Of Things: What about Data Storage?
Thursday, June 30, 2016, 2:10PM - 2:50PM
212
Mohan Sadashiva

Waterline Data

Session(s):
Extend Governance in Hadoop with Atlas Ecosystem
Thursday, June 30, 2016, 4:10PM - 4:50PM
210A
Kostas Sakellis

Cloudera
Kostas is the Tech Lead and Manager of the Apache Spark Team at Cloudera. Previous to that, he contributed to the extensibility effort on Cloudera Manager. Before Cloudera, Kostas did a 6 year stint at Amazon working across various teams including platform infrastructure. Kostas has a Bachelors of Mathematics in Computer Science from the University of Waterloo.

Session(s):
Effective Spark on Multi-Tenant Clusters
Thursday, June 30, 2016, 4:10PM - 4:50PM
210C
Rahul Sarda

Wipro Technologies
Rahul Sarda is Distinguished Member of Technical Staff (DMTS) at Wipro Technologies and currently managing the Big Data Practice at Wipro. He has led several engagements as an hands on Big Data Architect delivering significant value to the Customers and delivered several programs around building Enterprise Data Provisioning Platform, Data Discovery and Analytics, Iterative Analysis, Low Latency Platforms, Enterprise Security, Real Time Streaming using Big Data Analytics platform.

Session(s):
Big Data Ready Enterprise Framework
Thursday, June 30, 2016, 3:00PM - 3:40PM
211
Kaz Sato

Google Inc.
Kaz Sato is a Developer Advocate of Cloud Platform team, Google Inc. His focus areas are Machine Learning, Big Data and IoT products and solutions. He also works as an advocate for Google Cloud Platform developer communities.

Session(s):
Google Cloud Platform Empowers TensorFlow and Machine Learning
Tuesday, June 28, 2016, 4:10PM - 4:50PM
BALLROOM A
Apoorv Saxena

Google
Apoorv Saxena, is a product manager in the Google Google Cloud machine intelligence leading natural language understanding products. Previously, he was co-founder of mobile payment company Okati and worked at McKinsey as management consultant. He holds a MBA from Wharton and BS from IIT, Bombay.

Session(s):
Machine Learning for Any Size of Data, Any Type of Data
Wednesday, June 29, 2016, 12:20PM - 1:00PM
BALLROOM A
Joanna Schloss

Dell Software
Joanna Schloss is a subject matter expert in the Dell Center of Excellence specializing in data and information management. Her areas of expertise include big data analytics, business intelligence, business analytics, and data warehousing. With a blend of experience in both startup and G500 environments, Joanna has successfully launched a myriad of products, from business-focused analytic applications to data warehousing tools such as Business Objects Data Services. Within the Dell Center of excellence, she helps clients deal with the challenges of managing multiple data platforms, applications systems, and analytic environments.

Session(s):
IoT, Big Data, Cloud – the Convergence of Marketing Terms?
Thursday, June 30, 2016, 11:30AM - 12:10PM
BALLROOM C
Eric Schmidt

Google
Thomas Schweiger

eBay, Inc
Tom Schweiger is an architects in eBay's Identity Services and Shared Data team, working from eBay's Seattle office. He has 15 years of experience building entity resolution systems.

Session(s):
High-Scale Entity Resolution in Hadoop
Wednesday, June 29, 2016, 5:50PM - 6:30PM
BALLROOM C
Ohad Shacham

Yahoo Research
Ohad Shacham is Research Scientist at Yahoo Research. He works on scalable big data and search platforms. Most recently, he focused on extending the Omid transaction processing system with high availability and scalability features. Dr. Shacham received his PhD in concurrent software verification from Tel-Aviv University CS in 2012. Prior to Yahoo, he held technical positions at IBM and Intel.

Session(s):
Omid: A Transactional Framework for HBase
Wednesday, June 29, 2016, 3:00PM - 3:40PM
210A
Premal Shah

6sense
At 6sense Premal geeks out over tons of data, dreams about how to process data faster and extract the best signals from it, and obsesses over a blazing fast UI. Premal was one of the co-founders of GrepData, the Y Combinator backed startup that simplified Hadoop to make Big Data more accessible for everyone. Prior to his tenure with GrepData, Premal designed databases and optimized data access for Livingly Media.

Session(s):
Instrument Your Instruments: Data-Driven Ops
Tuesday, June 28, 2016, 5:50PM - 6:30PM
230C
Punit Shah

Impetus Technologies
Punit has been a key member in building Enterprise grade applications and platforms running 100s of millions of $ in revenue for Impetus customers. He is part of the original technology team that created StreamAnalytix an open-source powered Enterprise-grade streaming analytics platform. He is working actively with many large enterprises to help them build a variety of streaming and complex-event processing use-cases using Apache Kafka, Storm, Spark-streaming and the Hadoop stack in general.

Session(s):
Lego-Like Building Blocks of Storm and Spark-Streaming Pipelines for Rapid IOT and Streaming Analytics App Development
Thursday, June 30, 2016, 2:10PM - 2:50PM
230C
Purshotam Shah

Yahoo!
Purshotam is a senior software engineer with the Hadoop team at Yahoo, and an Apache Oozie PMC member and committer.

Session(s):
Building and Managing Large Scale Data Pipelines with Complex Dependencies Using Apache Oozie
Wednesday, June 29, 2016, 3:00PM - 3:40PM
210C
Ashwin Shankar

Netflix
Ashwin Shankar is an Apache Hadoop and Spark contributor. He is a Senior Software Engineer at Netflix and is passionate about developing features and debugging problems in large scale distributed systems. Ashwin holds a Master's degree in Computer Science from University of Illinois at Urbana Champaign.

Session(s):
Netflix - Productionizing Spark on YARN for ETL at Petabyte Scale
Tuesday, June 28, 2016, 5:50PM - 6:30PM
BALLROOM A
Carter Shanklin

Hortonworks
Carter is the Director Product Management in Hortonworks.

Session(s):
Fine-Grained Security for Spark and Hive
Wednesday, June 29, 2016, 4:10PM - 4:50PM
210A
Ambud Sharma

Symantec Corporation
Software and security professional with focus in the area of Security Incident and Event Management (SIEM) and BigData Analytics. Over the past few years I have built: - a secure scalable cloud analytics platform with graph storage - an open source scalable SIEM platform

Session(s):
In-Flux Limiting for a Multi-Tenant Logging Service
Tuesday, June 28, 2016, 2:10PM - 2:50PM
BALLROOM B
Smiti Sharma

EMC
Smiti Sharma is a Principal Engineer for Analytics, Cloud and Emerging technologies at EMC. Her focus is distributed computing and analytics, High transaction processing and Cloud Native apps. At EMC she also works with customers and Office of CTO, to integrate and define direction for emerging products in this space. Prior to joining EMC, Smiti spent 3 years as Field CTO, Greenplum/Pivotal Over the past 12 years, she has managed and built complex critical enterprise solutions. A regular speaker at technical conferences and User groups, she is passionate about solving customer’s business problems creatively with technology.

Session(s):
Navigating the World of User Data Management and Data Discovery
Wednesday, June 29, 2016, 11:30AM - 12:10PM
211

Building a Data Analytics PaaS for Smart Cities
Wednesday, June 29, 2016, 4:10PM - 4:50PM
212
Gera Shegalov

Twitter
Gera has PhD in Computer Science from Saarland University in Saarbruecken, Germany. He has published various papers related to transactional systems, and exactly-once recovery guarantees. He has worked at Oracle at the Java Platform Group on benchmark-winning JMS provider, at MapR on Shuffle and Sort optimizations, and at Twitter first on the Hadoop Team, and now on the Product Instrumentation and Experimentation (PIE) team. Gera is an Apache Hadoop committer.

Session(s):
Cross-DC Fault-Tolerant ViewFileSystem at Twitter
Thursday, June 30, 2016, 11:30AM - 12:10PM
BALLROOM B
Jun Shi

Yahoo!
Jun Shi is a software engineer at Yahoo with focus on machine learning platforms. His research interest is in the large-scale machine learning algorithms. Prior to Yahoo, he was designing wireless communication chips at Broadcom, Qualcomm and Intel.

Session(s):
Distributed Deep Learning on Hadoop Clusters
Thursday, June 30, 2016, 12:20PM - 1:00PM
BALLROOM C
Vinay Shukla

Hortonworks
Vinay is the Director of Product Management for Spark & Data science at Hortonworks. Vinay loves using data to solve problems and thinks Spark will lead to more data driven decisions. Vinay has worked as Product Manager, Developer, and Security Architect. When not in front of a computer, Vinay enjoys being on a Yoga mat or on a hiking trail.

Session(s):
State of Security in Spark
Thursday, June 30, 2016, 11:30AM - 12:10PM
210A
Gurpreet Singh

eBay, Inc
Gurpreet Singh is an integration architect in eBay's Identity Services and Shared Data team. He works out of eBay's Seattle office.

Session(s):
High-Scale Entity Resolution in Hadoop
Wednesday, June 29, 2016, 5:50PM - 6:30PM
BALLROOM C
Ankit Singhal

Hortonworks
I have been working on analytics from last 4-5 years and have built various analytics products for pharmacy and adTech companies.

Session(s):
Phoenix + HBase: An Enterprise Grade Data-Warehouse Appliance for Interactive Analytics?
Thursday, June 30, 2016, 3:00PM - 3:40PM
210C
Joseph Sirosh

Microsoft
James Sirota

Hortonworks
James Sirota is a Apache community member and a PPMC member and a release manager for Apache Metron (Incubating). James currently serves as a Director of Security Solutions for a major big data platform vendor and held a previous position as a Chief Data Scientist at a Fortune 500 security vendor. James has a CISSP-ISSAP and over 15 years of experience as a security practitioner. He holds a B.S. in Computer Science and M.S. in Computer Engineering from Arizona State University and University of Southern California respectively.

Session(s):
War on Stealth Cyberattacks that Target Unknown Vulnerabilities
Thursday, June 30, 2016, 11:30AM - 12:10PM
BALLROOM A
Mike Sontag

E & Y
Enis Soztutar

Hortonworks
Enis Soztutar is an Apache HBase, Hadoop, and Phoenix committer and member of the Apache Software Foundation. He has been using and developing Hadoop ecosystem projects since 2007. He is currently working at Hortonworks as a part of the HBase engineering team.

Session(s):
Apache HBase - State of the Union
Wednesday, June 29, 2016, 2:10PM - 2:50PM
BALLROOM C

Operating and Supporting Apache HBase - Best Practices and Improvements
Wednesday, June 29, 2016, 5:00PM - 5:40PM
210C
Casey Stella

Hortonworks
I am a principal architect focusing on Data Science on the Apache Metron project at Hortonworks. In the past, I've worked as an architect and senior engineer at a healthcare informatics startup spun out of the Cleveland Clinic, as a developer at Oracle and as a Research Geophysicist in the Oil & Gas industry. Before that, I was a poor graduate student in Math at Texas A&M. I specialize in writing software and solving problems where there are either data science challenges or scalability concerns due to large amounts of traffic or large amounts of data.

Session(s):
Data Preparation and Munging for Data Science: A Field Guide
Tuesday, June 28, 2016, 5:00PM - 5:40PM
230A

Scalable Optical Character Recognition with Apache NiFi and Tesseract
Wednesday, June 29, 2016, 5:50PM - 6:30PM
210C
Ralph Su

eBay Inc.
Ralph works as MTS for eBay Analytic Data Infrastructure, focus on hadoop monitoring. He is contributor of Apache eagle project (http://eagle.incubator.apache.org).

Session(s):
Apache Eagle - Secure Hadoop in Real Time
Wednesday, June 29, 2016, 5:00PM - 5:40PM
230A
Arun Suresh

Microsoft
Arun has been a long time comrade of the yellow elephant. He has worked on various large scale distributed systems at Cloudera, Pivotal and Yahoo!, most of them involved pushing the limits of Hadoop. Currently a Research Engineer at Microsoft's Cloud and Information systems Labs (CISL), his general interests are in the field of Systems used for large scale data analytics and graph processing. He is also a bit of a functional programming junkie interested in languages like Haskell and Erlang.

Session(s):
GoodFit - An Efficient MRP (Multi-Resource Packing) Allocator for YARN
Thursday, June 30, 2016, 4:10PM - 4:50PM
230A
Wangda Tan

Hortonworks
Wangda Tan is a senior member of technical staff at Hortonworks Hadoop YARN team. He is also a committer and PMC member on the Apache Hadoop project. His major working field is Hadoop YARN resource scheduler, participated features like node labeling, resource preemption, container resizing etc. Before join Hortonworks, he was working at Pivotal, working on a project to integration OpenMPI/GraphLab with Hadoop YARN. Before that, he was working at Alibaba, participated ODPS project (Open Data Processing Service), which is a large scale machine learning, matrix and statistics computation platform using Map-Reduce and MPI.

Session(s):
Scheduling Policies and Resource Types in YARN
Tuesday, June 28, 2016, 5:50PM - 6:30PM
210C
Shashank Tandon

Expedia
Shashank Tandon is a Software Development Engineer II at Expedia.As a part of EDW Platform team, provide solutions in the field of big data to various teams at Expedia.Recently helped teams with providing solution for data ingestion of bulk data into Hadoop and currently working on building a realtime log collector to figure out stability of Hadoop.

Session(s):
Bridging the Gap of Relational to Hadoop Using Sqoop @ Expedia
Tuesday, June 28, 2016, 5:00PM - 5:40PM
211
Uday Tennety

GE Digital
Uday Tennety manages the strategic client engagements at GE Digital, and also leads a customer innovation and co-creation center called the Design Center. Prior to joining GE, Uday worked as the Director of Analytic Services at Revolution Analytics, a Microsoft company. Also, Uday led diverse teams with roles in Strategy, Business Development, Marketing, Product Management and Product Development at companies such as Fujitsu, Tata Consultancy Services, Ecologic Analytics and others. Uday holds a MS in Computer Science degree from the University of North Carolina at Charlotte, and an MBA from the Haas School of Business at UC Berkeley.

Session(s):
The Industrial Internet: Big Data, Intelligent Machines, and Smarter Workforce
Tuesday, June 28, 2016, 3:00PM - 3:40PM
BALLROOM A
Nishant Thacker

Microsoft
Nishant Thacker is a Technical Product Manager for the Big Data Analytics services from Microsoft. With more than 8 years of experience working with on Analytics platform, he's been a speaker at many leading conferences. His work involves working with Engineering and the Field to enable a smooth launch pad for new service and offerings in the Big Data space, and also making sure there is ample technical acumen for partners to have a seamless implementation cycle. He also is a true evangelist of the power of analytics.

Session(s):
Hadoop Application Architectures - Fraud Detection
Wednesday, June 29, 2016, 11:30AM - 12:10PM
212

Hadoop in the Cloud – The What, Why and How from the Experts
Thursday, June 30, 2016, 2:10PM - 2:50PM
BALLROOM B
Robert Thorman

AT&T
Technical Solution Architect and Principal Software Engineer in the area of Security Technology for the Chief Security Officer at AT&T. Using the Hortonworks Data Platform to solve the data organization and integration needs of the enterprise. From custom YARN applications, to random access solutions, to complete threat analytics systems solution to protect the AT&T enterprise.

Session(s):
Near Real-time Outlier Detection and Interpretation
Tuesday, June 28, 2016, 12:20PM - 1:00PM
210A
Eric Thorsen

Hortonworks
20 years in tech and industry. focused on retail and CPG. Three years managing Walmart tech needs, Currently manages industry team at Hortonworks including Retail, Healthcare, Insurance, Financial Service, Manufacturing, and Telecom.

Session(s):
Customer Journey - Sentiment Analysis for Fashion Retail
Thursday, June 30, 2016, 3:00PM - 3:40PM
230A
Kendall Thrapp

Yahoo!
Kendall Thrapp is a Principal Software Engineer on the Grid UI Team at Yahoo, where he has been working on web applications for managing Yahoo's cloud infrastructure and collecting and visualizing various Hadoop and Storm metrics since 2009.

Session(s):
Cost and Resource Tracking for Hadoop
Thursday, June 30, 2016, 3:00PM - 3:40PM
BALLROOM B
Daniel Totten

Ford Motor Company
Consulting Architect for analytics at Ford Motor Company.

Session(s):
The "Connected Vehicle" (IoT and Streaming) – Supporting the Mission to Become Both an Automotive and Mobility Company
Tuesday, June 28, 2016, 12:20PM - 1:00PM
210C

The Architectural Journey to our Modern Data Applications – DSC (Data Supply Chain)
Wednesday, June 29, 2016, 4:10PM - 4:50PM
BALLROOM A
Martin Traverso

Facebook
Martin is a co-founder of Presto and a Software Engineer at Facebook where he leads the Presto development team. Previously, he was an architect at Proofpoint and Ning.

Session(s):
Presto, What's New in SQL-on-Hadoop and Beyond
Wednesday, June 29, 2016, 5:50PM - 6:30PM
210A
Vineet Tyagi

Impetus
Vineet is a strategic leader with expertise in technology forecasting, solutions, and developing strategic technology plans that lead to enhanced profitability. He has a proven track record of translating business vision into products and businesses to generate maximum ROI. At Impetus, Vineet has been instrumental in leading a high performance team of 100+ engineers and architects. His responsibilities include managing the products and R&D portfolio and driving thought leadership for conversion of sales leads to customers and lead generation. He also contributes to the strategic direction and growth of the company as a part of the executive management.

Session(s):
Accelerating Data Warehouse Migration to Hadoop
Wednesday, June 29, 2016, 11:30AM - 12:10PM
230C

Swimming Across the Data Lake - Lessons Learned and Keys to Success!
Thursday, June 30, 2016, 4:10PM - 4:50PM
230C
Kostas Tzoumas

data Artisans
Kostas Tzoumas is PMC member of Apache Flink and co-founder and CEO of data Artisans. Before founding data Artisans, Kostas did his postdoc at TU Berlin leading the Stratosphere project which served as inspiration for Apache Flink, received his PhD at Aalborg University, and was with Microsoft Research and University of Maryland, College Park.

Session(s):
Streaming in the Wild with Apache Flink
Tuesday, June 28, 2016, 12:20PM - 1:00PM
BALLROOM C
Varun Vasudev

Hortonworks
I am a Apache Hadoop committer who works primarily on YARN. Before working on Hadoop, I used to work on search related technologies for Yahoo!

Session(s):
Scheduling Policies and Resource Types in YARN
Tuesday, June 28, 2016, 5:50PM - 6:30PM
210C
Vinod Kumar Vavilapalli

Hortonworks
Vinod Kumar Vavilapalli (reachable at @tshooter) is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science in Engineering. He has been working on Hadoop for nearly 9 years. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit.

Session(s):
A Multi-Colored YARN: Apps and First-Class Support for Services
Tuesday, June 28, 2016, 2:10PM - 2:50PM
BALLROOM A
Rajat Venkatesh

Qubole
Rajat Venkatesh is an Engineering Lead at Qubole. His focus is on improving data analytics on the cloud and primarily works on projects spanning Hive, Presto and Quark. Previously, Rajat worked on HP Vertica optimizer and execution engine. Rajat has an MS from Carnegie Mellon University. He holds several patents in the distributed data management.

Session(s):
Quark: Simplify and Optimize SQL Queries Across Hadoop and RDBMS
Wednesday, June 29, 2016, 12:20PM - 1:00PM
230C
Ram Venkatesh

Hortonworks
Ram Venkatesh is a Senior Director of Engineering at Hortonworks, leading the development of Hadoop platform to solve real-world problems in data processing, machine learning, resource management. He is a systems guy, deeply interested in all aspects of creating mission critical enterprise software.

Session(s):
Debugging YARN Cluster in Production
Thursday, June 30, 2016, 2:10PM - 2:50PM
211
Anand Venugopal

Impetus Technologies
Anand Venugopal heads Product Strategy for StreamAnalytix - an open source powered Enterprise grade Streaming Analytics platform developed by Impetus Technologies. Prior to this he built the Big Data Solutions practice at Impetus delivering high impact business solutions based on Big Data and Analytics to large enterprises in many Industry verticals. He has 20 years of software innovation and business development experience in the Enterprise and Telco space and is specifically passionate about Customer experience and Operational intelligence solutions using a combination of real-time and batch predictive analytics - leveraging open source technology stacks.

Session(s):
Lego-Like Building Blocks of Storm and Spark-Streaming Pipelines for Rapid IOT and Streaming Analytics App Development
Thursday, June 30, 2016, 2:10PM - 2:50PM
230C
George Vetticaden

Hortonworks
George Vetticaden is a Principal Architect at Hortonworks, Senior Product Owner/Manager for Metron/CyberSecurity, and committer on the Apache Metron project. Over the last 4 years at Hortonworks, George has spent time in the field with enterprise customers helping them build big data solutions on top of Hadoop. In his previous role at Hortonworks, George was the Director of Solutions Engineering where he led a team of 15 Big Data Senior Solution Architects helping large enterprise customers with use case inception, design, architecture, to implementation of use cases monetizing data with Hadoop. George graduated from Trinity University with a BA in Computer Science.

Session(s):
War on Stealth Cyberattacks that Target Unknown Vulnerabilities
Thursday, June 30, 2016, 11:30AM - 12:10PM
BALLROOM A
Gopal Vijayaraghavan

Hortonworks
Gopal Vijayaraghavan is a performance specialist who is an active contributor and PMC member to the Apache Tez and Apache Hive projects. He is currently working on the Stinger.next initiative to improve SQL performance for popular BI tools.

Session(s):
LLAP: Sub-Second Analytical Queries in Hive
Thursday, June 30, 2016, 12:20PM - 1:00PM
210A
Nanda Vijaydev

BlueData
Nanda has more than 10 years of experience in Data Management and Data Science. At BlueData, Nanda works with Spark, Hadoop, and related technologies to build software solutions for Big Data analytics use cases. She has worked on multiple Data Science and Big Data projects for large enterprises in the healthcare, media, telecommunications, and other industries. Prior to BlueData, she was a principal architect at Silicon Valley Data Science and director of solutions engineering at Karmasphere. She has an in-depth understanding of Data Management tools including data integration, ETL, data warehousing, reporting, Hadoop, and Spark.

Session(s):
There is a New Ranger in Town! End-to-End Security and Auditing in a Big-Data-as-a-Service Deployment
Tuesday, June 28, 2016, 12:20PM - 1:00PM
212
Sanjay Vyas

Diyotta
Sanjay is the co-founder and CEO of Diyotta. With more than 17 years of experience in data management and architecture, he is responsible for business leadership and delivering on Diyotta's vision towards modern data integration for big data. Prior to founding Diyotta, Sanjay was a senior architect at Bank of America, where he drove the data strategy for the entire enterprise risk platform. Previously, he held various information management leadership positions and lead projects at companies like Time Warner Cable, TIAA-CREF, Pitney-Bowes, Microsoft, and Hewlett Packard.

Session(s):
Analyzing Telecom Fraud at Hadoop Scale
Wednesday, June 29, 2016, 4:10PM - 4:50PM
211
Josh Walters

Yahoo!
Josh Walters is an engineer at Yahoo where he builds data pipelines and analytics systems.

Session(s):
Faster, Faster, Faster!: The True Story of a Mobile Analytics Data Mart on Hive
Tuesday, June 28, 2016, 12:20PM - 1:00PM
BALLROOM A
Guozhang Wang

Confluent
Guozhang is a an engineer at Confluent, building a stream data platform on top of Apache Kafka. He receives his PhD from Cornell University database group where he worked on scaling iterative data-driven applications. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructures on Apache Kafka and Apache Samza.

Session(s):
Introducing Kafka Streams, the New Stream Processing Library of Apache Kafka
Tuesday, June 28, 2016, 5:00PM - 5:40PM
210A
Tanping Wang

IBM
Tanping Wang is the senior architect of IBM BigInsights Hadoop product mainly focus on security. Before joining IBM, she worked has a Hadoop comitter on various projects that impact Yahoo's revenue pipeline.

Session(s):
Hive Metastore Security of Apache Ranger
Tuesday, June 28, 2016, 5:50PM - 6:30PM
212
Wei Wang

Hortonworks
Wei Wang is a Data Scientist at Hortonworks. He has expertise in healthcare insurance, manufacturing and hardware industries process where he has built data science pipelines and analytical tools at scale. Wei has graduated from Columbia University with a M.A in medical informatics and he also has an M.S in Intelligence Systems from University of Pittsburgh.

Session(s):
Zero Downtime App Deployment Using Hadoop
Thursday, June 30, 2016, 3:00PM - 3:40PM
BALLROOM C
Feng Wang

Alibaba Inc
Feng Wang is a Senior Manager in Alibaba's Search division. He has over 5 years experience in Hadoop.

Session(s):
Blink−Improved Runtime for Flink and its Application in Alibaba Search
Wednesday, June 29, 2016, 2:10PM - 2:50PM
210C
Sewook Wee

Trulia
Sewook Wee is leading the Personalization team in Data Engineering at Trulia. Recently, working with a team of talented engineers, he built a big data platform from scratch to provide personalized content and experience for Trulia consumers. The platform implements Lambda architecture with Hadoop MR and HBase, Scalding, Storm, Kafka, Flume, and RESTful services. Before joining Trulia, he worked for Accenture Technology Labs as an R&D manager focused on Big Data and Cloud technology with emphasis on Hadoop, Cassandra, and AWS. He received Master and PhD degree at Stanford University. His alma mater is Seoul National University.

Session(s):
Lambda Architecture: How we Merged Batch and Real-Time
Wednesday, June 29, 2016, 2:10PM - 2:50PM
212
Thomas Weise

DataTorrent
Thomas is Apache Apex PMC member and architect/co-founder at DataTorrent. He has developed distributed systems, middleware and web applications since 1997. Prior to DataTorrent he was in the Hadoop Team at Yahoo! and contributed to projects like Pig and Hive and migration to the next generation Hadoop 2.x.

Session(s):
Next Gen Big Data Analytics with Apache Apex
Thursday, June 30, 2016, 3:00PM - 3:40PM
212
Adam Wenchel

Capital One
Jay White Bear

IBM
Part of the Spark Enablement Team at IBM working to explore and define new ways to utilize cloud services and open source technologies to tackle the challenges of analytics in modern data.

Session(s):
Simultaneous Localization and Mapping (SLAM) with Kafka and Spark Streaming
Wednesday, June 29, 2016, 2:10PM - 2:50PM
230C
Roy Wilds

PHEMI Systems
Roy Wilds is the Director, Product Management and leader for the data science practice at PHEMI Systems, a big data solution provider that takes advantage of the power, scalability, and flexibility of Hadoop while providing fully integrated privacy, governance, and data management—all built right in. Roy has been using big data technologies since 2007 and has advanced knowledge in machine learning theory, a strong background in Python, R, and SQL, and substantial expertise using Hadoop’s distributed technologies. He holds a BSc from SFU in Physics, as well as MSc and PhD degrees in Mathematics and Statistics from McGill University.

Session(s):
"The Path to Wellness Through Big Data"
Wednesday, June 29, 2016, 5:50PM - 6:30PM
211
Gang Wu

Uber
Gang is currently a software engineer in the Hadoop Compute team at Uber. He builds services and tools to support large scale data applications. He is co-creator of Spark Uber Development Kit (UDK) which provides APIs and tools for engineers to develop and run Spark jobs easily and efficiently. Before Uber, Gang obtained a Master Degree in Computer Science from Carnegie Mellon University.

Session(s):
Spark Uber Development Kit
Wednesday, June 29, 2016, 11:30AM - 12:10PM
BALLROOM A
Maryann Xue

Intel
Maryann is a software engineer in the Big Data Technologies team at Intel. She is a PMC member of the Apache Phoenix project and has also been actively contributing to the Apache Calcite project. Before shifting focus on open source projects, she worked on Intel's Distribution of Hadoop as a technical leader of the HBase team.

Session(s):
How We Re-Engineered Phoenix with a Cost-Based Optimizer Based on Calcite
Thursday, June 30, 2016, 12:20PM - 1:00PM
BALLROOM B
Xiaoyu Yao

Hortonworks
Hadoop committer works mainly on HDFS.

Session(s):
Toward Better Multi-Tenancy Support from HDFS
Wednesday, June 29, 2016, 3:00PM - 3:40PM
BALLROOM C
Chuck Yarbrough

Pentaho
Chuck Yarbrough is the director of big data product marketing at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data, and report and visualize all of their data. Much of Chuck’s focus at Pentaho is in educating the organizations on how big data can help win, serve, and retain customers, lower costs, and grow revenue through the proper use of big data. A lifelong participant in the data game, Chuck has held leadership roles at Deloitte Consulting, SAP Business Objects, Hyperion, and National Semiconductor.

Session(s):
Filling the Data Lake
Wednesday, June 29, 2016, 5:00PM - 5:40PM
211
Nezih Yigitbasi

Netflix
Nezih Yigitbasi is a senior software engineer @ Netflix's Big Data Platform team working on and contributing to various open source projects such as Presto, Apache Parquet and Apache Hive. Previously he contributed to the design and implementation of various distributed systems in both academia and the industry, including Intel Labs, Carnegie Mellon University, Delft University of Technology (TUDelft), Argela, and Telenity. He holds a PhD in computer science from TUDelft, and an MSc and a BSc in computer engineering from Istanbul Technical University.

Session(s):
Netflix - Productionizing Spark on YARN for ETL at Petabyte Scale
Tuesday, June 28, 2016, 5:50PM - 6:30PM
BALLROOM A
Tendu Yogurtcu

Syncsort
Tendü Yoğurtçu is Syncsort’s General Manager for the Big Data business. She has 20+ years of computer software industry experience, including extensive Big Data and Hadoop industry knowledge and has worked closely with key ecosystem partners like Hortonworks, Cloudera and MapR. As General Manager, she is responsible for building on the success of Syncsort’s global data integration, Hadoop and Cloud solutions.

Session(s):
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!
Tuesday, June 28, 2016, 4:10PM - 4:50PM
230C
Dan Zaratsian

SAS
Dan Zaratsian is a Sr. Solutions Architect with SAS' Global Analytics Practice, specializing in real-time event stream processing, text analytics, and machine learning. He supports a variety of technologies, both SAS and open-source, in order to design, program, and implement analytical solutions for clients. Dan received a M.S in Analytics from NC State and has a B.S in Electrical Engineering from the University of Akron.

Session(s):
IoT, Streaming Analytics and Machine Learning: Delivering Real-Time Intelligence With Apache NiFi
Tuesday, June 28, 2016, 5:00PM - 5:40PM
BALLROOM B
Zhe Zhang

LinkedIn
Zhe Zhang is a software engineer at LinkedIn's Hadoop team. He's an Apache Hadoop Committer and author of HDFS Erasure Coding, a major feature for Hadoop 3.0. Before LinkedIn Zhe worked at Cloudera and IBM T. J. Watson Research Center. Zhe has over 20 research publications and 5 US patents. While at IBM he has received the Research Accomplishment Award and Outstanding Technology Achievement Award.

Session(s):
Debunking the Myths of HDFS Erasure Coding Performance
Wednesday, June 29, 2016, 12:20PM - 1:00PM
BALLROOM C
Jason Zhang

Airbnb
Jason Zhang is a software engineer on the Data Infrastructure team at Airbnb. Prior to Airbnb, he was a software engineering at LinkedIn, where he focused on Distributed Data Systems. Jason is also one of the initial committers for Apache Helix project.

Session(s):
Reliable and Scalable Data Ingestion at Airbnb
Wednesday, June 29, 2016, 5:00PM - 5:40PM
BALLROOM A
Jianfeng Zhang

Hortonworks
Jeff has 7 years of experience in big data industry. He has used Hadoop since 2009 and is a PIG committer. My past experience is not only in big data infrastructure, but also on the application level of how to leverage these big data tools to get insight from data. Now I work in hortonworks as member of technical staff and mostly focus on Tez & Spark.

Session(s):
Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
Tuesday, June 28, 2016, 4:10PM - 4:50PM
210C
Yan Zhou

IBM
Lead Architect of Astro (Spark-SQL-on-HBase) Project; Hadoop Pig committer; member of former Yahoo Hadoop Team; 15+ years of analytical database experience.

Session(s):
Hive Metastore Security of Apache Ranger
Tuesday, June 28, 2016, 5:50PM - 6:30PM
212
Adriana Zubiri

IBM
Adriana Zubiri is a Program Director in the Big Data organization at the IBM Toronto Lab. Adriana is responsible for leading the world wide Big SQL engineering team, IBM's exciting technology that extends the power of SQL to the world of Apache Hadoop as part of the IBM BigInsights offering. Adriana is recognized within the industry as an expert in the area of big data and data warehouses, based on her extensive work with clients, her numerous papers and conference presentations.
sponsor purchase