Posted January 28, 2016 There are a lot of great tools that can help us work with Big Data, but they all require a lot of resources. How can we ease the burden on this CPU/RAM demand? One way to do it is to share the data we are working on and the results of our computations with others.
Posted January 20, 2016 So if you are interested in the workings of the open source community,the economic incentives,to understand how a rag-tag band of developers can come up with so much of quality output and how you can contribute to it. Then this is a light-hearted step by step walk through of someone who jumped in after looking from the outside for too long.
Posted January 20, 2016 Understanding the physical plan of a big data application is often crucial for tracking down bottlenecks and faulty behavior. Apache Spark although offering useful Web UI component for monitoring and understanding the logical plan of the jobs, lacks a tool that helps to understand the physical plan of the task scheduler and the possibility to monitor execution at a very low level, along with the communication triggered by RDDs and remote block-requests...
Posted January 13, 2016 Bill Porto, Senior Engineering Analyst, RedPoint Global Inc. uncovers in his blog what he'll be presenting his community choice session on at Hadoop Summit in Dublin. His session forms part of the "Data Science Applications for Hadoop" track.
Posted January 13, 2016 The 'Overview of Apache Flink: the 4G of Big Data Analytics Frameworks' is a community choice winner of 'The Future of Hadoop' track of the 2016 Hadoop Summit that will take place in Dublin, Ireland in April 13-14, 2016. Click here to learn more about winning session delivered by Slim Baltagi, Director of Big Data engineering, Capital One