Loading…
ApacheCon EU 2014 has ended
Register Now for ApacheCon Europe 2014 - November 17-21 in Budapest, Hungary. 

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Big Data [clear filter]
Monday, November 17
 

11:30am

ETL Made Simple Using Spark - Mayur Rustagi, Sigmoid Analytics
Apache Spark is growing to be the most active project in Apache Big Data ecosystem. It truly unlocks the ability to perform analytics in-memory & in an iterative fashion. In this talk I will highlight the several customer case studies where we used several aspects of Apache Spark from Streaming, Warehousing & ML. Furthermore I will show how the seamless integration of Streaming, ML & warehousing yields new opportunities for businesses to reach to their data faster.

Speakers
MR

Mayur Rustagi

CTO & Co-Founder, Sigmoid Analytics
Mayur Rustagi is a CTO & Co-founder of Sigmoid Analytics. His areas of expertise include Real Time Big Data Analytics using open source technologies like Apache Spark, Shark and Apache Hadoop. Sigmoid Analytics has worked with over 25 customers in the Big data space including several... Read More →


Monday November 17, 2014 11:30am - 12:20pm
Arany

1:40pm

Accelerating Big Data Application Development With Cascading - Supreet Oberoi, Concurrent, Inc.
Cascading is a Java-based application development framework for building Big Data applications on Apache Hadoop. This open source framework allows developers to leverage their existing skillsets such as Java, SQL, etc. to create enterprise=grade applications without having to think in MapReduce. This comprehensive framework separates business logic from integration logic so that developers can quickly build and test data applications locally on their laptop and then deploy them on Hadoop. While typical enterprise data applications must cross through multiple departments and frameworks, Cascading allows multiple departments to seamlessly integrate their application components into one single data processing application. In this presentation, developers will get an introduction to Cascading, how it works, and then dive into how one can build applications with Cascading

Speakers
SO

Supreet Oberoi

VP of Field Engineering, Concurrent, Inc.
Supreet Oberoi is a hands-on, entrepreneurial, technology leader with over two decades of experience in successfully developing transformative information technologies, and working in leadership roles at Concurrent Inc., American Express, Oracle, Microsoft and many privately-held... Read More →


Monday November 17, 2014 1:40pm - 2:30pm
Arany
 
Tuesday, November 18
 

11:20am

DataType API by Example - Nick Dimiduk, HBase in Action
HBase has traditionally been a simple "byte-bucket", in strict homage to the BigTable paper. HBase 0.96 introduced a new API for making HBase "data type aware". This API provides necessary encodings that preserve serialized order and have first-class support for complex rowkeys. It's also user-extensible. This session will introduce the API to developers with examples, including how to implement your own data types for HBase.

Speakers
ND

Nick Dimiduk

Hortonworks, Inc
Nick Dimiduk is an HBase committer and an author of HBase in Action. He works on the HBase team at Hortonworks where his focus is on usability and performance. His involvement in Hadoop and HBase communities started in 2008 when his nightly ETL jobs were taking 20+ hours. Since... Read More →


Tuesday November 18, 2014 11:20am - 12:10pm
Tas

4:50pm

Apache Giraph: Start Analyzing Graph Relationships In Your Big Data In 45 Minutes (Or Your Money Back)! - Roman Shaposhnik, Pivotal
The genesis of Hadoop was in analyzing massive amounts of data with a mapreduce framework. SQL­-on­Hadoop has followed shortly after that, paving a way to the whole schema-­on­-read notion. Discovering graph relationship in your data is the next logical step. Apache Giraph (modeled on Google’s Pregel) lets you apply the power of BSP approach to the unstructured data. In this talk we will focus on practical advice of how to get up and running with Apache Giraph, start analyzing simple data sets with built­-in algorithms and finally how to implement your own graph processing applications using the APIs provided by the project. We will then dive into how Giraph integrates with the Hadoop ecosystem (Hive, HBase, Accumulo, etc.) and will also provide a whirlwind tour of Giraph architecture.

Speakers
avatar for Roman Shaposhnik

Roman Shaposhnik

Director of Open Source, Linux Foundation
Apache Software Foundation and Data, oh but also unikernels


Tuesday November 18, 2014 4:50pm - 5:40pm
Arany
 
Wednesday, November 19
 

3:00pm

The Other Apache Technologies Your Big Data Solution Needs - Nick Burch, Quanticate
In this talk, we'll take a look at a range of projects from the Apache Software Foundation, looking at those which complement the "headline projects" to build out your big data solution. While we can't cover every project at Apache (there are just too many these days!), we'll take a tour through some of the up-coming and lesser-known established projects out there, those that should prove very helpful to you in building your big data solution. We'll see that Apache is more than just the webserver, Hadoop and Lucene, and with any luck point you at projects that'll save you time and effort!

Speakers
avatar for Nick Burch

Nick Burch

CTO, Quanticate
Nick began contributing to Apache projects in 2003, and hasn't looked back since! Most of the projects Nick has worked in belong in the "Content" space, such as Apache POI (ex-PMC Chair), Apache Tika and Apache Chemistry. As well as coding projects, Nick is also involved in a number... Read More →


Wednesday November 19, 2014 3:00pm - 3:50pm
Arany