Apache Hadoop (http://hadoop.apache.org) has rapidly become the platform of choice for data-intensive supercomputing around the world. Yahoo! is one of the main contributors and the largest user of Apache Hadoop in the world.
In this tutorial, we will learn how Yahoo! uses Hadoop for solving real-world big data problems. We will see how easy it is to develop MapReduce applications in Hadoop by using high-level dataflow language such as Pig Latin. We will get an in-depth look at Hadoop core components, such as the Distributed File System (HDFS), and Map-Reduce programming framework.

This talk will be conducted by Dr Milind Bhandarkar

The evening will consist of two 90 minutes talks, with a break in between. Some food and drinks, and some networking will round off the evening.
The venue has not been confirmed but will be in central London. I will communicate the venue once it has been confirmed by Friday 10th September.

Schedule: Wednesday 22nd September

4:00pm: Registration and refreshments:
4:30pm: Talk 1 - Overview of Apache Hadoop & MapReduce Programming

Apache Hadoop has rapidly become the platform of choice for Data-Intensive Supercomputing. At Yahoo!, Hadoop runs on more than 38,000 servers, stores more than 170 PetaBytes of data, and performs millions of big data analytics computations every month. In this talk, we describe the two core components of Hadoop: Distributed File System, and the MapReduce programming framework. We will learn how to program using Hadoop MapReduce framework, with numerous examples from real-world usage of Hadoop at Yahoo!.

6:00pm: Break with refreshments
6:30pm: Talk 2 - Introduction to Pig Programming

Apache Pig is a parallel dataflow system that uses Hadoop as it's backend distributed computation platform. More than 75% of Hadoop jobs at Yahoo! are invoked with Pig. In this talk, we introduce the dataflow language, Pig Latin. We will learn about the simplicity, flexibility, and configurability of Pig. We will describe the Pig architecture, and how Pig dataflow programs are executed using Hadoop MapReduce platform, with real examples.

8:00pm: Food and Drinks
10:00pm: End

The event is free. Please only register if you are coming as we have a limited amount of space

Thank you
The Yahoo Developer Network

About Milind Bhandarkar

Dr. Milind Bhandarkar has been working with Hadoop and Pig since version 0.1.0 for both. He started the Yahoo! Grid Solutions team focused on training, consulting, and supporting thousands of new migrants to Hadoop and Pig. He has been focused on parallel programming languages and paradigms for over 20 years, and has a PhD from University of Illinois at Urbana-Champaign, USA in that field. He worked at the Center for Development of Advanced Computing (C-DAC), Center for Simulation of Advanced Rockets, Siebel Systems, and Pathscale before settling at Yahoo! in 2005. As Hadoop Solutions Architect at Yahoo!, Milind has enabled several mission-critical projects at Yahoo! adopt (and adapt to) Apache Hadoop.

Added by anilkp on September 9, 2010