Apache Hadoop Get Together @ Berlin

Tucholskystr. 48
Berlin, Bundesland Berlin

I would like to announce the September-2009 Hadoop Get Together in newthinking store Berlin.

When: 29. September 2009 at 5:00pm
Where: newthinking store, Tucholskystr. 48, Berlin, Germany

As always there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. There are quite a few good restaurants nearby, so we can go there after the official part.

Talks scheduled so far:
Thorsten Schuett, Solving Puzzles with MapReduce: MapReduce is most often used for data mining and filtering large datasets. In this talk we will show that it also useful for a completely different problem domain: solving puzzles. Based on MapReduce, we can implement massively parallel breadth-first and heuristic search. MapReduce will take care of the hard problems, like parallelization, disk and error handling, while we can concentrate on the puzzle. Throughout the talk we will use the sliding puzzle (http://en.wikipedia.org/wiki/Sliding_puzzle) as our example.

Thilo Götz, Text analytics on jaql: Jaql (JSON query language) is a query language for Javascript Object Notation that runs on top of Apache Hadoop. It was primarily designed for large scale analysis of semi-structured data. I will give an introduction to jaql and describe our experiences using it for text analytics tasks. Jaql is open source and available from
http://code.google.com/p/jaql.

Uwe Schindler, Lucene 2.9 Developments: Numeric Search, Per-Segment- and
Near-Real-Time Search, new TokenStream API: Uwe Schindler presents some new additions to Lucene 2.9. In the first half he will talk about fast numerical and date range queries (NumericRangeQuery, formerly TrieRangeQuery) and their usage in geospatial search applications like the Publishing Network for Geoscientific & Environmental Data (PANGAEA). In the second half of his talk, Uwe will highlight various improvements to the internal search implementation for near-real-time search. Finally, he will present the new TokenStream API, based on AttributeSource/Attributes that make indexing more pluggable. Future
developments in the Flexible Indexing Area will make use of it. Uwe will
show a Tokenizer that uses custom attributes to index XML files into various
document fields based on XML element names as a possible use-case.

We would like to invite you, the visitor to also tell your Hadoop story, if you like, you can bring slides - there will be a beamer.

A big Thanks goes to the newthinking store for providing a room in the center of Berlin for us. Another big thanks goes to Cloudera for sponsoring videos of the talks. Links to the videos will be posted here as well as on the Cloudera blog. Yet another big thanks goes to O'Reilly for providing three "Hadoop: The Definitive Guide" books that will be raffled at the event.

Official Website: http://www.isabel-drost.de/hadoop

Apache Hadoop Get Together @ Berlin September 29, 2009

Interested 12