This is going to be the fifth German Hadoop get together in Berlin. As always there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss.
You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. After the official part we will go to one of the restaurants close by - exactly which one will be announced at the beginning of the event.
Talks scheduled so far:
Torsten Curdt: Data Legacy - the challenges of an evolving data warehouse
Abstract: "MapReduce is great for processing great data sets. A distributed file system can be used to store huge amounts of data. But what if your data format needs to adapt to new requirements? This talk will cover a simple introduction to Thrift and Protocol Buffers and sprinkle in some rants and approaches to manage your big data sets."
Christoph M. Friedrich, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI)
Title: "SCAIView - Lucene for Life Science Knowledge Discovery".
Abstract: "In the Life Sciences, there is an immense growing of freely available information. In Medline, a medical information system, every day more than 3000 citations are newly indexed. Today Medline contains approx. 19Mio references and abstracts. Using machine learning and dictionary based Named Entity Recognition, we extracted information of genes, drugs, SNPs and other Life Science entities from Medline. SCAIView a Life Science Knowledge Discovery system will be presented, that uses a multi-threaded Lucene to allow semantic search and ontological search on this data. Questions, that can be solved now quickly are: What drugs are mentioned in the context of Alzheimers disease? or: What genes are co-mentioned with Diabetes and are on the insulin signalling pathway? "
Uri Boness from JTeam in Amsterdam
Title: Solr - From Theory to Practice.
Abstract: "This session will introduce the attendees to Solr by a real world example. We will show how Solr enabled us to replace an existing commercial search engine in one of the most popular online company directories in The Netherlands. We'll briefly discuss the decision making process that led the company to explore open source alternatives to their search back end in general and why Solr was chosen in particular. We will then show how Solr extensible infrastructure enabled us to implement non-trivial search functionality such as geo-location search and complex ranking rules schemes."
We would like to invite you, the visitor to also tell your Hadoop story, if you like, you can bring slides - there will be a beamer. Talks on related projects (HBase, CouchDB, Cassandra, Hive, Pig, Lucene, Solr, nutch, katta, UIMA, Mahout, ...) are of course welcome as well.
A big Thanks goes to the newthinking store for providing a room in the center of Berlin for us.
Official Website: http://www.isabel-drost.de/hadoop
Added by MaineC. on April 23, 2009