Data Mining has evolved as a new discipline at the intersection of several existing areas, including Database Systems, Machine Learning, Optimization, and Statistics. An important question is whether the field has matured to the point where it has originated substantial new problems and techniques that distinguish it from its parent disciplines. In this talk, we will discuss a class of new problems and techniques that show great promise for exploratory mining, while synthesizing and generalizing ideas from the parent disciplines. While the class of problems we discuss is broad, there is a common underlying objective-to look beyond a single data mining step (e.g., data summarization or model construction) and address the combined process of data selection and transformation, parameter and algorithm selection, and model construction. The fundamental difficulty lies in the large space of alternative choices at each step, and good solutions must provide a natural framework for managing this complexity. We regard this as a grand challenge for Data Mining, and see the ideas in this talk as promising initial steps towards a rigorous exploratory framework that supports the entire process.
This is joint work with several people, in particular, Beechung Chen.
About the Speaker`
Raghu Ramakrishnan is VP and Research Fellow at Yahoo! Research, where he heads the Community Systems group. He is on leave from the University of Wisconsin-Madison, where he is Professor of Computer Sciences, and was founder and CTO of QUIQ, a company that pioneered question-answering communities such as Yahoo! Answers, and provided collaborative customer support for several companies, including Compaq and Sun. His research is in the area of database systems, with a focus on data retrieval, analysis, and mining. He has developed scalable algorithms for clustering, decision-tree construction, and itemset counting, and was among the first to investigate mining of continuously evolving, stream data. His work on query optimization and deductive databases has found its way into several commercial database systems, and his work on extending SQL to deal with queries over sequences has influenced the design of window functions in SQL:1999. His paper on the Birch clustering algorithm received the SIGMOD 10-Year Test-of-Time award, and he has written the widely-used text "Database Management Systems" (WCB/McGraw-Hill, with J. Gehrke), now in its third edition.
He is Chair of ACM SIGMOD, on the Board of Directors of ACM SIGKDD and the Board of Trustees of the VLDB Endowment, and has served as editor-in-chief of the Journal of Data Mining and Knowledge Discovery, associate editor of ACM Transactions on Database Systems, and the Database area editor of the Journal of Logic Programming. Dr. Ramakrishnan is a Fellow of the Association for Computing Machinery (ACM), and has received several awards, including a Packard Foundation Fellowship, an NSF Presidential Young Investigator Award, and an ACM SIGMOD Contributions Award.
Official Website: http://sfbayacm.org/events/2007-05-09.php
Added by marstein on April 9, 2007