Spatial Ranking Methods for Geographic Information Retrieval
Ray Larson (UCB

Unlike the large body of work in GIS and spatial analysis, there has been little study of the use and effectiveness of different spatial approximations for Geographic Information Retrieval (GIR). Clearly, no one approach will apply to all types of geographic information resources and collections. The need to retrieve and evaluate information objects based on their geospatial characteristics increases as the geographical "aboutness'' of the objects increase, e.g., a guidebook vs. a digital geographic data set of invasive plant species. Moreover, geospatial ranking methods are becoming increasingly important as the supply of and demand for geographic information grows. The quality of geospatial approximations in GIR, i.e. how closely they represent the original objects, constrains how accurately and effectively these objects can be retrieved and ranked. We have been exploring these issues and have developed some new algorithms for ranked retrieval of georeferenced objects. We have also been examining the indexing methods that can be employed for materials with geographic content or associations, including text materials. We have done comparative analysis of several GIR algorithms and evaluating their relative performance using a test collection of geospatial metadata, derived from the California Environmental Information Catalog (CEIC -- http://ceres.ca.gov/catalog). This discussion is based on that work, and on the results of two years of the GeoCLEF evaluation forum, part of the Cross-Language Evaluation Forum organized as part of EU funded efforts for development of digital libraries. This talk will examine this research and the results obtained from evaluation of ranking methods and different spatial approximations of objects, in the context of both pure geographic retrieval rankings, and combinations of text and location information in the search process.

Prof. Larson specializes in the design and performance evaluation of information systems, and the evaluation of user interaction with those systems. His background includes work as a programmer/analyst with the University of California Division of Library Automation (DLA) where he was involved in the design, development, and performance evaluation of the UC public access online union catalog (MELVYL). His research has concentrated on the design and evaluation of information retrieval systems, with an emphasis on digital libraries. Prof. Larson was the principal investigator for the "CHESHIRE Demonstration and Evaluation Project" sponsored by the US Dept. of Education, that developed a next-generation online catalog and full-text retrieval system. He was a co-principal investigator for the "Searching Unfamiliar Metadata Vocabularies" project sponsored by DARPA. Prof. Larson was also the principal investigator of the "Cross-Domain Resource Discovery: Integrated Discovery and Use of Textual, Numeric and Spatial Data" project sponsored by NSF as part of the International Digital Libraries program.

