English

Information Retrieval Modeling (IRM)






There is no such thing as a dominating model or theory of information retrieval, unlike the situation in for instance the area of databases where the relational model is the dominating database model. In information retrieval, some models work for some applications, whereas others work for other applications. For instance, vector space models are well-suited for similarity search and relevance feedback in many (also non-textual) situations if a good weighting function is available; the probabilistic retrieval model or naive Bayes model might be a good choice if examples of relevant and nonrelevant documents are available; Google's Pagerank model is often used in situations that need modelling of more of less static relations between documents; region models have been designed to search in structured text; and language models are helpful in situations that require models of language similarity or document priors; In this tutorial, I carefully describe all these models by exlpaining the consequences of modelling assumptions. I address approaches based on statistical language models in great depth. After the course, students are able to choose a model of information retrieval that is adequate in new situations, and to apply the model in practical situations.
Find OpenCourseWare Online Exams!
Attribution: The Open Education Consortium
http://www.ocwconsortium.org/courses/view/f8cf8649cd197160f4da8b5e4896120f/
Course Home http://videolectures.net/russir09_hiemstra_irm/