Hi Guest, 30 September 2020 Wednesday IST

About CUSAT | About Department | Alumni | Sitemap | Disclaimer  

     
 
  Home > Academic/Programmes > Programme Structure > SE (2015)
       
       
 
CSS3107: INFORMATION RETRIEVAL AND WEB SEARCH

Core/Elective: Elective Semester: 1 Credits: 3

Course Description

A coherent treatment of classical and web based information retrieval that includes web search, text classification, text clustering, gathering, indexing and searching documents and methods of evaluating systems .

Course Objectives

Basic and advanced techniques for text-based information systems: efficient text indexing;
Boolean and vector space retrieval models; evaluation and interface issues
Web search including crawling, link-based algorithms, and Web metadata
Understand the dynamics of the Web by building appropriate mathematical models.
Build working systems that assist users in finding useful information on the Web

Course Content

1. Taxonomy of IR Models – Classic models- Set theoretic model- Algebraic models- Probabilistic model- Structured text retrieval models- Models for browsing- Retrieval evaluations-Reference collections

2. Query languages-query operations-text and multimedia languages-Text operations-document preprocessing- matrix decompositions and latent semantic indexing-text compression –indexing and searching-inverted files-suffix trees- Boolean queries-sequential searching-pattern matching

3. Text Classification, and Naïve bayes-vector space classification-support vector machines and machine learning on documents-flat clustering –hierarchical clustering

4. Web search basics-web characteristics-index size and estimation- near duplicates and shingling-web crawling-distributing indexes- connectivity servers-link analysis-web as a graph-PageRank-Hubs and authorities- question answering

5. Online IR systems- online public access catalogs-digital libraries-architectural issues-document models -representations and access- protocols

REFERNCES

1. Modern Information Retrieval: The Concepts and Technology behind Search (2ndEd): Ricardo Baezce Yates, Berthier Ribeiro-Neto, AW (2010)
2. Introduction to Information Retrieval (1st Ed): Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press (2008)
3. Search Engines: Information Retrieval in Practice (1st Ed): Bruce Croft, Donald Metzler and Trevor Strohman, AW (2009)


Copyright © 2009-20 Department of Computer Science,CUSAT
Design,Hosted and Maintained by Department of Computer Science
Cochin University of Science & Technology
Cochin-682022, Kerala, India
E-mail: csdir@cusat.ac.in
Phone: +91-484-2577126
Fax: +91-484-2576368