CS410 Text Information Systems (Spring 2014)

Instructor: ChengXiang Zhai

| Home | Basic Information | Schedule |
| Readings | Assignments | Project | Resources |




Readings


Because the course covers a wide range of topics, it is very difficult to find an appropriate textbook, and some additional readings have to be used. However, in many cases, you are not required to read or to fully understand the whole content of a paper or book chapter. This page is intended to provide some information about where the key contents are and which part(s) to focus on when reading a paper/book.

Core contents

In general, the lecture slides are the best "definition" of the core contents -- the contents to be tested. That is, you are expected to understand all the major points and algorithms that we have discussed in the class; anything beyond the slides can be regarded as optional. The last slide of each lecture usually summarizes what you should know for that lecture. You should check the last slide to make sure that you indeed understand all the major points and any necessary technical details. Since some material we cover in the lecture can not be readily found in any of the reading materials, you should make every effort to come to each class or watch every lecture video in a timely manner. Come to the office hours
  1. V. Bush, As we may think, 1945 .

    This is truly a classic paper. Read it to appreciate Bush's great vision which has NOT yet completely realized. If possible, read everything starting from section 6.

  2. M. Sanderson, W. B. Croft, The History of IR Research

    This is an excellent review of the history of IR research. Read the entire paper.

  3. A. Singhal, Modern Information Retrieval: A Brief Overview, In IEEE Data Engineering Bulletin 24(4), pages 35-43, 2001. pdf (Error)

    This is a very good overview paper of IR, though it's a bit out of date and slightly biased toward empirically effective techniques. Your goal of reading it is to know about the general history of IR and a summary of IR techniques from empirical perspective. Read the whole paper.

  4. IR Book Chapter 6

    Optional. Read whatever you feel is useful to you.

  5. IRBook Ch4; IRBook Ch5; IRBook Ch7

    Optional. Read whatever you feel is useful to you.

  6. IR Book Chapter 8

    Read 8.3 and 8.4. Other sections should also be very interesting to read, though not required.

  7. SLM for IR, Ch1 & Ch3 ;

    Read entire Chapter 1 and Sections 3.1 to 3.4 in Chapter 3.

  8. Note on KL-div Retrieval Model

    Read the entire note.

  9. Chapters 19-21, Information Retrieval book

    Read sections 19.1-19.5; Chapter 20 is optional, but would be interesting to read if you want to know more about crawling; read the entire Chapter 21.

  10. C. Zhai and others, Threshold Calibration in CLARIT Adaptive Filtering , Proceedings of TREC 1998.

    The main goal is to understand Section 3. You may want to read some other parts especially Sec 1 and Sec 2 to get some background.

  11. John S. Breese, David Heckerman, Carl Kadie, Empirical Analysis of Predictive Algorithms for Collaborative Filtering (1998)

    Read Section 1, Section 2.1-2.2. The goal is to know how memory-based algorithms work.

  12. Chapter 14, Information Retrieval book

    Read sections 14.1-14.5

  13. Chapters 16-17, Information Retrieval book

    Read sections 16.1-16.4; read the entire Chapter 17.