| Date | Topic | Readings | Notes |
|---|---|---|---|
| 8/29 Fri | Course policies (ppt, ps); Overview of Text Management (ppt, ps) | FOA Ch1, Bush 45 | |
|
Understanding Text: Natural Language Processing |
|||
| 9/3 Wed | Introduction to NLP (ppt, ps) |
FOA Ch2, Abney 96, Charniak 97 | |
| 9/5 Fri | Basic probability & statistics (ppt, ps) |
Rosenfeld's notes (estimation information theory) | Assign #1 out |
| 9/10 Wed | Statistical language models (ppt,ps) |
FOA Ch3.1-3.3.3, Ch5.1, Rosenfeld 00 | |
| 9/12 Fri | Language Model Smoothing (ppt,ps) |
[ Chen & Goodman 98] | |
| 9/17 Wed | Mixture language models and EM algorithm (ppt, ps) | Rosenfeld's note, Bilmes 98 [ Minka 98] | |
|
Accessing Textual Information: Text Retrieval and Filtering |
|||
| 9/19 Fri | Overview of Text Retrieval (ppt, ps) | Singhal's review (Error), TREC measures | Assign #1 due |
| 9/24 Wed | Retrieval Models: vector space (ppt, ps) |
FOA Ch3.3-3.8, *Salton & Buckley 88, Singhal et al. 96 | Assign #2 out |
| 9/26 Fri | Retrieval Models: probabilistic (ppt, ps) |
FOA Ch5.5, Lafferty & Zhai 03 [Sparck Jones & Robertson 98,Cooper et al. 92, Fuhr 92] | |
| 10/1 Wed | Implementation Issues and TR systems (ppt, ps) | Excerpts from MG and MIR | |
| 10/3 Fri | Language Models for Text Retrieval (ppt) | Zhai & Lafferty 01 (SIGIR) [ Ponte & Croft 98] | |
| 10/8 Wed | Model-based feedback (ppt) | FOA Ch7.1, Zhai & Lafferty 01 (CIKM) and the note about the paper | |
| 10/10 Fri | EM algorithm revisted (ppt) Applications of basic TR techniques (ppt) | Rosenfeld's note, Bilmes 98 [ Minka 98] | Assign #2 due, Assign #3 out |
| 10/15 Wed | Risk Minimization Retrieval Framework (ppt) | Risk Min. Draft paper | |
| 10/17 Fri | Information Filtering (ppt) | FOA Ch7.6, Zhang & Callan 01 , Breese et al. 98 | |
| 10/22 Wed | Midterm Examination | Review List | Assign #3 due |
|
Organizing Text: Text Categorization and Clustering |
|||
| 10/24 Fri | Text Categorization (ppt) | FOA Ch7.2-7.5 Sebastiani | |
| 10/29 Wed | Clustering (ppt) | FOA Ch5.2-5.4, Steinbach 2000 | Assign #4 out |
| 10/31 Fri | Clustering (cont.) | ||
|
Extracting Knowledge from Text: Information Extraction and Text Mining |
|||
| 11/5 Wed | Hidden Markov Models (HMMs) (Guest lecture by Tao Tao)(ppt) |
Bilmes 98, Rabiner 89 , A brief note | |
| 11/7 Fri | Hidden Markov Models (cont.) (ppt) | Bilmes 98, Rabiner 89 , A brief note | |
| 11/12 Wed | Hidden Markov Models (cont.) (ppt) | Bilmes 98, Rabiner 89 , A brief note | Assign #4 due, Assign #5 out |
| 11/14 Fri | Generation of Hypertext from Text (ppt) | FOA Ch 6.1-6.5, [ Zhai 97] | 11/19 Wed | Information Extraction ( guest lecture by Scott Wen-tau Yih)) ppt) | [Freitag & McCallum 2000, Chieu & Ng 2002, Roth & Yih 2001] |
| 11/21 Fri | User Interface and Visualization (ppt) | Hearst 99 | |
|
Applications: Web Information Management |
|||
| 11/26 Wed | No class (Thanksgiving) | ||
| 11/28 Fri | No class (Thanksgiving) | ||
| 12/3 Wed | Overview of Web Information Management (ppt) | Kosla & Blockeel 2000, Berkeley Teaching Library Search engine survey | Assign #5 due |
| 12/5 Fri | Structured text retrieval models (ppt) | Kleinberg 98, Page et al. 98 | |
| 12/10 Wed | Review for Final (ppt) | ||
| 12/12 Fri (last class) | Course Summary (ppt) | IR challenges workshop and the draft report | |
| 12/13 Sat | Reading day | ||
| 12/16 Tuesday, 1:30-4:30pm | Final Examination, 169 Everitt (location) | ||