CS397-CXZ Introduction to Text Information Systems (Fall 2003)

| Home | Basic Information | Schedule |
| Readings | Assignments | Project | Resources |




Schedule (anticipated)


Notes:
Date Topic Readings Notes
8/29 Fri Course policies (ppt, ps); Overview of Text Management (ppt, ps) FOA Ch1, Bush 45

Understanding Text: Natural Language Processing
9/3 Wed Introduction to NLP (ppt, ps)
FOA Ch2, Abney 96, Charniak 97
9/5 Fri Basic probability & statistics (ppt, ps)
Rosenfeld's notes (estimation information theory) Assign #1 out
9/10 Wed Statistical language models (ppt,ps)
FOA Ch3.1-3.3.3, Ch5.1, Rosenfeld 00
9/12 Fri Language Model Smoothing (ppt,ps)
[ Chen & Goodman 98]
9/17 Wed Mixture language models and EM algorithm (ppt, ps) Rosenfeld's note, Bilmes 98 [ Minka 98]

Accessing Textual Information: Text Retrieval and Filtering
9/19 Fri Overview of Text Retrieval (ppt, ps)
Singhal's review (Error), TREC measures Assign #1 due
9/24 Wed Retrieval Models: vector space (ppt, ps)
FOA Ch3.3-3.8, *Salton & Buckley 88, Singhal et al. 96 Assign #2 out
9/26 Fri Retrieval Models: probabilistic (ppt, ps)
FOA Ch5.5, Lafferty & Zhai 03 [Sparck Jones & Robertson 98,Cooper et al. 92, Fuhr 92]
10/1 Wed Implementation Issues and TR systems (ppt, ps) Excerpts from MG and MIR
10/3 Fri Language Models for Text Retrieval (ppt) Zhai & Lafferty 01 (SIGIR) [ Ponte & Croft 98]
10/8 Wed Model-based feedback (ppt) FOA Ch7.1, Zhai & Lafferty 01 (CIKM) and the note about the paper
10/10 Fri EM algorithm revisted (ppt) Applications of basic TR techniques (ppt) Rosenfeld's note, Bilmes 98 [ Minka 98] Assign #2 due, Assign #3 out
10/15 Wed Risk Minimization Retrieval Framework (ppt) Risk Min. Draft paper
10/17 Fri Information Filtering (ppt) FOA Ch7.6, Zhang & Callan 01 , Breese et al. 98
10/22 Wed Midterm Examination Review List Assign #3 due

Organizing Text: Text Categorization and Clustering
10/24 Fri Text Categorization (ppt) FOA Ch7.2-7.5 Sebastiani
10/29 Wed Clustering (ppt) FOA Ch5.2-5.4, Steinbach 2000 Assign #4 out
10/31 Fri Clustering (cont.)

Extracting Knowledge from Text: Information Extraction and Text Mining
11/5 Wed Hidden Markov Models (HMMs)
(Guest lecture by Tao Tao)(ppt)
Bilmes 98, Rabiner 89 , A brief note
11/7 Fri Hidden Markov Models (cont.) (ppt) Bilmes 98, Rabiner 89 , A brief note
11/12 Wed Hidden Markov Models (cont.) (ppt) Bilmes 98, Rabiner 89 , A brief note Assign #4 due, Assign #5 out
11/14 Fri Generation of Hypertext from Text (ppt) FOA Ch 6.1-6.5, [ Zhai 97]
11/19 Wed Information Extraction ( guest lecture by Scott Wen-tau Yih)) ppt) [Freitag & McCallum 2000, Chieu & Ng 2002, Roth & Yih 2001]
11/21 Fri User Interface and Visualization (ppt) Hearst 99

Applications: Web Information Management
11/26 Wed No class (Thanksgiving)
11/28 Fri No class (Thanksgiving)
12/3 Wed Overview of Web Information Management (ppt) Kosla & Blockeel 2000, Berkeley Teaching Library Search engine survey Assign #5 due
12/5 Fri Structured text retrieval models (ppt) Kleinberg 98, Page et al. 98
12/10 Wed Review for Final (ppt)
12/12 Fri (last class) Course Summary (ppt) IR challenges workshop and the draft report
12/13 Sat Reading day
12/16 Tuesday, 1:30-4:30pm Final Examination, 169 Everitt (location)