CS397-CXZ Introduction to Text Information Systems (Fall 2003)

| Home | Basic Information | Schedule |
| Readings | Assignments | Project | Resources |




Assignments


General instructions

Assignments are generally due at the beginning of a class (i.e., 2pm of a Wednesday or Friday). A hardcopy of your answers should be turned in at the class. If, for any reason, you cannot make a class when an assignment is due, or if you have obtained an extension of the deadline, please turn it in to Heather Hall. Computer printouts are highly appreciated, as they eliminate many problems such as ambiguous symbols and unclear handwriting. However, hand written reports are also acceptable; in that case, please write CLEARLY to avoid mis-grading.

When an assignment involves coding or exploration of data, you may also be required to copy your code or other files to your handin directory on the CSIL machines. Your handin directory is /home/class/cs397cxz/handin/assignX/YOUR_ID, where "YOUR_ID" is your CSIL account user name. Note that each assignment has a separate handin directory. If you login to any of the machines in the Linux lab, e.g., csil-linux4, you should be able to see such a directory. Copy such files before the deadline, as your handin directory may have the writing permission blocked after the deadline.

Currently your handin directories are not writable. The problem is expected to be solved by Sept. 15.

1. Assignment 1: Probability & Statistics, Statistical NLP

2. Assignment 2: Pivoted normalization vs. BM25 (Okapi)

3. Assignment 3: Dirichlet Prior and Model-based Feedback

4. Assignment 4: Naive Bayes for Spam Filtering

5. Assignment 5: Hidden Markov Models for Information Extraction