The course is lecture-based with a midterm examination. There are individual and group assignments, which often involve using a retrieval toolkit to implement an algorithm and/or experiment with real text data.
Attendance is mandatory, but use common sense if you are sick or run into any emergency situation.
In case you cannot go to a class, you must send (or ask some one to send) an explanation message to the instructor
no later than 24 hours after the class. For example, if you cannot go to a class on Wednesday, you need to send a
message before 2:00pm the next day (i.e., Thursday). Note that attending the lectures is often the only chance for you to learn
certain materials as you may not find them in any textbook or other readings.
The assignments are designed to ensure that every student has a deep and precise understanding of the major algorithms and gains handson experience with using a retrieval toolkit, thus the students are generally required to complete them independently unless it is a group assignment. Assignments may be of two flavors of a mixture of them: 1) short written problem sets to test your understanding of materials; 2) experimentation and machine problems to provide an opportunity to work on a toolkit or experiment with algorithms.
Discussion with others is allowed to the extent of helping understand the material. The course newsgroup may be a good place for discussions. The purpose of student collaboration is to facilitate learning, not to circumvent it. The actual solution must be done by each student alone, and the student should be ready to reproduce their solution upon request. If any substantial discussion happens, every one involved must write down the names of the people that he/she has discussed with and the nature or topic of discussion. In any case, you must exercise academic integrity. See the University Policy on Academic Integrity, especially the section on plagiarism.
Late submission of an assignment would result in a reduced grade for the assignment, unless an extension has been granted by the instructor. An assignment is worth full credit at the beginning of class on the due date (later if an extension has been granted). It is worth at most 90% credit for the next 24 hours. It is worth at most 75% credit for the following 24 hours. It is worth 50% credit after that. Unless in exceptional cases, assignments will generally not be accepted if they are two weeks later after the due date, which means that if your assignment is turned in 14 days later than the due date, it would not be graded and you would receive zero credit for the assignment. If you need an extension, please ask for it by sending email to the instructor as soon as the need for it is known. Extensions that are requested promptly will be granted more liberally.
There is a difference between asking questions about the course content and asking questions about homework:
Thus, if you don't understand a concept/method taught in the class, you should ask the instructor and/or the TAs questions or try to discuss the problem with others, unless you would like to see if you could figure out by yourself, which of course, is also beneficial to you.
However, you must complete your homework independently in any case; this is so that we can be sure that you really benefit from doing the homework and that homework grading is fair.
Thus, in any case, you should not ask the instructor, the TA, or any other students to explain how to solve a problem in an assignment for the purpose of skipping the process of problem solving.
If you are truly stuck with some homework problem and don't really know how to proceed, you should come to the office hours of the instructor or TA, who will help you proceed. You may also discuss with your classmates to get some help in such a case, but if you have received substantial help from any other student(s) for finishing your homework, you must briefly describe what kind of help you have received from others. Note that both the person who provided help and the person who received help should report this collaboration. It could be as simple as:
* XXX explained to me how to do YYY step in problem 2. and * I explained how to do XXX for problem 2 to XXX.
The purpose of the course project is twofold: (1) to give the students opportunities to
apply what has been learned from the course to solve some real world text information management and analysis problems;
(2)to allow the students to explore ideas and techniques for text information management and analysis by working
on a real problem. Team work is allowed and encouraged. There will be a number of "instructor-designed" project topics
available for you to choose, but you are also very welcome, indeed encouraged, to come up with any
interesting topic on your own. You will be asked to do a short presentation of your course project and submit
a 4-6 pages written project report at the end of the semester. See Project Page for details.
Every student who takes the course for 4 credit hours is required to finish a literature review on a topic in the scope of the course. The topic will be selected by the student with approval of the instructor. Often the selected topic would be related to the course project that the student is involved in, but it does not have to. It can also be a topic covered in the lectures. You must decide a topic for the literature review by March 19, 2015 (the latest), and finish the literature review by April 14, 2015.
In the case of multiple students working on the same project as a team, they can each choose to focus on a different sub-topic or a topic not related to the course project to finish a separate review. They can also work together as a group to finish a much more comprehensive review. However, the literature does not have to be tied with the course project. That is, you can work with a different group of people to complete a literature review on topic X, while working with another group to complete a course project on topic Y.
A group literature review must include a clear statement about the work division so as to show that every one has indeed contributed significantly to the combined review. The length of such a group review is also expected to be much longer than an individual review, though there is no strict requirement of the length. In particular, if k people work together, you are not required to write a report of k times as long as an individual review would be. Indeed, your report is expected to be shorter than that due to removed redundancy. However, your report should show "sufficient" work of each of you, where "sufficient" is defined as "reading at least 6 papers" and "writing at least 3~4 pages" by each person. Please list specific names when you post the topic of your group survey.
The goal of your literature review is to synthesize a set of papers about a topic. Note that this is different from a simple list of paragraphs covering each paper; instead, you should try to organize the papers you've read into a structure and connect them so that you can discuss their similarity and differences. Your review should help a reader to see a clear overall picture of the papers that you reviewed. It is unnecessary to cover many details of any paper as a reader can and will read the original paper if he or she is interested. That is, your review is mostly to provide an entry point to the relevant literature. Picture it as the first reading that someone would look at in order to learn about research work on the topic and try to write your review for such a reader.
Decide whether you would like to do a broad shallow survey or a narrow deep survey. A broad shallow survey can cover many papers (e.g., more than 10 or even 20 papers), but only briefly mention what is in each paper. A narrow one can cover just 6~10 papers, but with more detailed description of each paper. Either strategy is acceptable. In the first case, you can read broadly, but you don't need to read each paper in detail; in the second, you will read fewer papers, but you will also need to understand each paper in more detail. You may make this decision based on which strategy would help you most in finishing your project or based on your own preferences.
Select a set of "seed" papers first. You can find them by doing general literature search on the Web and focus on papers that have high citations. Then you can check which papers have cited them. Focus on newer papers on your topic. It's better to read the newest papers and go backward to read relatively older papers. This way your survey will be reflecting the most recent progress, making it more useful. If you find any existing survey or review of the topic or a related topic, read it and try to build on top of it, rather than repeat what has already been surveyed.
Start reading papers as soon as possible. It takes sometime to read a paper especially if a paper is complicated. So you should act early to ensure you will be able to read at least 6 papers in detail or 10 papers in a shallow way by the end of the semester.
If you aren't sure about which papers to focus on, please email the instructor for a discussion.
Check out this presentation for the typical structure of a literature survey paper.
As another option, your literature review may also be on a topic covered in our lectures. In such a case,
your literature review will be more like a short tutorial introduction to a topic, or a relatively self-contained
review of a topic that we covered. For example,
you may write a short introduction to the Vector Space Retrieval Model. You may use the lecture slides as the basis, but you should find and add some additional readings to enrich the content. For example, you may find a few resources where a reader can find more detailed explanations of some concepts or methods. The length is flexible as long as you have at least covered thoroughly the slides of the lecture(s) on that topic. Such a literature review may be very useful
to you (as well as your peers) for preparing for the midterm exam.
Your literature review is due on Tuesday, April 14, 2015, 11:59pm. Submit your literature review by posting it
this literature review wiki page (there will be instructions there on how to post it).
Grading will be based on the following weighting scheme:
For students taking the course for 4 credit hours, this weighting scheme is only applied to 75 points out of the 100 points. The remaining 25 points are based on the literature review, which will be graded as "pass or fail", contributing either 25 or 0 points to your final grade.