Information Retrieval

CS 6501 Fall 2014

Course Project

The course project is to give the students hands-on experience on solving some novel information retrieval and/or text mining problems. The project thus emphasizes either research-oriented problems or "deliverables." It is preferred that the outcome of your project could be publishable, e.g., your (unique) solution to some (interesting/important/new) problems, or tangible, e.g., some kind of prototype system that can be demonstrated. Bonus points will be given to the groups meet either one of above criteria. Group work is strongly encouraged, but not required.

General steps

Your project will be graded based on the following required components:

An official rubric for the final report and rubric for the project presentation will be provided after project proposal due.

Note that you are required to use the provided templates for your project proposal and final report. See the Resources page for the template and example file.

Pick a topic

You can either pick from a list of sample topics provided by the instructor or choose your own topic. Your starting point could be the essays that you wrote for Homework 1. You can browse all student essays to see if you can get any insights. This is also a good way to identify opportunities for collaborations.

Leveraging existing resources is especially encouraged as it allows you to minimize the amount of work that you have to do and focus on developing truly your ideas.

When picking a topic, try to ask yourself the following questions:

Form a team

You are encouraged to work with other students as a team. Teams may consist of up to three total students. Teamwork not only gives your some experience on working with others, but also allows you to work on a larger (presumably more important) topic.

Note that it is your responsibility to figure out how to contribute to your group project, so you will need to act proactively and in a timely manner if your group leader has not assigned a task to you. The instructor will believe all team members actively contribute to the project and the same grade will be applied to the group member (unless special treatment is required).

Survey related work

While choosing a topic, it is very important to be aware of whether the problem you would like to tackle has already been solved. If so, you may want to figure out where exactly your novelty is and whether novelty leads to any benefit to others. Your goal is to go beyond, rather than simply duplicate, the existing work. To minimize your effort, you are encouraged to leverage existing algorithms, toolkits, and other useful resources as much as possible. The instructor can also help you check related work. Please feel free to discuss your plan with the instructor before finalizing your proposal.

Write a project proposal

You are required to write a two-page proposal before you actually go in depth on a topic. In the proposal, you should address the following questions and include the names of all the team members as authors. The order among authors' names do not matter.

Intuitively, the proposal should read like the introduction part of a regular research paper. Briefly state the background/motivation, what has been done, what is missing, how do you plan to solve it, how do you plan to prove the usefulness of your method, and summarize your contribution(s).

Work on the project

You should leverage any existing tools or methods as much as possible. For example, consider using the Lucene toolkit for indexing and searching in a large text corpus; using Stanford NLP parser or OpenNLP toolkit for text analysis; using MALLET or WEKA for classification or clustering. There are also many tools available on the Internet. See the resources page for some useful pointers. Discuss any problems or issues with your teammates or classmates. If you need special support, please let the instructor know.

Consider documenting your work regularly. This way, you will already have a lot of things written down by the end of the semester. In addition, we strongly suggest using version control for your project! Nothing is more frustrating than losing a lot of your hard work, especially if it's close to a deadline.

Present the course project

At the end of the semester, each project team is expected to present their project in class. The purpose of this presentation is

In general, the structure of your presentation should be prepared like a conference presentation. So it should touch all the following aspects (text in parenthesis states the instructor's expectation):

Think about how you can best present your work so as to make it as easy as possible for your audience to understand your main messages. Try to be concise, to the point. Pictures, illustrations, and examples are generally more effective than text for explaining your project. Try to show screen shots and/or plots of your experimental results. Watching some top conference presentations (e.g., KDD, SIGIR, ICML) on VideoLectures will be beneficial.

In order to be fair to all members in the same group, the instructor will randomly pick team members for question answering during the presentation.

Write a project report

You should write your report as if you were writing a regular conference paper. You should address the same questions as those you have addressed in the proposal and presentation, only with more details. Pay special attention to the challenges that you have solved and your detailed solutions. Basic sections to be included in the report should be the same as those in a conference paper, e.g., abstract, introduction, related work, method, experiment and conclusion. If you are developing a demo system or toolkit, your report should follow the format of a demo paper.

You are required to use LaTeX for your project report. See the Resources page for the template and example file. The project report must be at most ten pages with that format (no minimal requirement, as long as you feel it is sufficient to prove the merit of your work).

The instructor will provide feedback about your course project during the final presentation if we see any way to further improve your work, and bonus points will be given immediately.

Sample topics for course project

The instructor has selected a set of representative course projects from former UIUC students. From this selection, you can get a broad spectrum of topics and a basic sense of what would be regarded as feasible topics for our course project. Note, their ideas nor solutions are not necessarily perfect. You are encouraged to come up with better ideas and solutions.

Warning, please do not disclose or distribute the following project reports outside this class; their original authors hold all the reserved copyrights!

Topics proposed by the instructor

Rubric for project report

Aspects Score
Strictly follow the provided template [0-10]
Background and research question were clearly stated in the introduction and the logic and argument were reasonable [0-10]
Contribution of the work was properly articulated in the introduction [0-10]
Sufficient discussion of state-of-the-art in related work section [0-10]
Description of the proposed method was clear, comprehensive, coherent and consistent with the claim in the introduction [0-20]
Precise description of experiment design and experimental data set [0-10]
Thorough experimentations that proved all necessary components in the proposed method and detailed analysis of the experimental results [0-20]
Summarization of the work, reasonable discussion of limitation of the proposed solution and future work [0-10]

Rubric for project presentation

Aspects Score
Slides content was clearly visible and self-explainable [0-10]
Presenters were confident about their work and clearly explained it to me [0-10]
Background and research question were clearly highlighted, and the logic and argument were reasonable [0-10]
There was sufficient discussion of state-of-the-art and why do we need this new method [0-10]
Description of the proposed method is clear, comprehensive, coherent and consistent with the claim in the introduction [0-15]
Thorough experimentations that proved all necessary components in the proposed method and detailed analysis of the experimental results [0-15]
The presenters well managed their time during presentation. [0-10]
The presenters did a good job in answering the questions. [0-10]
I like this work! [0-10]

Schedule of Project Presentation

Title Team Date
Analysis of Social Media for Language Acquisition Adam Pearce Dec 4th
Personalized Job Matching Md Mustafizur Rahman, John Clougherty, Sam Hewitt, Elise Clougherty Dec 4th
A Natural Language Code Search Tool Will Hawkins Dec 4th
Modeling the treatment for patients Jinghe Zhang Dec 4th
Stock Prediction Using Twitter Sentiment Analysis Prasad Seemakurthi, Krishna Aswani Dec 6th
Improving Hashtag Comprehension with Search and Text Summarization John Lanchantin, Nicholas Janus, Weilin Xu Dec 6th
A Personalized Smart Query Recommendation Assistant Shize Su, Qingyun Wu, Haoran Hou Dec 6th
Learning personalized ranking of search engines by high-dimensional learning algorithm Christian Kümmerle Dec 6th
Mapping User Comments back to Newspaper Articles Muhammad Nur Yanhaona, Asif Salekin, Md Anindya Prodhan Dec 6th
Exploring the Relationship between User Reviews and Prices Lingjie Zhang, Lin Gong, Bo Man Dec 6th
UniHealth: A Data Visualization Tool for American College Student Health Hao Wu Dec 6th