CS410 Text Information Systems (Spring 2013)

Instructor: ChengXiang Zhai

| Home | Basic Information | Schedule |
| Readings | Assignments | Project | Resources |

Course Project


The course project is to give the students hands-on experience on developing some novel information retrieval and/or text mining tools. The project thus emphasizes applied research and "deliverables", meaning that the outcome of your project should be something tangible, typically some kind of prototype system that can be demonstrated, though theoretical research projects are also fine. Group work is strongly encouraged, but not required.

General steps

  1. Pick a topic
  2. Form a team
  3. Read related work
  4. Write a project proposal
  5. Work on the project
  6. Present the project
  7. Write a report

Grading criteria

Your project will be graded primarily based on the following weighting scheme: The factors to be considered in grading include (1) the utility of the tool you develop (this is the most important factor); (2) the relevance to the course; (3) the challenges you have to solve (i.e., technical contributions); and (4) the quality of presentation/writing. Refer to the guidelines on the proposal and report for what should be included in a report.

1. Pick a topic

You can either pick from a list of sample topics provided by the instructor (available here) or choose your own topic. Your starting point should be the essays that you wrote for assignment #1. They are available here. You can browse them to see if you can get any insights from some of the essays there. This is also a good way to identify opportunities for collaborations.

You may find useful to take a look at some general advice on choosing a project topic here. Please keep in mind that the general goal is to develop some useful tool to help people manage text information. Leveraging existing resources is especially encouraged as it allows you to minimize the amount of work that you have to do and focus on developing truly novel functions.

When picking a topic, try to ask yourself the following questions:

2. Form a team

You are encouraged to work with other students as a team. Teamwork not only gives your some experience on working with others, but also allows you to work on a larger (presumably more interesting) topic. Generally speaking, all the team members of a group will get the same grade provided all have contributed substantially to the project. In case there is evidence that a team member has only made superficial contribution to a project (I really hope this won't happen!), the particular team member's grade may be discounted. The project report must state clearly who did what. It is also fine if you choose not to work with others.

3. Check related work

While choosing a topic, you should also check to see whether the tool/function you would like to develop already exists. If so, you may want to figure out where exactly your novelty is and whether novelty leads to any benefit for a user. Your goal is to extend , rather than duplicate, the existing work. To minimize your effort, you should leverage existing systems, toolkits, and other useful resources as much as possible. The instructor can also help you check related work. Please feel free to discuss your plan with the instructor at any time.

4. Write a proposal (due Tuesday, March 26, 2013)

You are required to write a one-page proposal before you actually go in depth on a topic and post your proposal on the class wiki by the due date, Tuesday, March 26, 2013.

To post your proposal, go to the Project Proposals page, and add your project to the page. Follow the same format as used by some existing project proposals there. Specifically, you need to create a new page for your project, where you would put your proposal. It is up to you how to design such a page; it could directly contain your proposal or have your proposal as an attachment to this new page. To do that, you simply need to edit the proposal wiki page and add a line like "# [Your project title]" to this page. After saving the page, you can click on the new entry you created, which would lead you to the new page that you created, and you can then edit that page.

In the proposal, you should address the following questions and include the names and email addresses of all the team members. (As long as these questions are addressed, the proposal does not have to be very long. A couple of sentences for each question would be sufficient.)

5. Work on the project

You should reuse any existing tools as much as possible. For example, consider using the Lucene or Lemur toolkit if possible. There are also many tools available on the Internet. See the resources page for some useful pointers.

Discuss any problems or issues with your teammates or classmates. Discuss them with the TA and the instructor. If you need special support (e.g., more disk space on your account), please let the instructor know.

Consider documenting your work regularly. This way, you will already have a lot of things written down by the end of the semester.

6. Present the course project (7-10pm, Tuesday, May 7, 2013)

At the end of the semester, each project team is expected to make a poster presentation of the project. The purpose of this presentation is: (1) Let you know about others' projects. (2) Give you some opportunity to practice presentation skills, which are very important for a successful career. (3) Obtain some feedback from others about your project. Every on-campus student is expected to attend this presentation unless you have obtained permission from the instructor in advance for not attending. The course project presentation will be given at 7pm-10pm, Tuesday, May 7, 2013, in room 3403 Siebel Center ( note the room change here). There will be two sessions with a short break in between. In each session, one half of the project groups will present their projects and the other half will be the audience. Everyone should show up for both sessions.

To prepare for your poster presentation, you can make about 12 PowerPoint slides and print a hard copy of them out to post on the wall. There will be a tape for you to use to attach the slides on the wall, so you don't need to prepare it. Alternatively, you may also use Powerpoint to prepare a single-sheet poster and print out a large single-sheet poster using a special poster printer of our department if you have access to it and know how to use it. (We'd suggest that you simply print out letter-size hard copies of regular Powerpoint slides, which would be much easier.)

You must also post yours Powerpoint slides in the wiki before 7pm, May 7, 2013. To post your slides, follow the following steps: (1) upload your slides (e.g., "myproject-team-slides.ppt") as an attachment to the project proposal list page: go to the proposal list wiki page, click on the "Add" button at the top right corner of the page, and choose "Attachment". You will be prompted to upload a file. (2) edit the proposal list page to add a link to your slides: click on the "Edit" button on the proposal list wiki page, and choose "Wiki Markup" editing mode, then add the following at the end of the line of your proposal: ", [slides|^myproject-teaam-slides.ppt]". You can also check how the sample slides are linked to the sample proposal line. Note that it's important for you to use a somewhat unique name for your presentation file to avoid overriding.

If a team consists solely of online students, they can pre-record a short voiced Powerpoint presentation (no more than 4 minutes), which will be played at the presentation meeting.

In general, the structure of your presentation should roughly follow your project proposal. So it should touch all the following aspects:

Your presentation will be graded mainly based on (1) the clarity of your slides and presentation, and (2) whether your presentation has covered all the questions listed above. Think about how you can best present your work so as to make it as easy as possible for your audience to understand your main messages. Try to be concise, to the point. Pictures, illustrations, and examples are generally more effective than text for explaining your project. Try to show screen shots and/or plots of your experimental results. If you are not familiar with PowerPoint, you can adapt this sample presentation.

7. Write a project report (due May 8, 2013, Wednesday, 11:59pm)

You should write your report as if you were writing a short conference paper. You should address the same questions as those you have addressed in the proposal, only with more details, especially regarding some of the challenges that you need to solve in developing the tool. You should also include some screenshots if applicable and any other evaluation results. Furthermore, it would be good to include a brief discussion of how your system/research work can be further extended.

There is no strict length requirement, but you may target at 2000~4000 words without counting any necessary appendices (this is about 4~6 pages with 10-point font or 6~8 pages with 11-point font). Feel free to use any format for your report.

If you have not written such a report before, you may want to take a look at the following sample research papers published in the World Wide Web conferences:

Of course, I would expect your reports to be much shorter and more concise, but you should try to write your reports in a similar way. You may also refer to this sample project report for an example of how to organize your project report. If you want to reuse this template for your report, you can download the Microsoft Word file here.

The project report should be posted on the proposal page in the same way as you've done for your presentation slides. The deadline for posting your project reports is 11:59pm, May 8, 2012, Wednesday. That is, add the following to the end of the line of your proposal: ",[report|^your-name-report.pdf]" or ",[report|^your-name-report.doc]", and upload your report as an attachment to the proposal list wiki page.

Each project team only needs to submit one report. However, if there are multiple members in the team, you must include, for each member, at least one sentence to describe what he/she did exactly for the project. .

Grading of a project report will be based on three factors: (1) [25%] clarity and completeness of the report itself (i.e., whether you have clearly described what you have done and addressed all the questions that you are supposed to address); (2) [50%] amount of work that you have done; and (3) [25%] the quality of your project as reflected in the importance of problem being addressed, the quality of solution, and the impact of your project. Since the report accounts for 40% of your overall grade for the project, it means that if you've devoted enough effort (i.e., getting 50% of the report grade), and also done well for the project proposal and project presentation, you should have at least 40*50%+30+30=80 points for your project grade, out of the total of 100 points. On the other hand, if there is no evidence showing that you have done substantial work for a group project and only made superficial contribution, then not only will you lose many of the 50% of the points for the amount of work you've done, but your grades in the parts of the project may be discounted as well. Thus it is very important that you make sure to make enough effort to contribute to your group project. Note that it is your responsibility to figure out how to contribute to your group project, so you will need to act proactively and in a timely manner if your group leader has not assigned a task to you. There will be no opportunity to make up for any task that you failed to accomplish. Everyone is expected to spend at least 20 hours to seriously work on your course project as a minimum, not including the time spent for preparing the presentation and writing the report. In general, all the members of a team will get the same grade for the project unless the report indicates that some member(s) only superficially participated in the project without doing much actual work; in that case, I will discount the grade. The instructor and TAs will provide feedback about your report later by email if we see any way to further improve your work.