Jing Jiang ------------------------------------------------------------- Web: Google dictionary: As a non-native English speaker, I often want to find out the correct usage of a word, and more often, the correct usage of a phrase. Online dictionaries usually only show a few examples. For certain phrases, online dictionaries may not even have entries for them. If I try to search for the word or phrase on Google, it usually finds web pages where the word or the phrase only appears in the title, which is not very helpful. It would be very useful to have a Google dictionary so that if you type in a word or a phrase, it shows how the word or phrase is popularly used, with summarization and examples. The users of such a tool will be people to whom English is a second language, or kids who are still learning new words and phrases. It would also be useful for finding out the meanings of buzzwords. Since it is supposed to be an English dictionary, the system should filter out web pages that may contain improper usage of English. The data involved should most likely be news articles, online books, essays, and other well-written English articles. The dictionary should ideally summarize the usage of the word or phrase into several categories, give examples for each of them, and maybe differentiate between formal usage and informal usage. And just like other online dictionaries, this dictionary should be able to correct the userâ^À^Ùs spelling, or find the best match if the user enters a phrase that does not exist. Email: A better email routing method in customer support systems: I thought of this problem because once after I sent an email to userhelp, someone emailed me back, asking me to re-send the email to userhelp because the problem was not supposed to be assigned to him. I then realized that there must be an automatic router in any customer support system that assigns emails to experts who can solve the corresponding problems. This is essentially a text categorization problem in the email domain. People who benefit from solving such a problem are both the customers and the companies/organizations who provide the service. The data involved are emails from customers who have different kinds of questions. A good email router should provide several functions: (1) To identify from the email text the major problem the customer encountered, and assigns the problem to a certain category or several categories. (2) To extract information the customer provided to assist the human expert to fix the problem, e.g., error message the customer received from a computer screen, IP address, etc. The information can also include the customerâ^À^Ùs name, phone number, customer ID, etc. (3) If the problem is common, and the solution is already available, then the solution should be automatically sent back to the customer. I believe such routing systems have already been developed in many places. But how satisfactory they are is still questionable. Literature: Automatic survey generation: A useful tool for researchers may be a program that can automatically generate surveys on topics given by the user, and/or recommend a state-of-the-art technique that works the best for the user regarding the topic/problem. The users of such a program are either researchers who want to find out the current state-of-the-art technology in some research areas new to them (or simply new research areas), or researchers and engineers who want to use existing methods/tools as building blocks for their research/products at a higher level. The problem is not trivial because (1) for new research topics, there probably is no survey paper published, yet, (2) even for old research topics, the existing survey papers may be out-dated already. Therefore, data involved in this challenge not only include existing survey papers on the given topic, but also include the most recent research papers that are addressing the problem. The problem is challenging in several aspects. (1) Some topics (especially new topics) may not be well defined. When searching for relevant papers, the system needs to consider different ways of describing the problem, in addition to what the user provides, (similar to query expansion.) (2) How to summarize the methods proposed in different papers, and how to compare the pros and cons of these methods may be difficult. This may involve text summarization and information extraction. For example, can the system identify a benchmark for the given problem and compare the performance of different methods on that benchmark? (3) If the user provides his constraints/requirements, can the system recommend a good method that fits the userâ^À^Ùs need the best? This may involve more sophisticated techniques.