Text Information Management Group
Text information plays a very important role in our lives. Web pages, email messages, scientific literature, and office documents are just a few examples of text information that we encounter all the time. With the dramatic increase in online information in recent years, management of text information is becoming increasingly important; for example, Web search engines are now being used by all of us routinely to find information on the Web. The huge amount of information presents both challenges and opportunities. The challenge is how to manage large amounts of information effectively and efficiently so that we can easily find useful information. The opportunity is the possibility of exploiting statistical inference to discover knowledge ("hidden patterns"). Correspondingly, we are working in two broad, but related directions -- intelligent information access (to address the challenge) and text data mining (to exploit the opportunity).
There are two modes of information access -- "pull" and "push", depending on whether the user initiates the process. In the pull mode, a user searches for information by using a search engine (e.g., Google) or browes information items through structures available on the information space (e.g., Yahoo directory). In the push mode, an information management system keeps track of a user's interest and recommends any relevant incoming information items to a user. Our interests in intelligent information access are centered on information retrieval (leading to better search engine technologies such as personalized search), information organization (creating structures to assist a user in browsing), and information filtering (i.e., information recommendation).
In text data mining, we are interested in contextual text mining, which is concerned with extracting topical themes from text collections and discovering patterns about their variations in different contexts, where context can broadly mean any meta data such as time, location, authors, and sources. Depending on the context to model, contextual text mining potentially covers spatiotemporal text mining, cross-language text mining, novelty detection, sentiment analysis, and many other interesting text mining problems as special cases, and has many applications such as opinion summarization, business intelligence, text federation, and customer relationship management.
Natural language processing is crucial to all kinds of text management tasks. We are especially interested in the development of algorithms that exploit language technologies, such as statistical language models (i.e., probabilistic models of text).
We are interested in all kinds of applications of text information management, such as Web search, digital libraries, email management, and bioinformatics.