Selected Recent Publications
2008
- Qiaozhu Mei, Duo Zhang, ChengXiang Zhai.
Smoothing Language Models with Document and Word Graphs ,
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval ( SIGIR'08 ), to appear.
( 17% acceptance)
- Xuanhui Wang, Hui Fang, ChengXiang Zhai.
A study of methods for negative relevance feedback ,
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval ( SIGIR'08 ), to appear.
( 17% acceptance)
- Qiaozhu Mei, ChengXiang Zhai.
Generating Impact-Based Summaries for Scientific Literature ,
Proceedings of the 46th Annual Meeting of the Association for
Computational Linguistics: Human Language Technologies ( ACL-08:HLT), to appear. (25% acceptance)
- Yue Lu, ChengXiang Zhai.
Opinion Integration Through Semi-supervised Topic
Modeling,
Proceedings of the World Wide Conference 2008 ( WWW'08), pdf.
- Qiaozhu Mei, Deng Cai, Duo Zhang, ChengXiang Zhai.
Topic Modeling with Network Regularization,
Proceedings of the World Wide Conference 2008 ( WWW'08), pdf.
- Azadeh Shakery, ChengXiang Zhai.
Smoothing Document Language Models with Probabilistic
Term Count Propagation, Information Retrieval Journal, to appear.
- Qiaozhu Mei, Dong Xin, Hong Cheng, Jiawei Han, and ChengXiang Zhai, Semantic Annotation of
Frequent Patterns, ACM Transactions on Knowledge Discovery from Data, to appear.
- Xuanhui Wang, Tao Tao, Jian-Tao Sun, Azadeh Shakery, and ChengXiang Zhai, DirichletRank:
Solving the Zero-One Gap Problem of PageRank, ACM Transactions on Information Systems, to appear.
2007
- Jing Jiang, ChengXiang Zhai, A Two-Stage Approach to Domain Adaptation for Statistical Classifiers ,
Proceedings of the 16th ACM International Conference on Information and Knowledge Management ( CIKM'07), pages 401-410.
( full paper, 17% acceptance)
- Xuanhui Wang, Hui Fang, ChengXiang Zhai, Improve Retrieval Accuracy for Difficult Queries using Negative Feedback ,
Proceedings of the 16th ACM International Conference on Information and Knowledge Management ( CIKM'07), pages 991-994.
( short paper, 26% acceptance)
- Shui-Lung Chuang, Kevin Chen-Chuan Chuang, and ChengXiang Zhai,
Context-Aware Wrapping: Synchronized Data Extraction,
Proceedings of the 33rd Very Large Data Bases Conference (VLDB'07),pages 699-710. (17.5% acceptance)
- Xuehua Shen, Bin Tan, and ChengXiang Zhai, Privacy Protection in Personalized Search,
ACM SIGIR Forum , 41(1), pages 4-17. pdf
- Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai, Automatic Labeling of Multinomial Topic Models ,
Proceedings of the 2007 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'07 ), pages 490-499. ( 19% acceptance ) pdf
- Xuanhui Wang, ChengXiang Zhai, Xiao Hu, and Richard Sproat, Mining Correlated Bursty Topic Patterns from Coordinated Text Streams ,
Proceedings of the 2007 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'07 ), pages 784-793. (19% acceptance rate) pdf
- Xuanhui Wang, ChengXiang Zhai, Learn from Web Search Logs to
Organize Search Results,
Proceedings of the 30th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval ( SIGIR'07 ), pages 87-94. ( 18% acceptance) pdf
- Bin Tan, Atulya Velivelli, Hui Fang, ChengXiang Zhai,
Term Feedback for Information Retrieval with Language Models,
Proceedings of the 30th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval ( SIGIR'07 ), pages 263-270. ( 18% acceptance) pdf
- Qiaozhu Mei, Hui Fang, ChengXiang Zhai,
A Study of Poisson Query Generation Model for Information Retrieval,
Proceedings of the 30th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval ( SIGIR'07 ), pages 319-326. ( 18% acceptance) pdf
- Tao Tao, ChengXiang Zhai, An Exploration of Proximity Measures in Information Retrieval,
Proceedings of the 30th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval ( SIGIR'07 ), pages 295-302. ( 18% acceptance) pdf
- Jing Jiang and ChengXiang Zhai, An Empirical Study of
Tokenization Strategies for Biomedical Information Retrieval, Information Retrieval. To appear. pdf
- Jing Jiang and ChengXiang Zhai, Instance Weighting for Domain Adaptation in NLP,
Proceedings of ACL 2007, pages 264-271. pdf
- Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, ChengXiang Zhai, Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs, Proceedings of the World Wide Conference 2007 ( WWW'07), pages 171-180. pdf
- Hui Fang, ChengXiang Zhai, Probabilistic Models for Expert Finding , Proceedings of
the 29th European Conference on Information Retrieval (ECIR'07), pages 418-430. ( 19% acceptance) pdf
- Jing Jiang, ChengXiang Zhai,
A Systematic Exploration of The Feature Space for Relation Extraction
,
Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), pages 113-120. ( 24% acceptance) pdf
- Xu Ling, Jing Jiang, Xin He, Qiaozhu Mei, ChengXiang Zhai, Bruce Schatz,
Generating Semi-Structured Gene Summaries from Biomedical Literature,
Information Processing and Management, 43(6), Nov. 2007, pp. 1777-1791.
pdf
2006
- Saurabh Sinha, Xu Ling, Charles W. Whitfield, ChengXiang Zhai, and Gene E. Robinson,
Genome scan for cis-regulatory DNA motifs associated with social behavior in honey bees ,
Proceedings of National Academy of Sciences of the United States of America (PNAS) ,
103(44), Oct. 2006, pages 16352-16357. URL
- Jing Jiang and ChengXiang Zhai,
Extraction of coherent relevant passages
using hidden Markov models, ACM Transactions on Information
Systems, 24(3), July 2006, pages 295-319. URL
- Azadeh Shakery and ChengXiang Zhai,
A probabilistic relevance propagation model for hypertext retrieval,
In Proceedings of the 15th ACM International Conference on Information and Knowledge Management ( CIKM'06), pages 550-558. ( 15% acceptance) pdf
- Rong Jin, Luo Si, and ChengXiang Zhai,
A study of mixture models for collaborative filtering, Information Retrieval,
9(3), Jun. 2006, pages 357-382. URL
- Bin Tan, Xuehua Shen, ChengXiang Zhai,
Mining long-term search history to improve search
accuracy ,
Proceedings of the 2006 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , (KDD'06 ), pages 718-723. (poster paper, 23% acceptance) pdf
- Qiaozhu Mei, ChengXiang Zhai,
A Mixture Model for Contextual Text Mining,
Proceedings of the 2006 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , (KDD'06 ), pages 649-655. (poster paper, 23% acceptance) pdf
- Qiaozhu Mei, Dong Xin, Hong Cheng, Jiawei Han, ChengXiang Zhai,
Generating Semantic Annotations for Frequent Patterns
with Context Analysis ,
Proceedings of the 2006 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , (KDD'06 ), pages 337-346. Best Student Paper Award Runner-Up.
(full paper, 11% acceptance) pdf
- Tao Tao, Su-Youn Yoon, Andrew Fister, Richard Sproat and ChengXiang Zhai,
Unsupervised Named Entity Transliteration Using Temporal and Phonetic Correlation ,
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pages 250-257. ( 31% acceptance) pdf
- Richard Sproat, Tao Tao and ChengXiang Zhai,
Named Entity Transliteration with Comparable Corpora,
Proceedings of COLING-ACL 2006, pages 73-80. ( 23% acceptance) pdf
- Xuanhui Wang, Jian-Tao Sun, Zheng Chen, ChengXiang Zhai,
Latent Semantic Analysis for Multiple-Type Interrelated Data Objects
Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'06 ), pages 236-243. ( 19% acceptance) pdf
- Hui Fang, ChengXiang Zhai,
Semantic Term Matching in Axiomatic Approaches to Information Retrieval
Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'06 ), pages 115-122. ( 19% acceptance) pdf
- Tao Tao, ChengXiang Zhai,
Regularized Estimation of Mixture Models for Robust Pseudo-Relevance Feedback
Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'06 ), pages 162-169. ( 19% acceptance) pdf
- Jing Jiang, ChengXiang Zhai,
Exploiting Domain Structure for Named Entity Recognition.
Proceedings of HLT/NAACL 2006, pages 74-81. ( 25% acceptance) pdf, ppt
- Tao Tao, Xuanhui Wang, Qiaozhu Mei, ChengXiang Zhai,
Language Model Information Retrieval with Document Expansion.
Proceedings of HLT/NAACL 2006, pages 407-414. ( 25% acceptance) pdf
- Qiaozhu Mei, Chao Liu, Hang Su, and ChengXiang Zhai,
A Probabilistic Approach to Spatiotemporal Theme Pattern Mining on Weblogs.
Proceedings of the
World Wide Web Conference 2006 ( WWW'06), pages 533-542. (11% acceptance) pdf
- Xu Ling, Jing Jiang, Xin He, Qiaozhu Mei, ChengXiang Zhai, and Bruce Schatz,
Automatically Generating Gene Summaries from Biomedical Literature . In Proceedings of
Pacific Symposium on Biocomputing 2006 (PSB'06), pages 40-51.
pdf
- ChengXiang Zhai and John Lafferty,
A risk minimization framework for information retrieval ,
Information Processing and Management ( IP &M ), 42(1), Jan. 2006. pages 31-55.
URL
2005
- Xuehua Shen, Bin Tan, and ChengXiang Zhai, Implicit User Modeling for Personalized Search ,
In Proceedings of the 14th ACM International Conference on Information and Knowledge Management ( CIKM'05), pages 824-831.
pdf ( 18% acceptance)
- Qiaozhu Mei, ChengXiang Zhai, Discovering Evolutionary Theme Patterns from Text -- An Exploration of Temporal Text Mining,
Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , (KDD'05 ), pages 198-207, 2005. pdf
(full paper, 12% acceptance)
- Tao Tao, ChengXiang Zhai, Mining Comparable Bilingual Text Corpora for Cross-Language Information Integration ,
Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'05 ), pages 691-696, 2005. pdf ( poster paper, 22% acceptance)
- Hui Fang, ChengXiang Zhai, An Exploration of Axiomatic Approach to Information Retrieval ,
Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'05 ), 480-487, 2005.
pdf ( 19% acceptance)
- Xuehua Shen, ChengXiang Zhai, Active Feedback in Ad Hoc Information Retrieval,
Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'05), 59-66, 2005.
pdf ( 19% acceptance )
- Xuehua Shen, Bin Tan, ChengXiang Zhai, Context-Sensitive Information Retrieval with Implicit Feedback,
Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'05), 43-50, 2005.
pdf ( 19% acceptance )
2004
-
Tao Tao, ChengXiang Zhai, Xinghua Lu, and Hui Fang, A study of statistical methods for function prediction of protein motifs , Applied Bioinformatics, Volume 3, No. 2-3, pages 115-124. (BLM 03 paper: ps, pdf)
-
Xinghua Lu, Chengxiang Zhai , Vanathi Gopalakrishnan, and Bruce G Buchanan,
Automatic annotation of protein motif function with Gene Ontology terms, BMC Bioinformatics 2004, 5:122. (url) (Impact factor=5.42, as of 2006)
- Hui Fang, Tao Tao, ChengXiang Zhai, A formal study of information retrieval heuristics,
Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'04), pages 49-56, 2004. Best Paper Award. pdf ( 22% acceptance )
- ChengXiang Zhai, Atulya Velivelli, Bei Yu, A cross-collection mixture model for comparative text mining, Proceedings of ACM KDD 2004 ( KDD'04 ), pages 743-748, 2004. pdf, ppt ( poster paper, 25% acceptance )
- Tao Tao, ChengXiang Zhai, A Mixture Clustering Model for Pseudo Feedback in Information Retrieval ,
Proceedings of the 2004 Meeting of the International Federation of Classification Societies ( IFCS'04), pages 541-552. Invited Paper. pdf
- ChengXiang Zhai, John Lafferty, A study of smoothing methods for language models applied to information retrieval , ACM Transactions on Information Systems ( ACM TOIS ), Vol. 22, No. 2, April 2004, pages 179-214. ( ps)
2003
-
Hwanjo Yu, ChengXiang Zhai, and Jiawei Han,
Text Classification from Positive and Unlabeled Documents , Proceedings of ACM CIKM 2003 (CIKM'03), pages 232-239, 2003. pdf ( 15% acceptance )
- Jin Rong, Luo Si, ChengXiang Zhai, and Jamie Callan,
Collaborative Filtering with Decoupled Models for Preferences and Ratings ,
Proceedings of ACM CIKM 2003 (CIKM'03 ), pages 301-316, 2003. ps, pdf ( 15% acceptance)
- ChengXiang Zhai, William W. Cohen, and John Lafferty, Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval ,
Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'03 ), pages 10-17, 2003.
ps, pdf ( 17% acceptance )
- Rong Jin, Luo Si, and ChengXiang Zhai, Preference-based Graphic Models for Collaborative Filtering, In Proceedings of UAI 2003 (UAI'03 ), pages 329-336, 2003. ps, pdf ( 25% acceptance )
- John Lafferty and Chengxiang Zhai, Probabilistic relevance models based on document and query generation , In Language Modeling and Information Retrieval, Kluwer International Series on Information Retrieval, Vol. 13, 2003. ps,
pdf
2002
- ChengXiang Zhai, Risk Minimization and Language Modeling in Information Retrieval, Ph.D. thesis, Carnegie Mellon University, 2002. (summary).
- ChengXiang Zhai and John Lafferty, Two-Stage Language Models for Information Retrieval ,
Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'02), pages 49-56, 2002.
ps, pdf ( 20% acceptance )
- Rong Jin, Alex G. Hauptmann, and ChengXiang Zhai, Title Language
Model for Information Retrieval,
Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'02 ), pages 42-48, 2002.
ps,
pdf ( 20% acceptance )
2001
- Chengxiang Zhai and John Lafferty, Model-based feedback in the language modeling approach to information retrieval , Proceedings of the Tenth ACM International Conference on Information and Knowledge Management (CIKM'01), pages 403-410, 2001. ps,
pdf ( 25% acceptance)
- Chengxiang Zhai and John Lafferty, A study of smoothing methods for
language models applied to ad hoc information retrieval,
Proceedings of the 24th Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval (SIGIR'01 ), pages 334-342, 2001. ps, pdf
( 23% acceptance )
- John Lafferty and Chengxiang Zhai, Document language models, query models, and risk minimization for information
retrieval ,
Proceedings of the 24th Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval (SIGIR'01 ), pages 111-119, 2001. ps,
pdf ( 23% acceptance )