Cheng Xiang Zhai

Last updated
ChengXiang Zhai
Alma mater Nanjing University
Carnegie Mellon University
Scientific career
Fields Information Retrieval
Text Mining
Natural Language Processing
Machine Learning
Bioinformatics
Institutions University of Illinois at Urbana-Champaign
Thesis Risk Minimization and Language Modeling in Text Retrieval
Doctoral advisor John D. Lafferty
Website czhai.cs.illinois.edu

ChengXiang Zhai is a computer scientist. He is a Donald Biggar Willett Professor in Engineering in the Department of Computer Science at the University of Illinois at Urbana-Champaign. [1]

Contents

Biography

Zhai received the BS (1984), MS (1987, under Guoliang Zheng), and PhD (1990, under Jiafu Xu) in Computer Science from Nanjing University. He spent 1990 to 1993 working at Nanjing University's State Key Laboratory for Novel Software Technology. In 1993, he left for America to pursue a second PhD, this time at Carnegie Mellon University (CMU) with David A. Evans. [2] Evans then left to spend more time with the company ClariTech. Zhai obtained from CMU a MS (1997) in computational linguistics and then started working with John Lafferty. He finally received from CMU a PhD in Language and Information Technologies in 2002. [3]

Since then, he has been an Assistant Professor (2002–2008), [4] Associate Professor (2008–2013), [5] Professor (2013–2018), and Donald Biggar Willett Professor (2018–) at the UIUC Department of Computer Science. [6] He also holds joint appointments with the Carl R. Woese Institute for Genomic Biology, Department of Statistics, and School of Information Sciences at UIUC. [3]

Awards

Personal

Zhai's son Alex has earned three medals at the International Mathematical Olympiad. [22] [23]

Related Research Articles

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user. Relevance may include concerns such as timeliness, authority or novelty of the result.

Gerard A. "Gerry" Salton was a professor of Computer Science at Cornell University. Salton was perhaps the leading computer scientist working in the field of information retrieval during his time, and "the father of Information Retrieval". His group at Cornell developed the SMART Information Retrieval System, which he initiated when he was at Harvard. It was the very first system to use the now popular vector space model for information retrieval.

A recommender system, or a recommendation system, is a subclass of information filtering system that provides suggestions for items that are most pertinent to a particular user. Recommender systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer.

The Gerard Salton Award is presented by the Association for Computing Machinery (ACM) Special Interest Group on Information Retrieval (SIGIR) every three years to an individual who has made "significant, sustained and continuing contributions to research in information retrieval". SIGIR also co-sponsors the Vannevar Bush Award, for the best paper at the Joint Conference on Digital Libraries.

<span class="mw-page-title-main">Susan Dumais</span> American computer scientist

Susan Dumais is an American computer scientist who is a leader in the field of information retrieval, and has been a significant contributor to Microsoft's search technologies. According to Mary Jane Irwin, who heads the Athena Lecture awards committee, “Her sustained contributions have shaped the thinking and direction of human-computer interaction and information retrieval."

<span class="mw-page-title-main">Robert Harper (computer scientist)</span> Computer scientist

Robert William "Bob" Harper, Jr. is a computer science professor at Carnegie Mellon University who works in programming language research. Prior to his position at Carnegie Mellon, Harper was a research fellow at the University of Edinburgh.

Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not those results are relevant to perform a new query. We can usefully distinguish between three types of feedback: explicit feedback, implicit feedback, and blind or "pseudo" feedback.

Query expansion (QE) is the process of reformulating a given query to improve retrieval performance in information retrieval operations, particularly in the context of query understanding. In the context of search engines, query expansion involves evaluating a user's input and expanding the search query to match additional documents. Query expansion involves techniques such as:

Plagiarism detection or content similarity detection is the process of locating instances of plagiarism or copyright infringement within a work or document. The widespread use of computers and the advent of the Internet have made it easier to plagiarize the work of others.

<span class="mw-page-title-main">Special Interest Group on Information Retrieval</span> Subgroup of the Association for Computing Machinery

SIGIR is the Association for Computing Machinery's Special Interest Group on Information Retrieval. The scope of the group's specialty is the theory and application of computers to the acquisition, organization, storage, retrieval and distribution of information; emphasis is placed on working with non-numeric information, ranging from natural language to highly structured data bases.

XML retrieval, or XML information retrieval, is the content-based retrieval of documents structured with XML. As such it is used for computing relevance of XML documents.

W. Bruce Croft is a distinguished professor of computer science at the University of Massachusetts Amherst whose work focuses on information retrieval. He is the founder of the Center for Intelligent Information Retrieval and served as the editor-in-chief of ACM Transactions on Information Systems from 1995 to 2002. He was also a member of the National Research Council Computer Science and Telecommunications Board from 2000 to 2003. Since 2015, he is the Dean of the College of Information and Computer Sciences at the University of Massachusetts Amherst. He was Chair of the UMass Amherst Computer Science Department from 2001 to 2007.

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.

David Ron Karger is an American computer scientist who is professor and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology.

<span class="mw-page-title-main">Monika Henzinger</span> German computer scientist

Monika Henzinger is a German computer scientist, and is a former director of research at Google. She is currently a professor at the Institute of Science and Technology Austria. Her expertise is mainly on algorithms with a focus on data structures, algorithmic game theory, information retrieval, search algorithms and Web data mining. She is married to Thomas Henzinger and has three children.

John D. Lafferty is an American scientist, Professor at Yale University and leading researcher in machine learning. He is best known for proposing the Conditional Random Fields with Andrew McCallum and Fernando C.N. Pereira.

<span class="mw-page-title-main">Shumin Zhai</span> Human–computer interaction research scientist

Shumin Zhai is a Chinese-born American Canadian Human–computer interaction (HCI) research scientist and inventor. He is known for his research specifically on input devices and interaction methods, swipe-gesture-based touchscreen keyboards, eye-tracking interfaces, and models of human performance in human-computer interaction. His studies have contributed to both foundational models and understandings of HCI and practical user interface designs and flagship products. He previously worked at IBM where he invented the ShapeWriter text entry method for smartphones, which is a predecessor to the modern Swype keyboard. Dr. Zhai's publications have won the ACM UIST Lasting Impact Award and the IEEE Computer Society Best Paper Award, among others, and he is most known for his research specifically on input devices and interaction methods, swipe-gesture-based touchscreen keyboards, eye-tracking interfaces, and models of human performance in human-computer interaction. Dr. Zhai is currently a Principal Scientist at Google where he leads and directs research, design, and development of human-device input methods and haptics systems.

Wei Wang is a Chinese-born American computer scientist. She is the Leonard Kleinrock Chair Professor in Computer Science and Computational Medicine at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). Her research specializes in big data analytics and modeling, database systems, natural language processing, bioinformatics and computational biology, and computational medicine.

Hsiao-Wuen Hon is a Taiwanese-US researcher in speech technology, and coauthor of the book Spoken Language Processing. He is Corporate Vice President of Microsoft and Chairman of Microsoft's Asia-Pacific R&D Group.

Learned sparse retrieval or sparse neural search is an approach to text search which uses a sparse vector representation of queries and documents. It borrows techniques both from lexical bag-of-words and vector embedding algorithms, and is claimed to perform better than either alone. The best-known sparse neural search systems are SPLADE and its successor SPLADE v2. Others include DeepCT, uniCOIL, EPIC, DeepImpact, TILDE and TILDEv2, Sparta, SPLADE-max, and DistilSPLADE-max.

References

  1. Zhai, ChengXiang. "Home Page". Computer Science. Retrieved 19 November 2022.
  2. Zhai, ChengXiang (2002). "Risk minimization and language modeling in text retrieval" (PDF). Language Technology Institute — Faculty. Retrieved 24 November 2022.
  3. 1 2 Zhai, ChengXiang. "Bio". ChengXiang Zhai. Retrieved 23 November 2022.
  4. "Promotions recommended to be effective at the beginning of the 2008–09 academic year" (PDF). Board of Trustees. The Board of Trustees. Retrieved 24 November 2022.
  5. "Promotions recommended to be Effective at the beginning of the 2013–2014 academic year: Urbana-Champaign" (PDF). Board of Trustees. The Board of Trustees Office. Retrieved 24 November 2022.
  6. Zhai, ChengXiang (2003). "CV". ChengXiang Zhai — Home Page. Retrieved 24 November 2022.
  7. "Gerard Salton Awards". Awards. ACM SIGIR . Retrieved 19 November 2022.
  8. "SIGIR Academy". Awards. Retrieved 19 November 2022.
  9. Schmitt, Laura. "Zhai Inducted into Inaugural Class of New ACM Information Retrieval Academy". News. University of Illinois Department of Computer Science . Retrieved 19 November 2022.
  10. "ACM Recognizes 2017 Fellows for Making Transformative Contributions and Advancing Technology in the Digital Age". Association for Computing Machinery. December 11, 2017. Retrieved 2018-10-17.
  11. 1 2 "SIGIR Test of Time Awards Pre-2002 Recipients". Awards. ACM SIGIR. Retrieved 23 November 2022.
  12. 1 2 "ChengXiang Zhai". All Faculty. University of Illinois. Retrieved 23 November 2022.
  13. Zhai, ChengXiang; Lafferty, John (2001). "A study of smoothing methods for language models applied to Ad Hoc information retrieval". Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 334–342. doi:10.1145/383952.384019. ISBN   1581133316. S2CID   5793143 . Retrieved 23 November 2022.
  14. Zhai, ChengXiang; Lafferty, John (2004). "A study of smoothing methods for language models applied to information retrieval". ACM Transactions on Information Systems. 22 (2): 179–214. doi:10.1145/984321.984322. S2CID   207670589 . Retrieved 23 November 2022.
  15. Lafferty, John; Zhai, Chengxiang (2001). "Document language models, query models, and risk minimization for information retrieval". Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 111–119. doi:10.1145/383952.383970. ISBN   1581133316. S2CID   53234319 . Retrieved 23 November 2022.
  16. "SIGIR Test of Time Awards". Awards. ACM SIGIR. Retrieved 23 November 2022.
  17. Zhai, ChengXiang; Cohen, William W.; Lafferty, John (2003). "Beyond independent relevance". Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. pp. 10–17. doi:10.1145/860435.860440. ISBN   9781581136463. S2CID   52855966 . Retrieved 23 November 2022.
  18. "Recipients". Distinguished Members. ACM . Retrieved 23 November 2022.
  19. "The Presidential Early Career Award for Scientists and Engineers: Recipient Details: ChengXIang Zhai". National Science Foundation. NSF . Retrieved 19 November 2022.
  20. "Best Paper Awards". Awards. ACM SIGIR. Retrieved 23 November 2022.
  21. Fang, Hui; Tao, Tao; Zhai, ChengXiang (2004). "A formal study of information retrieval heuristics". Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 49–56. doi:10.1145/1008992.1009004. ISBN   1581138814. S2CID   15200693 . Retrieved 23 November 2022.
  22. Kline, Greg (21 August 2006). "Uni student makes it count in Mathematical Olympiad". Illini football news. The News-Gazette. Retrieved 24 November 2022.
  23. "Alex (Lin) Zhai". International Mathematical Olympiad. Retrieved 24 November 2022.