![]() |
Clyde Lee Giles | |
---|---|
Born | |
Nationality | American |
Alma mater | Rhodes College University of Tennessee University of Michigan University of Arizona |
Known for | CiteSeer, neural networks, information retrieval, digital libraries, web search |
Awards | |
Scientific career | |
Fields | Computer science, search engines, artificial intelligence, deep learning |
Institutions | Ford Motor Company, Naval Research Laboratory, Air Force Office of Scientific Research, NEC Research Institute, Pennsylvania State University |
Doctoral advisor | Harrison Hooker Barrett [1] |
Clyde Lee Giles is an American computer scientist and is the Emeritus David Reese Professor at the Penn State College of Information Sciences and Technology (IST) at the Pennsylvania State University. He was also Graduate Faculty Professor of Computer Science and Engineering, Courtesy Professor of Supply Chain and Information Systems, and Director of the Intelligent Systems Research Laboratory. He was Interim Associate Dean of Research in the College of IST. He is now emeritus faculty. He graduated from Oakhaven High School in Memphis, Tennessee. His graduate degrees are from the University of Michigan and the University of Arizona and his undergraduate degrees are from Rhodes College and the University of Tennessee. His PhD is in optical sciences with advisor Harrison H. Barrett. His academic genealogy includes two Nobel laureates (Felix Bloch and Werner Heisenberg), Arnold Sommerfeld and prominent mathematicians. [2]
Giles has been associated with the computer science or electrical engineering departments at Princeton University, the University of Pennsylvania, Columbia University, the University of Pisa, the University of Trento and the University of Maryland, College Park. Previous positions were at NEC Research Institute (now NEC Labs), Princeton, NJ; Air Force Research Laboratory; and the United States Naval Research Laboratory. He is best known for his work on the creation of novel scientific and academic search engines and digital libraries and is considered by some one of the founders of academic document search. Earlier research was concerned with recurrent neural networks and optical computing.
His research interests are in intelligent web and cyberinfrastructure tools, search engines and information retrieval, digital libraries, web services, knowledge and information management and extraction, machine learning, and information and data mining. He has created several vertical search engines in these areas. He has over 500 publications with some in Nature, Science and the Proceedings of the National Academy of Sciences. His research is well cited with an h-index of 118 according to Google Scholar and over 60,000 total citations as evidenced in Google Scholar. He has one of the top 200 h-indexes in Computer Science and the top 10 in Information Retrieval. At Penn State he has graduated 40 PHD students. Most of his papers author his name as C. Lee Giles or C.L. Giles.
Most of his classes were graduate courses on deep learning and graduate/upperclassman courses on search engines and information retrieval.
He is a Fellow of the Association for Computing Machinery (ACM), [3] Institute of Electrical and Electronics Engineers (IEEE), [4] and International Neural Network Society (INNS). He also received the Gabor Award [5] from the International Neural Network Society (INNS) recognizing achievements in engineering/applications in neural networks. Most recently he received the 2018 Institute of Electrical and Electronics Engineers (IEEE) Computational Intelligence Society (CIS) Neural Networks Pioneer Award [6] and the 2018 National Federation of Advanced Information Services (NFAIS) Miles Conrad Award. [7]
He has twice received the IBM Distinguished Faculty Award.[ citation needed ]
Before his work on neural networks, Giles published papers on reflection and scattering of electromagnetic waves from magnetic materials for the particular cases of equal refractive indexes. His work is mentioned in the following articles: Fresnel equations, Mie scattering, and Brewster's angle. For Mie scattering, he is a coauthor on the Kerker effect, which was an extension of his work on a planar boundary effect and his idea.
Giles' work on neural networks showed that fundamental computational structures such as regular grammars and finite state machines could be theoretically represented in recurrent neural networks. Another contribution was the Neural Network Pushdown Automata and the first analog differentiable stack. Some of these publications are cited as early work in "deep" learning.
While at AFOSR in Washington DC, Giles in 1986 established the first neural network funding in 20 years and helped DARPA establish theirs.
In 1998 and 1999 his work published in Science and Nature with Steve Lawrence estimated the size of the web and showed that search engines did not index that much of it. This work also showed that the web had significantly matured and had a diversity of material and resources.
With Steve Lawrence and Kurt Bollacker, Giles was responsible for the creation in 1997 of automatic citation indexing [8] and CiteSeer, a public academic search engine and digital library for Computer and Information Science. Under his direction CiteSeer was moved to and is being maintained at the Pennsylvania State University. CiteSeer has been replaced by the Next Generation CiteSeer, CiteSeerX.
He is the director of the Next Generation CiteSeer project, CiteSeerX, also at the Pennsylvania State University. In addition, he was responsible for the creation of an academic business search engine and digital library, BizSeer (previously known as SmealSearch). With Isaac Councill, he created automatic acknowledgement indexing, permitting for the first time the automatic search and indexing of acknowledged entities in scholarly and research documents. The search engine for this was AckSeer. He also was the cocreator of the first search engine for robots.txt, BotSeer.
Research in collaboration with Professors Prasenjit Mitra, Karl Mueller, Barbara Garrison and James Kubicki resulted in the development of a search engine and data portal for chemistry, ChemxSeer, ChemXSeer. With Yang Sun, a novel search engine, BotSeer, was designed that searches and indexes robots.txt files on web sites. The Next Generation CiteSeer, CiteSeerx, came online in February 2008, with over one million articles indexed and now with active crawling exceeds 10 million articles. RefSeerX was a context-aware citation recommendation service which recommended papers from CiteSeerX that are most relevant to a given text. These services were based on SeerSuite, a package of open sources tools for searching and indexing academic documents and data.
Giles has been an expert witness for Quinn Emanuel Urquhart & Sullivan, LLP and Davis Polk & Wardwell, LLP representing Google and Yahoo.
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.
CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science.
DBLP is a computer science bibliography website. Starting in 1993 at Universität Trier in Germany, it grew from a small collection of HTML files and became an organization hosting a database and logic programming bibliography site. Since November 2018, DBLP is a branch of Schloss Dagstuhl – Leibniz-Zentrum für Informatik (LZI). DBLP listed more than 5.4 million journal articles, conference papers, and other publications on computer science in December 2020, up from about 14,000 in 1995 and 3.66 million in July 2016. All important journals on computer science are tracked. Proceedings papers of many conferences are also tracked. It is mirrored at three sites across the Internet.
A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.
A recommender system (RecSys), or a recommendation system (sometimes replacing system with terms such as platform, engine, or algorithm), is a subclass of information filtering system that provides suggestions for items that are most pertinent to a particular user. Recommender systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer.
SmealSearch was a web portal, search engine and digital library for academic business documents that was originally hosted at the defunct eBusiness Research Center at the Pennsylvania State University. It was based on the CiteSeer digital library and search engine technology. Due to lack of support, it moved to the College of Information Sciences and Technology and became BizSeer. It was enhanced and modified by many including Lee Giles, Yang Sun, Sandip Debnath, Isaac Councill, Arvind Rangaswamy, Nirmal Pal, Yves Petinot and Pradeep Teregowda.
Citation analysis is the examination of the frequency, patterns, and graphs of citations in documents. It uses the directed graph of citations — links from one document to another document — to reveal properties of the documents. A typical aim would be to identify the most important documents in a collection. A classic example is that of the citations between academic articles and books. For another example, judges of law support their judgements by referring back to judgements made in earlier cases. An additional example is provided by patents which contain prior art, citation of earlier patents relevant to the current claim. The digitization of patent data and increasing computing power have led to a community of practice that uses these citation data to measure innovation attributes, trace knowledge flows, and map innovation networks.
Thomas Shi-Tao Huang was a Chinese-born Taiwanese-American computer scientist and electrical engineer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.
Michael S. Lew is a scientist in multimedia information search and retrieval at Leiden University, Netherlands. He has published over a dozen books and 150 scientific articles in the areas of content based image retrieval, computer vision, and deep learning. Notably, he had the most cited paper in the ACM Transactions on Multimedia, one of the top 10 most cited articles in the history of the ACM SIGMM, and the most cited article from the ACM International Conference on Multimedia Information Retrieval in 2008 and also in 2010. He was the opening keynote speaker for the 9th International Conference on Visual Information Systems, the Editor-in-Chief of the International Journal of Multimedia Information Retrieval (Springer), the co-founder of influential conferences such as the International Conference on Image and Video Retrieval, and the IEEE Workshop on Human Computer Interaction. He was also a founding member of the international advisory committee for the TRECVID video retrieval evaluation project, chair of the steering committee for the ACM International Conference on Multimedia Retrieval and a member of the ACM SIGMM Executive Committee. In addition, his work on convolutional fusion networks in deep learning won the best paper award at the 23rd International Conference on Multimedia Modeling. His work is frequently cited in both scientific and popular news sources.
Paul Alexander Desmond de Maine was a leading figure in the early development of computer-based automatic indexing and information retrieval and one of the founders of academic computer science in the 1960s.
ACM Multimedia (ACM-MM) is the Association for Computing Machinery (ACM)'s annual conference on multimedia, sponsored by the SIGMM special interest group on multimedia in the ACM. SIGMM specializes in the field of multimedia computing, from underlying technologies to applications, theory to practice, and servers to networks to devices.
James Ze Wang is a Chinese-American computer scientist. He is a distinguished professor of the College of Information Sciences and Technology at Pennsylvania State University. He is also an affiliated professor of the Molecular, Cellular, and Integrative Biosciences Program; the Computational Science Graduate Minor; and the Social Data Analytics Graduate Program. He is co-director of the Intelligent Information Systems Laboratory. He was a visiting professor of the Robotics Institute at Carnegie Mellon University from 2007 to 2008. In 2011 and 2012, he served as a program manager in the Office of International Science and Engineering at the National Science Foundation. He is the second son of Chinese mathematician Wang Yuan.
BotSeer was a Web-based information system and search tool used for research on Web robots and trends in Robot Exclusion Protocol deployment and adherence. It was created and designed by Yang Sun, Isaac G. Councill, Ziming Zhuang and C. Lee Giles. BotSeer was in operation from 2007 to 2010, approximately.
Amit Sheth is a computer scientist at University of South Carolina in Columbia, South Carolina. He is the founding Director of the Artificial Intelligence Institute, and a professor of Computer Science and Engineering. From 2007 to June 2019, he was the Lexis Nexis Ohio Eminent Scholar, director of the Ohio Center of Excellence in Knowledge-enabled Computing, and a professor of Computer Science at Wright State University. Sheth's work has been cited by over 48,800 publications. He has an h-index of 117, which puts him among the top 100 computer scientists with the highest h-index. Prior to founding the Kno.e.sis Center, he served as the director of the Large Scale Distributed Information Systems Lab at the University of Georgia in Athens, Georgia.
Knowledge retrieval seeks to return information in a structured form, consistent with human cognitive processes as opposed to simple lists of data items. It draws on a range of fields including epistemology, cognitive psychology, cognitive neuroscience, logic and inference, machine learning and knowledge discovery, linguistics, and information technology.
W. Bruce Croft is a distinguished professor of computer science at the University of Massachusetts Amherst whose work focuses on information retrieval. He is the founder of the Center for Intelligent Information Retrieval and served as the editor-in-chief of ACM Transactions on Information Systems from 1995 to 2002. He was also a member of the National Research Council Computer Science and Telecommunications Board from 2000 to 2003. Since 2015, he is the Dean of the College of Information and Computer Sciences at the University of Massachusetts Amherst. He was Chair of the UMass Amherst Computer Science Department from 2001 to 2007.
Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.
Prabhakar Raghavan is a business executive and former researcher of web information retrieval. He currently holds the role of Chief Technologist at Google. His research spans algorithms, web search and databases. He is the co-author of the textbooks Randomized Algorithms with Rajeev Motwani and Introduction to Information Retrieval.
Venkata Narayana Padmanabhan is a computer scientist and principal researcher at Microsoft Research India. He is known for his research in networked and mobile systems. He is an elected fellow of the Indian National Academy of Engineering, Institute of Electrical and Electronics Engineers and the Association for Computing Machinery. The Council of Scientific and Industrial Research, the apex agency of the Government of India for scientific research, awarded him the Shanti Swarup Bhatnagar Prize for Science and Technology, one of the highest Indian science awards for his contributions to Engineering Sciences in 2016.
Jianchang (JC) Mao is a Chinese-American computer scientist and Vice President, Google Assistant Engineering at Google. His research spans artificial intelligence, machine learning, computational advertising, data mining, and information retrieval. He was named a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in 2012 for his contributions to pattern recognition, search, content analysis, and computational advertising.
For contributions to information processing and web analysis.
C. Lee Giles 1997: for contributions to the theory and practice of neural networks