Nassib Nassar

Last updated

Nassib Nassar is an American computer scientist and classical pianist.

As a computer scientist, Nassar was among the architects of information retrieval software for the World Wide Web and was the creator of Isearch, one of the earliest open source search engines, in 1994. [1] He was president of Etymon Systems, an open source software company founded in 1998 and best known for producing Etymon PJ, which became the standard library for generating Portable Document Format (PDF) documents in Java, [2] [3] and Amberfish, a large scale information retrieval system for semi-structured text and XML. [4] [5] [6] [7]

As a pianist Nassar was the winner of The American Prize in 2014 for his performance of the Brahms F minor piano sonata. Other performances have included the 32 piano sonatas of Beethoven, played in a series of seven recitals. [8] [9]

Related Research Articles

Transclusion phenomenon in computer science

In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by hypertext reference. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user. The result of transclusion is a single integrated document made of parts assembled dynamically from separate sources, possibly stored on different computers in disparate places.

Information retrieval (IR) is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

The Semantic Web is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

A translation memory (TM) is a database that stores "segments", which can be sentences, paragraphs or sentence-like units that have previously been translated, in order to aid human translators. The translation memory stores the source text and its corresponding translation in language pairs called “translation units”. Individual words are handled by terminology bases and are not within the domain of TM.

A document management system (DMS) is a system used to receive, track, manage and store documents and reduce paper. Most are capable of keeping a record of the various versions created and modified by different users. In the case of the management of digital documents such systems are based on computer programs. The term has some overlap with the concepts of content management systems. It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.

Information science is an academic field which is primarily concerned with analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information. Practitioners within and outside the field study the application and the usage of knowledge in organizations in addition to the interaction between people, organizations, and any existing information systems with the aim of creating, replacing, improving, or understanding information systems. Historically, information science is associated with computer science, data science, psychology, technology, and intelligence agencies. However, information science also incorporates aspects of diverse fields such as archival science, cognitive science, commerce, law, linguistics, museology, management, mathematics, philosophy, public policy, and social sciences.

Text Retrieval Conference

The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity, and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology.

News aggregator Client software that aggregates syndicated web content

In computing, a news aggregator, also termed a feed aggregator, feed reader, news reader, RSS reader or simply an aggregator, is client software or a web application that aggregates syndicated web content such as online newspapers, blogs, podcasts, and video blogs (vlogs) in one location for easy viewing. The updates distributed may include journal tables of contents, podcasts, videos, and news items.

The following outline is provided as an overview of and topical guide to library science:

Isearch is open-source text retrieval software first developed in 1994 by Nassib Nassar as part of the Isite Z39.50 information framework. The project started at the Clearinghouse for Networked Information Discovery and Retrieval (CNIDR) of the North Carolina supercomputing center MCNC and funded by the National Science Foundation to follow in the track of WAIS and develop prototype systems for distributed information networks encompassing Internet applications, library catalogs and other information resources.

The Lemur Project is a collaboration between the Center for Intelligent Information Retrieval at the University of Massachusetts Amherst and the Language Technologies Institute at Carnegie Mellon University. The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri and Galago search engines, the ClueWeb09 and ClueWeb12 datasets, and the RankLib learning-to-rank library. The software and datasets are used widely in scientific and research applications, as well as in some commercial applications.

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process in the context of search engines designed to find web pages on the Internet is web indexing.

Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.

dtSearch Corp. is a software company which specializes in text retrieval software. It was founded in 1991, and is headquartered in Bethesda, Maryland. Its current range of software includes products for enterprise desktop search, Intranet/Internet spidering and search, and search engines for developers (SDK) to integrate into other software applications.

Information Retrieval Facility

The Information Retrieval Facility (IRF), founded 2006 and located in Vienna, Austria, was a research platform for networking and collaboration for professionals in the field of information retrieval. It ceased operations in 2012.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

XML retrieval, or XML information retrieval, is the content-based retrieval of documents structured with XML. As such it is used for computing relevance of XML documents.

Regional Information Center for Science and Technology

Regional Information Center for Science and Technology (RICeST) is an Iranian governmental organisation established to promote the production and distribution of scientific information in Iran and Islamic countries, providing reference, study and bibliographical information and related services. It also undertakes scientometrics based on its databases of scientific products of Iran and Islamic countries.

The following is provided as an overview of and topical guide to databases:

The Clearinghouse for Networked Information Discovery and Retrieval or CNIDR was an organization funded by the U.S. National Science Foundation from 1993 to 1997 and based at the Microelectronics Center of North Carolina (MCNC) in Research Triangle Park. CNIDR was active in the research and development of open source software and open standards, centered on information discovery and retrieval, in the emerging Internet.

References

  1. Menconi, David. "Nassib Nassar plays (and works) the keyboards." The News & Observer, January 10, 2015, Raleigh, NC.
  2. Zipper, Bernd. "PDF on the Fly: Tools and Strategies for Automatic Generation of PDF Files." The Seybold Report: Analyzing Publishing Technologies, August 19, 2002.
  3. Nassar, Nassib. "Automating PDF Objects for Interactive Publishing." Web Techniques, October, 1998.
  4. Fallen, Christopher T. and Newby, Gregory B. 2005. "Logistic Regression Merging of Amberfish and Lucene Multisearch Results." Text Retrieval Conference (TREC) 2005 Proceedings, Gaithersburg, MD: National Institute of Science and Technology.
  5. Sullivan, Dan. 2001. Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales. Wiley.
  6. Hu, Gongzhu and Tang, Chunxia. 2004. "Indexing XML Data for Path Expression Queries." In Software Engineering Research and Applications, Lecture Notes in Computer Science. Springer.
  7. Arnold, Stephen. "Etymon: Maybe Another Lost Search Vendor," January 19, 2009.
  8. Barrios, Karla. "Meet a Tar Heel: Piano man/computer scientist Nassib Nassar." University of North Carolina at Chapel Hill, November 6, 2014.
  9. The American Prize