Robert L. Grossman

Last updated

Robert L. Grossman
Born
Robert Lee Grossman

Alma mater
Scientific career
Fields Data science, computer science
Institutions University of Chicago

Robert Lee Grossman is an American computer scientist and bioinformatician at the University of Chicago. His primary research interests are data science and data-intensive computing.

Contents

Research

Grossman has worked in several fields. His early work (1984–1990) was in mathematics, where he developed algorithms in symbolic and numeric computing. In 1989, working with Richard Larson, he showed that trees have a natural multiplicative structure and are in fact a Hopf algebra. [1] This algebra, sometimes called the Grossman–Larson algebra, is dual to the Connes-Kreimer algebra, [2] which is one way of organizing the computations required when renormalizing Feynman diagrams. Working with Peter Crouch, he showed that there are Runge–Kutta methods that evolve naturally on Lie groups. [3]

From 1990 to 2010, he primarily worked in computer science, specifically, data mining and data intensive computing. With Stuart Bailey and Yunhong Gu, he developed open source software to move large datasets over wide area high performance networks (PTool and the UDP-based Data Transfer Protocol or UDT). [4] With Yunhong Gu, he also developed Sector/Sphere, a distributed platform for data intensive computing. [5] During this period, he also founded the Data Mining Group, which develops data mining standards, and led the technical working group that developed the Predictive Model Markup Language (PMML), which is now the dominant standard in analytics.

Since 2010, he has primarily focused on data science and its applications to biology medicine, health care and the environment. He developed the first biomedical cloud that was designated as a NIH Trusted Partner, allowing it to interoperate with NIH's controlled access genomic data. [6] He is currently leading the effort to build the NCI Genomic Data Commons, which will host all the genomic and associated clinical data from NIH/NCI funded research projects and clinical trials. [7]

He is a faculty member at the University of Chicago [8] and the Director of the Center for Data Intensive Science at the University of Chicago. He is the founder and director of the Open Commons Consortium.

Entrepreneurial activity

Grossman founded Magnify, Inc. in 1996, was its CEO from 1996 to 2000, and its Chairman until 2005, when it was sold to ChoicePoint. [9] Magnify is now part of LexisNexis. Magnify provided data mining solutions to the financial services sector. Grossman founded Open Data Group Inc. in 2001, which provides data science services so that clients can build predictive models over big data. Open Data's main product is a high performance scoring engine for statistics and analytic models that is compliant with the Portable Format for Analytics standard. [10] Grossman is the Chief Data Scientist at Open Data Group.

Education

Grossman was born in Shaker Heights, Ohio and attended Harvard University. He received an A.B. from Harvard University in 1980 and Ph.D. from Princeton University in 1985 from the Program in Applied and Computational Mathematics. He was a NSF Postdoctoral Research Fellow in the Mathematics Department at the University of California, Berkeley from 1984 to 1988.

Awards and honors

Grossman is a Fellow of the American Association for the Advancement of Science. [11] In 2017 he became a Fellow of the Association for Computing Machinery. [12]

Related Research Articles

<span class="mw-page-title-main">Computer science</span> Study of computation

Computer science is the study of computation, information, and automation. Computer science spans theoretical disciplines to applied disciplines. Though more often considered an academic discipline, computer science is closely related to computer programming.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

<span class="mw-page-title-main">Theoretical computer science</span> Subfield of computer science and mathematics

Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, formal language theory, the lambda calculus and type theory.

Computational science, also known as scientific computing, technical computing or scientific computation (SC), is a division of science that uses advanced computing capabilities to understand and solve complex physical problems. This includes

UDP-based Data Transfer Protocol (UDT), is a high-performance data transfer protocol designed for transferring large volumetric datasets over high-speed wide area networks. Such settings are typically disadvantageous for the more common TCP protocol.

<span class="mw-page-title-main">Piers Nash</span>

Piers David Nash is an entrepreneur, cancer biology professor, data evangelist, writer and technology futurist. He is the son of academic Roger Nash.

<span class="mw-page-title-main">David Haussler</span> American bioinformatician

David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.

<span class="mw-page-title-main">Micha Sharir</span> Israeli mathematician and computer scientist

Micha Sharir is an Israeli mathematician and computer scientist. He is a professor at Tel Aviv University, notable for his contributions to computational geometry and combinatorial geometry, having authored hundreds of papers.

<span class="mw-page-title-main">Computational statistics</span> Interface between statistics and computer science

Computational statistics, or statistical computing, is the bond between statistics and computer science, and refers to the statistical methods that are enabled by using computational methods. It is the area of computational science specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.

Sector/Sphere is an open source software suite for high-performance distributed data storage and processing. It can be broadly compared to Google's GFS and MapReduce technology. Sector is a distributed file system targeting data storage over a large number of commodity computers. Sphere is the programming architecture framework that supports in-storage parallel data processing for data stored in Sector. Sector/Sphere operates in a wide area network (WAN) setting.

<span class="mw-page-title-main">Applied mathematics</span> Application of mathematical methods to other fields

Applied mathematics is the application of mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and industry. Thus, applied mathematics is a combination of mathematical science and specialized knowledge. The term "applied mathematics" also describes the professional specialty in which mathematicians work on practical problems by formulating and studying mathematical models.

The Sidney Fernbach Award established in 1992 by the IEEE Computer Society, in memory of Sidney Fernbach, one of the pioneers in the development and application of high performance computers for the solution of large computational problems as the Division Chief for the Computation Division at Lawrence Livermore Laboratory from the late 1950s through the 1970s. A certificate and $2,000 are awarded for outstanding contributions in the application of high performance computers using innovative approaches. The nomination deadline is 1 July each year.

Robert Clifford Gentleman is a Canadian statistician and bioinformatician who is currently the founding executive director of the Center for Computational Biomedicine at Harvard Medical School. He was previously the vice president of computational biology at 23andMe. Gentleman is recognized, along with Ross Ihaka, as one of the originators of the R programming language and the Bioconductor project.

<span class="mw-page-title-main">Christopher R. Johnson</span> American computer scientist

Christopher Ray Johnson is an American computer scientist. He is a distinguished professor of computer science at the University of Utah, and founding director of the Scientific Computing and Imaging Institute (SCI). His research interests are in the areas of scientific computing and scientific visualization.

<span class="mw-page-title-main">Robert Tibshirani</span> Canadian statistician

Robert Tibshirani is a professor in the Departments of Statistics and Biomedical Data Science at Stanford University. He was a professor at the University of Toronto from 1985 to 1998. In his work, he develops statistical tools for the analysis of complex datasets, most recently in genomics and proteomics.

The High-performance Integrated Virtual Environment (HIVE) is a distributed computing environment used for healthcare-IT and biological research, including analysis of Next Generation Sequencing (NGS) data, preclinical, clinical and post market data, adverse events, metagenomic data, etc. Currently it is supported and continuously developed by US Food and Drug Administration, George Washington University, and by DNA-HIVE, WHISE-Global and Embleema. HIVE currently operates fully functionally within the US FDA supporting wide variety (+60) of regulatory research and regulatory review projects as well as for supporting MDEpiNet medical device postmarket registries. Academic deployments of HIVE are used for research activities and publications in NGS analytics, cancer research, microbiome research and in educational programs for students at GWU. Commercial enterprises use HIVE for oncology, microbiology, vaccine manufacturing, gene editing, healthcare-IT, harmonization of real-world data, in preclinical research and clinical studies.

<span class="mw-page-title-main">Inderjit Dhillon</span>

Inderjit S. Dhillon is the Gottesman Family Centennial Professor of Computer Science and Mathematics at the University of Texas at Austin, where he is also the Director of the ICES Center for Big Data Analytics. His main research interests are in machine learning, data analysis, parallel computing, network analysis, linear algebra and optimization.

<span class="mw-page-title-main">Owl Scientific Computing</span> Numerical programming library for the OCaml programming language

Owl Scientific Computing is a software system for scientific and engineering computing developed in the Department of Computer Science and Technology, University of Cambridge. The System Research Group (SRG) in the department recognises Owl as one of the representative systems developed in SRG in the 2010s. The source code is licensed under the MIT License and can be accessed from the GitHub repository.

References

  1. Robert L. Grossman and Richard G. Larson, Hopf algebraic structures of families of trees, Journal Algebra, Volume 26, 1989, pages 184-210.
  2. Michael E. Hoffman, "Combinatorics of Rooted Trees and Hopf Algebras", Transactions of the American Mathematical Society, 2003
  3. Crouch, Peter; Grossman, Robert L. (1993). "Numerical integration of ordinary differential equations on manifolds". Journal of Nonlinear Science. 3 (1): 1–33. Bibcode:1993JNS.....3....1C. doi:10.1007/bf02429858. S2CID   13991373.
  4. Gu, Yunhong; Grossman, Robert L. (2007). "UDT: UDP-based data transfer for high-speed wide area networks". Computer Networks. 51 (7): 1777–1799. doi:10.1016/j.comnet.2006.11.009.
  5. Gu, Yunhong; Grossman, Robert L. (1897). "Sector and Sphere: The design and implementation of a high-performance data cloud". Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 367 (1897): 2429–2445. doi:10.1098/rsta.2009.0053. PMC   3391065 . PMID   19451100.
  6. Heath, Allison P; Greenway, Matthew; Powell, Raymond; Spring, Jonathan; Suarez, Rafael; Hanley, David; Bandlamudi, Chai; McNerney, Megan E; White, Kevin P; Grossman, Robert L (2014). "Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets". Journal of the American Medical Informatics Association. 21 (6): 969–975. doi:10.1136/amiajnl-2013-002155. PMC   4215034 . PMID   24464852.
  7. NCI Genomic Data Commons
  8. Computation Institute Faculty - Robert L. Grossman
  9. ChoicePoint adds Magnify Inc., Atlanta Business Chronicle, May 2, 2005
  10. Portable Format for Analytics
  11. AAAS Council Elects 388 New AAAS Fellows
  12. Cacm Staff (March 2017), "ACM Recognizes New Fellows", Communications of the ACM , 60 (3): 23, doi:10.1145/3039921, S2CID   31701275 .