Martin Ester | |
---|---|
Born | |
Children | 2 |
Academic background | |
Education | MS.c., Technical University of Dortmund PhD, ETH Zurich |
Thesis | Konsistenzwerkzeuge für PROLOG-Wissensbasen (1989) |
Academic work | |
Institutions | University of Munich Simon Fraser University |
Notable ideas | DBSCAN |
Martin Ester FRSC (born November 5,1958) is a Canadian-German Full Professor of Computing Science at Simon Fraser University. His research focuses on researcher data mining and machine learning.
After earning his MS.c.,Ester worked for Swissair before earning a position at the University of Munich as an Assistant Professor in 1993. [1] Three years later,in 1996,Ester,Hans-Peter Kriegel,Jörg Sander and Xiaowei Xu proposed a data clustering algorithm called "Density-based spatial clustering of applications with noise" (DBSCAN). [2] Their proposal won the 2014 KDD Test of Time Award for "outstanding papers from past KDD Conferences beyond the last decade that have had an important impact on the data mining research community." [3]
A few years later,Ester moved to Vancouver and accepted a position at Simon Fraser University. [4] In 2009,Ester was selected to become an Associate Editor of the IEEE Transactions on Knowledge and Data Engineering. [5]
Between 2010 and 2015,Ester served as the SFU School of Computing Science director,before being succeeded by Greg Mori. [6] In 2016,Arnetminer listed Ester as the world's most influential scholar in data mining. At the time,Arnetminer recorded that Ester authored 169 papers,which gained more than 21,000 citations,and hitting 50 on the h-index. [7] Besides working as a Full Professor at SFU,Ester is also heading research at British Columbia Children's Hospital regarding genetic influence in drug reception and reactions in patients. [8] His research team received a $9.9 million grant from Genome Canada for their research through Genome Canada's 2017 Large-Scale Applied Research Project Competition:Genomics and Precision Health. [9]
As a result of his research,Ester was elected a Fellow of the Royal Society of Canada in 2019. [4]
Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning,statistics,and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process,or KDD. Aside from the raw analysis step,it also involves database and data management aspects,data pre-processing,model and inference considerations,interestingness metrics,complexity considerations,post-processing of discovered structures,visualization,and online updating.
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups (clusters). It is a main task of exploratory data analysis,and a common technique for statistical data analysis,used in many fields,including pattern recognition,image analysis,information retrieval,bioinformatics,data compression,computer graphics and machine learning.
The R-trees are tree data structures used for spatial access methods,i.e.,for indexing multi-dimensional information such as geographical coordinates,rectangles or polygons. The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both theoretical and applied contexts. A common real-world usage for an R-tree might be to store spatial objects such as restaurant locations or the polygons that typical maps are made of:streets,buildings,outlines of lakes,coastlines,etc. and then find answers quickly to queries such as "Find all museums within 2 km of my current location","retrieve all road segments within 2 km of my location" or "find the nearest gas station". The R-tree can also accelerate nearest neighbor search for various distance metrics,including great-circle distance.
SIGKDD,representing the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining,hosts an influential annual conference.
Jiawei Han is a Chinese-American computer scientist and researcher. He currently holds the position of Michael Aiken Chair Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. His research focuses on data mining,text mining,database systems,information networks,data mining from spatiotemporal data,Web data,and social/information network data.
In data analysis,anomaly detection is generally understood to be the identification of rare items,events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behaviour. Such examples may arouse suspicions of being generated by a different mechanism,or appear inconsistent with the remainder of that set of data.
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester,Hans-Peter Kriegel,Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering non-parametric algorithm:given a set of points in some space,it groups together points that are closely packed together,marking as outliers points that lie alone in low-density regions . DBSCAN is one of the most common clustering algorithms and also most cited in scientific literature.
Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data. It was presented by Mihael Ankerst,Markus M. Breunig,Hans-Peter Kriegel and Jörg Sander. Its basic idea is similar to DBSCAN,but it addresses one of DBSCAN's major weaknesses:the problem of detecting meaningful clusters in data of varying density. To do so,the points of the database are (linearly) ordered such that spatially closest points become neighbors in the ordering. Additionally,a special distance is stored for each point that represents the density that must be accepted for a cluster so that both points belong to the same cluster. This is represented as a dendrogram.
ELKI is a data mining software framework developed for use in research and teaching. It was originally at the database systems research unit of Professor Hans-Peter Kriegel at the Ludwig Maximilian University of Munich,Germany,and now continued at the Technical University of Dortmund,Germany. It aims at allowing the development and evaluation of advanced data mining algorithms and their interaction with database index structures.
Hans-Peter Kriegel is a German computer scientist and professor at the Ludwig Maximilian University of Munich and leading the Database Systems Group in the Department of Computer Science. He was previously professor at the University of Würzburg and the University of Bremen after habilitation at the Technical University of Dortmund and doctorate from Karlsruhe Institute of Technology.
AMiner is a free online service used to index,search,and mine big scientific data.
Usama M. Fayyad is an American-Jordanian data scientist and co-founder of KDD conferences and ACM SIGKDD association for Knowledge Discovery and Data Mining. He is a speaker on Business Analytics,Data Mining,Data Science,and Big Data. He recently left his role as the Chief Data Officer at Barclays Bank.
Bing Liu is a Chinese-American professor of computer science who specializes in data mining,machine learning,and natural language processing. In 2002,he became a scholar at University of Illinois at Chicago. He holds a PhD from the University of Edinburgh (1988). His PhD advisors were Austin Tate and Kenneth Williamson Currie,and his PhD thesis was titled Reinforcement Planning for Resource Allocation and Constraint Satisfaction.
Gregory I. Piatetsky-Shapiro is a data scientist and the co-founder of the KDD conferences,and co-founder and past chair of the Association for Computing Machinery SIGKDD group for Knowledge Discovery,Data Mining and Data Science. He is the founder and president of KDnuggets,a discussion and learning website for Business Analytics,Data Mining and Data Science.
Huan Liu is a Chinese-born computer scientist.
Jianchang (JC) Mao is a Chinese-American computer scientist and Vice President,Google Assistant Engineering at Google. His research spans artificial intelligence,machine learning,computational advertising,data mining,and information retrieval. He was named a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in 2012 for his contributions to pattern recognition,search,content analysis,and computational advertising.
Arthur Zimek is a professor in data mining,data science and machine learning at the University of Southern Denmark in Odense,Denmark.
Hui Xiong is a data scientist. He is a Distinguished Professor at Rutgers University and a Distinguished Guest Professor at the University of Science and Technology of China (USTC).
Spatial embedding is one of feature learning techniques used in spatial analysis where points,lines,polygons or other spatial data types. representing geographic locations are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per geographic object to a continuous vector space with a much lower dimension.
Lesley Shannon is a Canadian professor who is Chair for the Computer Engineering Option in the School of Engineering Science at Simon Fraser University. She is also the current NSERC Chair for Women in Science and Engineering for BC and Yukon. Shannon’s chair operates the Westcoast Women in Engineering,Science and Technology (WWEST) program to promote equity,diversity and inclusion in STEM.