National Center for Data Mining

Last updated

The National Center for Data Mining (NCDM) is a center of the University of Illinois at Chicago (UIC), established in 1998 to serve as a resource for research, standards development, and outreach for high performance and distributed data mining and predictive modeling.

University of Illinois at Chicago Public University

The University of Illinois at Chicago (UIC) is a public research university in Chicago, Illinois. Its campus is in the Near West Side community area, adjacent to the Chicago Loop. The second campus established under the University of Illinois system, UIC is also the largest university in the Chicago area, having approximately 30,000 students enrolled in 15 colleges.

Data mining computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems; interdisciplinary subfield of computer science

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data; in contrast, data mining uses machine-learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.

NCDM won the High Performance Bandwidth Challenge at SuperComputing '06 in Tampa, FL and recently demonstrated the use of UDP Data Transport.


Related Research Articles

National Center for Supercomputing Applications

The National Center for Supercomputing Applications (NCSA) is a state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances research, science and engineering based in the United States of America. NCSA operates as a unit of the University of Illinois at Urbana–Champaign, and provides high-performance computing resources to researchers across the country. Support for NCSA comes from the National Science Foundation, the state of Illinois, the University of Illinois, business and industry partners, and other federal agencies.

In telecommunications, broadband is wide bandwidth data transmission which transports multiple signals and traffic types. The medium can be coaxial cable, optical fiber, radio or twisted pair.

San Diego Supercomputer Center

The San Diego Supercomputer Center (SDSC) is an organized research unit of the University of California, San Diego (UCSD). SDSC is located at the UCSD campus' Eleanor Roosevelt College east end, immediately north the Hopkins Parking Structure.

Copper Harbor, Michigan Census-designated place & unincorporated community in Michigan, United States

Copper Harbor is an unincorporated community and census-designated place in northeastern Keweenaw County in the U.S. state of Michigan. It is within Grant Township on the Keweenaw Peninsula which juts out from the Upper Peninsula of Michigan into Lake Superior. Its population was 108 as of the 2010 census.

Geography of Chicago

The city of Chicago is located in northern Illinois, United States, at the south western tip of Lake Michigan. It sits on the Saint Lawrence Seaway continental divide at the site of the Chicago Portage, an ancient trade route connecting the Mississippi River and the Great Lakes watersheds.

Google data centers

Google data centers are the large data center facilities Google uses to provide their services, which combine large amounts of digital storage, compute nodes organized in aisles of racks, internal and external networking, environmental controls, and operations software.

UDP-based Data Transfer Protocol (UDT), is a high-performance data transfer protocol designed for transferring large volumetric datasets over high-speed wide area networks. Such settings are typically disadvantageous for the more common TCP protocol.

National Energy Research Scientific Computing Center supercomputer facility operated by the US Department of Energy in Berkeley, California

The National Energy Research Scientific Computing Center, or NERSC, is a high performance computing (supercomputer) user facility operated by Lawrence Berkeley National Laboratory for the United States Department of Energy Office of Science. As the mission computing center for the Office of Science, NERSC houses high performance computing and data systems used by 7,000 scientists at national laboratories and universities around the country. NERSC's newest and largest supercomputer is Cori, which was ranked 5th on the TOP500 list of world's fastest supercomputers in November 2016. NERSC is located on the main Berkeley Lab campus in Berkeley, California.

In computing, computer performance is the amount of useful work accomplished by a computer system. Outside of specific contexts, computer performance is estimated in terms of accuracy, efficiency and speed of executing computer program instructions. When it comes to high computer performance, one or more of the following factors might be involved:

Ganeer Township, Kankakee County, Illinois Township in Illinois, United States

Ganeer Township is one of seventeen townships in Kankakee County, Illinois, USA. As of the 2010 census, its population was 3,215 and it contained 1,411 housing units.

Pembroke Township, Kankakee County, Illinois Township in Illinois, United States

Pembroke Township is one of seventeen townships in Kankakee County, Illinois, USA. As of the 2010 census, its population was 2,140 and it contained 1,062 housing units. Pembroke Township was formed from parts of Momence township on February 17, 1877.

Downers Grove Township, DuPage County, Illinois Township in Illinois, United States

Downers Grove Township is one of nine townships in DuPage County, Illinois, USA. As of the 2010 census, its population was 146,795 and it contained 60,438 housing units. It is the largest township in the county, both in terms of area and population.

York Township, DuPage County, Illinois Township in Illinois, United States

York Township is one of nine townships in DuPage County, Illinois, USA. As of the 2010 census, its population was 123,449 and it contained 51,557 housing units.

Boyers, Pennsylvania human settlement in Pennsylvania, United States of America

Boyers is an unincorporated village in Marion Township, Butler County, Pennsylvania, United States. It has a small population with a few businesses located in the center of the town. Slippery Rock Creek flows through the community. The creek's source is a few miles to the east, in the small village of Hilliards. PA 308 is one of the main roads in the area. The secondary state highway runs through the center of Boyers.

Intel Tera-Scale is a research program by Intel that focuses on development in Intel processors and platforms that utilize the inherent parallelism of emerging visual-computing applications. Such applications require teraFLOPS of parallel computing performance to process terabytes of data quickly. Parallelism is the concept of performing multiple tasks simultaneously. Utilizing parallelism will not only increase the efficiency of computer processing units (CPUs), but also increase the bytes of data analyzed each second. In order to appropriately apply parallelism, the CPU must be able to handle multiple threads and to do so the CPU must consist of multiple cores. The conventional amount of cores in consumer grade computers are 2–8 cores while workstation grade computers can have even greater amounts. However, even the current amount of cores aren't great enough to perform at teraFLOPS performance leading to an even greater amount of cores that must be added. As a result of the program, two prototypes have been manufactured that were used to test the feasibility of having many more cores than the conventional amount and proved to be successful.

Mashpee Middle-High School is a public high school located in Mashpee, Massachusetts. It is located at the intersection of Old Barnstable Road and Route 151, has an approximate enrollment of 425 students in grades 9–12 and is the home of the Technology "Center of Excellence". The school's mascot is the Falcons, and the school colors are Royal Blue and White.

The College of Engineering is an academic department at the University of Illinois at Chicago offering both undergraduate and graduate programs of study.

Fabric computing consolidated high-performance computing platform

Fabric computing or unified computing involves constructing a computing fabric consisting of interconnected nodes that look like a "weave" or a "fabric" when viewed/envisaged collectively from a distance.

HPC Challenge Benchmark combines several benchmarks to test a number of independent attributes of the performance of high-performance computer (HPC) systems. The project has been co-sponsored by the DARPA High Productivity Computing Systems program, the United States Department of Energy and the National Science Foundation.

Data center is a pool of resources interconnected using a communication network. Data Center Network (DCN) holds a pivotal role in a data center, as it interconnects all of the data center resources together. DCNs need to be scalable and efficient to connect tens or even hundreds of thousands of servers to handle the growing demands of Cloud computing. Today’s data centers are constrained by the interconnection network.