Thomsen Diagrams

Last updated

Thomsen Diagrams are the diagrammatic methodology developed by Erik Thomsen in 1997. [1] They are essentially a metaphor for describing multidimensional data spaces in the OLAP system. [2] It may be thought of as a multi-dimensional domain structure. In the structure, each dimension is represented by a vertical line, and hence each dimension is described independently.

Every member of a dimension is represented by a unit interval on the line. A multi-dimensional model is built by combining the resultant lines for the particular dimensions.

The Thomsen diagrammatical technique is not based on angular defined dimensions, and is thus able to represent any number of dimensions. It may be referred to as a multi-dimensional type structure [3] (MTS). The MTS permits the viewing of information about hierarchies and data flows, both within and between structures, hence enhancing the capabilities of the OLAP system.

Related Research Articles

Data warehouse system used for reporting and data analysis

In computing, a data warehouse, also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise.

Online analytical processing, or OLAP, is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

Data mart access pattern in data warehouse environments, used to retrieve client-facing data pertaining to a single department, so that each department isolates the use of its own data

A data mart is a structure / access pattern specific to data warehouse environments, used to retrieve client-facing data. The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team. Whereas data warehouses have an enterprise-wide depth, the information in data marts pertains to a single department. In some deployments, each department or business unit is considered the owner of its data mart including all the hardware, software and data. This enables each department to isolate the use, manipulation and development of their data. In other deployments where conformed dimensions are used, this business unit ownership will not hold true for shared dimensions like customer, product, etc.

Table (information) arrangement of data in rows and columns

A table is an arrangement of data in rows and columns, or possibly in a more complex structure. Tables are widely used in communication, research, and data analysis. Tables appear in print media, handwritten notes, computer software, architectural ornamentation, traffic signs, and many other places. The precise conventions and terminology for describing tables vary depending on the context. Further, tables differ significantly in variety, structure, flexibility, notation, representation and use. In books and technical articles, tables are typically presented apart from the main text in numbered and captioned floating blocks.

A diagram is a symbolic representation of information using visualization techniques. Diagrams have been used since ancient times, but became more prevalent during the Enlightenment. Sometimes, the technique uses a three-dimensional visualization which is then projected onto a two-dimensional surface. The word graph is sometimes used as a synonym for diagram.

OLAP cube

An OLAP cube is a multi-dimensional array of data. Online analytical processing (OLAP) is a computer-based technique of analyzing data to look for insights. The term cube here refers to a multi-dimensional dataset, which is also sometimes called a hypercube if the number of dimensions is greater than 3.

Entity–relationship model Model or diagram describing interrelated things

An entity–relationship model describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types and specifies relationships that can exist between entities.

In computing, the star schema is the simplest style of data mart schema and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more fact tables referencing any number of dimension tables. The star schema is an important special case of the snowflake schema, and is more effective for handling simpler queries.

In statistics, econometrics, and related fields, multidimensional analysis (MDA) is a data analysis process that groups data into two categories: data dimensions and measurements. For example, a data set consisting of the number of wins for a single football team at each of several years is a single-dimensional data set. A data set consisting of the number of wins for several football teams in a single year is also a single-dimensional data set. A data set consisting of the number of wins for several football teams over several years is a two-dimensional data set.

Snowflake schema

In computing, a snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake shape. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions.. "Snowflaking" is a method of normalizing the dimension tables in a star schema. When it is completely normalized along all the dimension tables, the resultant structure resembles a snowflake with the fact table in the middle. The principle behind snowflaking is normalization of the dimension tables by removing low cardinality attributes and forming separate tables.

Essbase is a multidimensional database management system (MDBMS) that provides a multidimensional database platform upon which to build analytic applications. Essbase, whose name derives from "extended spreadsheet database", began as a product of Arbor Software, which merged with Hyperion Software in 1998. Oracle Corporation acquired Hyperion Solutions Corporation in 2007, as of 2009 Oracle marketed Essbase as "Oracle Essbase" and more recently, Essbase is offered as part of the Oracle Analytics Cloud. Until late 2005 IBM also marketed an OEM version of Essbase as DB2 OLAP Server.

In computer programming, an Iliffe vector, also known as a display, is a data structure used to implement multi-dimensional arrays. An Iliffe vector for an n-dimensional array consists of a vector of pointers to an (n − 1)-dimensional array. They are often used to avoid the need for expensive multiplication operations when performing address calculation on an array element. They can also be used to implement jagged arrays, such as triangular arrays, triangular matrices and other kinds of irregularly shaped arrays. The data structure is named after John K. Iliffe.

Multidimensional Expressions (MDX) is a query language for online analytical processing (OLAP) using a database management system. Much like SQL, it is a query language for OLAP cubes. It is also a calculation language, with syntax similar to spreadsheet formulas.

Microsoft SQL Server Analysis Services, SSAS, is an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server. SSAS is used as a tool by organizations to analyze and make sense of information possibly spread out across multiple databases, or in disparate tables or files. Microsoft has included a number of services in SQL Server related to business intelligence and data warehousing. These services include Integration Services, Reporting Services and Analysis Services. Analysis Services includes a group of OLAP and data mining capabilities and comes in two flavors - Multidimensional and Tabular.

Dimensional modeling (DM) is part of the Business Dimensional Lifecycle methodology developed by Ralph Kimball which includes a set of methods, techniques and concepts for use in data warehouse design. The approach focuses on identifying the key business processes within a business and modelling and implementing these first before adding additional business processes, a bottom-up approach. An alternative approach from Inmon advocates a top down design of the model of all the enterprise data using tools such as entity-relationship modeling (ER).

Diagrammatic reasoning reasoning by the mean of visual representations

Diagrammatic reasoning is reasoning by means of visual representations. The study of diagrammatic reasoning is about the understanding of concepts and ideas, visualized with the use of diagrams and imagery instead of by linguistic or algebraic means.

Database model generic structure of a database type, for instance relational

A database model is a type of data model that determines the logical structure of a database and fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

The dimensional fact model (DFM) is an ad hoc and graphical formalism specifically devised to support the conceptual modeling phase in a DW project. DFM is extremely intuitive and can be used by analysts and non-technical users as well. A short-term working is sufficient to realize a clear and exhaustive representation of multidimensional concepts. It can be used from the initial DW life-cycle steps, to rapidly devise a conceptual model to share with customers.

Array DBMS System that provides database services specifically for arrays

Array database management systems provide database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today’s earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

The functional database model is used to support analytics applications such as financial planning and performance management. The functional database model, or the functional model for short, is different from but complementary to the relational model. The functional model is also distinct from other similarly named concepts, including the DAPLEX functional database model and functional language databases.

References

  1. Erik Thomsen, 1997, OLAP Solutions: Building Multidimensional Information Systems (1st ed.), John Wiley
  2. "Thomsen's diagram | nan's BI trip". Nanzheng.wordpress.com. 2011-08-29. Retrieved 2014-04-05.
  3. Erik Thomsen, 1997, OLAP Solutions: Building Multidimensional Information Systems (2nd ed.), John Wiley