Lattice Miner

Last updated

Lattice Miner [1] is a formal concept analysis software tool for the construction, visualization and manipulation of concept lattices. It allows the generation of formal concepts and association rules as well as the transformation of formal contexts via apposition, subposition, reduction and object/attribute generalization, and the manipulation of concept lattices via approximation, projection and selection. Lattice Miner allows also the drawing of nested line diagrams.

Contents

Introduction

Formal concept analysis (FCA) is a branch of applied mathematics based on the formalization of concept and concept hierarchy and mainly used as a framework for conceptual clustering and rule mining. [2] Over the last two decades, a collection of tools have emerged to help FCA users visualize and analyze concept lattices. [3] [4] They range from the earliest DOS-based implementations (e.g., ConImp and GLAD) to more recent implementations in Java like ToscanaJ, [5] Galicia, [6] ConExp [7] and Coron. [8] A main issue in the development of FCA tools is to visualize large concept lattices and provide efficient mechanisms to highlight patterns (e.g., concepts, associations) that could be relevant to the user. The initial objective of the FCA tool called Lattice Miner [9] was to focus on visualization mechanisms for the representation of concept lattices, including nested line diagrams. Later on, many other interesting features were integrated into the tool.

Functional architecture of Lattice Miner

Lattice Miner Architecture LatticeMiner Architecture.JPG
Lattice Miner Architecture

Lattice Miner is a Java-based platform whose functions are articulated around a core. The Lattice Miner core provides all low-level operations and structures for the representation and manipulation of contexts, lattices and association rules. Mainly, the core of Lattice Miner consists of three modules: context, concept and association rule modules. The user interface offers a context editor and concept lattice manipulator to assist the user in a set of tasks. The architecture of Lattice Miner is open and modular enough to allow the integration of new features and facilities in each one of its components.

Context module

Figure 1 LatticeMiner Figure-1.JPG
Figure 1

The context module offers all the basic operations and structures to manipulate binary and valued contexts as well as context decomposition to produce nested line diagrams. Basic context operations include apposition, subposition, generalization, clarification, reduction as well as the complementary context computation. The module provides also the arrow relations (for context reduction and decomposition) [2]. The tool has an input LMB format and recognizes the binary format SLF found in Galicia and the format CEX produced by ConExp.

Concept module

Figure 2 LatticeMiner Figure-2.JPG
Figure 2

The main function of the concept module is to generate the concepts of the current binary context and construct the corresponding lattice and nested structure (see Figures 2 and 3). It provides the user with basic operators such as projection, selection, and exact search as well as advanced features like pair approximation. Some known algorithms are included in this module such as Bordat’s procedure, Godin’s algorithm and NextClosure algorithm. [10] The approximation feature implemented in Lattice Miner is based on the following idea: given a pair (X,Y) where X ⊆ G, and Y ⊆ M, is there a set of formal concepts (Ai,Bi) which are “close to” (X,Y)? To answer this question, The tool starts to identify the type of couple that the pair (X,Y) represents. [11] It can be a formal concept, a protoconcept, a semiconcept or a preconcept. In the last case, the approximation is given by the interval [(X",X′),(Y′,Y")] and highlighted in the line diagram.

Association rule module

This module includes procedures for computing the (stem) GuiguesDuquenne base using NextClosure algorithm [3], as well as the generic and informative bases. Implications with negation can be obtained using the apposition of a context and its complementary. This module embeds also procedures for the computation of a non-redundant family C of implications and the closure of a set Y of attributes for the given implication set C.

User interface

The initial objective of Lattice Miner was to focus on lattice drawing and visualization either as a flat or nested structure by taking into account the cognitive process of human beings and known principles for lattice drawing (e.g., reducing the number of edge intersections, ensuring diagram symmetry). Some well-known visualization techniques were implemented such as focus & context and fisheye view. The basic idea behind focus & context visualization paradigm is to allow a viewer to see key (important) objects in full detail in the foreground (focus) while at the same time an overview of all the surrounding information (context) remains available in the background. Lattice Miner translates the focus & context paradigm into clear and blurred elements while the size of nodes and the intensity of their color were used to indicate their importance. Various forms of highlighting, labelling and animation are also provided.

Figure 3 LatticeMiner Figure-3.JPG
Figure 3

In order to better handle the display of large lattices, nested line diagrams are offered in the tool. Figure 3 shows the third level of the nested line diagram corresponding to the binary context of Figure 1 where three levels of nesting are defined. Each one of the inner nodes of this diagram represents a combination of attributes from the previous two (outer) levels. Real inner concepts (see the node on the left hand-side of the diagram) are identified by colored nodes while void elements are in grey color. Each node of levels 1 and 2 can be expanded to exhibit its internal line diagram. Both flat and nested diagrams can be saved as an image. Simple (flat) lattices can also be saved as an XML format file.

Related Research Articles

Tree structure a way of representing the hierarchical nature of a structure in a graphical form

A tree structure or tree diagram is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, even though the chart is generally upside down compared to an actual tree, with the "root" at the top and the "leaves" at the bottom.

Data model abstract model for organizing data; abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

Software design is the process by which an agent creates a specification of a software artifact, intended to accomplish goals, using a set of primitive components and subject to constraints. Software design may refer to either "all the activity involved in conceptualizing, framing, implementing, commissioning, and ultimately modifying complex systems" or "the activity following requirements specification and before programming, as ... [in] a stylized software engineering process."

Formal concept analysis (FCA) is a principled way of deriving a concept hierarchy or formal ontology from a collection of objects and their properties. Each concept in the hierarchy represents the objects sharing some set of properties; and each sub-concept in the hierarchy represents a subset of the objects in the concepts above it. The term was introduced by Rudolf Wille in 1981, and builds on the mathematical theory of lattices and ordered sets that was developed by Garrett Birkhoff and others in the 1930s.

In computer science, model checking or property checking is a method for checking whether a finite-state model of a system meets a given specification. This is typically associated with hardware or software systems, where the specification contains liveness requirements as well as safety requirements.

Graph drawing visualization of node-link graphs

Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from applications such as social network analysis, cartography, linguistics, and bioinformatics.

Hasse diagram visual depiction of a partially ordered set

In order theory, a Hasse diagram is a type of mathematical diagram used to represent a finite partially ordered set, in the form of a drawing of its transitive reduction. Concretely, for a partially ordered set (S, ≤) one represents each element of S as a vertex in the plane and draws a line segment or curve that goes upward from x to y whenever y covers x . These curves may cross each other but must not touch any vertices other than their endpoints. Such a diagram, with labeled vertices, uniquely determines its partial order.

Conceptual graph

A conceptual graph (CG) is a formalism for knowledge representation. In the first published paper on CGs, John F. Sowa used them to represent the conceptual schemas used in database systems. The first book on CGs applied them to a wide range of topics in artificial intelligence, computer science, and cognitive science.

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

Ben Shneiderman American computer scientist

Ben Shneiderman is an American computer scientist, a Distinguished University Professor in the University of Maryland Department of Computer Science, which is part of the University of Maryland College of Computer, Mathematical, and Natural Sciences at the University of Maryland, College Park, and the founding director (1983-2000) of the University of Maryland Human-Computer Interaction Lab. He conducted fundamental research in the field of human–computer interaction, developing new ideas, methods, and tools such as the direct manipulation interface, and his eight rules of design.

Weka (machine learning) suite of machine learning software written in Java

Waikato Environment for Knowledge Analysis (Weka), developed at the University of Waikato, New Zealand. It is free software licensed under the GNU General Public License, and the companion software to the book "Data Mining: Practical Machine Learning Tools and Techniques".

Recursion (computer science) Use of functions that call themselves (on smaller inputs)

Recursion in computer science is a method of solving a problem where the solution depends on solutions to smaller instances of the same problem. Such problems can generally be solved by iteration, but this needs to identify and index the smaller instances at programming time. At the opposite, recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

Object-oriented design is the process of planning a system of interacting objects for the purpose of solving a software problem. It is one approach to software design.

Structured analysis method for analyzing and converting business requirements into specifications

In software engineering, structured analysis (SA) and structured design (SD) are methods for analyzing business requirements and developing specifications for converting practices into computer programs, hardware configurations, and related manual procedures.

Diagrammatic reasoning reasoning by the mean of visual representations

Diagrammatic reasoning is reasoning by means of visual representations. The study of diagrammatic reasoning is about the understanding of concepts and ideas, visualized with the use of diagrams and imagery instead of by linguistic or algebraic means.

Rudolf Wille German Mathematician

Rudolf Wille was a German mathematician and was professor of General Algebra from 1970 to 2003 at Technische Universität Darmstadt. His most celebrated work is the invention of formal concept analysis, an unsupervised machine learning technique that applies mathematical lattice theory to organize data based on objects and their shared attributes.

Core architecture data model

Core architecture data model (CADM) in enterprise architecture is a logical data model of information used to describe and build architectures.

KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing, for modeling, data analysis and visualization without, or with only minimal, programming.

NetMiner

NetMiner is an application software for exploratory analysis and visualization of large network data based on SNA(Social Network Analysis). It can be used for general research and teaching in social networks. This tool allows researchers to explore their network data visually and interactively, helps them to detect underlying patterns and structures of the network. It features data transformation, network analysis, statistics, visualization of network data, chart, and a programming language based on the Python script language. Also, it enables users to import unstructured text data(e.g. news, articles, tweets, etc.) and extract words and network from text data. In addition, NetMiner SNS Data Collector, an extension of NetMiner, can collect some social networking service(SNS) data with a few clicks.

In formal concept analysis (FCA) implications relate sets of properties. An implication AB holds in a given domain when every object having all attributes in A also has all attributes in B. Such implications characterize the concept hierarchy in an intuitive manner. Moreover, they are "well-behaved" with respect to algorithms. The knowledge acquisition method called attribute exploration uses implications.

References

  1. Boumedjout Lahcen and Leonard Kwuida. Lattice Miner: A Tool for Concept Lattice Construction and Exploration. In Supplementary Proceeding of International Conference on Formal concept analysis (ICFCA'10), 2010
  2. Bernhard Ganter and Rudolf Wille. Formal Concept Analysis: Mathematical Foundations. Springer-Verlag New York, Inc., 1999.
  3. Thomas Tilley. Tool support for fca. In ICFCA, pages 104–111, 2004.
  4. Pascal Hitzler and Henrik Schärfe. Conceptual Structures in Practice. studies in informatics series. CRC Press, 2009.
  5. Peter Becker and Joachim Hereth Correia. The toscanaj suite for implementing conceptual information systems. In Bernhard Ganter and Gerd Stumme, editors, Formal Concept Analysis, volume 3626 of Lecture Notes in Computer Science, pages 324–348. Springer Berlin / Heidelberg, July 2005.
  6. Petko Valtchev, David Grosser, Cyril Roume, and Mohamed Rouane Hacene. Galicia: An open platform for lattices. In In Using Conceptual Structures: Contributions to the 11th Intl. Conference on Conceptual Structures (ICCS03, pages 241–254. Shaker Verlag, Herzogenrath 2003.
  7. Concept explorer. http://conexp.sourceforge.net/license.html.
  8. Laszlo Szathmary and Amedeo Napoli. Coron: A framework for levelwise itemset mining algorithms. In Supplementary Proceedings of the Third International Conf. on Formal Concept Analysis (ICFCA’05), Lens, pages 110–113, 2005.
  9. Geneviève Roberge. Visualisation des résultats de la fouille des données dans les treillis des concepts. Master’s thesis, Université du Québec en Outaouais, 2007.
  10. Bernhard Ganter. Two basic algorithms in concept analysis. Preprint 831, Technische Hochschule Darmstadt, June 1984.
  11. Rokia Missaoui, L´eonard Kwuida, Mohamed Quafafou, and Jean Vaillancourt. Algebraic operators for querying pattern bases. CoRR, abs/0902.4042, 2009. Also published in the Supplementary proceedings of ICFCA’2009, pp. 117, Darmstadt, Germany, May 2009.