Application discovery and understanding

Last updated

Application discovery and understanding (ADU) is the process of automatically analyzing artifacts of a software application and determining metadata structures associated with the application in the form of lists of data elements and business rules. The relationships discovered between this application and a central metadata registry is then stored in the metadata registry itself.


Business benefits of ADU

On average, developers are spending only 5% of their time writing new code, 20% modifying the legacy code and up to 60% understanding the existing code. [1] Thus, ADU saves a great deal of time and expense for organizations that are involved in the change control and impact analysis of complex computer systems. Impact analysis allows managers to know that if specific structures are changed or removed altogether, what the impact of those changes might be to enterprise-wide systems. This process has been largely used in the preparation of Y2K changes and validations in software. [2]

Application Discovery and Understanding is part of the process enabling development teams to learn and improve themselves by providing information on the context and current state of the application. [3]

The process of gaining application understanding is greatly accelerated when the extracted metadata is displayed using interactive diagrams. [4]

When a developer can browse the metadata, and drill down into relevant details on demand, then application understanding is achieved in a way that is natural to the developer. [5] Significant reductions in the effort and time required to perform full impact analysis have been reported when ADU tools are implemented. [6] ADU tools are especially beneficial to newly hired developers. A newly hired developer will be productive much sooner and will require less assistance from the existing staff when ADU tools are in place. [4]

ADU process

ADU software is usually written to scan the following application structures:

The output of the ADU process frequently includes:

Note that a registered data element is any data element that already exists within a metadata registry.

See also

Related Research Articles

In computer programming and software design, code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior. Refactoring is intended to improve the design, structure, and/or implementation of the software, while preserving its functionality. Potential advantages of refactoring may include improved code readability and reduced complexity; these can improve the source code's maintainability and create a simpler, cleaner, or more expressive internal architecture or object model to improve extensibility. Another potential goal for refactoring is improved performance; software engineers face an ongoing challenge to write programs that perform faster or use less memory.

In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:

  1. An identification such as a data element name
  2. A clear data element definition
  3. One or more representation terms
  4. Optional enumerated values Code (metadata)
  5. A list of synonyms to data elements in other metadata registries Synonym ring

A web service (WS) is either:

<span class="mw-page-title-main">BEA Systems</span> Defunct American software corporation

BEA Systems, Inc. was a company that specialized in enterprise infrastructure software products, which was wholly acquired by Oracle Corporation on April 29, 2008.

Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.

A software regression is a type of software bug where a feature that has worked before stops working. This may happen after changes are applied to the software's source code, including the addition of new features and bug fixes. They may also be introduced by changes to the environment in which the software is running, such as system upgrades, system patching or a change to daylight saving time. A software performance regression is a situation where the software still functions correctly, but performs more slowly or uses more memory or resources than before. Various types of software regressions have been identified in practice, including the following:

Software mining is an application of knowledge discovery in the area of software modernization which involves understanding existing software artifacts. This process is related to a concept of reverse engineering. Usually the knowledge obtained from existing software is presented in the form of models to which specific queries can be made when necessary. An entity relationship is a frequent format of representing knowledge obtained from existing software. Object Management Group (OMG) developed specification Knowledge Discovery Metamodel (KDM) which defines an ontology for software assets and their relationships for the purpose of performing knowledge discovery of existing code.

Requirements traceability is a sub-discipline of requirements management within software development and systems engineering. Traceability as a general term is defined by the IEEE Systems and Software Engineering Vocabulary as (1) the degree to which a relationship can be established between two or more products of the development process, especially products having a predecessor-successor or primary-subordinate relationship to one another; (2) the identification and documentation of derivation paths (upward) and allocation or flowdown paths (downward) of work products in the work product hierarchy; (3) the degree to which each element in a software development product establishes its reason for existing; and (4) discernible association among two or more logical entities, such as requirements, system elements, verifications, or tasks.

Search-based software engineering (SBSE) applies metaheuristic search techniques such as genetic algorithms, simulated annealing and tabu search to software engineering problems. Many activities in software engineering can be stated as optimization problems. Optimization techniques of operations research such as linear programming or dynamic programming are often impractical for large scale software engineering problems because of their computational complexity or their assumptions on the problem structure. Researchers and practitioners use metaheuristic search techniques, which impose little assumptions on the problem structure, to find near-optimal or "good-enough" solutions.

Software archaeology or source code archeology is the study of poorly documented or undocumented legacy software implementations, as part of software maintenance. Software archaeology, named by analogy with archaeology, includes the reverse engineering of software modules, and the application of a variety of tools and processes for extracting and understanding program structure and recovering design information. Software archaeology may reveal dysfunctional team processes which have produced poorly designed or even unused software modules, and in some cases deliberately obfuscatory code may be found. The term has been in use for decades.

<span class="mw-page-title-main">Moose (analysis)</span>

Moose is a free and open source platform for software and data analysis built in Pharo.

<span class="mw-page-title-main">Tachyon (software)</span>

Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.

Within software engineering, the mining software repositories (MSR) field analyzes the rich data available in software repositories, such as version control repositories, mailing list archives, bug tracking systems, issue tracking systems, etc. to uncover interesting and actionable information about software systems, projects and software engineering.

Software analytics is the analytics specific to the domain of software systems taking into account source code, static and dynamic characteristics as well as related processes of their development and evolution. It aims at describing, monitoring, predicting, and improving the efficiency and effectiveness of software engineering throughout the software lifecycle, in particular during software development and software maintenance. The data collection is typically done by mining software repositories, but can also be achieved by collecting user actions or production data.

<span class="mw-page-title-main">ConQAT</span>

The Continuous Quality Assessment Toolkit (ConQAT) is a configurable software quality analysis engine. ConQAT is based on a pipes and filters architecture that enables flexible complex analysis configurations using a graphical configuration language. This architecture differs from other analysis tools that usually have a fixed data model and hard-wired analysis logics.

A software map represents static, dynamic, and evolutionary information of software systems and their software development processes by means of 2D or 3D map-oriented information visualization. It constitutes a fundamental concept and tool in software visualization, software analytics, and software diagnosis. Its primary applications include risk analysis for and monitoring of code quality, team activity, or software development progress and, generally, improving effectiveness of software engineering with respect to all related artifacts, processes, and stakeholders throughout the software engineering process and software maintenance.

Software diagnosis refers to concepts, techniques, and tools that allow for obtaining findings, conclusions, and evaluations about software systems and their implementation, composition, behaviour, and evolution. It serves as means to monitor, steer, observe and optimize software development, software maintenance, and software re-engineering in the sense of a business intelligence approach specific to software systems. It is generally based on the automatic extraction, analysis, and visualization of corresponding information sources of the software system. It can also be manually done and not automatic.

In software engineering, a microservice architecture is a variant of the service-oriented architecture structural style. It is an architectural pattern that arranges an application as a collection of loosely coupled, fine-grained services, communicating through lightweight protocols. One of its goals is that teams can develop and deploy their services independently of others. This is achieved by the reduction of several dependencies in the code base, allowing developers to evolve their services with limited restrictions from users, and for additional complexity to be hidden from users. As a consequence, organizations are able to develop software with fast growth and size, as well as use off-the-shelf services more easily. Communication requirements are reduced. These benefits come at a cost to maintaining the decoupling. Interfaces need to be designed carefully and treated as a public API. One technique that is used is having multiple interfaces on the same service, or multiple versions of the same service, so as to not disrupt existing users of the code.

Software Intelligence is insight into the inner workings and structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in Information Technology environments. Similarly to Business Intelligence (BI), Software Intelligence is produced by a set of software tools and techniques for the mining of data and the software's inner-structure. Results are automatically produced and feed a knowledge base containing technical documentation and make it available to all to be used by business and software stakeholders to make informed decisions, measure the efficiency of software development organizations, communicate about the software health, prevent software catastrophes.

Static application security testing (SAST) is used to secure software by reviewing the source code of the software to identify sources of vulnerabilities. Although the process of statically analyzing the source code has existed as long as computers have existed, the technique spread to security in the late 90s and the first public discussion of SQL injection in 1998 when Web applications integrated new technologies like JavaScript and Flash.


  1. Xin XIA; Lingfeng BAO; David LO; Zhengchang XING; Ahmed E HASSAN. "Measuring program comprehension: A large-scale field study with professionals".{{cite journal}}: Cite journal requires |journal= (help)
  2. Bohner (1996). "Impact analysis in the software change process: A year 2000 perspective". Proceedings of International Conference on Software Maintenance ICSM-96. pp. 42–51. doi:10.1109/ICSM.1996.564987. ISBN   0-8186-7677-9. S2CID   41115735.
  3. van Solingen; Berghout; Kusters; Trienekens (2000). "From process improvement to people improvement: enabling learning in software development". Information and Software Technology. 42 (14): 965–971. doi:10.1016/S0950-5849(00)00148-8.
  4. 1 2 Lanza, Michele; Ducasse, Stéphane (2002). "Understanding Software Evolution using a Combination of Software Visualization and Software Metrics" (PDF). In Proceedings of LMO 2002 (Langages et Modèles à Objets): 135–149.
  5. Storey, M.-A.D.; Wong, K.; Fracchia, F.D.; Muller, H.A. (1997). "On integrating visualization techniques for effective software exploration". Proceedings of VIZ '97: Visualization Conference, Information Visualization Symposium and Parallel Rendering Symposium. pp. 38–45. doi:10.1109/INFVIS.1997.636784. ISBN   0-8186-8189-6. S2CID   3091024.
  6. Canfora, G.; Cerulo, L. (2005). "Impact Analysis by Mining Software and Change Request Repositories". 11th IEEE International Software Metrics Symposium (METRICS'05). p. 29. doi:10.1109/METRICS.2005.28. ISBN   0-7695-2371-4. S2CID   16199730.