This article needs attention from an expert in software. The specific problem is: it appears to misrepresent the history of software analytics.(December 2014) |
Software analytics is the analytics specific to the domain of software systems taking into account source code, static and dynamic characteristics (e.g., software metrics) as well as related processes of their development and evolution. It aims at describing, monitoring, predicting, and improving the efficiency and effectiveness of software engineering throughout the software lifecycle, in particular during software development and software maintenance. The data collection is typically done by mining software repositories, but can also be achieved by collecting user actions or production data.
Software analytics aims at supporting decisions and generating insights, i.e., findings, conclusions, and evaluations about software systems and their implementation, composition, behavior, quality, evolution as well as about the activities of various stakeholders of these processes.
Methods, techniques, and tools of software analytics typically rely on gathering, measuring, analyzing, and visualizing information found in the manifold data sources stored in software development environments and ecosystems. Software systems are well suited for applying analytics because, on the one hand, mostly formalized and precise data is available and, on the other hand, software systems are extremely difficult to manage ---in a nutshell: "software projects are highly measurable, but often unpredictable." [2]
Core data sources include source code, "check-ins, work items, bug reports and test executions [...] recorded in software repositories such as CVS, Subversion, GIT, and Bugzilla." [4] Telemetry data as well as execution traces or logs can also be taken into account.
Automated analysis, massive data, and systematic reasoning support decision-making at almost all levels. In general, key technologies employed by software analytics include analytical technologies such as machine learning, data mining, statistics, pattern recognition, information visualization as well as large-scale data computing & processing. For example, software analytics tools allow users to map derived analysis results by means of software maps, which support interactively exploring system artifacts and correlated software metrics. There are also software analytics tools using analytical technologies on top of software quality models in agile software development companies, which support assessing software qualities (e.g., reliability), and deriving actions for their improvement. [5]
This article needs attention from an expert in Software. The specific problem is: it misrepresents the history of software analytics, strengthening a single researcher group that claims to have coined the expression software analytics.(August 2017) |
In May 2009, software analytics was first coined and proposed when Dongmei Zhang founded the Software Analytics Group (SA) at Microsoft Research Asia (MSRA). The term has become well known in the software engineering research community after a series of tutorials and talks on software analytics were given by Zhang and her colleagues, in collaboration with Tao Xie from North Carolina State University, at software engineering conferences including a tutorial at the IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), [6] a talk at the International Workshop on Machine Learning Technologies in Software Engineering (MALETS 2011), [7] a tutorial and a keynote talk given by Zhang at the IEEE-CS Conference on Software Engineering Education and Training, [8] [9] a tutorial at the International Conference on Software Engineering - Software Engineering in Practice Track, [10] and a keynote talk given by Zhang at the Working Conference on Mining Software Repositories. [11]
In November 2010, Software Development Analytics (Software Analytics with a focus on Software Development) was proposed by Thomas Zimmermann and his colleagues at the Empirical Software Engineering Group (ESE) at Microsoft Research Redmond in their FoSER 2010 paper. [12] A goldfish bowl panel on software development analytics was organized by Zimmermann and Tim Menzies from West Virginia University at the International Conference on Software Engineering, Software Engineering in Practice Track. [13]
Code review is a software quality assurance activity in which one or more people examine the source code of a computer program, either after implementation or during the development process. The persons performing the checking, excluding the author, are called "reviewers". At least one reviewer must not be the code's author.
Requirements engineering (RE) is the process of defining, documenting, and maintaining requirements in the engineering design process. It is a common role in systems engineering and software engineering.
Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.
A software regression is a type of software bug where a feature that has worked before stops working. This may happen after changes are applied to the software's source code, including the addition of new features and bug fixes. They may also be introduced by changes to the environment in which the software is running, such as system upgrades, system patching or a change to daylight saving time. A software performance regression is a situation where the software still functions correctly, but performs more slowly or uses more memory or resources than before. Various types of software regressions have been identified in practice, including the following:
Requirements traceability is a sub-discipline of requirements management within software development and systems engineering. Traceability as a general term is defined by the IEEE Systems and Software Engineering Vocabulary as (1) the degree to which a relationship can be established between two or more products of the development process, especially products having a predecessor-successor or primary-subordinate relationship to one another; (2) the identification and documentation of derivation paths (upward) and allocation or flowdown paths (downward) of work products in the work product hierarchy; (3) the degree to which each element in a software development product establishes its reason for existing; and (4) discernible association among two or more logical entities, such as requirements, system elements, verifications, or tasks.
Search-based software engineering (SBSE) applies metaheuristic search techniques such as genetic algorithms, simulated annealing and tabu search to software engineering problems. Many activities in software engineering can be stated as optimization problems. Optimization techniques of operations research such as linear programming or dynamic programming are often impractical for large scale software engineering problems because of their computational complexity or their assumptions on the problem structure. Researchers and practitioners use metaheuristic search techniques, which impose little assumptions on the problem structure, to find near-optimal or "good-enough" solutions.
Zhang Hongjiang is a Chinese computer scientist and executive. He was CEO of Kingsoft, managing director of Microsoft Advanced Technology Center (ATC) and chief technology officer (CTO) of Microsoft China Research and Development Group (CRD). Hongjiang is currently Chairman of BAAI. In 2022, he was elected to the National Academy of Engineering for his technical contributions and leadership in the area of multimedia computing.
Metamorphic testing (MT) is a property-based software testing technique, which can be an effective approach for addressing the test oracle problem and test case generation problem. The test oracle problem is the difficulty of determining the expected outcomes of selected test cases or to determine whether the actual outputs agree with the expected outcomes.
Within software engineering, the mining software repositories (MSR) field analyzes the rich data available in software repositories, such as version control repositories, mailing list archives, bug tracking systems, issue tracking systems, etc. to uncover interesting and actionable information about software systems, projects and software engineering.
Simcenter Amesim is a commercial simulation software for the modeling and analysis of multi-domain systems. It is part of systems engineering domain and falls into the mechatronic engineering field.
A software map represents static, dynamic, and evolutionary information of software systems and their software development processes by means of 2D or 3D map-oriented information visualization. It constitutes a fundamental concept and tool in software visualization, software analytics, and software diagnosis. Its primary applications include risk analysis for and monitoring of code quality, team activity, or software development progress and, generally, improving effectiveness of software engineering with respect to all related artifacts, processes, and stakeholders throughout the software engineering process and software maintenance.
Software diagnosis refers to concepts, techniques, and tools that allow for obtaining findings, conclusions, and evaluations about software systems and their implementation, composition, behaviour, and evolution. It serves as means to monitor, steer, observe and optimize software development, software maintenance, and software re-engineering in the sense of a business intelligence approach specific to software systems. It is generally based on the automatic extraction, analysis, and visualization of corresponding information sources of the software system. It can also be manually done and not automatic.
Software intelligence is insight into the inner workings and structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in information technology environments. Similarly to business intelligence (BI), software intelligence is produced by a set of software tools and techniques for the mining of data and the software's inner-structure. Results are automatically produced and feed a knowledge base containing technical documentation and blueprints of the innerworking of applications, and make it available to all to be used by business and software stakeholders to make informed decisions, measure the efficiency of software development organizations, communicate about the software health, prevent software catastrophes.
Automatic bug-fixing is the automatic repair of software bugs without the intervention of a human programmer. It is also commonly referred to as automatic patch generation, automatic bug repair, or automatic program repair. The typical goal of such techniques is to automatically generate correct patches to eliminate bugs in software programs without causing software regression.
Apache SINGA is an Apache top-level project for developing an open source machine learning library. It provides a flexible architecture for scalable distributed training, is extensible to run over a wide range of hardware, and has a focus on health-care applications.
Agile architecture means how enterprise architects, system architects and software architects apply architectural practice in agile software development. A number of commentators have identified a tension between traditional software architecture and agile methods along the axis of adaptation versus anticipation.
Wei Wang is a Chinese-born American computer scientist. She is the Leonard Kleinrock Chair Professor in Computer Science and Computational Medicine at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). Her research specializes in big data analytics and modeling, database systems, natural language processing, bioinformatics and computational biology, and computational medicine.
Ahmed E. Hassan is a professor at Queen's University in the Queen's School of Computing, where he leads the Software Analysis and Intelligence Lab (SAIL). He is a fellow of the ACM and IEEE. In 2023, he received the Mustafa Prize for his contributions to software engineering.
Xing Xie is a partner research manager at Microsoft Research Asia. As a computer scientist, his research has focused on data mining, social computing, and responsible AI. He has published more than 400 papers which have been cited more than 60,000 times. He has been on organizing committees or helped with the programs of over 70 conferences and workshops.
Kelly Blincoe is an American academic based in New Zealand. She is an associate professor at the University of Auckland and a Rutherford Discovery Fellow for Royal Society Te Apārangi. Her research primarily focuses on software engineering, with a particular interest in software project management, developer productivity, and collaboration in software development teams. She is an senior editor for Journal of Systems and Software and an editor for IEEE Transactions on Software Engineering, and Empirical Software Engineering.