DAP (software)

Last updated
DAP
Developer(s) GNU Project
Initial release1998;24 years ago (1998) [1]
Stable release
3.10 / April 16, 2014;8 years ago (2014-04-16)
Repository
Written in C
Operating system Cross-platform
Type Statistical Analysis
License GNU General Public License
Website www.gnu.org/software/dap

Dap is a statistics and graphics program based on the C programming language that performs data management, analysis, and C-style graphical visualization tasks without requiring complex syntax. Its name is an acronym for Data Analysis and Presentation.

Contents

Dap was written to be a free replacement for SAS, but users are assumed to have a basic familiarity with the C programming language in order to permit greater flexibility.

It has been designed to be used on large data sets and is primarily used in statistical consulting practices.

However, even with its clear benefits, Dap hasn't been updated since 2014 and hasn't seen widespread use when compared to other statistical analysis programs.

Features

Dap is a command line driven program. Below are various features that DAP can perform.

DAP can compute means and percentiles, correlation, & ANOVA from data sets. This includes Unbalanced as well as Crossed, Nested ANOVA. It can also be used to create scatterplots, line graphs and histograms of data. This can include split plots, treatment combinations, as well as latin squares.

DAP can perform linear regression and can utilize regressions to build linear models. In addition to linear regression, DAP can also perform logistic regression analysis as well. There's a variety of other analysis that DAP can do as well including building loglinear models as well as Logit models for linear-by-linear association.

In terms of models, DAP can create mixed balanced and unbalanced models as well as random unbalanced models.

It has been designed so as to cope with very large data sets; even when the size of the data exceeds the size of the computer's memory due to the fact that the program processes files one line at a time rather than reading entire files into memory.

Applications

Industry Uses

Related Research Articles

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Ronald Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.

Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV) often called a treatment, while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates (CV) or nuisance variables. Mathematically, ANCOVA decomposes the variance in the DV into variance explained by the CV(s), variance explained by the categorical IV, and residual variance. Intuitively, ANCOVA can be thought of as 'adjusting' the DV by the group means of the CV(s).

Linear trend estimation is a statistical technique to aid interpretation of data. When a series of measurements of a process are treated as, for example, a sequences or time series, trend estimation can be used to make and justify statements about tendencies in the data, by relating the measurements to the times at which they occurred. This model can then be used to describe the behaviour of the observed data, without explaining it.

Georg C. F. Greve

Georg C. F. Greve is a software developer, physicist, author and currently co-founder and president at Vereign. He has been working on technology politics since he founded the Free Software Foundation Europe (FSFE) in 2001.

In robust statistics, robust regression is a form of regression analysis designed to overcome some limitations of traditional parametric and non-parametric methods. Regression analysis seeks to find the relationship between one or more independent variables and a dependent variable. Certain widely used methods of regression, such as ordinary least squares, have favourable properties if their underlying assumptions are true, but can give misleading results if those assumptions are not true; thus ordinary least squares is said to be not robust to violations of its assumptions. Robust regression methods are designed to be not overly affected by violations of assumptions by the underlying data-generating process.

EViews

EViews is a statistical package for Windows, used mainly for time-series oriented econometric analysis. It is developed by Quantitative Micro Software (QMS), now a part of IHS. Version 1.0 was released in March 1994, and replaced MicroTSP. The TSP software and programming language had been originally developed by Robert Hall in 1965. The current version of EViews is 12, released in November 2020.

MedCalc

MedCalc is a statistical software package designed for the biomedical sciences. It has an integrated spreadsheet for data input and can import files in several formats.

RATS, an abbreviation of Regression Analysis of Time Series, is a statistical package for time series analysis and econometrics. RATS is developed and sold by Estima, Inc., located in Evanston, IL.

The following tables compare general and technical information for a number of statistical analysis packages.

Segmented regression, also known as piecewise regression or broken-stick regression, is a method in regression analysis in which the independent variable is partitioned into intervals and a separate line segment is fit to each interval. Segmented regression analysis can also be performed on multivariate data by partitioning the various independent variables. Segmented regression is useful when the independent variables, clustered into different groups, exhibit different relationships between the variables in these regions. The boundaries between the segments are breakpoints.

PSPP is a free software application for analysis of sampled data, intended as a free alternative for IBM SPSS Statistics. It has a graphical user interface and conventional command-line interface. It is written in C and uses GNU Scientific Library for its mathematical routines. The name has "no official acronymic expansion".

GNU General Public License Series of free software licenses

The GNU General Public License is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general use and was originally written by the founder of the Free Software Foundation (FSF), Richard Stallman, for the GNU Project. The license grants the recipients of a computer program the rights of the Free Software Definition. These GPL series are all copyleft licenses, which means that any derivative work must be distributed under the same or equivalent license terms. It is more restrictive than the Lesser General Public License and even further distinct from the more widely used permissive software licenses BSD, MIT, and Apache.

XLfit is a Microsoft Excel-based plug-in that performs regression analysis, curve fitting, and statistical analysis. XLfit generates 2D and 3D graphs and analyses data sets produced by any type of research. XLfit can make linear and non-linear curve fits, smoothing, statistics, weighting, and error bars.

Free statistical software is a practical alternative to commercial packages. Many of the free to use programs aim to be similar in function to commercial packages, in that they are general statistical packages that perform a variety of statistical analyses. Many other free to use programs were designed specifically for particular functions, like factor analysis, power analysis in sample size calculations, classification and regression trees, or analysis of missing data.

One application of multilevel modeling (MLM) is the analysis of repeated measures data. Multilevel modeling for repeated measures data is most often discussed in the context of modeling change over time ; however, it may also be used for repeated measures data in which time is not a factor.

SegReg

In statistics and data analysis, the application software SegReg is a free and user-friendly tool for linear segmented regression analysis to determine the breakpoint where the relation between the dependent variable and the independent variable changes abruptly.

JASP

JASP is a free and open-source program for statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It offers standard analysis procedures in both their classical and Bayesian form. JASP generally produces APA style results tables and plots to ease publication. It promotes open science by integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results. The development of JASP is financially supported by several universities and research funds.

References

Sources

See also