Data-flow diagram

Last updated
Data flow diagram with data storage, data flows, function and interface Data-flow-diagram-example.svg
Data flow diagram with data storage, data flows, function and interface

A data-flow diagram is a way of representing a flow of data through a process or a system (usually an information system). The DFD also provides information about the outputs and inputs of each entity and the process itself. A data-flow diagram has no control flowthere are no decision rules and no loops. Specific operations based on the data can be represented by a flowchart. [1]


There are several notations for displaying data-flow diagrams. The notation presented above was described in 1979 by Tom DeMarco as part of structured analysis.

For each data flow, at least one of the endpoints (source and / or destination) must exist in a process. The refined representation of a process can be done in another data-flow diagram, which subdivides this process into sub-processes.

The data-flow diagram is a tool that is part of structured analysis and data modeling. When using UML, the activity diagram typically takes over the role of the data-flow diagram. A special form of data-flow plan is a site-oriented data-flow plan.

Data-flow diagrams can be regarded as inverted Petri nets, because places in such networks correspond to the semantics of data memories. Analogously, the semantics of transitions from Petri nets and data flows and functions from data-flow diagrams should be considered equivalent.


The DFD notation draws on graph theory, originally used in operational research to model workflow in organizations. DFD originated from the activity diagram used in the structured analysis and design technique methodology at the end of the 1970s. DFD popularizers include Edward Yourdon, Larry Constantine, Tom DeMarco, Chris Gane and Trish Sarson. [2]

Data-flow diagrams (DFD) quickly became a popular way to visualize the major steps and data involved in software-system processes. DFDs were usually used to show data flow in a computer system, although they could in theory be applied to business process modeling. DFDs were useful to document the major data flows or to explore a new high-level design in terms of data flow. [3]

DFD components

Data flow diagram - Yourdon/DeMarco notation Data-flow-diagram-notation.svg
Data flow diagram - Yourdon/DeMarco notation

DFD consists of processes, flows, warehouses, and terminators. There are several ways to view these DFD components. [4]


The process (function, transformation) is part of a system that transforms inputs to outputs. The symbol of a process is a circle, an oval, a rectangle or a rectangle with rounded corners (according to the type of notation). The process is named in one word, a short sentence, or a phrase that is clearly to express its essence. [2]

Data flow

Data flow (flow, dataflow) shows the transfer of information (sometimes also material) from one part of the system to another. The symbol of the flow is the arrow. The flow should have a name that determines what information (or what material) is being moved. Exceptions are flows where it is clear what information is transferred through the entities that are linked to these flows. Material shifts are modeled in systems that are not merely informative. Flow should only transmit one type of information (material). The arrow shows the flow direction (it can also be bi-directional if the information to/from the entity is logically dependent - e.g. question and answer). Flows link processes, warehouses and terminators. [2]


The warehouse (datastore, data store, file, database) is used to store data for later use. The symbol of the store is two horizontal lines, the other way of view is shown in the DFD Notation. The name of the warehouse is a plural noun (e.g. orders) - it derives from the input and output streams of the warehouse. The warehouse does not have to be just a data file but can also be, for example, a folder with documents, a filing cabinet, or a set of optical discs. Therefore, viewing the warehouse in a DFD is independent of implementation. The flow from the warehouse usually represents reading of the data stored in the warehouse, and the flow to the warehouse usually expresses data entry or updating (sometimes also deleting data). The warehouse is represented by two parallel lines between which the memory name is located (it can be modeled as a UML buffer node). [2]


The Terminator is an external entity that communicates with the system and stands outside of the system. It can be, for example, various organizations (eg a bank), groups of people (e.g. customers), authorities (e.g. a tax office) or a department (e.g. a human-resources department) of the same organization, which does not belong to the model system. The terminator may be another system with which the modeled system communicates. [2]

Rules for creating DFD

Entity names should be comprehensible without further comments. DFD is a system created by analysts based on interviews with system users. It is determined for system developers, on one hand, project contractor on the other, so the entity names should be adapted for model domain or amateur users or professionals. Entity names should be general (independent, e.g. specific individuals carrying out the activity), but should clearly specify the entity. Processes should be numbered for easier mapping and referral to specific processes. The numbering is random, however, it is necessary to maintain consistency across all DFD levels (see DFD Hierarchy). DFD should be clear, as the maximum number of processes in one DFD is recommended to be from 6 to 9, minimum is 3 processes in one DFD. [1] [2] The exception is the so-called contextual diagram where the only process symbolizes the model system and all terminators with which the system communicates.

DFD consistency

DFD must be consistent with other models of the system - entity relationship diagram, state-transition diagram, data dictionary, and process specification models. Each process must have its name, inputs and outputs. Each flow should have its name (exception see Flow). Each Data store must have input and output flow. Input and output flows do not have to be displayed in one DFD - but they must exist in another DFD describing the same system. An exception is warehouse standing outside the system (external storage) with which the system communicates. [2]

DFD hierarchy

To make the DFD more transparent (i.e. not too many processes), multi-level DFDs can be created. DFDs that are at a higher level are less detailed (aggregate more detailed DFD at lower levels). The contextual DFD is the highest in the hierarchy (see DFD Creation Rules). The so-called zero level is followed by DFD 0, starting with process numbering (e.g., process 1, process 2). In the next, the so-called first level - DFD 1 - the numbering continues. E.g. process 1 is divided into the first three levels of the DFD, which are numbered 1.1, 1.2 and 1.3. Similarly, processes in the second level (DFD 2) are numbered eg 2.1.1, 2.1.2, 2.1.3 and 2.1.4. The number of levels depends on the size of the model system. DFD 0 processes may not have the same number of decomposition levels. DFD 0 contains the most important (aggregated) system functions. The lowest level should include processes that make it possible to create a process specification for roughly one A4 page. If the mini-specification should be longer, it is appropriate to create an additional level for the process where it will be decomposed into multiple processes. For a clear overview of the entire DFD hierarchy, a vertical (cross-sectional) diagram can be created. The warehouse is displayed at the highest level where it is first used and at every lower level as well. [2]

See also

Related Research Articles

<span class="mw-page-title-main">Data model</span> Model that organizes elements of data and how they relate to one another and to real-world entities.

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

Software design is the process by which an agent creates a specification of a software artifact intended to accomplish goals, using a set of primitive components and subject to constraints. Software design may refer to either "all the activity involved in conceptualizing, framing, implementing, commissioning, and ultimately modifying complex systems" or "the activity following requirements specification and before programming, as ... [in] a stylized software engineering process."

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure Programing language.

Structured Systems Analysis and Design Method (SSADM) is a systems approach to the analysis and design of information systems. SSADM was produced for the Central Computer and Telecommunications Agency, a UK government office concerned with the use of technology in government, from 1980 onwards.

<span class="mw-page-title-main">Entity–relationship model</span> Model or diagram describing interrelated things

An entity–relationship model describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types and specifies relationships that can exist between entities.

The Shlaer–Mellor method, also known as Object-Oriented Systems Analysis (OOSA) or Object-Oriented Analysis (OOA) is an object-oriented software development methodology introduced by Sally Shlaer and Stephen Mellor in 1988. The method makes the documented analysis so precise that it is possible to implement the analysis model directly by translation to the target architecture, rather than by elaborating model changes through a series of more platform-specific models. In the new millennium the Shlaer–Mellor method has migrated to the UML notation, becoming Executable UML.

<span class="mw-page-title-main">Structured analysis and design technique</span>

Structured analysis and design technique (SADT) is a systems engineering and software engineering methodology for describing systems as a hierarchy of functions. SADT is a structured analysis modelling language, which uses two types of diagrams: activity models and data models. It was developed in the late 1960s by Douglas T. Ross, and was formalized and published as IDEF0 in 1981.

<span class="mw-page-title-main">Business Process Model and Notation</span> Graphical representation for specifying business processes

Business Process Model and Notation (BPMN) is a graphical representation for specifying business processes in a business process model.

Glossary of Unified Modeling Language (UML) terms provides a compilation of terminology used in all versions of UML, along with their definitions. Any notable distinctions that may exist between versions are noted with the individual entry it applies to.

Object-oriented design (OOD) is the process of planning a system of interacting objects for the purpose of solving a software problem. It is one approach to software design.

The Toolkit for Conceptual Modeling (TCM) is a collection of software tools to present specifications of software systems in the form of diagrams, tables, trees, and the like. TCM offers editors for techniques used in Structured Analysis as well as editors for object-oriented (UML) techniques. For some of the behavior specification techniques, an interface to model checkers is offered. More in particular, TCM contains the following editors.

Process Specification is a generic term for the specification of a process. It is not unique to business activity, but can be applied to any organizational activity.

<span class="mw-page-title-main">Structured analysis</span>

In software engineering, structured analysis (SA) and structured design (SD) are methods for analyzing business requirements and developing specifications for converting practices into computer programs, hardware configurations, and related manual procedures.

<span class="mw-page-title-main">Information flow diagram</span> Business process model

An information flow diagram (IFD) is a diagram that shows how information is communicated from a source to a receiver or target, through some medium. The medium acts as a bridge, a means of transmitting the information. Examples of media include word of mouth, radio, email, etc. The concept of IFD was initially used in radio transmission. The diagrammed system may also include feedback, a reply or response to the signal that was given out. The return paths can be two-way or bi-directional: information can flow back and forth.

<span class="mw-page-title-main">Event partitioning</span>

Event partitioning is an easy-to-apply systems analysis technique that helps the analyst organize requirements for large systems into a collection of smaller, simpler, minimally-connected, easier-to-understand "mini systems" / use cases.

<span class="mw-page-title-main">Function model</span>

In systems engineering, software engineering, and computer science, a function model or functional model is a structured representation of the functions within the modeled system or subject area.

<span class="mw-page-title-main">Functional flow block diagram</span>

A functional flow block diagram (FFBD) is a multi-tier, time-sequenced, step-by-step flow diagram of a system’s functional flow. The term "functional" in this context is different from its use in functional programming or in mathematics, where pairing "functional" with "flow" would be ambiguous. Here, "functional flow" pertains to the sequencing of operations, with "flow" arrows expressing dependence on the success of prior operations. FFBDs may also express input and output data dependencies between functional blocks, as shown in figures below, but FFBDs primarily focus on sequencing.

Enterprise engineering is the body of knowledge, principles, and practices used to design all or part of an enterprise. An enterprise is a complex socio-technical system that comprises people, information, and technology that interact with each other and their environment in support of a common mission. One definition is: "an enterprise life-cycle oriented discipline for the identification, design, and implementation of enterprises and their continuous evolution", supported by enterprise modelling. The discipline examines each aspect of the enterprise, including business processes, information flows, material flows, and organizational structure. Enterprise engineering may focus on the design of the enterprise as a whole, or on the design and integration of certain business components.

Process map is a global-system process model that is used to outline the processes that make up the business system and how they interact with each other. Process map shows the processes as objects, which means it is a static and non-algorithmic view of the processes. It should be differentiated from a detailed process model, which shows a dynamic and algorithmic view of the processes, usually known as a process flow diagram. There are different notation standards that can be used for modelling process maps, but the most notable ones are TOGAF Event Diagram, Eriksson-Penker notation, and ARIS Value Added Chain.

<span class="mw-page-title-main">Interaction Flow Modeling Language</span>

The Interaction Flow Modeling Language (IFML) is a standardized modeling language in the field of software engineering. IFML includes a set of graphic notations to create visual models of user interactions and front-end behavior in software systems.


  1. 1 2 Bruza, P. D.; van der Weide, Th. P. (1990-11-01). "Assessing the quality of hypertext views". ACM SIGIR Forum. 24 (3): 6–25. doi:10.1145/101306.101307. ISSN   0163-5840. S2CID   8507530.
  2. 1 2 3 4 5 6 7 8 Yourdon, Edward (1975). "Structured programming and structured design as art forms". Proceedings of the May 19–22, 1975, National Computer Conference and Exposition on - AFIPS '75: 277. doi: 10.1145/1499949.1499997 . S2CID   36802486.
  3. Larman, Craig (2012). Applying UML and patterns : an introduction to object-oriented analysis and design and iterative development (3rd ed.). New Delhi: Pearson. ISBN   978-8177589795. OCLC   816555477.
  4. Řepa, Václav (1999). Analýza a návrh informačních systémů (Vyd. 1 ed.). Praha: Ekopress. ISBN   978-8086119137. OCLC   43612982.