Data Transformation Services

Last updated

Data Transformation Services, or DTS, is a set of objects and utilities to allow the automation of extract, transform and load operations to or from a database. The objects are DTS packages and their components, and the utilities are called DTS tools. DTS was included with earlier versions of Microsoft SQL Server, and was almost always used with SQL Server databases, although it could be used independently with other databases.

Contents

DTS allows data to be transformed and loaded from heterogeneous sources using OLE DB, ODBC, or text-only files, into any supported database. DTS can also allow automation of data import or transformation on a scheduled basis, and can perform additional functions such as FTPing files and executing external programs. In addition, DTS provides an alternative method of version control and backup for packages when used in conjunction with a version control system, such as Microsoft Visual SourceSafe.

Here a DTS package is edited with DTS Designer in Windows XP. DTS Designer screenshot.PNG
Here a DTS package is edited with DTS Designer in Windows XP.

DTS has been superseded by SQL Server Integration Services in later releases of Microsoft SQL Server though there was some backwards compatibility and ability to run DTS packages in the new SSIS for a time.

History

In SQL Server versions 6.5 and earlier, database administrators (DBAs) used SQL Server Transfer Manager and Bulk Copy Program, included with SQL Server, to transfer data. These tools had significant shortcomings, and many[ quantify ] DBAs used third-party tools such as Pervasive Data Integrator to transfer data more flexibly and easily. With the release of SQL Server 7 in 1998, "Data Transformation Services" was packaged with it to replace all these tools. The concept, design, and implementation of the Data Transformation Services was led by Stewart P. MacLeod (SQL Server Development Group Program Manager), Vij Rajarajan (SQL Server Lead Developer), and Ted Hart (SQL Server Lead Developer). The goal was to make it easier to import, export, and transform heterogeneous data and simplify the creation of data warehouses from operational data sources.

SQL Server 2000 expanded DTS functionality in several ways. It introduced new types of tasks, including the ability to FTP files, move databases or database components, and add messages into Microsoft Message Queue. DTS packages can be saved as a Visual Basic file in SQL Server 2000, and this can be expanded to save into any COM-compliant language. Microsoft also integrated packages into Windows 2000 security and made DTS tools more user-friendly; tasks can accept input and output parameters.

DTS comes with all editions of SQL Server 7 and 2000, but was superseded by SQL Server Integration Services in the Microsoft SQL Server 2005 release in 2005.

DTS packages

The DTS package is the fundamental logical component of DTS; every DTS object is a child component of the package. Packages are used whenever one modifies data using DTS. All the metadata about the data transformation is contained within the package. Packages can be saved directly in a SQL Server, or can be saved in the Microsoft Repository or in COM files. SQL Server 2000 also allows a programmer to save packages in a Visual Basic or other language file (when stored to a VB file, the package is actually scripted—that is, a VB script is executed to dynamically create the package objects and its component objects).

A package can contain any number of connection objects, but does not have to contain any. These allow the package to read data from any OLE DB-compliant data source, and can be expanded to handle other sorts of data. The functionality of a package is organized into tasks and steps.

A DTS Task is a discrete set of functionalities executed as a single step in a DTS package. Each task defines a work item to be performed as part of the data movement and data transformation process or as a job to be executed.

Data Transformation Services supplies a number of tasks that are part of the DTS object model and that can be accessed graphically through the DTS Designer or accessed programmatically. These tasks, which can be configured individually, cover a wide variety of data copying, data transformation and notification situations. For example, the following types of tasks represent some actions that you can perform by using DTS: executing a single SQL statement, sending an email, and transferring a file with FTP.

A step within a DTS package describes the order in which tasks are run and the precedence constraints that describe what to do in the case damage or of failure. These steps can be executed sequentially or in parallel.

Packages can also contain global variables which can be used throughout the package. SQL Server 2000 allows input and output parameters for tasks, greatly expanding the usefulness of global variables. DTS packages can be edited, password protected, scheduled for execution, and retrieved by version.

DTS tools

DTS tools packaged with SQL Server include the DTS wizards, DTS Designer, and DTS Programming Interfaces.

DTS wizards

The DTS wizards can be used to perform simple or common DTS tasks. These include the Import/Export Wizard and the Copy of Database Wizard. They provide the simplest method of copying data between OLE DB data sources. There is a great deal of functionality that is not available by merely using a wizard. However, a package created with a wizard can be saved and later altered with one of the other DTS tools.

A Create Publishing Wizard is also available to schedule packages to run at certain times. This only works if SQL Server Agent is running; otherwise the package will be scheduled, but will not be executed.

DTS Designer

The DTS Designer is a graphical tool used to build complex DTS Packages with workflows and event-driven logic. DTS Designer can also be used to edit and customize DTS Packages created with the DTS wizard.

Each connection and task in DTS Designer is shown with a specific icon. These icons are joined with precedence constraints, which specify the order and requirements for tasks to be run. One task may run, for instance, only if another task succeeds (or fails). Other tasks may run concurrently.

The DTS Designer has been criticized for having unusual quirks and limitations, such as the inability to visually copy and paste multiple tasks at one time. Many of these shortcomings have been overcome in SQL Server Integration Services, DTS's successor.

DTS Query Designer

A graphical tool used to build queries in DTS.

DTS Run Utility

DTS Packages can be run from the command line using the DTSRUN Utility.
The utility is invoked using the following syntax:

dtsrun /S server_name[\instance_name]         { {/[~]U user_name [/[~]P password]} | /E }     ]     {             {/[~]N package_name }         | {/[~]G package_guid_string}         | {/[~]V package_version_guid_string}     }     [/[~]M package_password]     [/[~]F filename]     [/[~]R repository_database_name]     [/A global_variable_name:typeid=value]      [/L log_file_name]     [/W NT_event_log_completion_status]     [/Z] [/!X] [/!D] [/!Y] [/!C] ] 

When passing in parameters which are mapped to Global Variables, you are required to include the typeid. This is rather difficult to find on the Microsoft site. Below are the TypeIds used in passing in these values.

Typetypeid
Boolean11
Currency6
Date7
Decimal14
HRESULT25
Int22
Integer (1-byte)16
Integer (8-byte)20
Integer (small)2
LPWSTR31
Pointer26
Real (4-byte)4
Real (8-byte)5
String8
Unsigned int (1-byte)17
Unsigned int (2-byte)18
Unsigned int (4-byte)19
Unsigned int (1-byte)21
Unsigned int23

See also

Related Research Articles

Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. It is included in most Windows Server operating systems as a set of processes and services. Initially, Active Directory was used only for centralized domain management. However, Active Directory eventually became an umbrella title for a broad range of directory-based identity-related services.

Microsoft Access Database manager that is part of the Microsoft Office package

Microsoft Access is a database management system (DBMS) from Microsoft that combines the relational Access Database Engine (ACE) with a graphical user interface and software-development tools. It is a member of the Microsoft 365 suite of applications, included in the Professional and higher editions or sold separately.

Oracle Forms is a software product for creating screens that interact with an Oracle database. It has an IDE including an object navigator, property sheet and code editor that uses PL/SQL. It was originally developed to run server-side in character mode terminal sessions. It was ported to other platforms, including Windows, to function in a client–server environment. Later versions were ported to Java where it runs in a Java EE container and can integrate with Java and web services that can be launched from a URL. Recent versions provide a means to run the forms from a desktop, without requiring the a browser.

Microsoft Project Project management software

Microsoft Project is a project management software product, developed and sold by Microsoft. It is designed to assist a project manager in developing a schedule, assigning resources to tasks, tracking progress, managing the budget, and analyzing workloads.

The Access Database Engine is a database engine on which several Microsoft products have been built. The first version of Jet was developed in 1992, consisting of three modules which could be used to manipulate a database.

Microsoft Data Access Components Framework

Microsoft Data Access Components is a framework of interrelated Microsoft technologies that allows programmers a uniform and comprehensive way of developing applications that can access almost any data store. Its components include: ActiveX Data Objects (ADO), OLE DB, and Open Database Connectivity (ODBC). There have been several deprecated components as well, such as the Jet Database Engine, MSDASQL, and Remote Data Services (RDS). Some components have also become obsolete, such as the former Data Access Objects API and Remote Data Objects.

Uniface (programming language) Low-code development platform

Uniface is a low-code development and deployment platform for enterprise applications that can run in a large range of runtime environments, including mobile, mainframe, web, Service-oriented architecture (SOA), Windows, Java EE and .NET. Uniface is used to create mission-critical applications.

SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server database software that can be used to perform a broad range of data migration tasks.

SQL Server Reporting Services (SSRS) is a server-based report generating software system from Microsoft. It is part of a suite of Microsoft SQL Server services, including SSAS and SSIS.

COM Structured Storage is a technology developed by Microsoft as part of its Windows operating system for storing hierarchical data within a single file. Strictly speaking, the term structured storage refers to a set of COM interfaces that a conforming implementation must provide, and not to a specific implementation, nor to a specific file format. In addition to providing a hierarchical structure for data, structured storage may also provide a limited form of transactional support for data access. Microsoft provides an implementation that supports transactions, as well as one that does not.

Windows Vista contains a range of new technologies and features that are intended to help network administrators and power users better manage their systems. Notable changes include a complete replacement of both the Windows Setup and the Windows startup processes, completely rewritten deployment mechanisms, new diagnostic and health monitoring tools such as random access memory diagnostic program, support for per-application Remote Desktop sessions, a completely new Task Scheduler, and a range of new Group Policy settings covering many of the features new to Windows Vista. Subsystem for UNIX Applications, which provides a POSIX-compatible environment is also introduced.

In the field of software, SQL programming tools provide platforms for database administrators (DBAs) and application developers to perform daily tasks efficiently and accurately.

Microsoft Visual Studio Code editor and IDE

Microsoft Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs, as well as websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, Windows Store and Microsoft Silverlight. It can produce both native code and managed code.

Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

Oracle SQL Developer

Oracle SQL Developer is an Integrated development environment (IDE) for working with SQL in Oracle databases. Oracle Corporation provides this product free; it uses the Java Development Kit.

The Softwell Maker is an environment development. Designer uses visual forms and reports (WYSIWYG), business rules, and visually representing other actions using flowcharts.

Apache Hive Database engine

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (HiveQL) into the underlying Java without the need to implement queries in the low-level Java API. Since most data warehousing applications work with SQL-based querying languages, Hive aids portability of SQL-based applications to Hadoop. While initially developed by Facebook, Apache Hive is used and developed by other companies such as Netflix and the Financial Industry Regulatory Authority (FINRA). Amazon maintains a software fork of Apache Hive included in Amazon Elastic MapReduce on Amazon Web Services.

BIDS Helper

BIDS Helper is a Visual Studio open source extension with multiple features that extend and enhance business intelligence development functionality in all editions of Microsoft's SQL Server 2005, 2008, 2008 R2 and 2012. BIDS Helper improves the development environment for integration, analysis and reporting services. BIDS Helper is hosted on GitHub.

Hierarchical Cluster Engine Project

Hierarchical Cluster Engine (HCE) is a FOSS complex solution for: construct custom network mesh or distributed network cluster structure with several relations types between nodes, formalize the data flow processing goes from upper node level central source point to down nodes and backward, formalize the management requests handling from multiple source points, support native reducing of multiple nodes results, internally support powerful full-text search engine and data storage, provide transactions-less and transactional requests processing, support flexible run-time changes of cluster infrastructure, have many languages bindings for client-side integration APIs in one product build on C++ language.

The history of Microsoft SQL Server begins with the first Microsoft SQL Server database product - SQL Server v1.0, a 16-bit Relational Database for the OS/2 operating system, released in 1989.

References