IBM InfoSphere DataStage

Last updated
IBM InfoSphere DataStage
Original author(s) Lee Scheffler
Stable release
11.x
Platform ETL Tool
Type Data integration
Website http://www.ibm.com

IBM InfoSphere DataStage is an ETL tool and part of the IBM Information Platforms Solutions suite and IBM InfoSphere. It uses a graphical notation to construct data integration solutions and is available in various versions such as the Server Edition, the Enterprise Edition, and the MVS Edition. It uses a client-server architecture. The servers can be deployed in both Unix as well as Windows.

Contents

It is a powerful data integration tool, frequently used in Data Warehousing projects to prepare the data for the generation of reports.

History

DataStage originated at VMark Software Inc, [1] a company that developed two notable products: UniVerse database and the DataStage ETL tool. The first VMark ETL prototype was built by Lee Scheffler in the first half of 1996. [2] Peter Weyman was VMark VP of Strategy and identified the ETL market as an opportunity. He appointed Lee Scheffler as the architect and conceived the product brand name "Stage" to signify modularity and component-orientation. [3] This tag was used to name DataStage and subsequently used in related products QualityStage, ProfileStage, MetaStage and AuditStage. Lee Scheffler presented the DataStage product overview to the board of VMark in June 1996 and it was approved for development. The product was in alpha testing in October, beta testing in November and was generally available in January 1997.

VMARK and Unidata merged in October 1997 and renamed themselves to Ardent Software. [4] In 1999 Ardent Software was acquired by Informix the database software vendor. In April 2001 IBM acquired Informix and took just the database business leaving the data integration tools to be spun off as an independent software company called Ascential Software. [5] In November 2001, Ascential Software Corp. of Westboro, Mass. acquired privately held Torrent Systems Inc. of Cambridge, Massachusetts for $46 million in cash. Ascential announced a commitment to integrate Orchestrate's parallel processing capabilities directly into the DataStageXE platform. [6] In March 2005 IBM acquired Ascential Software [7] and made DataStage part of the WebSphere family as WebSphere DataStage. In 2006 the product was released as part of the IBM Information Server under the Information Management family but was still known as WebSphere DataStage. In 2008 the suite was renamed to InfoSphere Information Server and the product was renamed to InfoSphere DataStage. [8]

Releases

IBM Acquisition

InfoSphere DataStage is a data integration tool. It was acquired by IBM in 2005 and has become a part of IBM Information Server Platform. It uses a client/server design where jobs are created and administered via a Windows client against central repository on a server. The IBM InfoSphere DataStage is capable of integrating data on demand across multiple and high volumes of data sources and target applications using a high performance parallel framework. InfoSphere DataStage also facilitates extended metadata management and enterprise connectivity

Major DataStage Versions and Life Cycle

Related Research Articles

<span class="mw-page-title-main">MVS</span> Operating system for IBM mainframes

Multiple Virtual Storage, more commonly called MVS, is the most commonly used operating system on the System/370, System/390 and IBM Z IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated to IBM's other mainframe operating system lines, e.g., VSE, VM, TPF.

z/OS 64-bit operating system for IBM mainframes

z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions. Like OS/390, z/OS combines a number of formerly separate, related products, some of which are still optional. z/OS has the attributes of modern operating systems but also retains much of the older functionality that originated in the 1960s and is still in regular use—z/OS is designed for backward compatibility.

<span class="mw-page-title-main">Informix</span> Database management software product family

Informix is a product family within IBM's Information Management division that is centered on several relational database management system (RDBMS) and multi-model database offerings. The Informix products were originally developed by Informix Corporation, whose Informix Software subsidiary was acquired by IBM in 2001.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.

<span class="mw-page-title-main">CICS</span> IBM mainframe transaction monitor

IBM CICS is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE.

WebSphere Application Server (WAS) is a software product that performs the role of a web application server. More specifically, it is a software framework and middleware that hosts Java-based web applications. It is the flagship product within IBM's WebSphere software suite. It was initially created by Donald F. Ferguson, who later became CTO of Software for Dell. The first version was launched in 1998. This project was an offshoot from IBM HTTP Server team starting with the Domino Go web server.

IBM Planning Analytics powered by TM1 is a business performance management software suite designed to implement collaborative planning, budgeting and forecasting solutions, interactive "what-if" analyses, as well as analytical and reporting applications.

In computing, a Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of up to 32 systems to share a workload for high performance and high availability.

Business intelligence software is a type of application software designed to retrieve, analyze, transform and report data for business intelligence (BI). The applications generally read data that has been previously stored, often - though not necessarily - in a data warehouse or data mart.

IBM App Connect Enterprise (abbreviated as IBM ACE, formerly known as IBM Integration Bus, WebSphere Message Broker, WebSphere Business Integration Message Broker, WebSphere MQSeries Integrator and started life as MQSeries Systems Integrator. App Connect IBM's integration software offering, allowing business information to flow between disparate applications across multiple hardware and software platforms. Rules can be applied to the data flowing through user-authored integrations to route and transform the information. The product can be used as an Enterprise Service Bus supplying a communication channel between applications and services in a service-oriented architecture. App Connect from V11 supports container native deployments with highly optimised container start-up times.

SB/XA is a 4GL development and runtime environment originally written for the Pick family of computer databases/environments and now part of the Rocket U2 software suite.

IBM Data magazine is a U.S.-based custom, online magazine published by TDA Group for IBM Corp. With a worldwide readership of more than 100,000, the magazine provides how-to information about Db2, Informix, UniVerse, and UniData, along with coverage of related tools, software, and solutions.

Information Management Software is one of the brands within IBM Software Group (SWG) division. The major Information Management products include:

<span class="mw-page-title-main">WaveMaker</span> Low-code programming platform

WaveMaker is a Java-based low-code development platform designed for building software applications and platforms. The company, WaveMaker Inc., is based in Mountain View, California. The platform is intended to assist enterprises in speeding up their application development and IT modernization initiatives through low-code capabilities. Additionally, for independent software vendors (ISVs), WaveMaker serves as a customizable low-code component that integrates into their products.

Precisely Holdings, LLC, doing business as Precisely, is a software company specializing in data integrity tools, and also providing big data, high-speed sorting, ETL, data integration, data quality, data enrichment, and location intelligence offerings. The company was originally founded as Whitlow Computer Systems before rebranding as Syncsort Incorporated in 1981, and then to its current form in 2020. Its original, eponymously named product, SyncSort, was the dominant sort program for IBM mainframe computers during much of the 1970s and 1980s.

Innovative Routines International (IRI), Inc. is an American software company first known for bringing mainframe sort merge functionality into open systems. IRI was the first vendor to develop a commercial replacement for the Unix sort command, and combine data transformation and reporting in Unix batch processing environments. In 2007, IRI's coroutine sort ("CoSort") became the first product to collate and convert multi-gigabyte XML and LDIF files, join and lookup across multiple files, and apply role-based data privacy functions for fields within sensitive files.

<span class="mw-page-title-main">Pentaho</span> Business intelligence software

Pentaho is the brand name for several Data Management software products that make up the Pentaho+ Data Platform. These include Pentaho Data Integration, Pentaho Business Analytics, Pentaho Data Catalog, and Pentaho Data Optimiser.

Informix Corporation, formerly Informix Software, Inc., was a software company located in Menlo Park, California. It was a developer of relational database software for computers using the Unix, Microsoft Windows, and Apple Macintosh operating systems.

Rocket U2 is a suite of database management (DBMS) and supporting software now owned by Rocket Software. It includes two MultiValue database platforms: UniData and UniVerse. Both of these products are operating environments which run on current Unix, Linux and Windows operating systems. They are both derivatives of the Pick operating system. The family also includes developer and web-enabling technologies including SB/XA, U2 Web Development Environment (WebDE), UniObjects connectivity API and wIntegrate terminal emulation software.

Actian is an American software company headquartered in Santa Clara, California that provides analytics-related software, products, and services. The company sells database software and technology, cloud engineered systems, and data integration solutions.

References

  1. "VMark Software Inc - Companies on the Move - Brief Article"
  2. McBurney, Vincent (2006), "Lee Scheffler interview - the ghost of DataStage past", Tooling Around in the IBM InfoSphere
  3. McBurney, Vincent (2006), "Lee Scheffler Interview - the Ghost of DataStage present", Tooling Around in the IBM InfoSphere
  4. Spotts, Jeff (1997), "VMARK and Unidata Announce MergerAgreement", Business Wire
  5. "IBM and Informix Corp. Sign Agreement for Sale of Informix Database Business to IBM", Press Release, 2001
  6. Russom, Philip (2002), "Orchestrating a Torrent", Intelligent Enterprise Magazine, archived from the original on 2008-12-03
  7. "IBM to Acquire Ascential Software", Press Release, IBM, 2005
  8. IBM Corporation (2008), IBM InfoSphere Information Server Version 8.1 and product name changes, IBM