Power Query

Last updated
The Power Query user interface, as shown in the Power Query Online experience. Power Query User Interface.png
The Power Query user interface, as shown in the Power Query Online experience.

Power Query is an ETL tool created by Microsoft for data extraction, loading and transformation, and is used to retrieve data from sources, process it, and load them into one or more target systems. Power Query is available in several variations within the Microsoft Power Platform, and is used for business intelligence on fully or partially self-service platforms. It is found in software such as Excel, Power BI, Analysis Services, Dataverse, [1] Power Apps, Azure Data Factory, SSIS, Dynamics 365, and in cloud services such as Microsoft Dataflows, [2] including Power BI Dataflow used with the online Power BI Service or the somewhat more generic version of Microsoft Dataflow used with Power Automate.

Contents

ETL is closely related to data modeling, [3] and for transformation, Power Query can be used to develop a logical data model in those cases where the data does not already have one, or where there is a need to further develop the data model.

History

Power Query was first announced in 2011 under the codename "Data Explorer" as part of Azure SQL Labs. In 2013, in order to expand on the self-service business intelligence capabilities of Microsoft Excel, the project was redesigned to be packaged as an add-in Excel and was renamed "Data Explorer Preview for Excel" [4] , and was made available for Excel 2010 and Excel 2013. [5] In July 2013 the add-in was removed from preview and renamed to "Power Query" [6] [7] . Monthly updates for the add-in were released until 2016, when Power Query was included in Excel natively. In Excel 2016, the function was renamed "Get & Transform" [8] for a short time, but has since been changed back to Power Query.

In April 2017, Power Query was made available in Microsoft Analysis Services. With the launch of the Common Data Service in March 2018, Power Query was included as its main data import tool.

M Formula language

Power Query is built on what was then[ when? ] a new query language called M. It is a mashup language (hence the letter M) designed to create queries that mix together data. It is similar to the F Sharp programming language, and according to Microsoft it is a "mostly pure, higher-order, dynamically typed, partially lazy, functional language." The M language is case-sensitive.

Much of the user interaction with Power Query can be done via graphical user interfaces with wizards, and this can be used for many common or basic tasks. It is also possible to use the advanced editing mode where the developer can write in the M formula language; this gives greater expressive power, more possibilities, and can also be used to change the code generated by the graphical wizards.

Let expression

User queries are typically written with a top level let expression. The let expression contains a list of comma-separated named reference (variables) bindings and an in expression which is what the let expression evaluates to. The in expression can reference the variables and the variables can reference each other. Backwards and forward referencing is allowed, and self-referencing is allowed by prefixing the @ on the variable. Variables are recursively evaluated as needed to evaluate the in expression. No variable is evaluated more than once.

Examples

leta="Hello",b="World",result=a&" "&binresult
letresult=Fib(5),Fib=(iteration)=>ifiteration=0oriteration=1then1elseleta=@Fib(iteration-1),b=@Fib(iteration-2)ina+binresult

Assertions and Datatypes

Variables are not typed in Power Query. Instead, an expression can have a type assertion which will evaluate to an error when the expression does not evaluate to a value compatible with the assertion. Assertions can be preceded by nullable to include null in the allowed values.

Datatypes/Assertions
NameDescriptionDatatypeAssertion
numberAssertion for integer and floating-point numbersDark Red x.svg NoGreen check.svg Yes
intSigned 32-bit integerGreen check.svg YesDark Red x.svg No
longSigned 64-bit integerGreen check.svg YesDark Red x.svg No
doubleIEEE 754 floatGreen check.svg YesDark Red x.svg No
decimal128-bit float. Same as C#'s decimalGreen check.svg YesDark Red x.svg No
timeTime of dayGreen check.svg YesGreen check.svg Yes
dateA calendar date ranging from 1 CE to 9999 CE in the Georgian CalendarGreen check.svg YesGreen check.svg Yes
datetimeA composite of the date and time datatypesGreen check.svg YesGreen check.svg Yes
durationA measurement of elapsed time (can be negative)Green check.svg YesGreen check.svg Yes
logicalRepresents a Boolean true or false valueGreen check.svg YesGreen check.svg Yes
textA Unicode stringGreen check.svg YesGreen check.svg Yes
guidA Globally Unique Identifier (Converts to a text as needed automatically)Green check.svg YesDark Red x.svg No
listAn ordered list of valuesGreen check.svg YesGreen check.svg Yes
recordAn ordered map from text to any valueGreen check.svg YesGreen check.svg Yes
tableA 2D matrix where each column has a unique name and type (type not checked on table contents)Green check.svg YesGreen check.svg Yes
functionA power query functionGreen check.svg YesGreen check.svg Yes
typeRepresents a datatype and may contain assertion informationGreen check.svg YesGreen check.svg Yes
actionAn internally used datatypeGreen check.svg YesGreen check.svg Yes
nullThe null singletonGreen check.svg YesGreen check.svg Yes
anyRepresents all valuesDark Red x.svg NoGreen check.svg Yes
anynonnullRepresents all values except nullDark Red x.svg NoGreen check.svg Yes
noneRepresents no values and always fails as an assertionDark Red x.svg NoGreen check.svg Yes
errorA pseudo value representing an errorDark Red x.svg NoDark Red x.svg No

Comments

Power Query supports the C block (/* ... */) and C line (// ...) comments.

DirectQuery

In Power BI, use of M-code is somewhat limited in DirectQuery, as opposed to Import which has all capabilities. This is due to the requirement that M-code in DirectQuery has to be translated from DAX into SQL and then into the data source's native query language (if it isn't SQL) upon user interaction with the data.

Query Folding

Query Folding is the ability for the Power Query steps to be transpiled into a single query at the data source (for example in Transact SQL). As such, Query Folding works like a traditional ETL process, and enables working on the data before loading. Query Folding is not always supported. Steps like filtering, selecting columns and simple SQL arithmetic are supported. Steps like creating index and appending or merging non foldable sources with foldable sources are not. Folding indicators (such as folding, not folding, might fold, opaque, unknown) might indicate up to which step a query might fold. Non-folding queries will have to be performed on the client-side. The order of queries can determine how many of the steps which get folded.

Connectors

A connector in power query is a library which enables queries to connect to a datasource. End users of Power BI can write custom connectors in the Power Query language. If a user is the owner/maintainer of the datasource technology they can request Microsoft certify their connector and include it in future releases of Power Bi after the review of the code. Connectors that are not certified can still be shared with other people who enable developer mode in Power Bi, or by Power Bi online domain administrators.

See also

Related Research Articles

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

<span class="mw-page-title-main">Microsoft Access</span> Database manager part of the Microsoft 365 package

Microsoft Access is a database management system (DBMS) from Microsoft that combines the relational Access Database Engine (ACE) with a graphical user interface and software-development tools. It is a member of the Microsoft 365 suite of applications, included in the Professional and higher editions or sold separately.

<span class="mw-page-title-main">F Sharp (programming language)</span> Microsoft programming language

F# is a general-purpose, high-level, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.

<span class="mw-page-title-main">Visual programming language</span> Programming language written graphically by a user

In computing, a visual programming language, also known as diagrammatic programming, graphical programming or block coding, is a programming language that lets users create programs by manipulating program elements graphically rather than by specifying them textually. A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols, used either as elements of syntax or secondary notation. For example, many VPLs are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations. VPLs are generally the basis of low-code development platforms.

In computing, dataflow is a broad concept, which has various meanings depending on the application and context. In the context of software architecture, data flow relates to stream processing or reactive programming.

A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. UDFs are usually written for the requirement of its creator.

Microsoft SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server database software that can be used to perform a broad range of data migration tasks.

SQL Server Reporting Services (SSRS) is a server-based report generating software system from Microsoft. It is part of a suite of Microsoft SQL Server services, including SSAS and SSIS.

Scriptella is an open source ETL (Extract-Transform-Load) and script execution tool written in Java. It allows the use of SQL or another scripting language suitable for the data source to perform required transformations. Scriptella does not offer any graphical user interface.

The null coalescing operator is a binary operator that is part of the syntax for a basic conditional expression in several programming languages, such as : C# since version 2.0, Dart since version 1.12.0, PHP since version 7.0.0, Perl since version 5.10 as logical defined-or, PowerShell since 7.0.0, and Swift as nil-coalescing operator.

Language Integrated Query is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, originally released as a major part of .NET Framework 3.5 in 2007.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

Business Intelligence Development Studio (BIDS) is the former IDE from Microsoft, and was used to develop data analysis and business intelligence solutions utilizing Microsoft SQL Server Analysis Services, Reporting Services and Integration Services. It is based on the Microsoft Visual Studio development environment, but customized with the SQL Server services-specific extensions and project types, including tools, controls and projects for reports, ETL dataflows, OLAP cubes and data mining structure.

Power Pivot, formerly known as PowerPivot, is a self-service business intelligence feature of Microsoft Excel which facilitates the creation of a tabular model to import, relate, and analyze data from a variety of sources.

<span class="mw-page-title-main">Array DBMS</span> System that provides database services specifically for arrays

An array database management system or array DBMS provides database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

<span class="mw-page-title-main">BIDS Helper</span>

BIDS Helper is a Visual Studio open source extension with multiple features that extend and enhance business intelligence development functionality in all editions of Microsoft's SQL Server 2005, 2008, 2008 R2 and 2012. BIDS Helper improves the development environment for integration, analysis and reporting services. BIDS Helper is hosted on GitHub.

Data Analysis Expressions (DAX) is the native formula and query language for Microsoft PowerPivot, Power BI Desktop and SQL Server Analysis Services (SSAS) Tabular models. DAX includes some of the functions that are used in Excel formulas with additional functions that are designed to work with relational data and perform dynamic aggregation. It is, in part, an evolution of the Multidimensional Expression (MDX) language developed by Microsoft for Analysis Services multidimensional models combined with Excel formula functions. It is designed to be simple and easy to learn, while exposing the power and flexibility of PowerPivot and SSAS tabular models.

<span class="mw-page-title-main">Cosmos DB</span> Cloud-based NoSQL database service

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL and vector database, which means it can handle unstructured, semi-structured, structured, and vector data types.

Microsoft Power BI is an interactive data visualization software product developed by Microsoft with a primary focus on business intelligence (BI). It is part of the Microsoft Power Platform. Power BI is a collection of software services, apps, and connectors that work together to turn various sources of data into static and interactive data visualizations. Data may be input by reading directly from a database, webpage, PDF, or structured files such as spreadsheets, CSV, XML, JSON, XLSX, and SharePoint.

Azure Data Explorer is a fully-managed big data analytics cloud platform and data-exploration service, developed by Microsoft, that ingests structured, semi-structured and unstructured data. The service then stores this data and answers analytic ad hoc queries on it with seconds of latency. It is a full-text indexing and retrieval database, including time series analysis capabilities and regular expression evaluation and text parsing.

References

  1. DougKlopfenstein. "Power Query documentation - Power Query" . Retrieved 2022-10-27.
  2. ptyx507x. "What is Power Query? - Power Query" . Retrieved 2022-10-27.{{cite web}}: CS1 maint: numeric names: authors list (link)
  3. Dearmer, Abe. "Why ETL Data Modeling is Critical in 2021" . Retrieved 2022-10-27.
  4. Raviv, Gil (2019). Collect, combine, and transform data using Power Query in Excel and Power BI. United States: published with the authorization of Microsoft Corporation by Pearson Education, Inc. ISBN   978-1-5093-0795-1. OCLC   1006311144.
  5. Webb, Chris (2013-02-27). "Public Preview of Data Explorer". Chris Webb's BI Blog. Retrieved 2025-01-27.
  6. kexugit (2013-07-06). "'Data Explorer' is now Microsoft Power Query for Excel". learn.microsoft.com. Retrieved 2024-05-15.
  7. Jackson, Joab (2013-07-08). "Microsoft adds business intelligence tools to Office 365". PCWorld. Retrieved 2025-01-27.
  8. "Integrating Power Query technology in Excel 2016". Microsoft 365 Blog. 2015-09-10. Retrieved 2024-05-17.