Microsoft UI Automation

Last updated

Microsoft UI Automation (UIA) is an application programming interface (API) that allows one to access, identify, and manipulate the user interface (UI) elements of another application. [1] [2]

Contents

UIA is targeted at providing UI accessibility and it is a successor to Microsoft Active Accessibility. It also facilitates GUI test automation, and it is the engine upon which many test automation tools are based. RPA tools also use it to automate applications in business processes.

UIA's property providers support both Win32 and .NET programs.

The latest specification of UIA is found as part of the Microsoft UI Automation Community Promise Specification. Microsoft claims that portability to platforms other than Microsoft Windows was one of its design goals. It has since been ported to Mono. [3]

History

In 2005, Microsoft released UIA as a successor to MSAA framework.

Managed UI Automation API was released as a part of .NET Framework 3.0. The native UI Automation API (provider) is included as part of the Windows Vista and Windows Server 2008 SDK and is also distributed with the .NET Framework.

UIA is available out of the box in Windows 7 as a part of Windows Automation API 3.0 and as a separate download for Windows XP, Windows Vista, and Windows Server 2003 and 2008. [4]

Motivation and goals

As a successor to MSAA, UIA aims to address the following goals:

Technical overview

Frame UI Automation.jpg
Frame

At client side, UIA provides a .NET interface in UIAutomationClient.dll assembly and a COM interface implemented directly in UIAutomationCore.dll.

At server side, UIAutomationCore.dll is injected into all or selected processes on the current desktop to perform data retrieval on behalf of a client. The DLL can also load UIA plugins (called providers) into its host process to extract data using different techniques.

UIA has four main provider and client components, as shown in the following table.

ComponentDescription
UIAutomationCore (UIAutomationCore.dll and dependents)The underlying code (sometimes called the UIA core) that handles communication between providers and clients. UI Automation Core also offers the provider and client API interfaces for unmanaged applications and clients; unmanaged applications (either clients or providers) do not require the managed assemblies listed below.
Managed Provider API (UIAutomationProvider.dll and dependents)

A set of interface definitions and functions that are implemented by managed UIA provider applications. Providers are objects that provide information about UI elements and respond to programmatic input.

Managed Client API (UIAutomationClient.dll and dependents)A set of interface definitions and functions for managed UIA client applications.
UIAutomationClientsideProviders.dllA set of UIA provider implementations for legacy Win32 controls and MSAA applications. This client-side provider is available to managed client applications by default.

Elements

UIA exposes every piece of the UI to client applications as an Automation Element. Elements are contained in a tree structure, with the desktop as the root element.

Automation Element objects expose common properties of the UI elements they represent. One of these properties is the control type, which defines its basic appearance and functionality as a single recognizable entity (e.g., a button or check box).

In addition, elements expose control patterns that provide properties specific to their control types. Control patterns also expose methods that enable clients to get further information about the element and to provide input.

Clients can filter the raw view of the tree as a control view or a content view. Applications can also create custom views.

Tree

Within the UIA tree there is a root element that represents the current desktop and whose child elements represent application windows. Each of these child elements may contain elements representing pieces of UI such as menus, buttons, toolbars, and list boxes. These elements, in turn, can contain other elements, such as list items.

The UIA tree is not a fixed structure and is seldom seen in its totality because it might contain thousands of elements. Parts of the tree are built as they are needed, and the tree can undergo changes as elements are added, moved, or removed.

Control types

UIA control types are well-known identifiers that can be used to indicate what kind of control a particular element represents, such as a combo box or a button.

Having a well-known identifier allows assistive technology (AT) devices to more easily determine what types of controls are available in the user interface (UI) and how to interact with the controls. A human-readable representation of the UIA control type information is available as a LocalizedControlType property, which can be customizable by control or application developers.

Control patterns

Control patterns provide a way to categorize and expose a control's functionality independent of the control type or the appearance of the control.

UIA uses control patterns to represent common control behaviors. For example, the Invoke control pattern is used for controls that can be invoked (such as buttons) and the Scroll control pattern is used for controls that are scrollable viewports (such as list boxes, list views, or combo boxes). Because each control pattern represents a separate functionality, they can be combined to describe the full set of functionality supported by a particular control.

Properties

UIA providers expose properties on UIA elements and the control patterns. These properties enable UIA client applications to discover information about pieces of the user interface (UI), especially controls, including both static and dynamic data.

Events

UIA event notification is a key feature for assistive technologies (AT) such as screen readers and screen magnifiers. These UIA clients track events that are raised by UIA providers that occur within the UIA, and use the information to notify end users.

Efficiency is improved by allowing provider applications to raise events selectively, depending on whether any clients are subscribed to those events, or not at all, if no clients are listening for any events.

TextPattern

UIA exposes the textual content, including format and style attributes, of text controls in UIA-supported platforms. These controls include, but are not limited to, the Microsoft .NET Framework TextBox and RichTextBox as well as their Win32 equivalents.

Exposing the textual content of a control is accomplished through the use of the TextPattern control pattern, which represents the contents of a text container as a text stream. In turn, TextPattern requires the support of the TextPatternRange class to expose format and style attributes. TextPatternRange supports TextPattern by representing a contiguous text span in a text container with the Start and End endpoints. Multiple or disjoint text spans can be represented by more than one TextPatternRange objects. TextPatternRange supports functionality such as clone, selection, comparison, retrieval and traversal.

UI Automation for automated testing

UIA can also be useful as a framework for programmatic access in automated testing scenarios. In addition to providing more refined solutions for accessibility, it is also specifically designed to provide robust functionality for automated testing.

Programmatic access provides the ability to imitate, through code, any interaction and experience exposed by traditional user interactions. UIA enables programmatic access through five components:

Availability

UIA was initially available on Windows Vista and Windows Server 2008, and it was also made available to Windows XP and Windows Server 2003 as part of .NET Framework 3.0. It has been integrated with all subsequent Windows versions, up to and including Windows 7. [5]

Besides Windows platforms, the Olive project (which is a set of add-on libraries for the Mono core aiming for the .NET Framework support) includes a subset of WPF (PresentationFramework and WindowsBase) and UI Automation. [6]

Novell's Mono Accessibility project is an implementation of the UIA Provider and Client specifications targeted for the Mono framework. Additionally, the project provides a bridge to the Accessibility Toolkit (ATK) for Linux assistive technologies (ATs). Novell is also working on a bridge for UIA-based ATs to interact with applications that implement ATK. [7]

Notes

Related Research Articles

In computing, cross-platform software is computer software that is designed to work in several computing platforms. Some cross-platform software requires a separate build for each platform, but some can be directly run on any platform without special preparation, being written in an interpreted language or compiled to portable bytecode for which the interpreters or run-time packages are common or standard components of all supported platforms.

Computer accessibility refers to the accessibility of a computer system to all people, regardless of disability type or severity of impairment. The term accessibility is most often used in reference to specialized hardware or software, or a combination of both, designed to enable the use of a computer by a person with a disability or impairment. Computer accessibility often has direct positive effects on people with disabilities.

<span class="mw-page-title-main">Screen reader</span> Assistive technology that converts text or images to speech or Braille

A screen reader is a form of assistive technology (AT) that renders text and image content as speech or braille output. Screen readers are essential to people who are blind, and are useful to people who are visually impaired, illiterate, or have a learning disability. Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like text-to-speech, sound icons, or a braille device. They do this by applying a wide variety of techniques that include, for example, interacting with dedicated accessibility APIs, using various operating system features, and employing hooking techniques.

<span class="mw-page-title-main">Windows Forms</span> Graphical user interface software library

Windows Forms (WinForms) is a free and open-source graphical (GUI) class library included as a part of Microsoft .NET, .NET Framework or Mono Framework, providing a platform to write client applications for desktop, laptop, and tablet PCs. While it is seen as a replacement for the earlier and more complex C++ based Microsoft Foundation Class Library, it does not offer a comparable paradigm and only acts as a platform for the user interface tier in a multi-tier solution.

Windows Presentation Foundation (WPF) is a free and open-source graphical subsystem originally developed by Microsoft for rendering user interfaces in Windows-based applications. WPF, previously known as "Avalon", was initially released as part of .NET Framework 3.0 in 2006. WPF uses DirectX and attempts to provide a consistent programming model for building applications. It separates the user interface from business logic, and resembles similar XML-oriented object models, such as those implemented in XUL and SVG.

Microsoft Active Accessibility (MSAA) is an application programming interface (API) for user interface accessibility. MSAA was introduced as a platform add-on to Microsoft Windows 95 in 1997. MSAA is designed to help Assistive Technology (AT) products interact with standard and custom user interface (UI) elements of an application, as well as to access, identify, and manipulate an application's UI elements. AT products work with MSAA enabled applications in order to provide better access for individuals who have physical or cognitive difficulties, impairments, or disabilities. Some examples of AT products are screen readers for users with limited sight, on screen keyboards for users with limited physical access, or narrators for users with limited hearing. MSAA can also be used for automated testing tools, and computer-based training applications.

<span class="mw-page-title-main">Assistive Technology Service Provider Interface</span>

Assistive Technology Service Provider Interface (AT-SPI) is a platform-neutral framework for providing bi-directional communication between assistive technologies (AT) and applications. It is the de facto standard for providing accessibility to free and open desktops, like Linux or OpenBSD, led by the GNOME Project.

<span class="mw-page-title-main">YUI Library</span>

The Yahoo! User Interface Library (YUI) is a discontinued open-source JavaScript library for building richly interactive web applications using techniques such as Ajax, DHTML, and DOM scripting. YUI includes several core CSS resources. It is available under a BSD License. Development on YUI began in 2005 and Yahoo! properties such as My Yahoo! and the Yahoo! front page began using YUI in the summer of that year. YUI was released for public use in February 2006. It was actively developed by a core team of Yahoo! engineers.

NonVisual Desktop Access (NVDA) is a free and open-source, portable screen reader for Microsoft Windows. The project was started by Michael Curran in 2006.

IAccessible2 is an accessibility API for Microsoft Windows applications. Initially developed by IBM under the codename Project Missouri, IAccessible2 has been placed under the aegis of the Free Standards Group, now part of the Linux Foundation. It has been positioned as an alternative to Microsoft's new UI Automation API.

Windows Vista has many significant new features compared with previous Microsoft Windows versions, covering most aspects of the operating system.

<span class="mw-page-title-main">Microsoft Silverlight</span> Application framework for writing and running rich Internet applications

Microsoft Silverlight is a discontinued application framework designed for writing and running rich web applications, similar to Adobe's runtime, Adobe Flash. A plugin for Silverlight is still available for a very small number of browsers. While early versions of Silverlight focused on streaming media, later versions supported multimedia, graphics, and animation, and gave support to developers for CLI languages and development tools. Silverlight was one of the two application development platforms for Windows Phone, but web pages using Silverlight did not run on the Windows Phone or Windows Mobile versions of Internet Explorer, as there was no Silverlight plugin for Internet Explorer on those platforms.

Comparison of the Java and .NET platforms.

<span class="mw-page-title-main">Moonlight (runtime)</span> Implementation of Microsoft Silverlight for some Unix-based operating systems

Moonlight was a free and open source implementation for Linux and other Unix-based operating systems of the now deprecated Microsoft Silverlight application framework, developed and then abandoned by the Mono Project. Like Silverlight, Moonlight was a web application framework which provided capabilities similar to those of Adobe Flash, integrating multimedia, graphics, animations and interactivity into a single runtime environment.

<span class="mw-page-title-main">PowerShell</span> Cross-platform command-line interface and scripting language for system and network administration

PowerShell is a task automation and configuration management program from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-source and cross-platform on 18 August 2016 with the introduction of PowerShell Core. The former is built on the .NET Framework, the latter on .NET Core.

<span class="mw-page-title-main">Xamarin</span> Software company

Xamarin is a Microsoft-owned San Francisco-based software company founded in May 2011 by the engineers that created Mono, Xamarin.Android and Xamarin.iOS, which are cross-platform implementations of the Common Language Infrastructure (CLI) and Common Language Specifications.

Microsoft Silverlight is an application framework for writing and running rich web applications that was actively developed and marketed by Microsoft from 2007 to 2012. This is a technical overview of the platform's history.

<span class="mw-page-title-main">Mono (software)</span> Computer software project

Mono is a free and open-source .NET Framework-compatible software framework. Originally by Ximian, it was later acquired by Novell, and is now being led by Xamarin, a subsidiary of Microsoft and the .NET Foundation. Mono can be run on many software systems.

References