Keyword Services Platform

Last updated

The Keyword Services Platform (KSP) is a keyword research tool available through Microsoft adCenter, which contains a set of algorithms for providing information about keywords used in search engine queries.

Contents

The KSP was originally conceived by ZhaoHui Tang, Dylan Huang, Wayne Guan, Jiong Feng, Li Luo, Ken Kwok, Fred Nie at Microsoft adCenter Labs in May 2006. It underwent a major overhaul in 2011 and the platform as we see today was developed by Nimeesh Patel, Shravana Aadith Ramia Bapulal and Vivek Vinodchandra Pradhan. The platform aims to provide a core set of data and technology to empower search engine marketing and keyword research efforts. The KSP uniquely delivers a standardized set of keyword technologies through a Web services model, accessible via an application programming interface (API) and a Microsoft Excel add-in.

KSP API beta access is available for researchers and developers upon request from the Keyword Services Platform [ dead link ] feedback link.

Architecture

The following components comprise the Keyword Service Platform architecture:

Developers may use .NET programming languages to create procedures that combine the use of different providers or implement additional business logic processing based on the output from a provider.

Keyword API

The Keyword Service Platform has defined a set of APIs for each class of keyword services. These interfaces for Web services include keyword extraction (ITermExtraction), keyword categorization (ITermCategorization), keyword suggestion (ITermSuggestion), keyword forecast (ITermForecast), keyword monetization (ITermMonetization), and several others. The APIs define the signatures of each Web service.

Keyword suggestion

Keyword suggestions are handled via the ITermSuggestion interface. To find the five most closely related keywords to "BMW", the following method call may be used: GetTermSuggestion("BMW",5). The query result is shown in the following table, and by default, sorted by confidence:

OriginalTermTerm
BMWAuto
BMWCar
BMWLexus
BMWBMW cars
BMWBMW Z4

To view the five suggested five terms with the corresponding confidence score, a third parameter can be used to indicate that statistics should be returned: GetTermSuggestion("BMW",5,true). The query result is shown in the following table along with columns for score and support. The results are similar to those available through the Data Mining Extensions (DMX) in SQL. Score represents the confidence or probability; support represents the number of cases supporting the rule in the training dataset.

OriginalTermTermScoreSupport
BMWAuto0.9610000
BMWCar0.899000
BMWLexus0.8911000
BMWBMW cars0.8312000
BMWBMW Z40.7812800

To return only those terms with a high confidence score, a filter can be used on the Score column with the following method call: GetTermSuggestion("BMW",5,true,"Score>0.8"). The query result is shown in the following table. In this case, only four rows are returned, as these are the only terms that meet the criterion of the filter.

OriginalTermTermScoreSupport
BMWAuto0.9610000
BMWCar0.899000
BMWLexus0.8911000
BMWBMW cars0.8312000

When the table of terms possibly includes thousands of keywords, batch query syntax can be used. For example, suppose that the keywords are stored in myInputTermTable, and only the two most relevant terms for each keyword should be returned: GetTermSuggestion(myInputTermTable,2). The query result is shown in the following table.

OriginalTermTerm
BMWAuto
BMWCar
HondaLexus
HondaSedan
FordPickup
FordTruck

Keyword demographics

Keyword demographics are handled via the ITermDemographics interface. To obtain the demographic distribution for the keyword "Minivan", the following method call could be used: GetTermDemographics("minivan"). The query result is shown in the following table.

TermMaleFemale0-1313-1818-2525-3535-5050-6565+
Minivan0.400.60000.10.20.40.20.1

Keyword monetization

Keyword monetization values specific to paid search are handled via the ITermMonetization interface. The following method call returns the KPIs for the keyword "Online bank" based on the previous week's paid search data, in the third position of sponsored listings: GetTermKPIs("online bank",TimeInterval.LastWeek,3). The result of the query is shown below, containing the input keyword, the number of clicks in the sponsored link for "Online bank", overall impressions for the keyword, position, average click-through rate (CTR), and average cost per click (CPC).

TermClicksImpressionsPositionCTRCPC
Online bank42291530.0141.325

Keyword extraction

Keyword extraction is handled via the ITermExtraction interface. The following method call extracts the eight most relevant keywords from the webpage "autos.msn.com", and provides the corresponding statistics: GetTermExtraction("autos.msn.com",8,true). The result of the query is shown below, where the Score column represents the relevance of the extracted keyword to the page content, while the Support column represents the number of occurrences of a keyword on the page.

URLTermScoreSupport
autos.msn.comauto reviews0.623
autos.msn.comMSN autos0.542
autos.msn.comcars0.485
autos.msn.comsport cars0.392
autos.msn.comused cars0.381
autos.msn.comcompare car0.341
autos.msn.comnew cars0.321
autos.msn.comluxury cars0.301

Sample code

The following code fragment connects to the Keyword Services Platform server and uses the keyword term forecast Web service.

using(KeywordServerserver=newKeywordServer("https://ksp.microsoft.com")){server.UserName="username";server.Password="********";ITermForecastprovider=null;try{server.Open();// Context can be set if needed. It will remain during the following calls. provider=server.GetProviderByImplementation<ITermForecast>("Microsoft.adCenterLabs.Providers.KeywordForecastProvider");if(provider!=null){// Single mode API DataTableresult=provider.GetTermForecast(term,-5,3);DisplayResults(result);// Batch mode API result=provider.GetTermForecast(terms,-5,3);DisplayResults(result);}}catch(FaultException){// Handle fault returned from calling the proxy method }catch(CommunicationException){// Handle lost network connection error }catch(TimeoutException){// Handle time-out error }finally{if(provider!=null)server.ReleaseService(provider);}}

Providers

Each Keyword Services Platform provider supplies a specific type of keyword technology by implementing one class of a specific keyword interface (e.g., ITermSuggestion, ITermForecast, ITermExtraction). The API defines the signature of each Web service and the format of the returned data. The KSP provider is a server-side object encapsulating a particular implementation of a keyword technology. This provider exposes its functionality through service contracts in the Windows Communication Foundation (WCF). The WCF is Microsoft's unified programming model for building service-oriented applications, which enables developers to build secure, reliable, transacted solutions that integrate across platforms and interoperate with existing investments. To enable seamless integration of a provider into the KSP, and correspondingly seamless integration with third-party tools and applications, the providers must meet several conditions:

Stored procedures

Developers can write stored procedures (sprocs) using any .NET programming language. These procedures are executed on the Keyword Services Platform server, which hosts the Common Language Runtime (CLR). Similar to a database sprocs, a KSP sproc is designed to enable developers to implement several types of business logic on the server side after retrieving result data from providers. KSP sprocs do not require configuration management or setup requirements.

Two types of stored procedures are supported: Managed Assembly Stored Procedure (MASP) and Common Language Runtime Stored Procedure (CLRSP). A MASP consists of a compiled .NET assembly containing a public interface exposed through the KSP as well as any dependent files. Once the MASP is uploaded to the KSP through its management interface, it becomes callable by KSP client programs. A CLRSP consists of a source file written in one of the supported CLR programming languages (C#, Visual Basic .NET, Managed Extensions for C++, and others). The functionalities of the CLRSPs are exposed through a public interface defined in the source file. Once the CLRSP is deployed to KSP through its management interface, it is compiled on-demand by KSP and becomes callable by KSP client programs. Compared to database sprocs, KSP sprocs are object-oriented. A sproc may contain a set of related functions, or even identically named functions with different signatures.

Server Object Model and Shared Services

Keyword Services Platform Server Object Models and Shared Services enable KSP Service Providers and stored procedure developers to access server-side objects and functionalities easily and consistently. The object model consists of the following three collections:

  1. Service providers: This collection enables callers to access server-side Service Provider objects by name, implementation interface, and/or class name. Once callers obtain the Service Provider object, all of the functionalities of the service provider are accessible through its public interface.
  2. Stored procedures: This collection enables callers to access server-side Stored Procedure objects by name, implementation interface, and/or class name. Once callers obtain the Stored Procedure object, all of the functionalities of the stored procedure are accessible through its public interface.
  3. Services: This collection enables callers to access server-side shared services by name, by implementation interface, and/or class name. Once callers obtain the shared service object, all of the functionalities of the shared service provider are accessible through its public interface.

Cloud server model

The Microsoft adCenter Keyword Services Platform server farm provides a scalable platform for keyword technologies. Each server in the farm can have different configuration to suit a variety of service providers and stored procedures. A dynamic service load balance server, a cloud server, is the hub of the KSP server farm. When a KSP server is added to the server farm via the cloud server, all available keyword service providers and stored procedures are dynamically discovered and registered with the server. Any changes in the availability of the KSP server, as well as all its running service providers and stored procedures, are discovered and registered automatically with the server.

The cloud server distributes accesses to services running on a KSP server farm through its load balancer provider. The default implementation of the load balancer provider uses a round-robin scheduling approach. Over time, the server accumulates usage patterns and statistics of various service providers and stored procedures running on each KSP server in the farm. This information is used by the server to determine how to automatically deploy additional service providers and stored procedures. For example, if the Keyword Forecast provider is being used heavily in the server farm and the providers running on machine "A" are used lightly, the server will automatically deploy the Keyword Forecast provider to machine "A" and route requests to that machine to balance the load for the Keyword Forecast provider.

When a client application calls a service provider or stored procedure through the server, a KSP server with a matching service provider or stored procedure is selected by the load balancer provider, and the request is routed to the appropriate KSP server. If a server, service provider, or stored procedure in the KSP server farm is unavailable, it will be taken out of rotation by the load balancer automatically.

Data mart

A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs. Many Keyword Services Platform providers require real-time database access. The database may contain a list of reference keywords, their corresponding traffic, most recent click-through data, and data mining model contents. This data is updated through ETL data pipelines on a regular basis based on the provider's requirements.

Technology transfer

Keyword Services Platform's architecture permits agile development and rapid technology transfer by providing a platform for researchers to ship their research results to a live system quickly. The API defines the standard contract between the research models and developers. Researchers simply need to implement providers and deploy the providers into the selected set of KSP cloud server machines. The scope is limited, and thus very easy to use for live testing. Once the provider is live-tested and proven, KSP can switch to the default provider without any changes on the application side. This infrastructure enables researchers at Microsoft and other academic settings to speed up innovation in keyword technology and deploy the latest research results to KSP consumers.

KSP data access with Microsoft Excel 2007

Microsoft adCenter released an add-in for Microsoft Excel 2007 that allows users to consume the Keyword Services Platform data directly via Excel rather than through the API. The add-in makes much of the keyword technology available directly through Excel. Essentially it is an example of the type of mashup and creative use of data that can be associated with the KSP. The add-in delivers features such as keyword extraction, suggestion, forecasting, monetization, etc.

Applications of the KSP

The Keyword Services Platform incorporates keyword technologies from Microsoft adCenter Labs and other Microsoft Research groups. Keyword APIs can be consumed by third-party business applications from paid search, content advertisements, behavioral targeting, presale business intelligence apps, and so on.

The KSP can be used in advertising campaign creation and management:

The KSP can also be used in behavioral targeting and display advertising:

Related Research Articles

In computing, Microsoft's ActiveX Data Objects (ADO) comprises a set of Component Object Model (COM) objects for accessing data sources. A part of MDAC, it provides a middleware layer between programming languages and OLE DB. ADO allows a developer to write programs that access data without knowing how the database is implemented; developers must be aware of the database for connection only. No knowledge of SQL is required to access a database when using ADO, although one can use ADO to execute SQL commands directly.

In computing, Open Database Connectivity (ODBC) is a standard application programming interface (API) for accessing database management systems (DBMS). The designers of ODBC aimed to make it independent of database systems and operating systems. An application written using ODBC can be ported to other platforms, both on the client and server side, with few changes to the data access code.

Online analytical processing, or OLAP, is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

A stored procedure is a subroutine available to applications that access a relational database management system (RDBMS). Such procedures are stored in the database data dictionary.

Windows Management Instrumentation (WMI) consists of a set of extensions to the Windows Driver Model that provides an operating system interface through which instrumented components provide information and notification. WMI is Microsoft's implementation of the Web-Based Enterprise Management (WBEM) and Common Information Model (CIM) standards from the Distributed Management Task Force (DMTF).

ADO.NET is a data access technology from the Microsoft .NET Framework that provides communication between relational and non-relational systems through a common set of components. ADO.NET is a set of computer software components that programmers can use to access data and data services from a database. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems, though it can also access data in non-relational data sources. ADO.NET is sometimes considered an evolution of ActiveX Data Objects (ADO) technology, but was changed so extensively that it can be considered an entirely new product.

OLE DB, an API designed by Microsoft, allows accessing data from a variety of sources in a uniform manner. The API provides a set of interfaces implemented using the Component Object Model (COM); it is otherwise unrelated to OLE. Microsoft originally intended OLE DB as a higher-level replacement for, and successor to, ODBC, extending its feature set to support a wider variety of non-relational databases, such as object databases and spreadsheets that do not necessarily implement.

A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.

<span class="mw-page-title-main">Microsoft Data Access Components</span> Framework

Microsoft Data Access Components is a framework of interrelated Microsoft technologies that allows programmers a uniform and comprehensive way of developing applications that can access almost any data store. Its components include: ActiveX Data Objects (ADO), OLE DB, and Open Database Connectivity (ODBC). There have been several deprecated components as well, such as the Jet Database Engine, MSDASQL, and Remote Data Services (RDS). Some components have also become obsolete, such as the former Data Access Objects API and Remote Data Objects.

Microsoft SQL Server Analysis Services (SSAS) is an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server. SSAS is used as a tool by organizations to analyze and make sense of information possibly spread out across multiple databases, or in disparate tables or files. Microsoft has included a number of services in SQL Server related to business intelligence and data warehousing. These services include Integration Services, Reporting Services and Analysis Services. Analysis Services includes a group of OLAP and data mining capabilities and comes in two flavors multidimensional and tabular, where the difference between the two is how the data is presented. In a tabular model, the information is arranged in two-dimensional tables which can thus be more readable for a human. A multidimensional model can contain information with many degrees of freedom, and must be unfolded to increase readability by a human.

Microsoft UI Automation (UIA) is an application programming interface (API) that allows one to access, identify, and manipulate the user interface (UI) elements of another application.

Language Integrated Query is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, originally released as a major part of .NET Framework 3.5 in 2007.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

<span class="mw-page-title-main">Windows Search</span> Desktop search platform by Microsoft

Windows Search is a content index desktop search platform by Microsoft introduced in Windows Vista as a replacement for both the previous Indexing Service of Windows 2000 and the optional MSN Desktop Search for Windows XP and Windows Server 2003, designed to facilitate local and remote queries for files and non-file items in compatible applications including Windows Explorer. It was developed after the postponement of WinFS and introduced to Windows constituents originally touted as benefits of that platform.

Component Object Model (COM) is a binary-interface standard for software components introduced by Microsoft in 1993. It is used to enable inter-process communication object creation in a large range of programming languages. COM is the basis for several other Microsoft technologies and frameworks, including OLE, OLE Automation, Browser Helper Object, ActiveX, COM+, DCOM, the Windows shell, DirectX, UMDF and Windows Runtime. The essence of COM is a language-neutral way of implementing objects that can be used in environments different from the one in which they were created, even across machine boundaries. For well-authored components, COM allows reuse of objects with no knowledge of their internal implementation, as it forces component implementers to provide well-defined interfaces that are separated from the implementation. The different allocation semantics of languages are accommodated by making objects responsible for their own creation and destruction through reference-counting. Type conversion casting between different interfaces of an object is achieved through the QueryInterface method. The preferred method of "inheritance" within COM is the creation of sub-objects to which method "calls" are delegated.

Microsoft adCenter Labs, is an applied research group at Microsoft that supports Microsoft adCenter. Microsoft adCenter, is the division of the Microsoft Network (MSN) responsible for MSN's advertising services.

The following tables compare general and technical information for a number of online analytical processing (OLAP) servers. Please see the individual products articles for further information.

Java Database Connectivity (JDBC) is an application programming interface (API) for the Java programming language which defines how a client may access a database. It is a Java-based data access technology used for Java database connectivity. It is part of the Java Standard Edition platform, from Oracle Corporation. It provides methods to query and update data in a database, and is oriented toward relational databases. A JDBC-to-ODBC bridge enables connections to any ODBC-accessible data source in the Java virtual machine (JVM) host environment.

<span class="mw-page-title-main">API</span> Software interface between computer programs

An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build or use such a connection or interface is called an API specification. A computer system that meets this standard is said to implement or expose an API. The term API may refer either to the specification or to the implementation. Whereas a system's user interface dictates how its end-users interact with the system in question, its API dictates how to write code that takes advantage of that system's capabilities.

Microsoft Azure Cognitive Search, formerly known as Azure Search, is a component of the Microsoft Azure Cloud Platform providing indexing and querying capabilities for data uploaded to Microsoft servers. The Search as a service framework is intended to provide developers with complex search capabilities for mobile and web development while hiding infrastructure requirements and search algorithm complexities. Azure Search is a recent addition to Microsoft's Infrastructure as a Service (IaaS) approach.

References

    Further reading