This article contains content that is written like an advertisement .(September 2023) |
Developer(s) | Amazon, Amazon Web Services |
---|---|
Initial release | 29 November 2017 |
Type | Software as a service |
Website | aws |
Amazon SageMaker is a cloud-based machine-learning platform that allows the creation, training, and deployment by developers of machine-learning (ML) models on the cloud. [1] It can be used to deploy ML models on embedded systems and edge-devices. [2] [3] The platform was launched in November 2017. [4]
SageMaker enables developers to operate at a number of different levels of abstraction when training and deploying machine learning models. At its highest level of abstraction, SageMaker provides pre-trained ML models that can be deployed as-is. [5] In addition, it offers a number of built-in ML algorithms that developers can train on their own data. [6] [7]
The platform also features managed instances of TensorFlow and Apache MXNet, where developers can create their own ML algorithms from scratch. [8] Regardless of which level of abstraction is used, a developer can connect their SageMaker-enabled ML models to other AWS services, such as the Amazon DynamoDB database for structured data storage, [9] AWS Batch for offline batch processing, [9] [10] or Amazon Kinesis for real-time processing. [11]
A number of interfaces are available for developers to interact with SageMaker. First, there is a web API that remotely controls a SageMaker server instance. [12] While the web API is agnostic to the programming language used by the developer, Amazon provides SageMaker API bindings for a number of languages, including Python, JavaScript, Ruby, Java, and Go. [13] [14] In addition, SageMaker provides managed Jupyter Notebook instances for interactively programming SageMaker and other applications. [15] [16]
In 2019, CIOL named SageMaker one of the "5 Best Machine Learning Platforms For Developers," alongside IBM Watson, Microsoft Azure Machine Learning, Apache PredictionIO, and AiONE. [35]
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions. Recently, artificial neural networks have been able to surpass many previous approaches in performance.
Unsupervised learning is a method in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Within such an approach, a machine learning model tries to find any similarities, differences, patterns, and structure in data by itself. No prior human intervention is needed.
AppScale is a software company offering cloud infrastructure software and services to enterprises, government agencies, contractors, and third-party service providers. The company commercially supports one software product, AppScale ATS, a managed hybrid cloud infrastructure software platform that emulates the core AWS APIs. In 2019, the company ended commercial support for its open-source serverless computing platform AppScale GTS, but AppScale GTS source code remains freely available to the open-source community.
Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.
A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. There are two common deployment models: users can run databases on the cloud independently, using a virtual machine image, or they can purchase access to a database service, maintained by a cloud database provider. Of the databases available on the cloud, some are SQL-based and some use a NoSQL data model.
Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.
Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. A survey from May 2020 exposes the fact that practitioners report a dire need for better protecting machine learning systems in industrial applications.
Domain adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning a model from a source data distribution and applying that model on a different target data distribution. For instance, one of the tasks of the common spam filtering problem consists in adapting a model from one user to a new user who receives significantly different emails. Domain adaptation has also been shown to be beneficial to learning unrelated sources. Note that, when more than one source distribution is available the problem is referred to as multi-source domain adaptation.
TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks.
Apache SystemDS is an open source ML system for the end-to-end data science lifecycle.
This is a timeline of Amazon Web Services, which offers a suite of cloud computing services that make up an on-demand computing platform.
Apache SINGA is an Apache top-level project for developing an open source machine learning library. It provides a flexible architecture for scalable distributed training, is extensible to run over a wide range of hardware, and has a focus on health-care applications.
Apache MXNet is an open-source deep learning software framework that trains and deploys deep neural networks. It aims to be scalable, allows fast model training, and supports a flexible programming model and multiple programming languages. The MXNet library is portable and can scale to multiple GPUs and machines. It was co-developed by Carlos Guestrin at the University of Washington, along with GraphLab.
The following outline is provided as an overview of and topical guide to machine learning:
Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. It is the combination of automation and ML.
Deep Learning Studio is a software tool that aims to simplify the creation of deep learning models used in artificial intelligence. It is compatible with a number of open-source programming frameworks popularly used in artificial neural networks, including MXNet and Google's TensorFlow.
ML.NET is a free software machine learning library for the C# and F# programming languages. It also supports Python models when used together with NimbusML. The preview release of ML.NET included transforms for feature engineering like n-gram creation, and learners to handle binary classification, multi-class classification, and regression tasks. Additional ML tasks like anomaly detection and recommendation systems have since been added, and other approaches like deep learning will be included in future versions.
Amazon Polly is a cloud service by Amazon Web Services, a subsidiary of Amazon.com, that converts text into spoken audio. It allows developers to create speech-enabled applications and products. It was launched in November 2016 and now includes 60 voices across 29 languages, some of which are Neural Text-to-Speech voices of higher quality. Users include Duolingo, a language education platform.
Animashree (Anima) Anandkumar is the Bren Professor of Computing at California Institute of Technology. Previously, she was a senior director of Machine Learning research at NVIDIA and a principal scientist at Amazon Web Services. Her research considers tensor-algebraic methods, deep learning and non-convex problems.
Neural Network Exchange Format (NNEF) is an artificial neural network data exchange format developed by the Khronos Group. It is intended to reduce machine learning deployment fragmentation by enabling a rich mix of neural network training tools and inference engines to be used by applications across a diverse range of devices and platforms.