Company type | Private |
---|---|
Industry | Artificial intelligence |
Founded | 2019 |
Founders |
|
Headquarters | Heidelberg, Germany |
Products | Luminous LLM |
Website | aleph-alpha |
Aleph Alpha is a German artificial intelligence (AI) startup company founded by professionals with experience as employees of Apple, SAP and Deloitte, based in Heidelberg. [1] Aleph Alpha attempts to provide a full-stack sovereign technology stack for generative AI, independent from US companies and comply with European data protection regulations and the AI Act. It built one of the most powerful AI clusters [2] inside its own data centre and develops large language models (LLM), which try to provide transparency of its sources used for the results generated [3] and are intended for enterprises and governmental agencies only. Training of its model has been done in five European languages. [4]
Aleph Alpha has been founded 2019 by the serial entrepreneur Jonas Andrulis and Samuel Weinbach. Jonas Andrulis received his academic economics engineering degree from Karlsruhe Institute of Technology with a thesis in AI. He worked in consulting at Deloitte and started several AI software companies. Before starting Aleph Alpha he worked as AI R&D engineering manager position at Apple's Special Projects Group and with Siri AI R&D working on classified research. [5] Samuel Weinbach received a degree in business administration and was with Deloitte consulting from 2010 building the Deloitte Analytic Institute that was focused on spearheading corporate AI initiatives. [1]
Aleph Alpha developed its own AI language model, named Luminous, based on its own research and codebase with the architecture of generative pre-trained transformers (GPT) with self-supervised learning. Next to the standard functionality all GPT models share Aleph Alpha contributed some proprietary innovation:
As a tool to build and train its foundation models, the HPE Machine Learning Development System is used. [17] Using the GPT-type concept allows adaptation and fine-tuning of the foundation model to various applications. [18]
Luminous is already used for the citizen information system Lumi of the city of Heidelberg. [19]
Anthropic PBC is a U.S.-based artificial intelligence (AI) startup public-benefit company, founded in 2021. It researches and develops AI to "study their safety properties at the technological frontier" and use this research to deploy safe, reliable models for the public. Anthropic has developed a family of large language models (LLMs) named Claude as a competitor to OpenAI's ChatGPT and Google's Gemini.
iFlytek, styled as iFLYTEK, is a partially state-owned Chinese information technology company established in 1999. It creates voice recognition software and 10+ voice-based internet/mobile products covering education, communication, music, intelligent toys industries. State-owned enterprise China Mobile is the company's largest shareholder. The company is listed in the Shenzhen Stock Exchange and it is backed by several state-owned investment funds.
The Machine is the name of an experimental computer made by Hewlett Packard Enterprise. It was created as part of a research project to develop a new type of computer architecture for servers. The design focused on a “memory centric computing” architecture, where NVRAM replaced traditional DRAM and disks in the memory hierarchy. The NVRAM was byte addressable and could be accessed from any CPU via a photonic interconnect. The aim of the project was to build and evaluate this new design.
Databricks, Inc. is a global data, analytics and artificial intelligence company founded by the original creators of Apache Spark.
The Hewlett Packard Enterprise Company (HPE) is an American multinational information technology company based in Spring, Texas.
Writer is a generative artificial intelligence company based in San Francisco, California that offers a full-stack generative AI platform for enterprises. In September 2023, Writer raised $100m in a Series B led by ICONIQ Growth with participation from Insight Partners, WndrCo, Balderton Capital, and Aspect Ventures. The co-founders of Writer also worked together on Qordoba, a previous startup.
VAST Data is a privately held technology company focused on artificial intelligence (AI) and deep learning computing infrastructure. Founded in 2016, the company offers a data computing platform that allows users to train AI models by storing and synthesizing large amounts of unstructured data.
Cerebras Systems Inc. is an American artificial intelligence company with offices in Sunnyvale and San Diego, Toronto, Tokyo and Bangalore, India. Cerebras builds computer systems for complex artificial intelligence deep learning applications.
QANDA is an AI-based learning platform developed by Mathpresso Inc., a South Korea-based education technology company. Its best known feature is a solution search, which uses optical character recognition technology to scan problems and provide step-by-step solutions and learning content.
Wu Dao is a multimodal artificial intelligence developed by the Beijing Academy of Artificial Intelligence (BAAI). Wu Dao 1.0 was first announced on January 11, 2021; an improved version, Wu Dao 2.0, was announced on May 31. It has been compared to GPT-3, and is built on a similar architecture; in comparison, GPT-3 has 175 billion parameters — variables and inputs within the machine learning model — while Wu Dao has 1.75 trillion parameters. Wu Dao was trained on 4.9 terabytes of images and texts, while GPT-3 was trained on 45 terabytes of text data. Yet, a growing body of work highlights the importance of increasing both data and parameters. The chairman of BAAI said that Wu Dao was an attempt to "create the biggest, most powerful AI model possible"; although direct comparisons between models based on parameter count do not directly correlate to quality. Wu Dao 2.0, was called "the biggest language A.I. system yet". It was interpreted by commenters as an attempt to "compete with the United States".. Notably, the type of architecture used for Wu Dao 2.0 is a mixture-of-experts (MoE) model, unlike GPT-3, which is a "dense" model: while MoE models require much less computational power to train than dense models with the same numbers of parameters, trillion-parameter MoE models have shown comparable performance to models that are hundreds of times smaller.
Prompt engineering is the process of structuring an instruction that can be interpreted and understood by a generative AI model. A prompt is natural language text describing the task that an AI should perform.
ChatGPT is a chatbot and virtual assistant developed by OpenAI and launched on November 30, 2022. Based on large language models (LLMs), it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive user prompts and replies are considered at each conversation stage as context.
In the field of artificial intelligence (AI), a hallucination or artificial hallucination is a response generated by AI which contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with unjustified responses or beliefs rather than perceptual experiences.
Generative pre-trained transformers (GPT) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence. They are artificial neural networks that are used in natural language processing tasks. GPTs are based on the transformer architecture, pre-trained on large data sets of unlabelled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs.
A large language model (LLM) is a computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. Based on language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.
In machine learning, the term stochastic parrot is a metaphor to describe the theory that large language models, though able to generate plausible language, do not understand the meaning of the language they process. The term was coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" by Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell.
Watsonx is IBM's commercial generative AI and scientific data platform based on cloud. It offers a studio, data store, and governance toolkit. It supports multiple large language models (LLMs) along with IBM's own Granite.
Mistral AI is a French company specializing in artificial intelligence (AI) products. Founded in April 2023 by former employees of Meta Platforms and Google DeepMind, the company has quickly risen to prominence in the AI sector.
Perplexity AI is an AI chatbot-powered research and conversational search engine that answers queries using natural language predictive text. Launched in 2022, Perplexity generates answers using sources from the web and cites links within the text response. Perplexity works on a freemium model; the free product uses Anthropic's Claude 3 Haiku model combined with the company's standalone large language model (LLM) that incorporates natural language processing (NLP) capabilities, while the paid version Perplexity Pro has access to GPT-4, Claude 3, Mistral Large, Llama 3 and an Experimental Perplexity Model. As of early 2024, it has about 10 million monthly users.
IBM Granite is a series of decoder-only foundation models created by IBM. It was announced on September 7, 2023, and an initial paper was published 4 days later. Initially intended for use in the IBM's cloud-based data and generative AI platform Watsonx along with other models, IBM opened the source code of some code models. Granite models are trained on datasets curated from Internet, academic publishings, code datasets, legal and finance documents.