Developer(s) | Harrison Chase |
---|---|
Initial release | October 2022 |
Stable release | 0.1.16 [1] / 11 April 2024 |
Repository | github.com/langchain-ai/langchain |
Written in | Python and JavaScript |
Type | Software framework for large language model application development |
License | MIT License |
Website | LangChain.com |
LangChain is a software framework that helps facilitate the integration of large language models (LLMs) into applications. As a language model integration framework, LangChain's use-cases largely overlap with those of language models in general, including document analysis and summarization, chatbots, and code analysis. [2]
LangChain was launched in October 2022 as an open source project by Harrison Chase, while working at machine learning startup Robust Intelligence. The project quickly garnered popularity, [3] with improvements from hundreds of contributors on GitHub, trending discussions on Twitter, lively activity on the project's Discord server, many YouTube tutorials, and meetups in San Francisco and London. In April 2023, LangChain had incorporated and the new startup raised over $20 million in funding at a valuation of at least $200 million from venture firm Sequoia Capital, a week after announcing a $10 million seed investment from Benchmark. [4] [5]
In the third quarter of 2023, the LangChain Expression Language (LCEL) was introduced, which provides a declarative way to define chains of actions. [6] [7]
In October 2023 LangChain introduced LangServe, a deployment tool to host LCEL code as a production-ready API. [8]
LangChain's developers highlight the framework's applicability to use-cases including chatbots, [9] retrieval-augmented generation, [10] document summarization, [11] and synthetic data generation. [12]
As of March 2023, LangChain included integrations with systems including Amazon, Google, and Microsoft Azure cloud storage; [13] API wrappers for news, movie information, and weather; Bash for summarization, syntax and semantics checking, and execution of shell scripts; multiple web scraping subsystems and templates; few-shot learning prompt generation support; finding and summarizing "todo" tasks in code; Google Drive documents, spreadsheets, and presentations summarization, extraction, and creation; Google Search and Microsoft Bing web search; [14] OpenAI, Anthropic, and Hugging Face language models; iFixit repair guides and wikis search and summarization; MapReduce for question answering, combining documents, and question generation; N-gram overlap scoring; PyPDF, pdfminer, fitz, and pymupdf for PDF file text extraction and manipulation; Python and JavaScript code generation, analysis, and debugging; Milvus vector database [15] to store and retrieve vector embeddings; Weaviate vector database [16] to cache embedding and data objects; Redis cache database storage; Python RequestsWrapper and other methods for API requests; SQL and NoSQL databases including JSON support; Streamlit, including for logging; text mapping for k-nearest neighbors search; time zone conversion and calendar operations; tracing and recording stack symbols in threaded and asynchronous subprocess runs; and the Wolfram Alpha website and SDK. [17] As of April 2023, it can read from more than 50 document types and data sources. [18]
Tool name | Account required? | API key required? | Licencing | Features | Documentation URL |
---|---|---|---|---|---|
Alpha Vantage | No | Yes | Proprietary | Financial data, analytics | https://python.langchain.com/docs/integrations/tools/alpha_vantage |
Apify | No | Yes | Commercial | Web scraping, automation | https://python.langchain.com/docs/integrations/providers/apify/ |
ArXiv | No | No | Open Source | Scientific papers, research | https://python.langchain.com/docs/integrations/tools/arxiv |
AWS Lambda | Yes | Yes | Proprietary | Serverless computing | https://python.langchain.com/docs/integrations/tools/awslambda |
Bash | No | No | Open source | Shell environment access | https://python.langchain.com/docs/integrations/tools/bash |
Bearly Code Interpreter | No | Yes | Commercial | Remote Python code execution | https://python.langchain.com/docs/integrations/tools/bearly |
Bing Search | No | Yes | Proprietary | Search engine | https://python.langchain.com/docs/integrations/tools/bing_search |
Brave Search | No | No | Open source | Privacy-focused search | https://python.langchain.com/docs/integrations/tools/brave_search |
ChatGPT Plugins | No | Yes | Proprietary | ChatGPT | https://python.langchain.com/docs/integrations/tools/chatgpt_plugins |
Connery | No | Yes | Commercial | API actions | https://python.langchain.com/docs/integrations/tools/connery |
Dall-E Image Generator | No | Yes | Proprietary | Text-to-image generation | https://python.langchain.com/docs/integrations/tools/dalle_image_generator |
DataForSEO | No | Yes | Commercial | SEO data, analytics | https://python.langchain.com/docs/integrations/tools/dataforseo |
DuckDuckGo Search | No | No | Open source | Privacy-focused search | https://python.langchain.com/docs/integrations/tools/ddg |
E2B Data Analysis | No | No | Open source | Data analysis | https://python.langchain.com/docs/integrations/tools/e2b_data_analysis |
Eden AI | No | Yes | Commercial | AI tools, APIs | https://python.langchain.com/docs/integrations/tools/edenai_tools |
Eleven Labs Text2Speech | No | Yes | Commercial | Text-to-speech | https://python.langchain.com/docs/integrations/tools/eleven_labs_tts |
Exa Search | No | Yes | Commercial | Web search | https://python.langchain.com/docs/integrations/tools/exa_search |
File System | No | No | Open source | File system interaction | https://python.langchain.com/docs/integrations/tools/filesystem |
Golden Query | No | Yes | Commercial | Natural language queries | https://python.langchain.com/docs/integrations/tools/golden_query |
Google Cloud Text-to-Speech | Yes | Yes | Proprietary | Text-to-speech | https://python.langchain.com/docs/integrations/tools/google_cloud_texttospeech |
Google Drive | Yes | Yes | Proprietary | Google Drive access | https://python.langchain.com/docs/integrations/tools/google_drive |
Google Finance | Yes | Yes | Proprietary | Financial data | https://python.langchain.com/docs/integrations/tools/google_finance |
Google Jobs | Yes | Yes | Proprietary | Job search | https://python.langchain.com/docs/integrations/tools/google_jobs |
Google Lens | Yes | Yes | Proprietary | Visual search, recognition | https://python.langchain.com/docs/integrations/tools/google_lens |
Google Places | Yes | Yes | Proprietary | Location-based services | https://python.langchain.com/docs/integrations/tools/google_places |
Google Scholar | Yes | Yes | Proprietary | Scholarly article search | https://python.langchain.com/docs/integrations/tools/google_scholar |
Google Search | Yes | Yes | Proprietary | Search engine | https://python.langchain.com/docs/integrations/tools/google_search |
Google Serper | No | Yes | Commercial | SERP scraping | https://python.langchain.com/docs/integrations/tools/google_serper |
Google Trends | Yes | Yes | Proprietary | Trend data | https://python.langchain.com/docs/integrations/tools/google_trends |
Gradio | No | No | Open source | Machine learning UIs | https://python.langchain.com/docs/integrations/tools/gradio_tools |
GraphQL | No | No | Open source | API queries | https://python.langchain.com/docs/integrations/tools/graphql |
HuggingFace Hub | No | No | Open source | Hugging Face models, datasets | https://python.langchain.com/docs/integrations/tools/huggingface_tools |
Human as a tool | No | No | N/A | Human input | https://python.langchain.com/docs/integrations/tools/human_tools |
IFTTT WebHooks | No | Yes | Commercial | Web service automation | https://python.langchain.com/docs/integrations/tools/ifttt |
Ionic Shopping | No | Yes | Commercial | Shopping | https://python.langchain.com/docs/integrations/tools/ionic_shopping |
Lemon Agent | No | Yes | Commercial | Lemon AI interaction | https://python.langchain.com/docs/integrations/tools/lemonai |
Memorize | No | No | Open source | Fine-tune LLM to memorize information using unsupervised learning | https://python.langchain.com/docs/integrations/tools/memorize |
Nuclia | No | Yes | Commercial | Indexing of unstructured data | https://python.langchain.com/docs/integrations/tools/nuclia |
OpenWeatherMap | No | Yes | Commercial | Weather data | https://python.langchain.com/docs/integrations/tools/openweathermap |
Polygon Stock Market API | No | Yes | Commercial | Stock market data | https://python.langchain.com/docs/integrations/tools/polygon |
PubMed | No | No | Open source | Biomedical literature | https://python.langchain.com/docs/integrations/tools/pubmed |
Python REPL | No | No | Open source | Python shell | https://python.langchain.com/docs/integrations/tools/python |
Reddit Search | No | No | Open source | Reddit search | https://python.langchain.com/docs/integrations/tools/reddit_search |
Requests | No | No | Open source | HTTP requests | https://python.langchain.com/docs/integrations/tools/requests |
SceneXplain | No | No | Open source | Model explanations | https://python.langchain.com/docs/integrations/tools/sceneXplain |
Search | No | No | Open source | Query various search services | https://python.langchain.com/docs/integrations/tools/search_tools |
SearchApi | No | Yes | Commercial | Query various search services | https://python.langchain.com/docs/integrations/tools/searchapi |
SearxNG | No | No | Open source | Privacy-focused search | https://python.langchain.com/docs/integrations/tools/searx_search |
Semantic Scholar API | No | No | Open source | Academic paper search | https://python.langchain.com/docs/integrations/tools/semanticscholar |
SerpAPI | No | Yes | Commercial | Search engine results page scraping | https://python.langchain.com/docs/integrations/tools/serpapi |
StackExchange | No | No | Open source | Stack Exchange access | https://python.langchain.com/docs/integrations/tools/stackexchange |
Tavily Search | No | Yes | Commercial | Question answering | https://python.langchain.com/docs/integrations/tools/tavily_search |
Twilio | No | Yes | Commercial | Communication APIs | https://python.langchain.com/docs/integrations/tools/twilio |
Wikidata | No | No | Open source | Structured data access | https://python.langchain.com/docs/integrations/tools/wikidata |
Wikipedia | No | No | Open source | Wikipedia access | https://python.langchain.com/docs/integrations/tools/wikipedia |
Wolfram Alpha | No | Yes | Proprietary | Computational knowledge | https://python.langchain.com/docs/integrations/tools/wolfram_alpha |
Yahoo Finance News | No | Yes | Commercial | Financial news | https://python.langchain.com/docs/integrations/tools/yahoo_finance_news |
Youtube | No | Yes | Commercial | YouTube access | https://python.langchain.com/docs/integrations/tools/youtube |
Zapier Natural Language Actions | No | Yes | Commercial | Workflow automation | https://python.langchain.com/docs/integrations/tools/zapier |
Cairo is an open-source graphics library that provides a vector graphics-based, device-independent API for software developers. It provides primitives for two-dimensional drawing across a number of different backends. Cairo uses hardware acceleration when available.
This is a comparison of notable web frameworks, software used to build and deploy web applications.
The Dynamic Language Runtime (DLR) from Microsoft runs on top of the Common Language Runtime (CLR) and provides computer language services for dynamic languages. These services include:
Benevolent dictator for life (BDFL) is a title given to a small number of open-source software development leaders, typically project founders who retain the final say in disputes or arguments within the community. The phrase originated in 1995 with reference to Guido van Rossum, creator of the Python programming language.
Bitbucket is a Git-based source code repository hosting service owned by Atlassian. Bitbucket offers both commercial plans and free accounts with an unlimited number of private repositories.
Flask is a micro web framework written in Python. It is classified as a microframework because it does not require particular tools or libraries. It has no database abstraction layer, form validation, or any other components where pre-existing third-party libraries provide common functions. However, Flask supports extensions that can add application features as if they were implemented in Flask itself. Extensions exist for object-relational mappers, form validation, upload handling, various open authentication technologies and several common framework related tools.
PyCharm is an integrated development environment (IDE) used for programming in Python. It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems, and supports web development with Django. PyCharm is developed by the Czech company JetBrains and built on their IntelliJ platform.
GraalVM is a Java Development Kit (JDK) written in Java. The open-source distribution of GraalVM is based on OpenJDK, and the enterprise distribution is based on Oracle JDK. As well as just-in-time (JIT) compilation, GraalVM can compile a Java application ahead of time. This allows for faster initialization, greater runtime performance, and decreased resource consumption, but the resulting executable can only run on the platform it was compiled for.
KDE Frameworks is a collection of libraries and software frameworks readily available to any Qt-based software stacks or applications on multiple operating systems. Featuring frequently needed functionality solutions like hardware integration, file format support, additional graphical control elements, plotting functions, and spell checking, the collection serves as the technological foundation for KDE Plasma and KDE Gear. It is distributed under the GNU Lesser General Public License (LGPL).
Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.
Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level system programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.
Shapes Constraint Language (SHACL) is a World Wide Web Consortium (W3C) standard language for describing Resource Description Framework (RDF) graphs. SHACL has been designed to enhance the semantic and technical interoperability layers of ontologies expressed as RDF graphs.
GitHub Copilot is a code completion and automatic programming tool developed by GitHub and OpenAI that assists users of Visual Studio Code, Visual Studio, Neovim, and JetBrains integrated development environments (IDEs) by autocompleting code. Currently available by subscription to individual developers and to businesses, the generative artificial intelligence software was first announced by GitHub on 29 June 2021, and works best for users coding in Python, JavaScript, TypeScript, Ruby, and Go. In March 2023 GitHub announced plans for "Copilot X", which will incorporate a chatbot based on GPT-4, as well as support for voice commands, into Copilot.
OpenAI Codex is an artificial intelligence model developed by OpenAI. It parses natural language and generates code in response. It powers GitHub Copilot, a programming autocompletion tool for select IDEs, like Visual Studio Code and Neovim. Codex is a descendant of OpenAI's GPT-3 model, fine-tuned for use in programming applications.
Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.
deepset is an enterprise software vendor that provides developers with the tools to build production-ready natural language processing (NLP) systems. It was founded in 2018 in Berlin by Milos Rusic, Malte Pietsch, and Timo Möller. deepset authored and maintains the open source software Haystack and its commercial SaaS offering deepset Cloud.
Mojo is a programming language in the Python family that is currently under development. It is available both in browsers via Jupyter notebooks, and locally on Linux and macOS. Mojo aims to combine the usability of higher level programming languages, specifically Python, with the performance of system programming languages like C++, Rust, and Zig. The Mojo compiler is currently closed source with an open source standard library, although Modular, the company behind Mojo, has stated their intent to eventually open source the Mojo programming language itself as it matures.
A vector database, vector store or vector search engine is a database that can store vectors along with other data items. Vector databases typically implement one or more Approximate Nearest Neighbor (ANN) algorithms, so that one can search the database with a query vector to retrieve the closest matching database records.
Tabnine is an artificial intelligence (AI) coding assistant developed by Tabnine, which was founded by Dror Weiss and Professor Eran Yahav in Tel Aviv, Israel, in 2013. Initially established under the name Codota, the company underwent a rebranding in May 2021 following the release of the company’s first large language model based AI coding assistant, adopting the name Tabnine.