LangChain

LangChain
Developer(s)	Harrison Chase
Initial release	October 2022
Stable release	0.1.16 / 11 April 2024;5 months ago
Repository	github.com/langchain-ai/langchain
Written in	Python and JavaScript
Type	Software framework for large language model application development
License	MIT License
Website	LangChain.com

Last updated September 29, 2024

LangChain is a software framework that helps facilitate the integration of large language models (LLMs) into applications. As a language model integration framework, LangChain's use-cases largely overlap with those of language models in general, including document analysis and summarization, chatbots, and code analysis.^[2]

History

LangChain was launched in October 2022 as an open source project by Harrison Chase, while working at machine learning startup Robust Intelligence. The project quickly garnered popularity,^[3] with improvements from hundreds of contributors on GitHub, trending discussions on Twitter, lively activity on the project's Discord server, many YouTube tutorials, and meetups in San Francisco and London. In April 2023, LangChain had incorporated and the new startup raised over $20 million in funding at a valuation of at least $200 million from venture firm Sequoia Capital, a week after announcing a $10 million seed investment from Benchmark.^[4]^[5]

In the third quarter of 2023, the LangChain Expression Language (LCEL) was introduced, which provides a declarative way to define chains of actions.^[6]^[7]

In October 2023 LangChain introduced LangServe, a deployment tool to host LCEL code as a production-ready API.^[8]

Capabilities

LangChain's developers highlight the framework's applicability to use-cases including chatbots,^[9] retrieval-augmented generation,^[10] document summarization,^[11] and synthetic data generation.^[12]

As of March 2023, LangChain included integrations with systems including Amazon, Google, and Microsoft Azure cloud storage;^[13] API wrappers for news, movie information, and weather; Bash for summarization, syntax and semantics checking, and execution of shell scripts; multiple web scraping subsystems and templates; few-shot learning prompt generation support; finding and summarizing "todo" tasks in code; Google Drive documents, spreadsheets, and presentations summarization, extraction, and creation; Google Search and Microsoft Bing web search;^[14] OpenAI, Anthropic, and Hugging Face language models; iFixit repair guides and wikis search and summarization; MapReduce for question answering, combining documents, and question generation; N-gram overlap scoring; PyPDF, pdfminer, fitz, and pymupdf for PDF file text extraction and manipulation; Python and JavaScript code generation, analysis, and debugging; Milvus vector database^[15] to store and retrieve vector embeddings; Weaviate vector database^[16] to cache embedding and data objects; Redis cache database storage; Python RequestsWrapper and other methods for API requests; SQL and NoSQL databases including JSON support; Streamlit, including for logging; text mapping for k-nearest neighbors search; time zone conversion and calendar operations; tracing and recording stack symbols in threaded and asynchronous subprocess runs; and the Wolfram Alpha website and SDK.^[17] As of April 2023, it can read from more than 50 document types and data sources.^[18]

LangChain tools


Tool name	Account required?	API key required?	Licencing	Features	Documentation URL
Alpha Vantage	No	Yes	Proprietary	Financial data, analytics	https://python.langchain.com/docs/integrations/tools/alpha_vantage
Apify	No	Yes	Commercial	Web scraping, automation	https://python.langchain.com/docs/integrations/providers/apify/
ArXiv	No	No	Open Source	Scientific papers, research	https://python.langchain.com/docs/integrations/tools/arxiv
AWS Lambda	Yes	Yes	Proprietary	Serverless computing	https://python.langchain.com/docs/integrations/tools/awslambda
Bash	No	No	Open source	Shell environment access	https://python.langchain.com/docs/integrations/tools/bash
Bearly Code Interpreter	No	Yes	Commercial	Remote Python code execution	https://python.langchain.com/docs/integrations/tools/bearly
Bing Search	No	Yes	Proprietary	Search engine	https://python.langchain.com/docs/integrations/tools/bing_search
Brave Search	No	No	Open source	Privacy-focused search	https://python.langchain.com/docs/integrations/tools/brave_search
ChatGPT Plugins	No	Yes	Proprietary	ChatGPT	https://python.langchain.com/docs/integrations/tools/chatgpt_plugins
Connery	No	Yes	Commercial	API actions	https://python.langchain.com/docs/integrations/tools/connery
Dall-E Image Generator	No	Yes	Proprietary	Text-to-image generation	https://python.langchain.com/docs/integrations/tools/dalle_image_generator
DataForSEO	No	Yes	Commercial	SEO data, analytics	https://python.langchain.com/docs/integrations/tools/dataforseo
DuckDuckGo Search	No	No	Open source	Privacy-focused search	https://python.langchain.com/docs/integrations/tools/ddg
E2B Data Analysis	No	No	Open source	Data analysis	https://python.langchain.com/docs/integrations/tools/e2b_data_analysis
Eden AI	No	Yes	Commercial	AI tools, APIs	https://python.langchain.com/docs/integrations/tools/edenai_tools
Eleven Labs Text2Speech	No	Yes	Commercial	Text-to-speech	https://python.langchain.com/docs/integrations/tools/eleven_labs_tts
Exa Search	No	Yes	Commercial	Web search	https://python.langchain.com/docs/integrations/tools/exa_search
File System	No	No	Open source	File system interaction	https://python.langchain.com/docs/integrations/tools/filesystem
Golden Query	No	Yes	Commercial	Natural language queries	https://python.langchain.com/docs/integrations/tools/golden_query
Google Cloud Text-to-Speech	Yes	Yes	Proprietary	Text-to-speech	https://python.langchain.com/docs/integrations/tools/google_cloud_texttospeech
Google Drive	Yes	Yes	Proprietary	Google Drive access	https://python.langchain.com/docs/integrations/tools/google_drive
Google Finance	Yes	Yes	Proprietary	Financial data	https://python.langchain.com/docs/integrations/tools/google_finance
Google Jobs	Yes	Yes	Proprietary	Job search	https://python.langchain.com/docs/integrations/tools/google_jobs
Google Lens	Yes	Yes	Proprietary	Visual search, recognition	https://python.langchain.com/docs/integrations/tools/google_lens
Google Places	Yes	Yes	Proprietary	Location-based services	https://python.langchain.com/docs/integrations/tools/google_places
Google Scholar	Yes	Yes	Proprietary	Scholarly article search	https://python.langchain.com/docs/integrations/tools/google_scholar
Google Search	Yes	Yes	Proprietary	Search engine	https://python.langchain.com/docs/integrations/tools/google_search
Google Serper	No	Yes	Commercial	SERP scraping	https://python.langchain.com/docs/integrations/tools/google_serper
Google Trends	Yes	Yes	Proprietary	Trend data	https://python.langchain.com/docs/integrations/tools/google_trends
Gradio	No	No	Open source	Machine learning UIs	https://python.langchain.com/docs/integrations/tools/gradio_tools
GraphQL	No	No	Open source	API queries	https://python.langchain.com/docs/integrations/tools/graphql
HuggingFace Hub	No	No	Open source	Hugging Face models, datasets	https://python.langchain.com/docs/integrations/tools/huggingface_tools
Human as a tool	No	No	N/A	Human input	https://python.langchain.com/docs/integrations/tools/human_tools
IFTTT WebHooks	No	Yes	Commercial	Web service automation	https://python.langchain.com/docs/integrations/tools/ifttt
Ionic Shopping	No	Yes	Commercial	Shopping	https://python.langchain.com/docs/integrations/tools/ionic_shopping
Lemon Agent	No	Yes	Commercial	Lemon AI interaction	https://python.langchain.com/docs/integrations/tools/lemonai
Memorize	No	No	Open source	Fine-tune LLM to memorize information using unsupervised learning	https://python.langchain.com/docs/integrations/tools/memorize
Nuclia	No	Yes	Commercial	Indexing of unstructured data	https://python.langchain.com/docs/integrations/tools/nuclia
OpenWeatherMap	No	Yes	Commercial	Weather data	https://python.langchain.com/docs/integrations/tools/openweathermap
Polygon Stock Market API	No	Yes	Commercial	Stock market data	https://python.langchain.com/docs/integrations/tools/polygon
PubMed	No	No	Open source	Biomedical literature	https://python.langchain.com/docs/integrations/tools/pubmed
Python REPL	No	No	Open source	Python shell	https://python.langchain.com/docs/integrations/tools/python
Reddit Search	No	No	Open source	Reddit search	https://python.langchain.com/docs/integrations/tools/reddit_search
Requests	No	No	Open source	HTTP requests	https://python.langchain.com/docs/integrations/tools/requests
SceneXplain	No	No	Open source	Model explanations	https://python.langchain.com/docs/integrations/tools/sceneXplain
Search	No	No	Open source	Query various search services	https://python.langchain.com/docs/integrations/tools/search_tools
SearchApi	No	Yes	Commercial	Query various search services	https://python.langchain.com/docs/integrations/tools/searchapi
SearxNG	No	No	Open source	Privacy-focused search	https://python.langchain.com/docs/integrations/tools/searx_search
Semantic Scholar API	No	No	Open source	Academic paper search	https://python.langchain.com/docs/integrations/tools/semanticscholar
SerpAPI	No	Yes	Commercial	Search engine results page scraping	https://python.langchain.com/docs/integrations/tools/serpapi
StackExchange	No	No	Open source	Stack Exchange access	https://python.langchain.com/docs/integrations/tools/stackexchange
Tavily Search	No	Yes	Commercial	Question answering	https://python.langchain.com/docs/integrations/tools/tavily_search
Twilio	No	Yes	Commercial	Communication APIs	https://python.langchain.com/docs/integrations/tools/twilio
Wikidata	No	No	Open source	Structured data access	https://python.langchain.com/docs/integrations/tools/wikidata
Wikipedia	No	No	Open source	Wikipedia access	https://python.langchain.com/docs/integrations/tools/wikipedia
Wolfram Alpha	No	Yes	Proprietary	Computational knowledge	https://python.langchain.com/docs/integrations/tools/wolfram_alpha
Yahoo Finance News	No	Yes	Commercial	Financial news	https://python.langchain.com/docs/integrations/tools/yahoo_finance_news
Youtube	No	Yes	Commercial	YouTube access	https://python.langchain.com/docs/integrations/tools/youtube
Zapier Natural Language Actions	No	Yes	Commercial	Workflow automation	https://python.langchain.com/docs/integrations/tools/zapier

Related Research Articles

Cairo is an open-source graphics library that provides a vector graphics-based, device-independent API for software developers. It provides primitives for two-dimensional drawing across a number of different backends. Cairo uses hardware acceleration when available.

This is a comparison of notable web frameworks, software used to build and deploy web applications.

The Dynamic Language Runtime (DLR) from Microsoft runs on top of the Common Language Runtime (CLR) and provides computer language services for dynamic languages. These services include:

Benevolent dictator for life (BDFL) is a title given to a small number of open-source software development leaders, typically project founders who retain the final say in disputes or arguments within the community. The phrase originated in 1995 with reference to Guido van Rossum, creator of the Python programming language.

Bitbucket is a Git-based source code repository hosting service owned by Atlassian. Bitbucket offers both commercial plans and free accounts with an unlimited number of private repositories.

Flask is a micro web framework written in Python. It is classified as a microframework because it does not require particular tools or libraries. It has no database abstraction layer, form validation, or any other components where pre-existing third-party libraries provide common functions. However, Flask supports extensions that can add application features as if they were implemented in Flask itself. Extensions exist for object-relational mappers, form validation, upload handling, various open authentication technologies and several common framework related tools.

PyCharm is an integrated development environment (IDE) used for programming in Python. It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems, and supports web development with Django. PyCharm is developed by the Czech company JetBrains and built on their IntelliJ platform.

GraalVM is a Java Development Kit (JDK) written in Java. The open-source distribution of GraalVM is based on OpenJDK, and the enterprise distribution is based on Oracle JDK. As well as just-in-time (JIT) compilation, GraalVM can compile a Java application ahead of time. This allows for faster initialization, greater runtime performance, and decreased resource consumption, but the resulting executable can only run on the platform it was compiled for.

KDE Frameworks is a collection of libraries and software frameworks readily available to any Qt-based software stacks or applications on multiple operating systems. Featuring frequently needed functionality solutions like hardware integration, file format support, additional graphical control elements, plotting functions, and spell checking, the collection serves as the technological foundation for KDE Plasma and KDE Gear. It is distributed under the GNU Lesser General Public License (LGPL).

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level system programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

Shapes Constraint Language (SHACL) is a World Wide Web Consortium (W3C) standard language for describing Resource Description Framework (RDF) graphs. SHACL has been designed to enhance the semantic and technical interoperability layers of ontologies expressed as RDF graphs.

GitHub Copilot is a code completion and automatic programming tool developed by GitHub and OpenAI that assists users of Visual Studio Code, Visual Studio, Neovim, and JetBrains integrated development environments (IDEs) by autocompleting code. Currently available by subscription to individual developers and to businesses, the generative artificial intelligence software was first announced by GitHub on 29 June 2021, and works best for users coding in Python, JavaScript, TypeScript, Ruby, and Go. In March 2023 GitHub announced plans for "Copilot X", which will incorporate a chatbot based on GPT-4, as well as support for voice commands, into Copilot.

OpenAI Codex is an artificial intelligence model developed by OpenAI. It parses natural language and generates code in response. It powers GitHub Copilot, a programming autocompletion tool for select IDEs, like Visual Studio Code and Neovim. Codex is a descendant of OpenAI's GPT-3 model, fine-tuned for use in programming applications.

Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

deepset German natural language processing startup

deepset is an enterprise software vendor that provides developers with the tools to build production-ready natural language processing (NLP) systems. It was founded in 2018 in Berlin by Milos Rusic, Malte Pietsch, and Timo Möller. deepset authored and maintains the open source software Haystack and its commercial SaaS offering deepset Cloud.

Mojo is a programming language in the Python family that is currently under development. It is available both in browsers via Jupyter notebooks, and locally on Linux and macOS. Mojo aims to combine the usability of higher level programming languages, specifically Python, with the performance of system programming languages like C++, Rust, and Zig. The Mojo compiler is currently closed source with an open source standard library, although Modular, the company behind Mojo, has stated their intent to eventually open source the Mojo programming language itself as it matures.

A vector database, vector store or vector search engine is a database that can store vectors along with other data items. Vector databases typically implement one or more Approximate Nearest Neighbor (ANN) algorithms, so that one can search the database with a query vector to retrieve the closest matching database records.

Tabnine is an artificial intelligence (AI) coding assistant developed by Tabnine, which was founded by Dror Weiss and Professor Eran Yahav in Tel Aviv, Israel, in 2013. Initially established under the name Codota, the company underwent a rebranding in May 2021 following the release of the company’s first large language model based AI coding assistant, adopting the name Tabnine.

References

↑ "Release 0.1.16". 11 April 2024. Retrieved 23 April 2024.
↑ Buniatyan, Davit (2023). "Code Understanding Using LangChain". Activeloop.
↑ Auffarth, Ben (2023). Generative AI with LangChain. Birmingham: Packt Publishing. p. 83. ISBN 9781835083468.
↑ Palazzolo, Stephanie (2023-04-13). "AI startup LangChain taps Sequoia to lead funding round at a valuation of at least $200 million". Business Insider. Archived from the original on 2023-04-18. Retrieved 2023-04-18.
↑ Griffith, Erin; Metz, Cade (2023-03-14). "'Let 1,000 Flowers Bloom': A.I. Funding Frenzy Escalates". The New York Times. ISSN 0362-4331. Archived from the original on 2023-04-18. Retrieved 2023-04-18.
↑ Mansurova, Mariya (2023-10-30). "Topic Modelling in production: Leveraging LangChain to move from ad-hoc Jupyter Notebooks to production modular service". towardsdatascience.com. Retrieved 2024-07-08.
↑ "LangChain Expression Language". langchain.dev. 2023-08-01. Retrieved 2024-07-08.
↑ "Introducing LangServe, the best way to deploy your LangChains". LangChain Blog. 2023-10-12. Retrieved 2023-10-17.
↑ "Chatbots | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.
↑ "Retrieval-augmented generation (RAG) | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.
↑ "Summarization | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.
↑ "Synthetic data generation | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.
↑ "Azure Cognitive Search and LangChain: A Seamless Integration for Enhanced Vector Search Capabilities". TECHCOMMUNITY.MICROSOFT.COM. Retrieved 2024-08-31.
↑ "Best Alternative AI Content Strategies and LLM Frameworks". Medium. 2024-08-31. Retrieved 2024-08-31.
↑ "Milvus — LangChain". python.langchain.com. Retrieved 2023-10-29.
↑ "Weaviate". python.langchain.com. Retrieved 2024-01-17.
↑ Hug, Daniel Patrick (2023-03-08). "Hierarchical topic tree of LangChain's integrations" (PDF). GitHub. Archived from the original on 2023-04-29. Retrieved 2023-04-18.
↑ "Document Loaders — LangChain 0.0.142". python.langchain.com. Archived from the original on 2023-04-18. Retrieved 2023-04-18.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[wikidata-abb2d70465d933e7eab77611ca869975d008bf0f-v18-1] "Release 0.1.16". 11 April 2024. Retrieved 23 April 2024.

[2] Buniatyan, Davit (2023). "Code Understanding Using LangChain". Activeloop.

[generativeai-3] Auffarth, Ben (2023). Generative AI with LangChain. Birmingham: Packt Publishing. p. 83. ISBN 9781835083468.

[bi-4] Palazzolo, Stephanie (2023-04-13). "AI startup LangChain taps Sequoia to lead funding round at a valuation of at least $200 million". Business Insider. Archived from the original on 2023-04-18. Retrieved 2023-04-18.

[5] Griffith, Erin; Metz, Cade (2023-03-14). "'Let 1,000 Flowers Bloom': A.I. Funding Frenzy Escalates". The New York Times. ISSN 0362-4331. Archived from the original on 2023-04-18. Retrieved 2023-04-18.

[6] Mansurova, Mariya (2023-10-30). "Topic Modelling in production: Leveraging LangChain to move from ad-hoc Jupyter Notebooks to production modular service". towardsdatascience.com. Retrieved 2024-07-08.

[7] "LangChain Expression Language". langchain.dev. 2023-08-01. Retrieved 2024-07-08.

[8] "Introducing LangServe, the best way to deploy your LangChains". LangChain Blog. 2023-10-12. Retrieved 2023-10-17.

[9] "Chatbots | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.

[10] "Retrieval-augmented generation (RAG) | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.

[11] "Summarization | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.

[12] "Synthetic data generation | 🦜️🔗 Langchain". python.langchain.com. Retrieved 2023-11-26.

[13] "Azure Cognitive Search and LangChain: A Seamless Integration for Enhanced Vector Search Capabilities". TECHCOMMUNITY.MICROSOFT.COM. Retrieved 2024-08-31.

[14] "Best Alternative AI Content Strategies and LLM Frameworks". Medium. 2024-08-31. Retrieved 2024-08-31.

[15] "Milvus — LangChain". python.langchain.com. Retrieved 2023-10-29.

[16] "Weaviate". python.langchain.com. Retrieved 2024-01-17.

[17] Hug, Daniel Patrick (2023-03-08). "Hierarchical topic tree of LangChain's integrations" (PDF). GitHub. Archived from the original on 2023-04-29. Retrieved 2023-04-18.

[18] "Document Loaders — LangChain 0.0.142". python.langchain.com. Archived from the original on 2023-04-18. Retrieved 2023-04-18.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]