Perplexity AI

Last updated

Perplexity AI, Inc.
Company type Private
Industry Artificial intelligence
Genre Search engine
FoundedAugust 2019;5 years ago (August 2019)
Founders
  • Aravind Srinivas
  • Andy Konwinski
  • Denis Yarats
  • Johnny Ho
Headquarters,
US
Key people
Aravind Srinivas (CEO)
Services
Number of employees
100 [1] [2]  (2024)
Website perplexity.ai
Screenshot of Perplexity (2024) Screenshot of Perplexity - What is Vilnius.png
Screenshot of Perplexity (2024)

Perplexity AI is a conversational search engine that uses large language models (LLMs) to answer queries using sources from the web and cites links within the text response. [3] [4] Its developer, Perplexity AI, Inc., is based in San Francisco, California. [5]

Contents

History

Aravind Srinivas at the 2024 TechCrunch Disrupt Aravind Srinivas TC Day 3.jpg
Aravind Srinivas at the 2024 TechCrunch Disrupt

Perplexity was founded in 2022 by Aravind Srinivas, Andy Konwinski, Denis Yarats and Johnny Ho, engineers with backgrounds in back-end systems, artificial intelligence (AI) and machine learning:

Services

Perplexity works on a freemium model. It also has an enterprise version of its product. [2]

Free plan

The free model uses the company's standalone LLM based on GPT-3.5 with browsing. [7] [8]

It uses the context of the user queries to provide a personalized search result. Perplexity summarizes the search results and produces a text with inline citations. [8]

Perplexity also enables users to use Pages to generate customizable webpage and research presentations based on user prompts. [9]

Perplexity Pro

Shopping hub

On 18 November 2024, Perplexity launched its shopping hub to attract users, backed by Amazon and leading AI chipmaker Nvidia. This will give users product cards which will show relevant items in response to asked questions about shopping. [12]

Internal Knowledge Search enables Pro and Enterprise Pro users to search across web content and internal documents simultaneously. Users can upload and search through Excel, Word, PDF, and other common file formats. Enterprise Pro users have a limit of 500 files for upload and indexing. [13]

Finance

In October 2024, introduced new finance-related features, including looking up stock prices and company earnings data. The tool provides real-time stock quotes and price tracking, industry peer comparisons and basic financial analysis tools. The platform sources its financial data from Financial Modeling Prep (FMP) to ensure accuracy. [14] [15]

Spaces

Perplexity Spaces was released in October 2024 as an AI-powered collaboration hub. The platform allows users to create customized knowledge spaces that combine web searches with personal file integration. Users can upload up to 50 different documents, with a 25MB size limit per file. [16]

As a business

As of 2024, Perplexity has raised $165 million in funding, valuing the company at over $1 billion. [2]

As of December 2024, Perplexity closed a $500 million round of funding that elevates its valuation to $9 billion. [14] [17] [18]

In July 2024, Perplexity announced the launch of a new publishers' program to share ad revenue with partners. [19]

Perplexity AI plans to introduce ads [20] [21] on its search platform by Q4 of 2024. [22]

Notable investors [8] [23] [2]

  1. Jeff Bezos
  2. Nvidia
  3. Databricks
  4. Bessemer Venture Partners
  5. Susan Wojcicki
  6. Jeff Dean
  7. Yann LeCun
  8. Andrej Karpathy
  9. Nat Friedman
  10. Garry Tan

Controversies

Forbes

In June 2024, Forbes publicly criticized Perplexity for use of their content.

According to Forbes, Perplexity published a story which was largely copied from a proprietary Forbes article, without mentioning or prominently citing Forbes.

In response, Srinivas said that the feature had some "rough edges" and accepted feedback, but maintained that Perplexity only "aggregates" rather than plagiarizes information. [24] [25]

Wired

In June 2024, separate investigations by the magazine Wired and web developer Robb Knight found that Perplexity does not respect the robots.txt standard, which allows websites to stop web crawlers from scraping content, reportedly despite Perplexity claiming the opposite.

Perplexity also lists the IP address ranges and user agent strings of their web crawlers publicly, but according to Wired and Robb Knight, they use undisclosed IP addresses and spoofed user agent strings when ignoring robots.txt. [26] [27]

Wired also stated that, in some cases, Perplexity may be summarizing:

"not actual news articles but reconstructions of what they say based on URLs and traces of them left in search engines like extracts and metadata, offering summaries purporting to be based on direct access to the relevant text." [26]

In response, Srinivas stated in a phone interview that:

"Perplexity is not ignoring the Robot Exclusions Protocol... We don't just rely on our own web crawlers, we rely on third-party web crawlers as well."

Srinivas explained that the web crawler identified by Wired was owned by a third-party provider. [28]

When asked whether Perplexity would cease scraping Wired content using third parties, Srinivas responded that "it's complicated." [28]

Amazon

Amazon Web Services, which hosts the Perplexity crawler, has a terms of service clause prohibiting its users from ignoring the robots.txt standard.

Amazon began a "routine" investigation into the company's usage of Amazon Elastic Compute Cloud. [29]

Lawsuits

In October 2024, The New York Times (NYT) sent a cease-and-desist notice to Perplexity to stop accessing and using NYT content, claiming that Perplexity is violating its copyright by scraping data from its website. [30]

NYT is also suing OpenAI and Microsoft for copyright infringement for similarly using millions of its articles to train the large language models that power ChatGPT. [31]

The cease-and-desist notice sent by NYT lawyers read in part:

"Perplexity and its business partners have been unjustly enriched by using, without authorization, The Times's expressive, carefully written and researched, and edited journalism without a license." [32]

Perplexity plans to respond to the notice by October 30, 2024. [30]

The same month, Dow Jones and New York Post filed a lawsuit against Perplexity, alleging copyright infringement. The lawsuit also alleges that Perplexity attributed quotes to an article on F-16 jets for Ukraine that never appeared in the original article. [33]

Related Research Articles

<span class="mw-page-title-main">Web crawler</span> Software which systematically browses the World Wide Web

A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing.

robots.txt Internet protocol

robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.

Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid search traffic rather than direct traffic, referral traffic, social media traffic, or paid traffic.

<span class="mw-page-title-main">Googlebot</span> Web crawler used by Google

Googlebot is the web crawler software used by Google that collects documents from the web to build a searchable index for the Google Search engine. This name is actually used to refer to two different types of web crawlers: a desktop crawler and a mobile crawler.

A sitemap is a list of pages of a web site within a domain.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

<span class="mw-page-title-main">Yandex</span> Russian multinational technology company

Yandex LLC is a Russian technology company that provides Internet-related products and services including a web browser, search engine, cloud computing, web mapping, online food ordering, streaming media, online shopping, and a ridesharing company.

<span class="mw-page-title-main">Search engine</span> Software system for finding relevant information on the Web

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

<span class="mw-page-title-main">Microsoft Bing</span> Web search engine developed by Microsoft

Microsoft Bing, commonly referred to as Bing, is a search engine owned and operated by Microsoft. The service traces its roots back to Microsoft's earlier search engines, including MSN Search, Windows Live Search, and Live Search. Bing offers a broad spectrum of search services, encompassing web, video, image, and map search products, all developed using ASP.NET.

<span class="mw-page-title-main">Box, Inc.</span> Cloud content management program

Box, Inc. is a public company based in Redwood City, California. It develops and markets cloud-based content management, collaboration, and file sharing tools for businesses. Box was founded in 2005 by Aaron Levie and Dylan Smith. Initially, it focused on consumers, but around 2009 and 2010 Box pivoted to focus on business users. The company raised about $500 million over numerous funding rounds before going public in 2015. Its software allows users to store and manage files in an online folder system accessible from any device. Users can then comment on the files, share them, apply workflows, and implement security and governance policies.

<span class="mw-page-title-main">Imgur</span> American online image hosting service

Imgur is an American online image sharing and image hosting service with a focus on social gossip that was founded by Alan Schaaf in 2009. The service has hosted viral images and memes, particularly those posted on Reddit.

A content farm or content mill is a company that employs freelance creators or uses automated tools to generate a large amount of web content which is specifically designed to satisfy algorithms for maximal retrieval by search engines, known as SEO. Their main goal is to generate advertising revenue through attracting page views, as first exposed in the context of social spam.

OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California. Its stated mission is to develop "safe and beneficial" artificial general intelligence (AGI), which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As a leading organization in the ongoing AI boom, OpenAI is known for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in November 2022 has been credited with catalyzing widespread interest in generative AI.

<span class="mw-page-title-main">QANDA</span> AI-based math problem-solving and tutoring platform

QANDA is an AI-based learning platform developed by Mathpresso Inc., a South Korea-based education technology company. Its best known feature is a solution search, which uses optical character recognition technology to scan problems and provide step-by-step solutions and learning content.

<span class="mw-page-title-main">You.com</span> Search engine

You.com is an AI assistant that began as a personalization-focused search engine. While still offering web search capabilities, You.com has evolved to prioritize a chat-first AI assistant.

<span class="mw-page-title-main">Midjourney</span> Image-generating machine learning model

Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco-based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called prompts, similar to OpenAI's DALL-E and Stability AI's Stable Diffusion. It is one of the technologies of the AI boom.

<span class="mw-page-title-main">ChatGPT</span> Chatbot developed by OpenAI

ChatGPT is a generative artificial intelligence chatbot developed by OpenAI and launched in 2022. It is currently based on the GPT-4o large language model (LLM). ChatGPT can generate human-like conversational responses and enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. It is credited with accelerating the AI boom, which has led to ongoing rapid investment in and public attention to the field of artificial intelligence (AI). Some observers have raised concern about the potential of ChatGPT and similar programs to displace human intelligence, enable plagiarism, or fuel misinformation.

The dead Internet theory is an online conspiracy theory that asserts, due to a coordinated and intentional effort, the Internet now consists mainly of bot activity and automatically generated content manipulated by algorithmic curation to control the population and minimize organic human activity. Proponents of the theory believe these social bots were created intentionally to help manipulate algorithms and boost search results in order to manipulate consumers. Some proponents of the theory accuse government agencies of using bots to manipulate public perception. The date given for this "death" is generally around 2016 or 2017. The dead Internet theory has gained traction because many of the observed phenomena are quantifiable, such as increased bot traffic, but the literature on the subject does not support the full theory.

GPTZero is an artificial intelligence detection software developed to identify artificially generated text, such as those produced by large language models.

SearchGPT is a search engine developed by OpenAI. It combines traditional search engine features with generative pretrained transformers (GPT) to generate responses, including citations to external websites.

References

  1. https://www.linkedin.com/pulse/linkedin-top-startups-2024-50-us-companies-rise-linkedin-news-hxote/
  2. 1 2 3 4 5 Ghaffary, Shirin (April 23, 2024). "AI Search Startup Perplexity Valued at $1 Billion in Funding Round". Bloomberg News . Archived from the original on April 24, 2024.
  3. "What Is Perplexity AI? Understanding One Of Google's Biggest Search Engine Competitors".
  4. 1 2 Singh, Shubham (January 6, 2024). "Perplexity AI raises $73.6M in funding round led by Nvidia, Bezos, now valued at $522M". Business Today . Retrieved January 12, 2024.
  5. "Google's latest rival: What is Perplexity AI and why is it causing so much controversy?".
  6. "AI-powered search engine Perplexity AI lands $26M, launches iOS app". TechCrunch . April 4, 2023. Archived from the original on March 5, 2024. Retrieved May 5, 2024.
  7. "Perplexity Free based on GPT-3.5". discord.com. Perplexity Community Moderator "IceLavaMan". Retrieved October 8, 2024.
  8. 1 2 3 4 Wiggers, Kyle (January 4, 2024). "AI-powered search engine Perplexity AI, now valued at $520M, raises $73.6M". TechCrunch . Archived from the original on January 7, 2024. Retrieved January 7, 2024.
  9. David, Emilia (May 30, 2024). "Perplexity will research and write reports". The Verge . Archived from the original on June 20, 2024. Retrieved June 24, 2024.
  10. "Introducing Internal Knowledge Search and Spaces". Perplexity. October 17, 2024.
  11. "Startup Perplexity Challenges Google With AI Search" . The Wall Street Journal . January 4, 2024. Archived from the original on January 10, 2024. Retrieved January 10, 2024.
  12. "AI startup Perplexity adds shopping features as search competition tightens". Reuters. November 18, 2024. Retrieved November 21, 2024.
  13. David, Emilia (October 17, 2024). "Perplexity lets you search your internal enterprise files and the web". VentureBeat. Retrieved December 18, 2024.
  14. 1 2 "AI Startup Perplexity Closes Funding Round at $9 Billion Value". Bloomberg.com. December 18, 2024. Retrieved December 18, 2024.
  15. "Perplexity AI's new tool makes researching the stock market 'delightful'. Here's how". ZDNET. Retrieved December 18, 2024.
  16. "NotebookLM & Perplexity Spaces: All You Need to Know". Habr. October 30, 2024. Retrieved December 18, 2024.
  17. Singh, Jaspreet (November 6, 2024). "Perplexity raising new funds at $9 bln valuation, source says". Reuters.
  18. Field, Hayden (November 5, 2024). "Perplexity AI in final stages of raising $500 million round at $9 billion valuation". CNBC. Retrieved November 21, 2024.
  19. Robison, Kylie (July 30, 2024). "Perplexity is cutting checks to publishers following plagiarism accusations". The Verge. Retrieved August 4, 2024.
  20. Wiggers, Kyle (November 13, 2024). "Perplexity brings ads to its platform". TechCrunch. Retrieved November 25, 2024.
  21. Field, Hayden (August 22, 2024). "Perplexity AI plans to start running ads in fourth quarter as AI-assisted search gains popularity". CNBC. Retrieved November 25, 2024.
  22. "Perplexity AI to launch ads on search platform by fourth quarter". The Economic Times. August 23, 2024. ISSN   0013-0389 . Retrieved November 25, 2024.
  23. "Announcing our series A funding round and mobile app launch". Perplexity.ai. April 28, 2023. Archived from the original on April 22, 2024. Retrieved April 24, 2024.
  24. O'Brien, Matt (June 15, 2024). "AI startup Perplexity wants to upend search business. News outlet Forbes says it's ripping them off". Associated Press . Archived from the original on June 20, 2024. Retrieved June 20, 2024.
  25. Lane, Randall (June 11, 2024). "Why Perplexity's Cynical Theft Represents Everything That Could Go Wrong With AI". Forbes . Retrieved June 20, 2024.
  26. 1 2 Mehrotra, Dhruv; Marchman, Tim (June 19, 2024). "Perplexity Is a Bullshit Machine". Wired . Archived from the original on June 20, 2024. Retrieved June 20, 2024.
  27. "Perplexity AI Is Lying about Their User Agent". Robb Knight. June 15, 2024. Archived from the original on June 20, 2024. Retrieved June 20, 2024.
  28. 1 2 Sullivan, Mark (June 21, 2024). "Perplexity CEO Aravind Srinivas responds to plagiarism and infringement accusations". Fast Company . Retrieved June 24, 2024.
  29. Mehrotra, Dhruv; Couts, Andrew (June 27, 2024). "Amazon Is Investigating Perplexity Over Claims of Scraping Abuse". Wired . Retrieved July 3, 2024.
  30. 1 2 Davis, Wes (October 15, 2024). "The New York Times warns AI search engine Perplexity to stop using its content". The Verge. Retrieved October 17, 2024.
  31. Complaint, New York Times, Co. v. Microsoft Corp., No. 1:23-cv-11195 (S.D.N.Y. December 27, 2023).
  32. Bruell, Alexandra (October 15, 2024). "New York Times to Bezos-Backed AI Startup: Stop Using Our Stuff". The Wall Street Journal . Retrieved October 17, 2024.
  33. Bruell, Alexandra (October 21, 2024). "Wall Street Journal, New York Post Sue AI Startup Perplexity, Alleging 'Massive Freeriding'" . The Wall Street Journal . Retrieved October 21, 2024.