Searx

Last updated
searx
Logo searx a.png
Screenshot-2017-8-31 searx me.png
Searx Web interface
Developer(s) Adam Tauber (alias asciimoo) [1]
Initial releaseJanuary 22, 2014;7 years ago (2014-01-22) [2]
Stable release
1.0.0 [3]   OOjs UI icon edit-ltr-progressive.svg / 27 March 2021;1 day ago (27 March 2021)
Repository github.com/searx/searx
Written in Python
Type Metasearch engine
License Free software (AGPLv3)
Website searx.me

Searx ( /sɜːrks/ ) is a free metasearch engine, [4] available under the GNU Affero General Public License version 3, with the aim of protecting the privacy of its users. [5] [6] [7] To this end, searx does not share users' IP addresses or search history with the search engines from which it gathers results. Tracking cookies served by the search engines are blocked, preventing user-profiling-based results modification. [8] [9] By default, searx queries are submitted via HTTP POST, to prevent users' query keywords from appearing in webserver logs. [10] Searx was inspired by the Seeks project, [11] though it does not implement Seeks' peer-to-peer user-sourced results ranking.

Contents

Each search result is given as a direct link to the respective site, rather than a tracked redirect link as used by Google. In addition, when available, these direct links are accompanied by "cached" and/or "proxied" links that allow viewing results pages without actually visiting the sites in question. The "cached" links point to saved versions of a page on archive.org, while the "proxied" links allow viewing the current live page via a searx-based web proxy. In addition to the general search, the engine also features tabs to search within specific domains: files, images, IT, maps, music, news, science, social media, and videos. [12] [13]

There are many public user-run searx instances, [14] some of which are available as Tor hidden services. [14] "Meta-searx" sites query a different random instance with each search. [14] A public API is available for searx, [15] [16] as well as Firefox search provider plugins. [17]

Search engines and other settings

Across all categories, searx can fetch search results from about 82 different engines. This includes major search engines and site-specific searches like Bing, Google, Reddit, Wikipedia, Yahoo, and Yandex. [18] The engines used for each search category can be set via a "preferences" interface, and these settings will be saved in a cookie in the user's browser, rather than on the server side, since for privacy reasons, searx does not implement a user login model. Other settings such as the search interface language and the search results language (over 20 languages are available) can be set the same way. [10]

In addition to the preferences cookie, it is possible on each query to modify the engines used, search categories selected, and/or languages to search in by specifying one or more of the following textual search operators before the search keywords. [19]

The ! and ? operators can be specified more than once to select multiple categories or engines, for example !google !deviantart ?images :japanese cow.

Instances

Any user may run their own instance of searx, [20] [21] [22] which can be done to maximize privacy, to avoid congestion on public instances, to preserve customized settings even if browser cookies are cleared, to allow auditing of the source code being run, etc. [23] [24] [25] Users may include their searx instances on the editable list of all public instances, or keep them private. [18] [23] It is also possible to add custom search engines to a self-hosted instance that are not available on the public instances. [26]

Another reason to use different searx instances, and/or to run one's own, is that as of 2019, Google has begun to block some instances, including some of the IP addresses used by searx.me (former instance run by the developer), from querying it, resulting in a "google (unexpected crash: CAPTCHA required)" error. [27] In response, some instances have been modified to silently skip trying to search with Google, even when it's the only engine specified. [28] [29]

Related Research Articles

In computers, case sensitivity defines whether uppercase and lowercase letters are treated as distinct (case-sensitive) or equivalent (case-insensitive). For instance, when users interested in learning about dogs search an e-book, "dog" and "Dog" are of the same significance to them. Thus, they request a case-insensitive search. But when they search an online encyclopedia for information about the United Nations, for example, or something with no ambiguity regarding capitalization and ambiguity between two or more terms cut down by capitalization, they may prefer a case-sensitive search.

Dogpile Metasearch engine

Dogpile is a metasearch engine for information on the World Wide Web that fetches results from Google, Yahoo!, Yandex, Bing, and other popular search engines, including those from audio and video content providers such as Yahoo!.

Metasearch engine

A metasearch engine is an online Information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

Internet privacy

Internet privacy involves the right or mandate of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to oneself via the Internet. Internet privacy is a subset of data privacy. Privacy concerns have been articulated from the beginnings of large-scale computer sharing.

Ruby on Rails Server-side open source web application framework

Ruby on Rails, or Rails, is a server-side web application framework written in Ruby under the MIT License. Rails is a model–view–controller (MVC) framework, providing default structures for a database, a web service, and web pages. It encourages and facilitates the use of web standards such as JSON or XML for data transfer and HTML, CSS and JavaScript for user interfacing. In addition to MVC, Rails emphasizes the use of other well-known software engineering patterns and paradigms, including convention over configuration (CoC), don't repeat yourself (DRY), and the active record pattern.

Startpage.com Privacy-focused search engine based in the Netherlands

Startpage is a Dutch search engine company that highlights privacy as its distinguishing feature. The website advertises that it allows users to obtain Google Search results while protecting users' privacy by not storing personal information or search data and removing all trackers. Startpage.com also includes an Anonymous View browsing feature that allows users the option to open search results via proxy for increased anonymity. Because the company is based in the Netherlands, it is protected by Dutch and European Union privacy laws, and thus is not subject to United States surveillance programs, like PRISM.

YaCy

YaCy is a free distributed search engine, built on principles of peer-to-peer (P2P) networks. Its core is a computer program written in Java distributed on several hundred computers, as of September 2006, so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database which is shared with other YaCy-peers using principles of P2P networks. It is a search engine that everyone can use to build a search portal for their intranet and to help search the public internet clearly.

An HTTP cookie is a small piece of data stored on the user's computer by the web browser while browsing a website. Cookies were designed to be a reliable mechanism for websites to remember stateful information or to record the user's browsing activity. They can also be used to remember pieces of information that the user previously entered into form fields, such as names, addresses, passwords, and payment card numbers.

MDN Web Docs, previously Mozilla Developer Network and formerly Mozilla Developer Center, is a documentation repository and learning resource for web developers used by Mozilla, Microsoft, Google, and Samsung. The project was started by Mozilla in 2005 as a unified place for documentation about open web standards, Mozilla's own projects, and developer guides. In 2017, Microsoft, Google, and Samsung announced that they would shut down their own documentation projects and move all their documentation to MDN Web Docs.

A search aggregator is a type of metasearch engine which gathers results from multiple search engines simultaneously, typically through RSS search results. It combines user specified search feeds to give the user the same level of control over content as a general aggregator.

It is difficult to determine which programming languages are "most widely used" because the meaning of the term varies by context. One language may occupy the most programmer-hours, another may have the most lines of code, a third may utilize the most CPU time, and so on. Some languages are very popular for particular kinds of applications: for example, COBOL in the corporate data center, often on large mainframes; Fortran in computational science and engineering; C in embedded applications and operating systems; and other languages for many kinds of applications.

DuckDuckGo Internet search engine

DuckDuckGo is an internet search engine that emphasizes protecting searchers' privacy and avoiding the filter bubble of personalized search results. DuckDuckGo distinguishes itself from other search engines by not profiling its users and by showing all users the same search results for a given search term.

Distributed social network projects generally develop software, protocols, or both.

Qwant Search engine

Qwant is a French search engine, launched in July 2013 and operated from Paris. It is one of the few EU-based search engines with its own indexing engine. It claims that it does not employ user tracking or personalize search results in order to avoid trapping users in a filter bubble. The search engine is available in 26 languages.

MetaGer Privacy-focused internet search engine

MetaGer is a metasearch engine focused on protecting users' privacy. Based in Germany, and hosted as a cooperation between the German NGO 'SUMA-EV - Association for Free Access to Knowledge' and the University of Hannover, the system is built on 24 small-scale web crawlers under MetaGer's own control. In September 2013, MetaGer launched MetaGer.net, an English-language version of their search engine.

Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management. It was originally designed by Google and is now maintained by the Cloud Native Computing Foundation. It aims to provide a "platform for automating deployment, scaling, and operations of application containers across clusters of hosts". It works with a range of container tools and runs containers in a cluster, often with images built using Docker. Kubernetes originally interfaced with the Docker runtime through a "Dockershim"; however, the shim has since been deprecated in favor of directly interfacing with containerd or another CRI-compliant runtime.

JSON Web Token is an Internet proposed standard for creating data with optional signature and/or optional encryption whose payload holds JSON that asserts some number of claims. The tokens are signed either using a private secret or a public/private key. For example, a server could generate a token that has the claim "logged in as admin" and provide that to a client. The client could then use that token to prove that it is logged in as admin. The tokens can be signed by one party's private key (usually the server's) so that party can subsequently verify the token is legitimate. If the other party, by some suitable and trustworthy means, is in possession of the corresponding public key, they too are able to verify the token's legitimacy. The tokens are designed to be compact, URL-safe, and usable especially in a web-browser single-sign-on (SSO) context. JWT claims can typically be used to pass identity of authenticated users between an identity provider and a service provider, or any other type of claims as required by business processes.

WebAssembly Cross-platform assembly language and bytecode designed for execution in web browsers

WebAssembly is an open standard that defines a portable binary-code format for executable programs, and a corresponding textual assembly language, as well as interfaces for facilitating interactions between such programs and their host environment. The main goal of WebAssembly is to enable high-performance applications on web pages, but the format is designed to be executed and integrated in other environments as well, including standalone ones.

Element (software) Decentralised, encrypted chat and collaboration software powered by the Matrix protocol

Element is a free and open-source software instant messaging client implementing the Matrix protocol.

References

  1. "asciimoo (Adam Tauber)". GitHub .
  2. Tauber, Adam. "searx: A privacy-respecting, hackable metasearch engine" via PyPI.
  3. "Release 1.0.0".
  4. Kühnast, Charly. "Peppered with Hits » Linux Magazine". Linux Magazine. Retrieved 2017-08-31.
  5. Bradbury, Danny (August 10, 2017). "Self-hosted search option is a new approach to bursting the filter bubble". Naked Security. Archived from the original on September 4, 2017. Retrieved August 30, 2017.
  6. Zak, Robert (April 3, 2017). "What Is the Best Search Engine for Privacy?". Make Tech Easier. Archived from the original on July 3, 2018. Retrieved August 30, 2017.
  7. Sonmez, John (December 22, 2014). "Searx: self-hosted web metasearch engine". Tuxdiary. Retrieved 2017-08-31.
  8. administrator, Acc. "Як захистити свої дані в інтернеті: 11 корисних додатків". Новини АСС (in Ukrainian). Retrieved 2017-08-31.
  9. "Searx: Die konfigurierbare Suchmaschine, die deine Privatsphäre respektiert". t3n News (in German). Retrieved 2017-08-31.
  10. 1 2 "preferences - searx.me". searx.me. Archived from the original on 2018-03-20. Retrieved 2017-09-25.
  11. "about.html". GitHub. Retrieved 2020-05-23.
  12. "A Primer on Staying Secure and Anonymous on the Dark Web". TechSpot. Retrieved 2017-08-30.
  13. Von Jan Weisensee (2016-09-07). "Searx 0.10.0: Die eigene Suchmaschine auf einem Raspberry Pi" [Searx 0.10.0: Your own search engine on a Raspberry Pi]. golem.de (in German). Archived from the original on 2020-08-07. Retrieved 2017-08-31.
  14. 1 2 3 "Public Searx instances". searx.space.
  15. "Search API — searx 0.12.0 documentation". searx.github.io. Retrieved 2017-08-31.
  16. Seitz, Justin (2017-04-18). "Building a Keyword Monitoring Pipeline with Python, Pastebin and Searx | Automating OSINT Blog". www.automatingosint.com. Retrieved 2017-08-31.
  17. "Search results for "searx" – Add-ons for Firefox (en-US)". addons.mozilla.org. Retrieved 2019-07-15.
  18. 1 2 Tauber, Adam (2017-08-30). "searx: Privacy-respecting metasearch engine" . Retrieved 2017-08-31.
  19. "Search syntax — searx 0.12.0 documentation". searx.github.io. Retrieved 2017-08-30.
  20. "My Searx instance - Logan Marchione". Logan Marchione. 2015-10-18. Retrieved 2017-08-31.
  21. "New fast and private searx instance in Europe for private websearches • r/privacy". Reddit. Retrieved 2017-08-31.
  22. "How to setup your own privacy respecting search engine in a couple of hours with a free ssl certificate • r/privacytoolsIO". Reddit. Retrieved 2017-08-31.
  23. 1 2 "Why use a private instance? — searx 0.12.0 documentation". searx.github.io. Retrieved 2017-08-31.
  24. "Privacy advantages of running my own searx instance • r/privacytoolsIO". Reddit. Retrieved 2017-08-31.
  25. "Searx.me is overloaded. Privacytools.io should link to just the instances page or randomize. • r/privacytoolsIO". Reddit. Retrieved 2017-08-31.
  26. "Engine overview — searx 0.12.0 documentation". searx.github.io. Retrieved 2017-08-31.
  27. "Google Captcha". GitHub issues. 2016-10-12. Retrieved 2020-05-23.
  28. "!google cow - searx". searx.info. Retrieved 2019-07-15. Sorry! we didn't find any results. Please use another query or search in more categories.
  29. "!google cow - searx". search.disroot.org. Retrieved 2019-07-15. Sorry! we didn't find any results. Please use another query or search in more categories.

See also