Google Code Search

Last updated
Google Code Search
Google Code Search.png
Type of site
Search engine
Available inAll languages
Owner Google
URLwww.google.com/codesearch
LaunchedOctober 5, 2006;16 years ago (2006-10-05)
Current statusDiscontinued as of 15 January 2012

Google Code Search was a free beta product from Google which debuted in Google Labs on October 5, 2006, allowing web users to search for open-source code on the Internet. Features included the ability to search using operators, namely lang:, package:, license:, and file:.

Contents

The code available for searching was in various formats including tar.gz, .tar.bz2, .tar, and .zip, CVS, Subversion, git and Mercurial repositories.

Google Code Search covered many open-source projects, and as such is different from the "Code Search for Google Open source projects" that was released afterwards. [1] [2]

Regular expression engine

The site allowed the use of regular expressions in queries, which at that time was not offered by any other search engine for code.[ citation needed ] This makes it resemble grep, but over the world's public code. The methodology employed, sometimes called trigram search, combines a trigram index with a custom-built, denial-of-service resistant regular expression engine. [3]

In March 2010, the code of RE2, the regular expression engine used in Google Code Search, was made open source. [4]

Google Code Search supported POSIX extended regular expression syntax, excluding back-references, collating elements, and collation classes.

Languages not officially supported could be searched for using the file: operator to match the common file extensions for the language.

Discontinuation

In October 2011, Google announced that Code Search was to be shut down along with the Code Search API. [5] The service remained online until March 2013, [6] and it now returns a 404.

In January 2012, Russ Cox published an overview of history and the technical aspects of the tool, and open-sourced a basic implementation of a similar functionality as a set of standalone programs that can run fast indexed regular expression searches over local code. [7]

See also

Related Research Articles

<span class="mw-page-title-main">Regular expression</span> Sequence of characters that forms a search pattern

A regular expression is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

<span class="mw-page-title-main">KDevelop</span> Integrated development environment

KDevelop is a free and open-source integrated development environment (IDE) for Unix-like computer operating systems and Windows. It provides editing, navigation and debugging features for several programming languages, and integration with build automation and version-control systems, using a plugin-based architecture.

Trigram search is a method of searching for text when the exact syntax or spelling of the target object is not precisely known or when queries may be regular expressions. It finds objects which match the maximum number of three consecutive character strings in the entered search terms, which are generally near matches. Two strings with many shared trigrams can be expected to be very similar. Trigrams also allow for efficiently creating search engine indexes for searches that are regular expressions or match the text inexactly. Indexes can significantly accelerate searches. A threshold for number of trigram matches can be specified as a cutoff point, after which a result is no longer considered a match.

<span class="mw-page-title-main">FontForge</span> Font editor created by George Williams

FontForge is a FOSS font editor which supports many common font formats. Developed primarily by George Williams until 2012, FontForge is free software and is distributed under a mix of the GNU General Public License Version 3 and the 3-clause BSD license. It is available for operating systems including Linux, Windows, and macOS, and is localized into 12 languages.

Squashfs is a compressed read-only file system for Linux. Squashfs compresses files, inodes and directories, and supports block sizes from 4 KiB up to 1 MiB for greater compression. Several compression algorithms are supported. Squashfs is also the name of free software, licensed under the GPL, for accessing Squashfs filesystems.

Google Developers is Google's site for software development tools and platforms, application programming interfaces (APIs), and technical resources. The site contains documentation on using Google developer tools and APIs—including discussion groups and blogs for developers using Google's developer products.

<span class="mw-page-title-main">Zipeg</span>

Zipeg is* an open source free software that extracts files from a wide range of compressed archive formats. Zipeg works under Mac OS X and Windows. It is best known for its file preview ability. It is incapable of compressing files, although it is able to extract compressed ones. Zipeg is built on top of the 7-Zip backend. Its UI is implemented in Java and is open source.

Text Template Transformation Toolkit is a free and open-source template-based text generation framework. T4 source files are usually denoted by the file extension ".tt".

<span class="mw-page-title-main">Go (programming language)</span> Programming language

Go is a statically typed, compiled high-level programming language designed at Google by Robert Griesemer, Rob Pike, and Ken Thompson. It is syntactically similar to C, but also has memory safety, garbage collection, structural typing, and CSP-style concurrency. It is often referred to as Golang because of its former domain name, golang.org, but its proper name is Go.

<span class="mw-page-title-main">OCRFeeder</span>

OCRFeeder is an optical character recognition suite for GNOME, which also supports virtually any command-line OCR engine, such as CuneiForm, GOCR, Ocrad and Tesseract. It converts paper documents to digital document files and can serve to make them accessible to visually impaired users.

<span class="mw-page-title-main">Sublime Text</span> Text editor

Sublime Text is a shareware text and source code editor available for Windows, macOS, and Linux. It natively supports many programming languages and markup languages. Users can customize it with themes and expand its functionality with plugins, typically community-built and maintained under free-software licenses. To facilitate plugins, Sublime Text features a Python API. The editor utilizes minimal interface and contains features for programmers including configurable syntax highlighting, code folding, search-and-replace supporting regular-expressions, terminal output window, and more. It is proprietary software, but a free evaluation version is available.

RE2 is a software library for regular expressions via a finite-state machine using automata theory, in contrast to almost all other regular expression libraries, which use backtracking implementations. It provides a C++ interface.

<span class="mw-page-title-main">OpenRefine</span> Application for data cleanup and data transformation

OpenRefine is an open-source desktop application for data cleanup and transformation to other formats, an activity commonly known as data wrangling. It is similar to spreadsheet applications, and can handle spreadsheet file formats such as CSV, but it behaves more like a database.

Emscripten is an LLVM/Clang-based compiler that compiles C and C++ source code to WebAssembly, primarily for execution in web browsers.

<span class="mw-page-title-main">RocksDB</span>

RocksDB is a high performance embedded database for key-value data. It is a fork of Google's LevelDB optimized to exploit multi-core processors (CPUs), and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads. It is based on a log-structured merge-tree data structure. It is written in C++ and provides official language bindings for C++, C, and Java. Many third-party language bindings exist. RocksDB is free and open-source software, released originally under a BSD 3-clause license. However, in July 2017 the project was migrated to a dual license of both Apache 2.0 and GPLv2 license, possibly in response to the Apache Software Foundation's blacklist of the previous BSD+Patents license clause.

Sourcegraph is a code search and code intelligence tool that semantically indexes and analyzes large codebases so that they can be searched across commercial, open-source, local, and cloud-based repositories. Sourcegraph supports all major programming languages.

In computer science, an algorithm for matching wildcards is useful in comparing text strings that may contain wildcard syntax. Common uses of these algorithms include command-line interfaces, e.g. the Bourne shell or Microsoft Windows command-line or text editor or file manager, as well as the interfaces for some search engines and databases. Wildcard matching is a subset of the problem of matching regular expressions and string matching in general.

BitFunnel is the search engine indexing algorithm and a set of components used in the Bing search engine, which were made open source in 2016. BitFunnel uses bit-sliced signatures instead of an inverted index in an attempt to reduce operations cost.

References

  1. "Code Search for Google open source projects". Google Open Source Blog. Retrieved 2020-04-01.
  2. "Google Open Source". cs.opensource.google. Retrieved 2020-04-01.
  3. Russ Cox (January 2012). "Regular Expression Matching with a Trigram Index (or: How Google Code Search Worked)". Archived from the original on 2012-01-28. Retrieved 2012-01-26.
  4. "RE2: a principled approach to regular expression matching". Archived from the original on 2016-09-27. Retrieved 2016-09-24.
  5. Horowitz, Bradley (2011-10-14). "Official Blog: A fall sweep". Googleblog.blogspot.com. Archived from the original on 2011-11-23. Retrieved 2013-07-09.
  6. "Replacement for Google Code Search?". Stack Overflow. Archived from the original on 2017-11-09. Retrieved 2016-07-25.
  7. codesearch on GitHub