JSFuck is an esoteric subset of JavaScript, where code is written using only six characters: [
, ]
, (
, )
, !
, and +
. The name is derived from Brainfuck, an esoteric programming language that also uses a minimalistic alphabet of only punctuation. Unlike Brainfuck, which requires its own compiler or interpreter, JSFuck is valid JavaScript code, meaning that JSFuck programs can be run in any web browser or engine that interprets JavaScript. JSFuck is able to recreate all JavaScript functionality using such a limited set of characters because JavaScript allows the evaluation of any expression as any type. [1]
In July 2009, Yosuke Hasegawa created a web application called jjencode which could encode arbitrary JavaScript into an obfuscated form utilizing only the 18 symbols []()!+,\"$.:;_{}~=
. [2] [3] In January 2010, an informal competition was held in the "Obfuscation" forum of the sla.ckers.org web application security site to come up with a way to get the minimum number of characters required down to less than eight: []()!+,/
. Contributors to the thread managed to eliminate the need for the ,
and /
characters. [4] As of March 2010, an online encoder called JS-NoAlnum was available which utilized only the final set of six characters. [5] By the end of 2010, Hasegawa made a new encoder available named JSF*ck which also used only the minimum six characters. [6] [7] In 2012, Martin Kleppe created a "jsfuck" project on GitHub, [8] and a JSFuck.com website with a web app using that implementation of the encoder. [9]
JSFuck can be used to bypass detection of malicious code submitted on websites, e.g. in cross-site scripting (XSS) attacks. [10] Another potential use of JSFuck lies in code obfuscation. An optimized version of JSFuck has been used to encode jQuery, a JavaScript library, into a fully functional version written with just the six characters. [11]
JSFuck code is extremely "verbose": In JavaScript, the code alert("Hello World!")
, which causes a pop-up window to open with the text "Hello world", is 21 characters long. In JSFuck, the same code has a length of 4325 characters. [12] Certain single characters require far more than 1000 characters when expanded as JSFuck. This section offers an overview of how this expansion works.
The number 0 is created by +[]
, where []
is the empty array and +
is the unary plus, used to convert the right side to a numeric value (zero here). The number 1 is formed as +!![]
or +!+[]
, where the boolean value true
(expressed as !![]
or !+[]
in JSFuck) is converted into the numeric value 1 by the prepended plus sign. The digits 2 to 9 are formed by summing true
the appropriate number of times. E.g. in JavaScript true + true
= 2 and true
= !![]
= !+[]
, hence 2 can be written as !![]+!![]
or !+[]+!+[]
. Other digits follow a similar pattern. Integers consisting of two or more digits are written, as a string, by concatenating 1-digit arrays with the plus operator. For example, the string "10"
can be expressed in JavaScript as [1] + [0]
. By replacing the digits with the respective JSFuck expansions, this yields [+!+[]]+[+[]]
. To get a numeric value instead of a string, one would enclose the previous expression in parentheses or square brackets and prepend a plus, yielding 10
= +([+!+[]]+[+[]])
.
Some letters can be obtained in JSFuck by accessing single characters in the string representations of simple boolean or numeric values like "false"
, "true"
, "NaN"
, "undefined"
with an indexer (a number in square brackets). Other tricks are needed to produce other letters – for example by casting the string 1e1000
into a number, which gives Infinity
, which in turn makes the letter y
accessible. [13]
The following is a list of primitive values used as building blocks to produce the most simple letters.
Value | JSFuck |
---|---|
false | ![] |
true | !![] or !+[] |
NaN | +[![]] |
undefined | [][[]] |
Infinity | +(+!+[]+(!+[]+[])[!+[]+!+[]+!+[]]+[+!+[]]+[+[]]+[+[]]+[+[]]) |
"a"
: Taken from the string "false"
. The second character of "false" is a, which can be accessed with:
"false"[1]
. "false"
can be made from false+[]
, i.e. the boolean constant false plus an empty array.(false+[])[1]
: We write false as ![]
(negation applied to an empty array).(![]+[])[1]
: 1 is a number, we can write it as +true
.(![]+[])[+true]
: Since false is ![]
, true is !![]
.(![]+[])[+!![]]
– which evaluates to "a".Proof: In JavaScript, alert((![]+[])[+!![]])
does the same as alert("a")
. [14]
The Function
constructor can be used to trigger execution of JavaScript code contained in a string as if it were native JavaScript. So, for example, the statement alert(1)
is equivalent to Function("alert(1)")()
. The Function
constructor can be retrieved in JSFuck by accessing the constructor property of a well known function, such as []["filter"]
(Array.prototype.filter
) or []["flat"]
(Array.prototype.flat
) in modern browsers. And then alert(1)
becomes []["flat"]["constructor"]("alert(1)")()
.
The characters with the shortest JSFuck expansions are listed below. Other UTF-8 characters can be expressed as well but will generate considerably longer code.
Character | JSFuck |
---|---|
+ | (+(+!+[]+(!+[]+[])[!+[]+!+[]+!+[]]+[+!+[]]+[+[]]+[+[]])+[])[!+[]+!+[]] |
. | (+(+!+[]+[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+[!+[]+!+[]]+[+[]])+[])[+!+[]] |
0 | +[] |
1 | +!![] or +!+[] |
2 | !![]+!![] or !+[]+!+[] |
3 | !![]+!![]+!![] or !+[]+!+[]+!+[] |
4 | !![]+!![]+!![]+!![] or !+[]+!+[]+!+[]+!+[] |
5 | !![]+!![]+!![]+!![]+!![] or !+[]+!+[]+!+[]+!+[]+!+[] |
6 | !![]+!![]+!![]+!![]+!![]+!![] or !+[]+!+[]+!+[]+!+[]+!+[]+!+[] |
7 | !![]+!![]+!![]+!![]+!![]+!![]+!![] or !+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[] |
8 | !![]+!![]+!![]+!![]+!![]+!![]+!![]+!![] or !+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[] |
9 | !![]+!![]+!![]+!![]+!![]+!![]+!![]+!![]+!![] or !+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[] |
a | (![]+[])[+!+[]] |
c | ([]+[][(![]+[])[+!![]]+(!![]+[])[+[]]])[!![]+!![]+!![]] |
d | ([][[]]+[])[!+[]+!+[]] |
e | (!![]+[])[!+[]+!+[]+!+[]] |
f | (![]+[])[+[]] |
i | ([![]]+[][[]])[+!+[]+[+[]]] |
I | (+(+!+[]+(!+[]+[])[!+[]+!+[]+!+[]]+(+!+[])+(+[])+(+[])+(+[]))+[])[+[]] |
l | (![]+[])[!+[]+!+[]] |
N | (+[![]]+[])[+[]] |
n | ([][[]]+[])[+!+[]] |
o | (!![]+[][(![]+[])[+!![]]+(!![]+[])[+[]]])[+!![]+[+[]]] |
r | (!+[]+[])[+!+[]] |
s | (![]+[])[!+[]+!+[]+!+[]] |
t | (!+[]+[])[+[]] |
u | ([][[]]+[])[+[]] |
y | (+[![]]+[+(+!+[]+(!+[]+[])[!+[]+!+[]+!+[]]+(+!+[])+(+[])+(+[])+(+[]))])[+!+[]+[+[]]] |
Lacking the distinct features of "usual" JavaScript, obfuscation techniques like JSFuck can assist malicious JavaScript code in bypassing intrusion prevention systems [15] or content filters. For instance, the lack of alphanumeric characters in JSFuck and a flawed content filter allowed sellers to embed arbitrary JSFuck scripts in their eBay auction pages. [10]
Brainfuck is an esoteric programming language created in 1993 by Swiss physics student Urban Müller. Designed to be extremely minimalistic, the language consists of only eight simple commands, a data pointer and an instruction pointer.
While Hypertext Markup Language (HTML) has been in use since 1991, HTML 4.0 from December 1997 was the first standardized version where international characters were given reasonably complete treatment. When an HTML document includes special characters outside the range of seven-bit ASCII, two goals are worth considering: the information's integrity, and universal browser display.
JavaScript, often abbreviated as JS, is a programming language and core technology of the Web, alongside HTML and CSS. 99% of websites use JavaScript on the client side for webpage behavior.
UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit.
UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variable-length, as code points are encoded with one or two 16-bit code units. UTF-16 arose from an earlier obsolete fixed-width 16-bit encoding now known as "UCS-2" (for 2-byte Universal Character Set), once it became clear that more than 216 (65,536) code points were needed, including most emoji and important CJK characters such as for personal and place names.
An esoteric programming language is a programming language designed to test the boundaries of computer programming language design, as a proof of concept, as software art, as a hacking interface to another language, or as a joke. The use of the word esoteric distinguishes them from languages that working developers use to write software. The creators of most esolangs do not intend them to be used for mainstream programming, although some esoteric features, such as visuospatial syntax, have inspired practical applications in the arts. Such languages are often popular among hackers and hobbyists.
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language.
In computer science and computer programming, a data type is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these values as machine types. A data type specification in a program constrains the possible values that an expression, such as a variable or a function call, might take. On literal data, it tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans.
Cross-site scripting (XSS) is a type of security vulnerability that can be found in some web applications. XSS attacks enable attackers to inject client-side scripts into web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same-origin policy. During the second half of 2007, XSSed documented 11,253 site-specific cross-site vulnerabilities, compared to 2,134 "traditional" vulnerabilities documented by Symantec. XSS effects vary in range from petty nuisance to significant security risk, depending on the sensitivity of the data handled by the vulnerable site and the nature of any security mitigation implemented by the site's owner network.
In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.
ActionScript is an object-oriented programming language originally developed by Macromedia Inc.. It is influenced by HyperTalk, the scripting language for HyperCard. It is now an implementation of ECMAScript, though it originally arose as a sibling, both being influenced by HyperTalk. ActionScript code is usually converted to byte-code format by a compiler.
The syntax of Java is the set of rules defining how a Java program is written and interpreted.
In computer science, a literal is a textual representation (notation) of a value as it is written in source code. Almost all programming languages have notations for atomic values such as integers, floating-point numbers, and strings, and usually for booleans and characters; some also have notations for elements of enumerated types and compound values such as arrays, records, and objects. An anonymous function is a literal for the function type.
This article compares Unicode encodings. Two situations are considered: 8-bit-clean environments, and environments that forbid use of byte values that have the high bit set. Originally such prohibitions were to allow for links that used only seven data bits, but they remain in some standards and so some standard-conforming software must generate messages that comply with the restrictions. Standard Compression Scheme for Unicode and Binary Ordered Compression for Unicode are excluded from the comparison tables because it is difficult to simply quantify their size.
The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.
In computing, an attribute is a specification that defines a property of an object, element, or file. It may also refer to or set the specific value for a given instance of such. For clarity, attributes should more correctly be considered metadata. An attribute is frequently and generally a property of a property. However, in actual usage, the term attribute can and is often treated as equivalent to a property depending on the technology being discussed. An attribute of an object usually consists of a name and a value. For an element these can be a type and class name, while for a file these can be a name and an extension, respectively.
Action Message Format (AMF) is a binary format used to serialize object graphs such as ActionScript objects and XML, or send messages between an Adobe Flash client and a remote service, usually a Flash Media Server or third party alternatives. The Actionscript 3 language provides classes for encoding and decoding from the AMF format.
This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.
GNU Unifont is a free Unicode bitmap font created by Roman Czyborra. The main Unifont covers all of the Basic Multilingual Plane (BMP). The "upper" companion covers significant parts of the Supplementary Multilingual Plane (SMP). The "Unifont JP" companion contains Japanese kanji present in the JIS X 0213 character set.
BSON is a computer data interchange format. The name "BSON" is based on the term JSON and stands for "Binary JSON". It is a binary form for representing simple or complex data structures including associative arrays, integer indexed arrays, and a suite of fundamental scalar types. BSON originated in 2009 at MongoDB. Several scalar data types are of specific interest to MongoDB and the format is used both as a data storage and network transfer format for the MongoDB database, but it can be used independently outside of MongoDB. Implementations are available in a variety of languages such as C, C++, C#, D, Delphi, Erlang, Go, Haskell, Java, JavaScript, Julia, Lua, OCaml, Perl, PHP, Python, Ruby, Rust, Scala, Smalltalk, and Swift.