SILK

Last updated
SILK
Developer(s) Skype Limited, Microsoft
Initial release2009;15 years ago (2009)
Stable release
1.0.9 / 2012;12 years ago (2012)
Written in C, C++
Operating system Microsoft Windows, macOS, Linux, Android, iOS
Predecessor SVOPC
Successor Satin
Type Audio codec
License BSD 2-Clause License [1]
Filename extension
.sil, .SIL
Internet media type
audio/SILK
Magic number #!SILK\n
Initial releaseMarch 2009;15 years ago (2009-03)
Latest release
SDK 1.0.9
2012;12 years ago (2012)
Type of formatAudio
Extended to Opus
Standard Internet Draft

SILK is an audio compression format and audio codec developed by Skype Limited, now a Microsoft subsidiary. It was developed for use in Skype, as a replacement for the SVOPC codec. Since licensing out, it has also been used by others. It has been extended to the Internet standard Opus codec.

Contents

Details

Block diagram of the SILK encoder SILK encoder block diagram.svg
Block diagram of the SILK encoder

Skype Limited announced that SILK can use a sampling frequency of 8, 12, 16 or 24 kHz and a bit rate from 6 to 40 kbit/s. It can also use a low algorithmic delay of 25 ms (20 ms frame size + 5 ms look-ahead). [2] The reference implementation is written in the C programming language. The codec technology is based on linear predictive coding (LPC). [3] The SILK binary SDK is available. [4]

License

The SILK codec is patented and licensed separately from the SILK SDK. [5] The codec is open-source, freeware, available royalty free with restrictions on use and distribution. [4] [6] [7] The SDK was initially available only by application by giving details of name, address, phone, and description of how SILK will be used. [4] As of 2012 (version 1.0.9) the SDK can by downloaded without application, but the licence restricts the use to internal evaluation and testing purposes only, excluding software distribution or use in any commercial product or service. [4] [8]

History

SILK replaces the previously used SVOPC in Skype, which was a in-house solution to replace the iSAC and iLBC, which again were licensed from Global IP Solutions. The SILK codec was a separate development branch from SVOPC and it has been under development for over 3 years. [9] It was announced in January 2009 on the Consumer Electronics Show [9] and was integrated in Skype for the first time in version 4.0 beta 3 from January 7, 2009, [10] with the final version being released on February 3. [11] On March 3, 2009 Skype Limited announced that the SILK codec will be available soon under a royalty free license to third-party software and hardware developers. [6] The first draft of the SILK Speech Codec description was submitted to the Internet Engineering Task Force (IETF) as a candidate for the standardisation of a new Internet wideband audio codec on July 6, 2009, thereby openly publishing the format along with the source code of the reference implementation. [12] There is also a first draft of the RTP Payload Format and File Storage Format for SILK Speech and Audio Codec. [13]

Opus

SILK is a foundation (with CELT) of the hybrid codec Opus (at the time called "Harmony") that was submitted to the IETF in September 2010, [14] and was chosen as the final candidate for the new standard. Opus was published as an IETF proposed standard in September 2012 [15] and Skype announced that they would be using Opus going forward. [16]

Usage

See also

Related Research Articles

Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.

<span class="mw-page-title-main">Vorbis</span> Royalty-free lossy audio encoding format

Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression, libvorbis. Vorbis is most commonly used in conjunction with the Ogg container format and it is therefore often referred to as Ogg Vorbis.

Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on voice over IP applications and podcasts. It is based on the code excited linear prediction speech coding algorithm. Its creators claim Speex to be free of any patent restrictions and it is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP. It may also be used with the FLV container format.

The Adaptive Multi-Rateaudio codec is an audio compression format optimized for speech coding. AMR is a multi-rate narrowband speech codec that encodes narrowband (200–3400 Hz) signals at variable bit rates ranging from 4.75 to 12.2 kbit/s with toll quality speech starting at 7.4 kbit/s.

Adaptive Multi-Rate Wideband (AMR-WB) is a patented wideband speech audio coding standard developed based on Adaptive Multi-Rate encoding, using a similar methodology to algebraic code-excited linear prediction (ACELP). AMR-WB provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz. AMR-WB was developed by Nokia and VoiceAge and it was first specified by 3GPP.

<span class="mw-page-title-main">G.722.1</span> ITU-T Recommendation

G.722.1 is a licensed royalty-free ITU-T standard audio codec providing high quality, moderate bit rate wideband (50 Hz – 7 kHz audio bandwidth, 16 ksps audio coding. It is a partial implementation of Siren 7 audio coding format developed by PictureTel Corp.. Its official name is Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. It uses a modified discrete cosine transform audio data compression algorithm.

Extended Adaptive Multi-Rate – Wideband (AMR-WB+) is an audio codec that extends AMR-WB. It adds support for stereo signals and higher sampling rates. Another main improvement is the use of transform coding additionally to ACELP. This greatly improves the generic audio coding. Automatic switching between transform coding and ACELP provides both good speech and audio quality with moderate bit rates.

The following tables compare general and technical information for a variety of audio coding formats.

internet Speech Audio Codec (iSAC) is a wideband speech codec, developed by Global IP Solutions (GIPS). It is suitable for VoIP applications and streaming audio. The encoded blocks have to be encapsulated in a suitable protocol for transport, e.g. RTP.

Siren is a family of patented, transform-based, wideband audio coding formats and their audio codec implementations developed and licensed by PictureTel Corporation. There are three Siren codecs: Siren 7, Siren 14 and Siren 22.

SVOPC is a compression method for audio which is used by VOIP applications. It is a lossy speech compression codec designed specifically towards communication channels suffering from packet loss. It uses more bandwidth than best bandwidth-optimised codecs, but it is packet loss resistant instead.

Wideband audio, also known as wideband voice or HD voice, is high definition voice quality for telephony audio, contrasted with standard digital telephony "toll quality". It extends the frequency range of audio signals transmitted over telephone lines, resulting in higher quality speech. The range of the human voice extends from 100 Hz to 17 kHz but traditional, voiceband or narrowband telephone calls limit audio frequencies to the range of 300 Hz to 3.4 kHz. Wideband audio relaxes the bandwidth limitation and transmits in the audio frequency range of 50 Hz to 7 kHz. In addition, some wideband codecs may use a higher audio bit depth of 16 bits to encode samples, also resulting in much better voice quality.

Constrained Energy Lapped Transform (CELT) is an open, royalty-free lossy audio compression format and a free software codec with especially low algorithmic delay for use in low-latency audio communication. The algorithms are openly documented and may be used free of software patent restrictions. Development of the format was maintained by the Xiph.Org Foundation and later coordinated by the Opus working group of the Internet Engineering Task Force (IETF).

The HTML5 draft specification adds video and audio elements for embedding video and audio in HTML documents. The specification had formerly recommended support for playback of Theora video and Vorbis audio encapsulated in Ogg containers to provide for easier distribution of audio and video over the internet by using open standards, but the recommendation was soon after dropped.

WebRTC is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication and streaming to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.

<span class="mw-page-title-main">Opus (audio format)</span> Lossy audio coding format

Opus is a lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently code speech and general audio in a single format, while remaining low-latency enough for real-time interactive communication and low-complexity enough for low-end embedded processors. Opus replaces both Vorbis and Speex for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until transparency is reached, including MP3, AAC, and HE-AAC.

HTML audio is a subject of the HTML specification, incorporating audio input, playback, and synthesis, as well as speech to text, all in the browser.

Codec 2 is a low-bitrate speech audio codec that is patent free and open source. Codec 2 compresses speech using sinusoidal coding, a method specialized for human speech. Bit rates of 3200 to 450 bit/s have been successfully created. Codec 2 was designed to be used for amateur radio and other high compression voice applications.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

References

  1. "SILK codec". GitHub . 23 August 2021.
  2. Skype SILK Data Sheet, Retrieved 2009-09-01
  3. Audio-Mitschnitt Archived 2013-02-10 at the Wayback Machine vom Treffen der IETF-Codec-Arbeitsgruppe auf der Konferenz IETF79 in Peking, China mit einer Darstellung der grundlegenden Funktionsprinzipien durch Koen Vos (MP3, ~70 MiB)
  4. 1 2 3 4 Skype SILK – Super Wideband Audio Codec, Retrieved 2009-09-01
  5. "SILK Patent License". Skype. December 6, 2010. Retrieved October 27, 2023.
  6. 1 2 Jonathan Christensen (2009-03-03) SILK, our super wideband audio codec, is now available for free Archived 2009-12-23 at the Wayback Machine , Retrieved 2009-09-01
  7. Skype publishes SILK audio codec source code, Retrieved 2012-12-26
  8. Skype Developer Forum - SILK SDK license Archived 2012-08-03 at the Wayback Machine , Retrieved 2012-12-26
  9. 1 2 Michael Stanford (2009-01-13) Skype’s new super-wideband codec, Retrieved 2009-09-01
  10. 1 2 Skype Journal (2009-01-07)Skype for Windows 4.0 Beta 3 Hotfix Introduces New Audio Codec, Retrieved 2009-09-01
  11. "All-New Skype Now Available - About Skype". Archived from the original on 2013-07-30. Retrieved 2012-07-14.
  12. IETF (2009-07-06) SILK Speech Codec - draft-vos-silk-00.txt, Retrieved 2009-09-01
  13. IETF (2009-07-06) RTP Payload Format and File Storage Format for SILK Speech and Audio Codec, Retrieved 2009-09-01
  14. Valin, Jean-Marc; Vos, Koen (24 September 2010). "Definition of the Harmony Audio Codec". IETF Datatracker.
  15. Jean-Marc Valin, Koen Vos & Timothy B. Terriberry (September 2012). "Definition of the Opus Audio Codec". RFC 6716. IETF. Retrieved 2013-08-19.
  16. "Skype and a New Audio Codec". Microsoft. September 12, 2012. Archived from the original on October 18, 2017. Retrieved October 25, 2023.
  17. PCWorld (2009-02-04)Skype Upgrade Simplifies VoIP Video Calls, Retrieved 2009-09-01
  18. (2009-02-04) Skype 4.0 audio: smooth as SILK Archived 2012-07-22 at the Wayback Machine , Retrieved 2009-09-01
  19. "Team Fortress 2 - Hatless Update". www.teamfortress.com.
  20. Marczak, Bill; Scott-Railton, John (3 April 2020). "Move Fast and Roll Your Own Crypto: A Quick Look at the Confidentiality of Zoom Meetings".