Smart speaker

Last updated

A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "hot word" (or several "hot words"). Some smart speakers can also act as a smart device that utilizes Wi-Fi and other protocol standards to extend usage beyond audio playback, such as to control home automation devices. This can include, but is not limited to, features such as compatibility across a number of services and platforms, peer-to-peer connection through mesh networking, virtual assistants, and others. Each can have its own designated interface and features in-house, usually launched or controlled via application or home automation software. [1] Some smart speakers also include a screen to show the user a visual response.

Contents

As of winter 2017, it is estimated by NPR and Edison Research that 39 million Americans (16% of the population over 18) own a smart speaker. [2]

A smart speaker with a touchscreen is known as a smart display. [3] [4] It is a smart device that integrates conversational user interface with display screens to augment voice interaction with images and video. They are powered by one of the common voice assistants and offer controls for smart home devices, feature streaming apps, and web browsers with touch controls for selecting content. The first smart displays were introduced in 2017 by Amazon (Amazon Echo).

Accuracy

According to a study by Proceedings of the National Academy of Sciences of the United States of America released In March 2020, the six biggest tech development companies, Amazon, Apple, Google, Yandex, IBM and Microsoft, have misidentified more words spoken by "black people" than "white people". The systems tested errors and unreadability, with a 19 and 35 percent discrepancy for the former and a 2 and 20 percent discrepancy for the latter. [5]

The North American Chapter of the Association for Computational Linguistics (NAACL) also identified a discrepancy between male and female voices. According to their research, Google's speech recognition software is 13 percent more accurate for men than women. It performs better than the systems used by Bing, AT&T, and IBM. [6]

Privacy concerns

The built-in microphone in smart speakers is continuously listening for "hot words" followed by a command. However, these continuously listening microphones also raise privacy concerns among users. [7] These include what is being recorded, how the data will be used, how it will be protected, and whether it will be used for invasive advertising. [8] [9] Furthermore, an analysis of Amazon Echo Dots showed that 30–38% of "spurious audio recordings were human conversations", suggesting that these devices capture audio other than strictly detection of the hot word. [10]

As a wiretap

There are strong concerns that the ever-listening microphone of smart speakers presents a perfect candidate for wiretapping. In 2017, British security researcher Mark Barnes showed that pre-2017 Echos have exposed pins which allow for a compromised OS to be booted. [11]

Voice assistance vs privacy

While voice assistants provide a valuable service, there can be some hesitation towards using them in various social contexts, such as in public or around other users. [12] However, only more recently have users begun interacting with voice assistants through an interaction with smart speakers rather than an interaction with the phone. On the phone, most voice assistants have the option to be engaged by a physical button (e.g., Siri with a long press of the home button) rather than solely by hot word-based engagement in a smart speaker. While this distinction increases the privacy by limiting when the microphone is on, users felt that having to press a button first removed the convenience of voice interaction. [13] This trade-off is not unique to voice assistants; as more and more devices come online, there is an increasing trade-off between convenience and privacy. [14]

Factors influencing adoption

While there are many factors influencing smart speaker adoption, specifically with regards to privacy, Lau et al. define five distinct categories as pros and cons: convenience, identity as an early adopter, contributing factors, perceived lack of utility, privacy, and security concerns. [7] Tennant et al. explored adoption using the technology acceptance model and identified factors influencing expected usefulness, ease of use, and attitudes toward smart speaker devices in the context of home care support. [15] The authors describe some of the critical challenges in designing for this context given the potential impact of the voice assistant's personality on someone's perspective of the care situation and a user's desire for intelligent support from the technology. [15]

Security concerns

When configured without authentication, smart speakers can be activated by people other than the intended user or owner. For example, visitors to a home or office, or people in a publicly accessible area outside an open window, partial wall, or security fence, may be able to be heard by a speaker. One team demonstrated the ability to stimulate the microphones of smart speakers and smartphones through a closed window, from another building across the street, using a laser. [16]

Virtual assistantOwned byDevicesNo. of usersLanguages (dialects)Notes
Alice Yandex
  • Yandex Station
  • Yandex Station Mini
  • Irbis A
  • LG Xboom AI ThinQ WK7Y
  • ELARI SmartBeat
  • Prestigio Smartmate Маяк Edition
30 million Yandex devices in CIS (January 2019)Russian, TurkishYandex Station went on sale in July 2018
AliGenie Alibaba Group ChineseWent on sale in August 2017
Amazon Alexa [17] Amazon 31 million Echo devices in U.S. (January 2018) [18] Summer 2019: English (US, UK, Ireland, Canada and Australia); French (France and Canada); German; Italian; Japanese; Portuguese (Brazilian) and Spanish (Spain and Mexico) [19] [20] [21]
Siri Apple, Inc. Summer 2019: Arabic, Chinese (Cantonese and Mandarin), Danish, Dutch, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Malay, Norwegian, Portuguese, Russian, Spanish, Swedish, Thai, and Turkish [21]
DuerOS Open Platform [22] Baidu Xiaoyu, RavenH, Aladdin ceiling-mounted smart speaker-lamp-projector [23] [24] ChineseXiaoyu went on sale in spring 2017. [22]
Clova Naver Corporation, Line Corporation Japanese and KoreanIntroduced summer 2017 [25]
Google Assistant [17] Google Google Home series: Home, Home Max, Home Mini, Nest Hub, Nest Hub Max, [17] Nest Mini, Nest Audio, Nest Wi-Fi (point only)14 million Google Homes in U.S. (January 2018) [18] Summer 2019: Danish, Dutch, English (U.S., U.K., Canada, Australia, India and Singapore), French (France and Canada), German (Austria and Germany), Hindi, Italian, Japanese, Korean, Norwegian, Portuguese (Brazilian), Spanish (Spain and Mexico) and Swedish [26] [21]
Beijing LingLong, part of JD DingDong Mandarin and Cantonese for Greater ChinaIn cooperation with Chinese AI firm iFlytek. Went on sale November 2016. [27]
Microsoft Cortana Microsoft Harman Kardon INVOKE October 2019: English (US, UK, Canada, Australia and India); Chinese (Simplified); French; German; Italian; Japanese; Portuguese (Brazil); Spanish (Spain and Mexico) [28] Support for Cortana on the Harman Kardon INVOKE was officially discontinued on March 9, 2021. [29] [30]
Safety Labs SironaSafety Labs IncSirona.TVEnglish (US, UK, Canada, Australia and India);
Xiaowei [22] Tencent forthcoming [22] Chinese
Bixby Samsung Electronics Galaxy Home [31]
Hallo Magenta Deutsche Telekom Hallo MagentaGerman

See also

Related Research Articles

A voice-user interface (VUI) enables spoken human interaction with computers, using speech recognition to understand spoken commands and answer questions, and typically text to speech to play a reply. A voice command device is a device controlled with a voice user interface.

Sonos, Inc. is an American audio equipment manufacturer headquartered in Santa Barbara, California, United States. The company was founded in 2002 by John MacFarlane, Craig Shelburne, Tom Cullen, and Trung Mai. Patrick Spence has been its CEO since 2017.

<span class="mw-page-title-main">Virtual assistant</span> Software agent

A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to simulate human conversation, such as via online chat, to facilitate interaction with their users. The interaction may be via text, graphical interface, or voice - as some virtual assistants are able to interpret human speech and respond via synthesized voices.

<span class="mw-page-title-main">Google Nest</span> Brand of smart home products by Google

Google Nest is a line of smart home products including smart speakers, smart displays, streaming devices, thermostats, smoke detectors, routers and security systems including smart doorbells, cameras and smart locks.

<span class="mw-page-title-main">Ring (company)</span> Home security products manufacturer

Ring LLC is a manufacturer of home security and smart home devices owned by Amazon. It manufactures a titular line of smart doorbells, home security cameras, and alarm systems. It also operates Neighbors, a social network that allows users to discuss local safety and security issues, and share footage captured with Ring products. Via Neighbors, Ring may also provide footage and data to law enforcement agencies to assist in investigations.

<span class="mw-page-title-main">Cortana (virtual assistant)</span> Discontinued personal assistant by Microsoft

Cortana is a discontinued virtual assistant developed by Microsoft that used the Bing search engine to perform tasks such as setting reminders and answering questions for users.

<span class="mw-page-title-main">Amazon Fire TV</span> Line of digital media players and microconsoles by Amazon

Amazon Fire TV is a line of digital media players and microconsoles developed by Amazon. The devices are small network appliances that deliver digital audio and video content streamed via the Internet to a connected high-definition television. They also allow users to access local content and to play video games with the included remote control or another game controller, or by using a mobile app remote control on another device.

Amazon Echo, often shortened to Echo, is an American brand of smart speakers developed by Amazon. Echo devices connect to the voice-controlled intelligent personal assistant service Alexa, which will respond when a user says "Alexa". Users may change this wake word to "Amazon", "Echo", "Computer", and other options. The features of the device include voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, and playing audiobooks, in addition to providing weather, traffic and other real-time information. It can also control several smart devices, acting as a home automation hub.

<span class="mw-page-title-main">Google Nest (smart speakers)</span> Line of voice-enabled smart speakers and displays by Google

Google Nest, previously named Google Home, is a line of smart speakers developed by Google under the Google Nest brand. The devices enable users to speak voice commands to interact with services through Google Assistant, the company's virtual assistant. Both in-house and third-party services are integrated, allowing users to listen to music, control playback of videos or photos, or receive news updates entirely by voice. Google Nest devices also have integrated support for home automation, letting users control smart home appliances with their voice command. The first device, Google Home, was released in the United States in November 2016; subsequent product releases have occurred globally since 2017.

<span class="mw-page-title-main">Google Assistant</span> AI-powered digital assistant from Google

The Google Assistant is a virtual assistant software application developed by Google that is primarily available on mobile and home automation devices. Based on artificial intelligence, The Google Assistant can engage in two-way conversations, unlike the company's previous virtual assistant, Google Now.

Amazon Alexa or Alexa is a virtual assistant technology largely based on a Polish speech synthesizer named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio and Amazon Tap speakers developed by Amazon Lab126. It is capable of natural language processing (NLP) for tasks such as voice interaction, music playback, creating to-do lists, setting alarms, streaming podcasts, playing audiobooks, providing weather, traffic, sports, other real-time information and news. Alexa can also control several smart devices as a home automation system. Alexa capabilities may be extended by installing "skills" such as weather programs and audio features. It performs these tasks using automatic speech recognition, NLP, and other forms of weak AI.

The Lenovo Smart Assistant is a Google assistant enabled smart speaker developed by Lenovo. The device was announced at CES 2017 and released to the U.S. market in May 2017. A Harman Kardon version of the speaker was also announced at the event. The speaker uses the Amazon's Alexa voice service, and can be controlled through the Alexa companion app. The inside of the device includes eight microphones, a W-Fi chip, Intel Celeron N3060 processor, and a large speaker. The device shell is white, with a gray, green or orange woven overlay over the speaker.

<span class="mw-page-title-main">Amazon Echo Show</span> Touchscreen smart speaker produced by Amazon

Amazon Echo Show is a smart speaker that is part of the Amazon Echo line of products. Similarly to other devices in the family, it is designed around Amazon's virtual assistant Alexa, but additionally features a touchscreen display that can be used to display visual information to accompany its responses, as well as play video and conduct video calls with other Echo Show users. The video call feature was later expanded to include all Skype users.

<span class="mw-page-title-main">Witlingo</span> Software as a service company

Witlingo is a B2B Software as a Service (SaaS) company that enables businesses and organization of all sizes to use the latest innovations in Human Language Technology and Conversational AI, such Speech recognition, Natural Language Processing, IVR, Virtual Assistant apps on Smartphone platforms(iOS and Android), Chatbots, and Digital audio, to deeply engage with their communities.

<span class="mw-page-title-main">Sonos One</span> Smart speaker with voice control built in

The Sonos One is a smart speaker developed by Sonos, announced on October 4, 2017 and released on October 24. The speaker contains a six-microphone array, allowing use of the virtual assistants, Amazon Alexa and Google Assistant. In 2018, the smart speaker added support for Apple's AirPlay 2.

<span class="mw-page-title-main">Invoke (smart speaker)</span> Cortana-powered smart speaker by Harman Kardon

Invoke is a smart speaker developed by Harman Kardon. It was powered by Microsoft's intelligent personal assistant, Cortana. Voice interaction with Cortana provides features such as setting alarms, facts, searches, weather, news, traffic, flights, and other real-time information. Additionally, the speaker's Cortana integration with one's Microsoft Account enabled calendars, reminders, commutes, to-do lists, and home automation features, among others.

AliGenie is a China-based open-platform intelligent personal assistant launched and developed by Alibaba Group, currently used in the Tmall Genie smart speaker. The platform was introduced in 2017, along with the Tmall Genie X1, at Alibaba's 2017 Computing Conference in Hangzhou.

<span class="mw-page-title-main">Meta Portal</span> Line of smart displays by Facebook

Meta Portal is a discontinued brand of smart displays and videophones released in 2018 by Meta. The product line consists of four models: Portal, Portal+, Portal TV, and Portal Go. These models provide video chat via Messenger and WhatsApp, augmented by a camera that can automatically zoom and track people's movements. The devices are integrated with Amazon's voice-controlled intelligent personal assistant service Alexa.

Virtual assistants are software technology that assist users complete various tasks. Well known virtual assistants include Amazon Alexa, and Siri, produced by Apple. Other companies, such as Google and Microsoft, also have virtual assistants. There are privacy issues concerning what information can go to the third party corporations that operate virtual assistants and how this data can potentially be used.

<span class="mw-page-title-main">Voice computing</span> Discipline in computing

Voice computing is the discipline that develops hardware or software to process voice inputs.

References

  1. smart speaker Archived 2019-04-10 at the Wayback Machine , techtarget.com, May 2017
  2. The Smart Audio Report from NPR and Edison Research, Fall-Winter 2017 (PDF), archived (PDF) from the original on 2019-01-01, retrieved 2018-01-13
  3. Brown, Rich. "Echo Show, Nest Hub, Facebook Portal and more: How to pick the best smart display in 2019". CNET. Archived from the original on 2019-07-08. Retrieved 2019-06-19.
  4. Faulkner, Cameron (9 October 2018). "How Google's new Home Hub compares to the Echo Show and Facebook Portal". The Verge. Archived from the original on 2019-12-06. Retrieved 2019-06-19.
  5. Metz, Cade (2020-03-23). "There Is a Racial Divide in Speech-Recognition Systems, Researchers Say". The New York Times. Archived from the original on 2022-10-13. Retrieved 2020-04-22.
  6. Bajorek, Joan Palmiter (2019-05-10). "Voice Recognition Still Has Significant Race and Gender Biases". Harvard Business Review. Archived from the original on 2020-04-25. Retrieved 2020-04-24.
  7. 1 2 Lau, Josephine; Zimmerman, Benjamin; Schaub, Florian (1 November 2018). "Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-seeking Behaviors with Smart Speakers". Proceedings of the ACM on Human-Computer Interaction. 2 (CSCW): 102:1–102:31. doi:10.1145/3274371. S2CID   53223356.
  8. "Amazon hands over Echo 'murder' data". BBC News. 7 March 2017. Archived from the original on 6 January 2020. Retrieved 2 March 2019.
  9. "Amazon patents 'voice-sniffing' algorithms". BBC News. 11 April 2018. Archived from the original on 14 December 2019. Retrieved 2 March 2019.
  10. Ford, Marcia, and William Palmer. "Alexa, are you listening to me? An analysis of Alexa voice service network traffic." Personal and Ubiquitous Computing (2018): 1-13.
  11. Greenberg, Andy (1 August 2017). "A Hacker Turned an Amazon Echo Into a 'Wiretap'". Wired. Archived from the original on 3 June 2019. Retrieved 2 March 2019 via www.wired.com.
  12. "Sarah Mennicken and Elaine M. Huang. 2012. Hacking the Natural Habitat: An In-the-Wild Study of Smart Homes,Their Development, and the People Who Live in Them. In Pervasive Computing. Springer, Berlin, Heidelberg, 143–160". doi:10.1007/978-3-642-31205-2_10. S2CID   3480089. Archived from the original on 2022-10-13. Retrieved 2019-02-26.{{cite journal}}: Cite journal requires |journal= (help)
  13. Christoffer Lambertsson. 2017. Expectations of Privacy in Voice Interaction–A Look at Voice Controlled Bank Transactions. Ph.D. Dissertation. KTH Royal Institute of Technology
  14. "Rao, Sonia (12 September 2018) "In today's homes, consumers are willing to sacrifice privacy for convenience". Retrieved 25 February 2019". The Washington Post . Archived from the original on 2 March 2019. Retrieved 26 February 2019.
  15. 1 2 Tennant, Ryan; Allana, Sana; Mercer, Kate; Burns, Catherine M (2022-06-30). "Caregiver Expectations of Interfacing With Voice Assistants to Support Complex Home Care: Mixed Methods Study". JMIR Human Factors. 9 (2): e37688. doi: 10.2196/37688 . ISSN   2292-9495. PMC   9284358 . PMID   35771594.
  16. "Lasers can silently issue 'voice commands' to your smart speakers". Archived from the original on 2019-11-05. Retrieved 2019-11-06.
  17. 1 2 3 4 Best, Smart Speaker (11 April 2021). "Best Smart Speaker". wired.com. Archived from the original on 13 January 2021. Retrieved 11 April 2021.
  18. 1 2 Bishop, Todd (January 26, 2018). "New data: Google Home faring better against Amazon Echo, grabbing 40% of U.S. holiday sales". GeekWire . Archived from the original on December 6, 2019. Retrieved November 29, 2019.
  19. "AVS for International". developer.amazon.com. Amazon. Archived from the original on 13 June 2019. Retrieved 19 March 2018.
  20. Barrett, Brian. "THE YEAR ALEXA GREW UP". Wired. Archived from the original on 14 July 2019. Retrieved 23 December 2018.
  21. 1 2 3 "Language Support in Voice Assistants Compared". Globalme. Archived from the original on 3 September 2019. Retrieved 28 January 2020.
  22. 1 2 3 4 Horwitz, Josh (5 July 2017). "China's tech giants are racing to popularize their versions of the Amazon Echo". Archived from the original on 2018-03-19. Retrieved 2018-03-19.
  23. "Baidu launches three new smart speakers that don't need Alexa or Google Assistant". 8 January 2018. Archived from the original on 2018-03-21. Retrieved 2018-03-20.
  24. Bonnington, Christina (16 November 2017). "Baidu's New Smart Speaker Looks Like Nothing Else on the Market". Slate. Archived from the original on 19 March 2018. Retrieved 19 March 2018.
  25. "LINE to Introduce Clova Virtual Assistant for Korea and Japan - Voicebot". www.voicebot.ai. 3 March 2017. Archived from the original on 2018-03-19. Retrieved 2018-03-19.
  26. "Change your Google Assistant language". Google Home Help. Archived from the original on 22 February 2019. Retrieved 19 March 2018.
  27. Bateman, Joshua D. (22 November 2016). "Behold China's Answer to Amazon Echo: The LingLong DingDong". Wired . Condé Nast. Archived from the original on 8 November 2020. Retrieved 25 November 2017.
  28. "Cortana's regions and languages". support.microsoft.com. Archived from the original on 22 January 2020. Retrieved 28 January 2020.
  29. "Cortana service on the Harman Kardon Invoke". support.microsoft.com. Archived from the original on 2022-05-15. Retrieved 2022-05-15.
  30. "Harmon Kardon Invoke Statement". HARMAN Newsroom. Archived from the original on 2022-09-21. Retrieved 2022-05-15.
  31. Ingraham, Nathan (9 August 2018). "Does Samsung's Galaxy Home stand a chance?". Engadget . Oath Inc. Archived from the original on 17 September 2018. Retrieved 9 August 2018.