Enhancing Usability of Voice Interfaces for Socially Assistive Robots Through Deep Learning: A German Case Study | Artificial Intelligence in HCI (2024)

Article

Authors: Oliver Guhr, Claudia Loitsch, Gerhard Weber, and Hans-Joachim Böhme

Artificial Intelligence in HCI: 5th International Conference, AI-HCI 2024, Held as Part of the 26th HCI International Conference, HCII 2024, Washington, DC, USA, June 29–July 4, 2024, Proceedings, Part III

June 2024

Pages 231 - 249

Published: 29 June 2024 Publication History

  • 0citation
  • 0
  • Downloads

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

      • View Options
      • References
      • Media
      • Tables
      • Share

    Abstract

    Voice Interfaces have become ubiquitous as they can make complex technology more usable and accessible. Current voice interfaces, however, often require the user to learn specific speech commands or sentence patterns to use them. This property does not satisfy usability heuristics and causes current language interfaces to underachieve the naturalness of language interaction. To address this issue, we developed a voice interface that is capable of understanding natural everyday language. The overall objective is to build a German language voice interface for socially assistive robots that can work in public spaces. Therefore, we cannot assume the user’s prior knowledge or experience. Based on recent advances in deep natural language processing, we have built a voice interface that is not restricted to specific speech commands. To test this voice interface, we conducted a study with 47 participants. Results indicate 93% of the given tasks were solved successfully by the target user group without prior training or experience with the voice interface.

    References

    [2]

    Ardila, R., et al.: Common voice: a massively-multilingual speech corpus. arXiv:1912.06670 [cs], March 2020

    [3]

    Babu, A., et al.: XLS-R: Self-supervised cross-lingual speech representation learning at scale. arXiv:2111.09296 [cs, eess], December 2021.

    [4]

    Baevski, A., et al.: wav2vec 2.0: a framework for self-supervised learning of speech representations. arXiv:2006.11477 [cs, eess], October 2020

    [5]

    Bastianelli, E., et al.: Speaky for robots: the development of vocal interfaces for robotic applications. Appl. Intell. 44(1), 43–66 (2016). ISSN: 0924-669X, 1573-7497. http://link.springer.com/10.1007/s10489-015-0695-5

    Digital Library

    [7]

    Flake, R., et al.: IW-Trends 3/2018 Fachkräfteengpass in der Altenpflege. German. Technical report, 45. IW Köln, March 2018. https://www.iwkoeln.de/fileadmin/user_upload/Studien/IW-Trends/PDF/2018/IW-Trends_2018-03-02_Pflegefallzahlen.pdf

    [8]

    Hedderich, J., Sachs, L.: Angewandte Statistik. Springer, Heidelberg (2016). http://link.springer.com/10.1007/978-3-662-45691-0. ISBN: 978-3-662-45690-3 978-3-662-45691-0

    [9]

    Hidalgo-Paniagua, A., Millan-Alcaide, A., Bandera, J.P., Bandera, A.: Integration of the Alexa assistant as a voice interface for robotics platforms. In: Silva, M.F., Luís Lima, J., Reis, L.P., Sanfeliu, A., Tardioli, D. (eds.) ROBOT 2019. AISC, vol. 1093, pp. 575–586. Springer, Cham (2020). ISBN: 978-3-662-45690-3 978-3-662-45691-0

    [10]

    Jakob, D., Wilhelm, S., Gerl, A., Ahrens, D.: A quantitative study on awareness, usage and reservations of voice control interfaces by elderly people. In: Stephanidis, C., et al. (eds.) HCII 2021. LNCS, vol. 13096, pp. 237–257. Springer, Cham (2021). ISBN: 978-3-030-90328-2

    Digital Library

    [11]

    Kobayashi, M., et al.: Effects of age-related cognitive decline on elderly user interactions with voice-based dialogue systems. In: Lamas, D., Loizides, F., Nacke, L., Petrie, H., Winckler, M., Zaphiris, P. (eds.) INTERACT 2019. LNCS, vol. 11749, pp. 53–74. Springer, Cham (2019). ISBN: 978-3-030-29389-5 978-3-030-29390-1

    Digital Library

    [12]

    Kowalski, J., et al.: Older adults and voice interaction: a pilot study with Google home. In: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. CHI EA 2019, pp. 1–6. Association for Computing Machinery, New York, NY, USA, May 2019. ISBN: 978-1-4503-5971-9

    Digital Library

    [13]

    Lewis, J.R., Hardzinski, M.L.: Investigating the psychometric properties of the speech user interface service quality questionnaire. Int. J. Speech Technol. 18(3), 479–487 (2015). ISSN: 1572-8110

    Digital Library

    [14]

    Lewis, J.R., Utesch, B.S., Maher, D.E.: Investigating the correspondence between UMUX-LITE and SUS scores. In: Marcus, A. (ed.) DUXU 2015. LNCS, vol. 9186, pp. 204–211. Springer, Cham (2015). ISBN: 978-3-319-20886-2

    [15]

    Lewis, J.R., Utesch, B.S., Maher, D.E.: UMUX-LITE: when there’s no time for the SUS. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2099–2102. ACM, Paris, France, April 2013. https://dl.acm.org/doi/10.1145/2470654.2481287. ISBN: 978-1-4503-1899-0

    Digital Library

    [16]

    Loitsch, C.: Designing accessible user interfaces for all by means of adaptive systems. Ph.D. thesis. Dresden University of Technology, Germany (2018). https://nbn-resolving.org/urn:nbn:de:bsz:14-qucosa2-319846

    [17]

    Müller, S.: Realisierung nutzeradaptiven Interaktionsverhaltens für mobile Assistenzroboter. Ph.D. thesis. Ilmenau, October 2016. https://www.db-thueringen.de/receive/dbt_mods_00030393

    [18]

    Norberto Pires, J.: Robot-by-voice: experiments on commanding an industrial robot using the human voice. Ind. Rob. Int. J. 32(6), 505–511 (2005). ISSN: 0143-991X

    [19]

    OpenAI. GPT-4 Technical Report. arXiv:2303.08774 [cs], March 2023.

    [20]

    Papachristos, E., Meldgaard, D.P., Thomsen, I.R., Skov, M.B.: ReflectPal: exploring self-reflection on collaborative activities using voice assistants. In: Ardito, C., et al. (eds.) INTERACT 2021. LNCS, vol. 12935, pp. 187–208. Springer, Cham (2021). https://link.springer.com/10.1007/978-3-030-85610-6_12. ISBN: 978-3-030-85609-0 978-3-030-85610-6

    Digital Library

    [21]

    Peng, Z., et al.: Shrinking bigfoot: reducing wav2vec 2.0 footprint. In: SUSTAINLP (2021).

    [22]

    Poirier, S., Routhier, F., Campeau-Lecours, A.: Voice control interface prototype for assistive robots for people living with upper limb disabilities. In: 2019 IEEE 16th International Conference on Rehabilitation Robotics (ICORR), Toronto, ON, Canada, pp. 46–52. IEEE, June 2019. https://ieeexplore.ieee.org/document/8779524/. ISBN: 978-1-72812-755-2

    Digital Library

    [23]

    Polkosky, M.: Toward a social-cognitive psychology of speech technology: affective responses to speech-based e-service, February 2005

    [24]

    Prodanov, P.J., et al.: Voice enabled interface for interactive tour-guide robots. In: IEEE/RSJ International Conference on Intelligent Robots and System, Lausanne, Switzerland, vol. 2, pp. 1332–1337. IEEE (2002). http://ieeexplore.ieee.org/document/1043939/. ISBN: 978-0-7803-7398-3

    [25]

    Reimers, N., Gurevych, I.: Making monolingual sentence embeddings multilingual using knowledge distillation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, November 2020. https://arxiv.org/abs/2004.09813

    [26]

    Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, November 2019. https://arxiv.org/abs/1908.10084

    [27]

    Salai, A.-M., Cook, G., Holmquist, L.E.: IntraVox: a personalized human voice to support users with complex needs in smart homes. In: Ardito, C., et al. (eds.) INTERACT 2021. LNCS, vol. 12932, pp. 223–244. Springer, Cham (2021). ISBN: 978-3-030-85623-6

    Digital Library

    [28]

    Sin, J., et al.: Does Alexa live up to the hype? Contrasting expectations from mass media narratives and older adults’ hands-on experiences of voice interfaces. In: 4th Conference on Conversational User Interfaces, Glasgow, United Kingdom, pp. 1–9. ACM, July 2022. https://dl.acm.org/doi/10.1145/3543829.3543841. ISBN: 978-1-4503-9739-1

    Digital Library

    [29]

    Stiefelhagen, R., et al.: Enabling multimodal human-robot interaction for the karlsruhe humanoid robot. IEEE Trans. Rob. 23(5), 840–851 (2007). http://ieeexplore.ieee.org/document/4339550/. ISSN: 1552-3098, 1941-0468

    Digital Library

    [30]

    Vaswani, A., et al.: Attention is all you need. arXiv:1706.03762 [cs], December 2017. arXiv: 1706.03762

    [31]

    Wolters, M., et al.: Being old doesn’t mean acting old: how older users interact with spoken dialog systems. ACM Trans. Accessible Comput. 2(1), 1–39 (2009). https://dl.acm.org/doi/10.1145/1525840.1525842. ISSN: 1936-7228, 1936-7236

    Digital Library

    Recommendations

    • KinVoices: Using Voices of Friends and Family in Voice Interfaces

      CSCW2

      With voice user interfaces (VUIs) becoming ubiquitous and speech synthesis technology maturing, it is possible to synthesise voices to resemble our friends and relatives (which we will collectively call 'kin') and use them on VUIs. However, designing ...

      Read More

    • A Usability Study of an Assistive Touch Voice Interface based Automated Teller Machine (ATM)

      DEV '15: Proceedings of the 2015 Annual Symposium on Computing for Development

      Automated Teller Machines (ATMs) have become increasingly common in urban areas of the developing world. In Pakistan, low-literate users who previously did not have access to traditional banking services have now been issued ATM cards for direct-cash ...

      Read More

    • Usability of nomadic user interfaces

      HCII'11: Proceedings of the 14th international conference on Human-computer interaction: towards mobile and intelligent interaction environments - Volume Part III

      During the last decade, a number of research activities have been performed to enable user interfaces and the underlying user activities to be migrated from one device to another. We call this "Nomadic User Interfaces". The primary goal of these ...

      Read More

    Comments

    Information & Contributors

    Information

    Published In

    Enhancing Usability ofVoice Interfaces forSocially Assistive Robots Through Deep Learning: A German Case Study | Artificial Intelligence in HCI (1)

    Artificial Intelligence in HCI: 5th International Conference, AI-HCI 2024, Held as Part of the 26th HCI International Conference, HCII 2024, Washington, DC, USA, June 29–July 4, 2024, Proceedings, Part III

    Jun 2024

    497 pages

    ISBN:978-3-031-60614-4

    DOI:10.1007/978-3-031-60615-1

    • Editors:
    • Helmut Degen

      Siemens Corporation, Princeton, NJ, USA

      ,
    • Stavroula Ntoa

      Foundation for Research and Technology - FORTH, Heraklion, Crete, Greece

    © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 29 June 2024

    Author Tags

    1. Human-Robot Interaction
    2. Voice Interface
    3. User Study

    Qualifiers

    • Article

    Contributors

    Enhancing Usability ofVoice Interfaces forSocially Assistive Robots Through Deep Learning: A German Case Study | Artificial Intelligence in HCI (2)

    Other Metrics

    View Article Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Total Citations

    • Total Downloads

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    View Author Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Publication

    Media

    Figures

    Other

    Tables

    Enhancing Usability of Voice Interfaces for Socially Assistive Robots Through Deep Learning: A German Case Study | Artificial Intelligence in HCI (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Neely Ledner

    Last Updated:

    Views: 5886

    Rating: 4.1 / 5 (42 voted)

    Reviews: 81% of readers found this page helpful

    Author information

    Name: Neely Ledner

    Birthday: 1998-06-09

    Address: 443 Barrows Terrace, New Jodyberg, CO 57462-5329

    Phone: +2433516856029

    Job: Central Legal Facilitator

    Hobby: Backpacking, Jogging, Magic, Driving, Macrame, Embroidery, Foraging

    Introduction: My name is Neely Ledner, I am a bright, determined, beautiful, adventurous, adventurous, spotless, calm person who loves writing and wants to share my knowledge and understanding with you.