Design of a Virtual Assistant to Improve Interaction Between the Audience and the Presenter.

Salvador Cobos Guzman; Silvia Nuere; Laura de Miguel; Caroline König

doi:10.9781/ijimai.2021.08.017

Authors

Salvador Cobos Guzman Universidad Internacional De La Rioja
Silvia Nuere Universidad Politécnica de Madrid
Laura de Miguel Universidad Internacional De La Rioja
Caroline König Universitat Politècnica de Catalunya

DOI:

https://doi.org/10.9781/ijimai.2021.08.017

Keywords:

Artificial Intelligence, Human-Computer Interaction (HCI), e-learning, Virtual Assistant, Multimedia

Supporting Agencies

We would like to acknowledge Gema Fernández-Blanco Martín for her contribution on the psychological analysis of the survey results.

Abstract

This article presents a novel design of a Virtual Assistant as part of a human-machine interaction system to improve communication between the presenter and the audience that can be used in education or general presentations for improving interaction during the presentations (e.g., auditoriums with 200 people). The main goal of the proposed model is the design of a framework of interaction to increase the level of attention of the public in key aspects of the presentation. In this manner, the collaboration between the presenter and Virtual Assistant could improve the level of learning among the public. The design of the Virtual Assistant relies on non-anthropomorphic forms with ‘live’ characteristics generating an intuitive and self-explainable interface. A set of intuitive and useful virtual interactions to support the presenter was designed. This design was validated from various types of the public with a psychological study based on a discrete emotions’ questionnaire confirming the adequacy of the proposed solution. The human-machine interaction system supporting the Virtual Assistant should automatically recognize the attention level of the audience from audiovisual resources and synchronize the Virtual Assistant with the presentation. The system involves a complex artificial intelligence architecture embracing perception of high-level features from audio and video, knowledge representation, and reasoning for pervasive and affective computing and reinforcement learning to teach the intelligent agent to decide on the best strategy to increase the level of attention of the audience.

Downloads

Download data is not yet available.

References

B. Alters and C. Nelson, “Perspective: teaching evolution in higher education”, Evolution, vol. 56, no. 10, p. 1891, 2002, doi: 10.1554/0014-3820(2002)056[1891:pteihe]2.0.co;2.

P.R. Pintrich, A. Zusho “Student Motivation and Self-Regulated Learning in the College Classroom,” in: Smart J.C., Tierney W.G. (eds) Higher Education: Handbook of Theory and Research. Higher Education: Handbook of Theory and Research, vol 17. Springer, Dordrecht, 2002, pp. 55-128, doi: 10.1007/978-94-010-0245-5_2.

A. Nijholt, “Towards the Automatic Generation of Virtual Presenter Agents”, Informing Science: The International Journal of an Emerging Transdiscipline, vol. 9, pp. 097-110, 2006, doi: 10.28945/474.

R. Looije, M. Neerincx and F. Cnossen, “Persuasive robotic assistant for health self-management of older adults: Design and evaluation of social behaviors”, International Journal of Human-Computer Studies, vol. 68, no. 6, pp. 386-397, 2010, doi: 10.1016/j.ijhcs.2009.08.007.

C. Bartneck, J. Reichenbach, and A. van Breemen, “In Your Face, Robot! The Influence of a Character’s Embodiment on How Users Perceive Its Emotional Expressions”, in Proceedings of the Design and Emotion, Ankara, Turkey, 2004, pp. 32–51.

W. Burgard et al., “Experiences with an interactive museum tour-guide robot”, Artificial Intelligence, vol. 114, no. 1-2, pp. 3-55, 1999, doi: 10.1016/s0004-3702(99)00070-3.

DeixiLabs, Accessed: Feb. 12, 2020. [Online]. Available: http://www.deixilabs.com/eliza.html

M. Mori, K. MacDorman and N. Kageki, “The Uncanny Valley [From the Field]”, IEEE Robotics & Automation Magazine, vol. 19, no. 2, pp. 98-100, 2012, doi: 10.1109/mra.2012.2192811.

S. Freud, The Uncanny (1919). Accessed: Jun. 28, 2021. [Online]. Available: https://web.mit.edu/allanmc/www/freud1.pdf

E. Jentsch, “On the psychology of the uncanny (1906)”, Angelaki, vol. 2, no. 1, pp. 7-16, 1997, doi: 10.1080/09697259708571910.

A. Chubarov and D. Azarnov, “Modeling Behavior of Virtual Actors: A Limited Turing Test for Social-Emotional Intelligence”, in: Samsonovich A., Klimov V. (eds) Biologically Inspired Cognitive Architectures (BICA) for Young Scientists. BICA 2017. Advances in Intelligent Systems and Computing, vol 636. Springer, Cham, 2018, doi: 10.1007/978-3-319-63940-6_5.

V. André et al., “Ethorobotics applied to human behaviour: can animated objects influence children’s behaviour in cognitive tasks?”, Animal Behaviour, vol. 96, pp. 69-77, 2014, doi: 10.1016/j.anbehav.2014.07.020.

S. Kim, B. Schmitt and N. Thalmann, “Eliza in the uncanny valley: anthropomorphizing consumer robots increases their perceived warmth but decreases liking”, Marketing Letters, vol. 30, no. 1, pp. 1-12, 2019, doi: 10.1007/s11002-019-09485-9.

K. Sullivan, G. Schumer and K. Alexander, “Ideas for the animated short: finding and building stories” Focal Press, USA, 2008, pp. 64-67.

K. I. Radoslav, “Televisión, dibujos animados y literatura para niños”, Aisthesis, 29, pp 33-49, 1996.

Pixar Animation Studios, Accessed: Mar. 19, 2021. [Online]. Available: https://www.pixar.com/feature-films/monsters-inc

E. Heller, “Psicología del color, Cómo actúan los colores sobre los sentimientos y la razón”, Gustavo Gili, Barcelona, 2004.

J. Guzmán Ramírez, “Una metodología para la creación de personajes desde el diseño de concepto”, Iconofacto, vol. 12, no. 18, pp. 96-117, 2016, doi: 10.18566/v12n18.a06.

I. Revina and W. Emmanuel, “A Survey on Human Face Expression Recognition Techniques”, Journal of King Saud University - Computer and Information Sciences, 2018, doi: 10.1016/j.jksuci.2018.09.002.

M. E. Holzschlag, “Color para sitios web”, McGraw Hill, México, 2002.

A. Frutiger, “Signos, símbolos, marcas, señales”, Gustavo Gili, México, 2007.

C. Harmon-Jones, B. Bastian and E. Harmon-Jones, “The Discrete Emotions Questionnaire: A New Tool for Measuring State Self-Reported Emotions”, PLOS ONE, vol. 11, no. 8, p. e0159915, 2016, doi: 10.1371/journal.pone.0159915.

I. Bhattacharya, M. Foley, N. Zhang, T. Zhang, C. Ku, C. Mine, and R. Radke, (2018). “A multimodal-sensor-enabled room for unobtrusive group meeting analysis, in Proceedings of the 20th ACM International Conference on Multimodal Interaction, 2018, pp. 347-355.

G. Trotta et al., “A neural network-based software to recognise blepharospasm symptoms and to measure eye closure time”, Computers in Biology and Medicine, vol. 112, p. 103376, 2019, doi: 10.1016/j.compbiomed.2019.103376.

Y. Wang, R. Huang and L. Guo, “Eye gaze pattern analysis for fatigue detection based on GP-BCNN with ESM”, Pattern Recognition Letters, vol. 123, pp. 61-74, 2019, doi: 10.1016/j.patrec.2019.03.013.

Y. Kuo, J. Lee and M. Hsieh, “Video-Based Eye Tracking to Detect the Attention Shift”, International Journal of Distance Education Technologies, vol. 12, no. 4, pp. 66-81, 2014, doi: 10.4018/ijdet.2014100105.

M. Shakeel and K. Lam, “Deep-feature encoding-based discriminative model for age-invariant face recognition”, Pattern Recognition, vol. 93, pp. 442-457, 2019, doi: 10.1016/j.patcog.2019.04.028.

J. Abeßer, “A Review of Deep Learning Based Methods for Acoustic Scene Classification”, Applied Sciences, vol. 10, no. 6, p. 2020, 2020, doi: 10.3390/app10062020.

Y. Li, Q. He, S. Kwong, T. Li and J. Yang, “Characteristics-based effective applause detection for meeting speech”, Signal Processing, vol. 89, no. 8, pp. 1625-1633, 2009, doi: 10.1016/j.sigpro.2009.03.001.

Y. Belinkov and J. Glass, “Analysis Methods in Neural Language Processing: A Survey”, Transactions of the Association for Computational Linguistics, vol. 7, pp. 49-72, 2019, doi: 10.1162/tacl_a_00254.

N. Saleem, M.I. Khattak, “Deep Neural Networks for Speech Enhancement in Complex-Noisy Environments,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 6, no. 1, pp. 84-90, 2020, doi: 10.9781/ijimai.2019.06.001.

N. Saleem, M.I. Khattak, E. Verdú, “On Improvement of Speech Intelligibility and Quality: A Survey of Unsupervised Single Channel Speech Enhancement Algorithms,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 6, no. 2, pp. 78-89,2020. doi: 10.9781/ijimai.2019.12.001.

A. Torfi, R. Shirvani, Y. Keneshloo, N. Tavvaf, and E. Fox, (2020). “Natural Language Processing Advancements By Deep Learning: A Survey”. ArXiv, Vol. abs/2003.01200, n. pag., 2020, available at: https://www.arxivvanity.com/papers/2003.01200/

A. Guzman, “Voices in and of the machine: Source orientation toward mobile virtual assistants”, Computers in Human Behavior, vol. 90, pp. 343-350, 2019, doi: 10.1016/j.chb.2018.08.009.

M. Gurban, “Multimodal feature extraction and fusion for audio-visual speech recognition”. Lausanne, EPFL, 2009, doi: 10.5075/epfl-thesis-4292.

Q. McNamara, A. De La Vega and T. Yarkoni, “Developing a comprehensive framework for multimodal feature extraction”, in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 2017, pp. 1567-1574.

X. H. Wang, D. Q. Zhang, T. Gu, and H. K. Pung, (2004, March). “Ontology based context modeling and reasoning using OWL”, in IEEE annual conference on pervasive computing and communications workshops, Orlando, FL, USA, 2004, pp. 18-22.

R. Speer, J. Chin, and C. Havasi, “Conceptnet 5.5: An open multilingual graph of general knowledge”, in Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 2017, pp. 4444-4451.

E. Cambria, “Affective Computing and Sentiment Analysis”, IEEE Intelligent Systems, vol. 31, no. 2, pp. 102-107, 2016, doi: 10.1109/ mis.2016.31.

R. Picard, “Affective computing: challenges”, International Journal of Human-Computer Studies, vol. 59, no. 1-2, pp. 55-64, 2003. doi: 10.1016/s1071-5819(03)00052-1.

M. Dragoni, S. Poria and E. Cambria, “OntoSenticNet: A Commonsense Ontology for Sentiment Analysis”, IEEE Intelligent Systems, vol. 33, no. 3, pp. 77-85, 2018. doi: 10.1109/mis.2018.033001419.

R. Sutton, F. Bach and A. Barto, “Reinforcement Learning”, Massachusetts: MIT Press Ltd, 2018.

P. Hohenecker and T. Lukasiewicz, “Ontology Reasoning with Deep Neural Networks”, Journal of Artificial Intelligence Research, vol. 68, 2020, doi: 10.1613/jair.1.11661.

K. Arulkumaran, M. Deisenroth, M. Brundage and A. Bharath, “Deep Reinforcement Learning: A Brief Survey”, IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26-38, 2017. doi: 10.1109/msp.2017.2743240.

J. Luketina, N. Nardelli, G. Farquhar, J. Foerster, J. Andreas, E. Grefenstette, S. Whiteson, and T. Rocktäschel, “A Survey of Reinforcement Learning Informed by Natural Language”, ArXiv, Vol. abs/1906.03926, 2019, available at: http://arxiv.org/abs/1906.03926