Publications | Quentin Guimard

2024

TOMM

Deep Variational Learning for 360° Adaptive Streaming

Quentin Guimard, Lucile Sassatelli, and 4 more authors

ACM Trans. Multimedia Comput. Commun. Appl., Aug 2024

Abs DOI

Prediction of head movements in immersive media is key to designing efficient streaming systems able to focus the bandwidth budget on visible areas of the content. However, most of the numerous proposals made to predict user head motion in 360° images and videos do not explicitly consider a prominent characteristic of the head motion data: its intrinsic uncertainty. In this article, we present an approach to generate multiple plausible futures of head motion in 360° videos, given a common past trajectory. To our knowledge, this is the first work that considers the problem of multiple head motion prediction for 360° video streaming. We introduce our discrete variational multiple sequence (DVMS) learning framework, which builds on deep latent variable models. We design a training procedure to obtain a flexible, lightweight stochastic prediction model compatible with sequence-to-sequence neural architectures. Experimental results on four different datasets show that DVMS outperforms competitors adapted from the self-driving domain by up to 41% on prediction horizons up to 5 s, at lower computational and memory costs. To understand how the learned features account for the motion uncertainty, we analyze the structure of the learned latent space and connect it with the physical properties of the trajectories. We also introduce a method to estimate the likelihood of each generated trajectory, enabling the integration of DVMS in a streaming system. We hence deploy an extensive evaluation of the interest of our DVMS proposal for a streaming system. To do so, we first introduce a new Python-based 360° streaming simulator that we make available to the community. On real-world user, video, and networking data, we show that predicting multiple trajectories yields higher fairness between the traces, the gains for 20–30% of the users reaching up to 10% in visual quality for the best number K of trajectories to generate.

2023

MMSys 2023

SMART360: Simulating Motion Prediction and Adaptive BitRate STrategies for 360° Video Streaming

Quentin Guimard, and Lucile Sassatelli

In Proceedings of the 14th Conference on ACM Multimedia Systems, Vancouver, BC, Canada, Aug 2023

Abs DOI

Adaptive bitrate (ABR) algorithms are used in streaming media to adjust video or audio quality based on the viewer’s network conditions to provide a smooth playback experience. With the rise of virtual reality (VR) headsets, 360° video streaming is growing rapidly and requires efficient ABR strategies to also adapt the video quality to the user’s head position. However, research in this field is often difficult to compare due to a lack of reproducible simulations. To address this problem, we provide SMART360, a 360° streaming simulation environment to compare motion prediction and adaptive bitrates strategies. We provide sample inputs and baseline algorithms along with the simulator, as well as examples of results and visualizations that can be obtained with SMART360. The code and data are made publicly available.

2022

MMSys 2022

Deep Variational Learning for Multiple Trajectory Prediction of 360° Head Movements

Quentin Guimard, Lucile Sassatelli, and 4 more authors

In Proceedings of the 13th ACM Multimedia Systems Conference, Athlone, Ireland, Aug 2022

Abs DOI

Prediction of head movements in immersive media is key to design efficient streaming systems able to focus the bandwidth budget on visible areas of the content. Numerous proposals have therefore been made in the recent years to predict 360° images and videos. However, the performance of these models is limited by a main characteristic of the head motion data: its intrinsic uncertainty. In this article, we present an approach to generate multiple plausible futures of head motion in 360° videos, given a common past trajectory. Our method provides likelihood estimates of every predicted trajectory, enabling direct integration in streaming optimization. To the best of our knowledge, this is the first work that considers the problem of multiple head motion prediction for 360° video streaming. We first quantify this uncertainty from the data. We then introduce our discrete variational multiple sequence (DVMS) learning framework, which builds on deep latent variable models. We design a training procedure to obtain a flexible and lightweight stochastic prediction model compatible with sequence-to-sequence recurrent neural architectures. Experimental results on 3 different datasets show that our method DVMS outperforms competitors adapted from the self-driving domain by up to 37% on prediction horizons up to 5 sec., at lower computational and memory costs. Finally, we design a method to estimate the respective likelihoods of the multiple predicted trajectories, by exploiting the stationarity of the distribution of the prediction error over the latent space. Experimental results on 3 datasets show the quality of these estimates, and how they depend on the video category.
ICIP 2022

On The Link Between Emotion, Attention And Content In Virtual Immersive Environments

Quentin Guimard, Florent Robert, and 6 more authors

In 2022 IEEE International Conference on Image Processing (ICIP), Aug 2022

Abs DOI

While immersive media have been shown to generate more intense emotions, saliency information has been shown to be a key component for the assessment of their quality, owing to the various portions of the sphere (viewports) a user can attend. In this article, we investigate the tri-partite connection between user attention, user emotion and visual content in immersive environments. To do so, we present a new dataset enabling the analysis of different types of saliency, both low-level and high-level, in connection with the user’s state in 360 ◦ videos. Head and gaze movements are recorded along with self-reports and continuous physiological measurements of emotions. We then study how the accuracy of saliency estimators in predicting user attention depends on user-reported and physiologically-sensed emotional perceptions. Our results show that high-level saliency better predicts user attention for higher levels of arousal. We discuss how this work serves as a first step to understand and predict user attention and intents in immersive interactive environments.
MMVE 2022

Effects of Emotions on Head Motion Predictability in 360° Videos

Quentin Guimard, and Lucile Sassatelli

In Proceedings of the 14th International Workshop on Immersive Mixed and Virtual Environment Systems, Athlone, Ireland, Aug 2022

Abs DOI

While 360° videos watched in a VR headset are gaining in popularity, it is necessary to lower the required bandwidth to stream these immersive videos and obtain a satisfying quality of experience. Doing so requires predicting the user’s head motion in advance, which has been tackled by a number of recent prediction methods considering the video content and the user’s past motion. However, human motion is a complex process that can depend on many more parameters, including the type of attentional phase the user is currently in, and their emotions, which can be difficult to capture. This is the first article to investigate the effects of user emotions on the predictability of head motion, in connection with video-centric parameters. We formulate and verify hypotheses, and construct a structural equation model of emotion, motion and predictability. We show that the prediction error is higher for higher valence ratings, and that this relationship is mediated by head speed. We also show that the prediction error is lower for higher arousal, but that spatial information moderates the effect of arousal on predictability. This work opens the path to better capture important factors in human motion, to help improve the training process of head motion predictors.
MMSys 2022

PEM360: A Dataset of 360° Videos with Continuous Physiological Measurements, Subjective Emotional Ratings and Motion Traces

Quentin Guimard, Florent Robert, and 6 more authors

In Proceedings of the 13th ACM Multimedia Systems Conference, Athlone, Ireland, Aug 2022

Abs DOI

From a user perspective, immersive content can elicit more intense emotions than flat-screen presentations. From a system perspective, efficient storage and distribution remain challenging, and must consider user attention. Understanding the connection between user attention, user emotions and immersive content is therefore key. In this article, we present a new dataset, PEM360 of user head movements and gaze recordings in 360° videos, along with self-reported emotional ratings of valence and arousal, and continuous physiological measurement of electrodermal activity and heart rate. The stimuli are selected to enable the spatiotemporal analysis of the connection between content, user motion and emotion. We describe and provide a set of software tools to process the various data modalities, and introduce a joint instantaneous visualization of user attention and emotion we name Emotional maps. We exemplify new types of analyses the PEM360 dataset can enable. The entire data and code are made available in a reproducible framework.
MMSys 2022

Machine Learning-Based Strategies for Streaming and Experiencing 3DoF Virtual Reality: Research Proposal

Quentin Guimard, and Lucile Sassatelli

In Proceedings of the 13th ACM Multimedia Systems Conference, Athlone, Ireland, Aug 2022

Abs DOI

This paper contains the research proposal of Quentin Guimard that was presented at the MMSys 2022 doctoral symposium.The development of 360° videos experienced in virtual reality (VR) is hindered by network, cybersickness, and content perception challenges. Many levers have already been proposed to address these challenges, but separately. This PhD thesis intends to jointly address these issues by dynamically controlling levers and making quality decisions, with a view to improving the VR streaming experience.This paper describes the steps necessary to the building of such approach, by separating work that has already been achieved over the course of this PhD from tasks that are still left to do. First results are also presented.

2021

arXiv

Early Detection of Security-Relevant Bug Reports using Machine Learning: How Far Are We?

Arthur D. Sawadogo, Quentin Guimard, and 4 more authors

Aug 2021

Abs DOI

Bug reports are common artefacts in software development. They serve as the main channel for users to communicate to developers information about the issues that they encounter when using released versions of software programs. In the descriptions of issues, however, a user may, intentionally or not, expose a vulnerability. In a typical maintenance scenario, such security-relevant bug reports are prioritised by the development team when preparing corrective patches. Nevertheless, when security relevance is not immediately expressed (e.g., via a tag) or rapidly identified by triaging teams, the open security-relevant bug report can become a critical leak of sensitive information that attackers can leverage to perform zero-day attacks. To support practitioners in triaging bug reports, the research community has proposed a number of approaches for the detection of security-relevant bug reports. In recent years, approaches in this respect based on machine learning have been reported with promising performance. Our work focuses on such approaches, and revisits their building blocks to provide a comprehensive view on the current achievements. To that end, we built a large experimental dataset and performed extensive experiments with variations in feature sets and learning algorithms. Eventually, our study highlights different approach configurations that yield best performing classifiers.