Text-driven Video Acceleration

De Souza Ramos, Washington and Soriano Marcolino, Leandro and Nascimento, Erickson R. (2024) Text-driven Video Acceleration. In: 37th Conference on Graphics, Patterns and Images (SIBGRAPI) : Workshop of Theses and Dissertations (WTD). UNSPECIFIED, BRA, pp. 35-41.

[thumbnail of WashingtonRamos_2024_WTD_SIBGRAPI_CameraReady]
Text (WashingtonRamos_2024_WTD_SIBGRAPI_CameraReady)
WashingtonRamos_2024_WTD_SIBGRAPI_CameraReady.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (8MB)

Abstract

From the dawn of the digital revolution until today, data has grown exponentially, especially in images and videos. Smartphones and wearable devices with high storage and long battery life contribute to continuous recording and massive uploads to social media. This rapid increase in visual data, combined with users' limited time, demands methods to produce shorter videos that convey the same information. Semantic Fast-Forwarding reduces viewing time by adaptively accelerating videos and slowing down for relevant segments. However, current methods require predefined visual concepts or user supervision, which is costly and time-consuming. This work explores using textual data to create text-driven fast-forwarding methods that generate semantically meaningful videos without explicit user input. Our proposed approaches outperform baselines, achieving F1 Score improvements up to 12.8 percentage points over the best competitors. Comprehensive user and ablation studies, along with quantitative and qualitative evaluations, confirm their superiority. Visual results are available at https://youtu.be/cOYqumJQOY and https://youtu.be/u6ODTv7-9C4 .

Item Type:
Contribution in Book/Report/Proceedings
Uncontrolled Keywords:
Research Output Funding/no_not_funded
Subjects:
?? no - not funded ??
ID Code:
224129
Deposited By:
Deposited On:
04 Dec 2024 15:35
Refereed?:
Yes
Published?:
Published
Last Modified:
12 Dec 2024 00:47