In the spirit of unity and during times of racial strife, it is important to hear the voices of the disadvantaged and ...
Abstract: Semantic text matching is a fundamental task in Natural Language Processing, with existing methods mainly focusing on short texts. However, handling long texts remains a challenge, as ...
Text-Based Editing is one of those genuinely transformative technologies that comes along once in a while. How will is it likely to change the editing workflow? And are there any downsides? Shiv ...
Abstract: Text-guided 3D face synthesis has achieved remarkable results by leveraging text-to-image (T2I) diffusion models. However, most existing works focus solely on the direct gen-eration, ...
TL;DR: Here, we propose FlowDirector, a training- and inversion-free framework for text-guided video editing, enabling precise object edits and temporal consistency through new spatial correction and ...
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including ...