Voice-generation technology enables machines to synthesize human-like speech—text-to-speech (TTS)—revolutionizing digital communication by fostering more inclusive and accessible experiences. What ...
This project demonstrates how to track a ball in a video showcasing a Tennis game by training a custom YOLO detection model. The model is trained not only for ball detection but also interpolation to ...
Abstract: The artist's style can be quickly imitated by fine-tuning a text-to-image model using artist's artworks, which raises serious copyright concerns. Scholars have proposed many watermarking ...
Each test case in Paircomp contains two similar prompts with subtle differences. By comparing the accuracy of the images generated by the model for each prompt, we evaluate whether the model has ...
Felice Frankel is a photographer and researcher in the Department of Chemical Engineering at the Massachusetts Institute of Technology in Cambridge. Her upcoming book is Phenomenal Moments. Flashes of ...
Abstract: Artificial Intelligence Generated Content (AIGC) has created a fertile ground for image steganography. Existing Coverless Image Steganography (CIS) methods rely on image semantics to encode ...