Each voice note contains an embedded audio player with the original recording, a summary, action tasks, key points, and the full transcript. The action tasks sometimes feel a bit overconfident, but ...
[2024/4/23] We have added an audio-grounding feature that tracks the sound-making object within the video's soundtrack. [2023/5/12] We have authored a technical report for SAM-Track. [2023/5/7] We ...