This repo serves as the starting point for the tutorial from the Ultravox documentation. Debug Logging showDebugMessages=true Turns on some additional console logging. Speaker Mute Toggle ...
TL;DR (1) - Add an adaptive mask onto the image to enhance LVLM performance. TL;DR (2) - Mask is generated by an auxiliary LVLM based on the relevance between the image regions and the query. 🔧 The ...