Abstract: Recognizing surgical gestures in real-time is a stepping stone towards automated activity recognition, skill assessment, intra-operative assistance, and eventually surgical automation. The ...
we introduce OmniMMI, a comprehensive multi-modal interaction benchmark tailored for OmniLLMs in streaming video contexts. OmniMMI encompasses over 1,121 interactive videos and 2,290 questions, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results