LLaVA-3D could perform both 2D and 3D vision-language tasks. The left block (b) shows that compared with previous 3D LMMs, our LLaVA-3D achieves state-of-the-art performance across a wide range of 3D ...
Abstract: 3D multi-object tracking methods utilize object motion, image, and point cloud information to compute similarities between different objects, facilitating cross-frame data association. In ...
Abstract: 3D object detection from images, one of the fundamental and challenging problems in autonomous driving, has received increasing attention from both industry and academia in recent years.
Variants: Each model includes all high-resolution variants, resulting in a total of 1,659,097 3D instances. Specialized Subsets: For the rigged and animated categories of models, we further obtain the ...