Skip navigation
Por favor, use este identificador para citar o enlazar este ítem: https://repositorio.ufpe.br/handle/123456789/58501

Comparte esta pagina

Título : Domain adaptation using randomized knowledge for monocular 6DoF pose estimation
Autor : CUNHA, Kelvin Batista da
Palabras clave : Estimação de pose; Detecção de objetos; Randomização de domínio
Fecha de publicación : 6-jun-2024
Editorial : Universidade Federal de Pernambuco
Citación : CUNHA, Kelvin Batista da. Domain adaptation using randomized knowledge for monocular 6DoF pose estimation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
Resumen : The 6DoF (six-degrees-of-freedom) pose of rigid objects is pivotal in solving various tasks within computer vision, facilitating seamless interaction between physical and virtual elements. Recent advancements in vision-based pose estimation, particularly through deep learning (DL), have significantly enhanced accuracy. DL models are adept at extracting intricate scene de- tails, empowering them to discern and adapt to diverse scenarios with efficiency. Still, DL methodologies demonstrate exceptional versatility, capable of assimilating various input types. Noteworthy is their ability to distill object features exclusively from RGB data, fitting mod- els that exhibit real-time performance across a spectrum of devices. This capability not only streamlines computational requirements but also broadens the applicability of such models in real-world settings. However, DL often requires extensive datasets tailored to specific tar- get distributions. Acquiring, annotating, and maintaining such datasets is not only costly and time-consuming but also susceptible to inaccuracies, failing to fully encapsulate the application domain. Our initial studies analyzed the impact of distribution shifts on 6DoF pose estimation, revealing models’ reliance on training data and their susceptibility to real-world challenges (i.e., generalization on test set). Variations rarely encountered during training, such as changes in object appearance (e.g., size, color, geometry), environmental conditions (e.g., illumination, motion speed, occlusion), and camera hardware (i.e., when the model is trained with one camera but tested with a different one), can drastically affect model accuracy. To address this challenge, we propose a pipeline that generates a diverse array of synthetic sequences using CAD models of objects. By randomizing scene elements in each frame, even if conditions ap- pear incoherent or surrealistic, we can train supervised models using simulated data, thereby reducing the dependency on labeled real data and enabling adaptation to continuous trans- formations in the target distribution. Furthermore, we extended our pipeline by introducing a novel strategy based on a photo-realistic randomized synthetic generation to mitigate target domain variations within monocular deep 6DoF pose estimation while preserving source fea- tures to reduce the domain gap. Leveraging a combination of NeRF (Neural Radiance Fields) reconstruction and domain randomization techniques, our approach demonstrates the feasibil- ity of achieving accurate pose estimation models with reduced reliance on real data. Finally, we propose a CAD-free 6DoF pose estimation pipeline using randomized frames for object tracking, seamlessly integrating object detection and optical flow. As an additional contribu- tion, we propose C3PO, a cross-device dataset organized for each device according to different challenges in pose estimation. The dataset includes more than 100000 full RGB images with pose annotations for three 3D printed objects and three different cameras, addressing issues such as occlusion, illumination changes, motion blur, color variation, and scale variation. Using C3PO, we can assess the method’s performance in the face of different isolated challenges to analyze the impact of randomized data in each variation. Comprehensive experiments against state-of-the-art methods on publicly available datasets, including linemod, linemod-Occlusion, C3PO, and HomebrewedDB, indicate the validity of our approach. Emphasizing the impact of randomization in addressing challenges associated with domain variations, such as changes in environmental lighting, motion blur, and object occlusion, underscores the significance of our contributions.
URI : https://repositorio.ufpe.br/handle/123456789/58501
Aparece en las colecciones: Teses de Doutorado - Ciência da Computação

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
TESE Kelvin Batista Da Cunha.pdf13,72 MBAdobe PDFVista previa
Visualizar/Abrir


Este ítem está protegido por copyright original



Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons