Domain adaptation using randomized knowledge for monocular 6DoF pose estimation

CUNHA, Kelvin Batista da

Use este identificador para citar ou linkar para este item: https://repositorio.ufpe.br/handle/123456789/58501

Compartilhe esta página

Título:	Domain adaptation using randomized knowledge for monocular 6DoF pose estimation
Autor(es):	CUNHA, Kelvin Batista da
Palavras-chave:	Estimação de pose; Detecção de objetos; Randomização de domínio
Data do documento:	6-Jun-2024
Editor:	Universidade Federal de Pernambuco
Citação:	CUNHA, Kelvin Batista da. Domain adaptation using randomized knowledge for monocular 6DoF pose estimation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
Abstract:	The 6DoF (six-degrees-of-freedom) pose of rigid objects is pivotal in solving various tasks within computer vision, facilitating seamless interaction between physical and virtual elements. Recent advancements in vision-based pose estimation, particularly through deep learning (DL), have significantly enhanced accuracy. DL models are adept at extracting intricate scene de- tails, empowering them to discern and adapt to diverse scenarios with efficiency. Still, DL methodologies demonstrate exceptional versatility, capable of assimilating various input types. Noteworthy is their ability to distill object features exclusively from RGB data, fitting mod- els that exhibit real-time performance across a spectrum of devices. This capability not only streamlines computational requirements but also broadens the applicability of such models in real-world settings. However, DL often requires extensive datasets tailored to specific tar- get distributions. Acquiring, annotating, and maintaining such datasets is not only costly and time-consuming but also susceptible to inaccuracies, failing to fully encapsulate the application domain. Our initial studies analyzed the impact of distribution shifts on 6DoF pose estimation, revealing models’ reliance on training data and their susceptibility to real-world challenges (i.e., generalization on test set). Variations rarely encountered during training, such as changes in object appearance (e.g., size, color, geometry), environmental conditions (e.g., illumination, motion speed, occlusion), and camera hardware (i.e., when the model is trained with one camera but tested with a different one), can drastically affect model accuracy. To address this challenge, we propose a pipeline that generates a diverse array of synthetic sequences using CAD models of objects. By randomizing scene elements in each frame, even if conditions ap- pear incoherent or surrealistic, we can train supervised models using simulated data, thereby reducing the dependency on labeled real data and enabling adaptation to continuous trans- formations in the target distribution. Furthermore, we extended our pipeline by introducing a novel strategy based on a photo-realistic randomized synthetic generation to mitigate target domain variations within monocular deep 6DoF pose estimation while preserving source fea- tures to reduce the domain gap. Leveraging a combination of NeRF (Neural Radiance Fields) reconstruction and domain randomization techniques, our approach demonstrates the feasibil- ity of achieving accurate pose estimation models with reduced reliance on real data. Finally, we propose a CAD-free 6DoF pose estimation pipeline using randomized frames for object tracking, seamlessly integrating object detection and optical flow. As an additional contribu- tion, we propose C3PO, a cross-device dataset organized for each device according to different challenges in pose estimation. The dataset includes more than 100000 full RGB images with pose annotations for three 3D printed objects and three different cameras, addressing issues such as occlusion, illumination changes, motion blur, color variation, and scale variation. Using C3PO, we can assess the method’s performance in the face of different isolated challenges to analyze the impact of randomized data in each variation. Comprehensive experiments against state-of-the-art methods on publicly available datasets, including linemod, linemod-Occlusion, C3PO, and HomebrewedDB, indicate the validity of our approach. Emphasizing the impact of randomization in addressing challenges associated with domain variations, such as changes in environmental lighting, motion blur, and object occlusion, underscores the significance of our contributions.
URI:	https://repositorio.ufpe.br/handle/123456789/58501
Aparece nas coleções:	Teses de Doutorado - Ciência da Computação

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
TESE Kelvin Batista Da Cunha.pdf		13,72 MB	Adobe PDF	Visualizar/Abrir

Este arquivo é protegido por direitos autorais

Ver licença

Mostrar registro completo do item Recomendar este item Visualizar estatísticas

Este item está licenciada sob uma Licença Creative Commons