MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare (2212.06870v1)

Published 13 Dec 2022 in cs.CV and cs.RO

Abstract: We introduce MegaPose, a method to estimate the 6D pose of novel objects, that is, objects unseen during training. At inference time, the method only assumes knowledge of (i) a region of interest displaying the object in the image and (ii) a CAD model of the observed object. The contributions of this work are threefold. First, we present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects. The shape and coordinate system of the novel object are provided as inputs to the network by rendering multiple synthetic views of the object's CAD model. Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner. Third, we introduce a large-scale synthetic dataset of photorealistic images of thousands of objects with diverse visual and shape properties and show that this diversity is crucial to obtain good generalization performance on novel objects. We train our approach on this large synthetic dataset and apply it without retraining to hundreds of novel objects in real images from several pose estimation benchmarks. Our approach achieves state-of-the-art performance on the ModelNet and YCB-Video datasets. An extensive evaluation on the 7 core datasets of the BOP challenge demonstrates that our approach achieves performance competitive with existing approaches that require access to the target objects during training. Code, dataset and trained models are available on the project page: https://megapose6d.github.io/.

PDF Abstract

Overview of the CVPR Proceedings \LaTeX\ Author Guidelines

The document under review presents the \LaTeX\ author guidelines for the Conference on Computer Vision and Pattern Recognition (CVPR) proceedings, providing a comprehensive guide for researchers intending to submit manuscripts to this prominent conference. This guide is essential for adhering to the prescribed submission and formatting standards, which ensure uniformity and readability across the accepted submissions.

Key Details and Formatting Instructions

The guidelines are structured to facilitate authors in producing manuscripts that align with the CVPR's stringent requirements. The document begins by outlining the language and dual submission policies. A notable emphasis is placed on the English language requirement, reflecting the conference's international scope. Additionally, the policy on dual submissions mandates that papers under consideration elsewhere must be disclosed and appropriately managed.

One of the critical components of the document is the section on paper length. Authors are informed that, excluding references, the manuscript must not exceed eight pages. Noteworthy is the policy of non-review for overlength papers, underscoring the inflexible adherence to the formatting criteria. This strictness aids in maintaining equitable review conditions and ensures that all submissions are assessed on a comparable basis.

The document further details the use of a printed ruler to aid reviewers in line-specific commentary, though it is absent in the final camera-ready submission. Authors are urged to ensure the visibility of the submission system's Paper ID on the review version of their manuscript, maintaining traceability during the blind review process.

Mathematics and Cross-referencing

The guidelines also address the formatting of mathematical elements within the text. Authors are instructed to number all sections and displayed equations for easy reference. The guidelines highlight using specific commands, such as \cref and \Cref, to facilitate cross-referencing within the document, aiding clarity for readers and reviewers alike.

Blind Review Process

The double-blind review process is another critical aspect discussed. Authors are reminded that anonymizing their submissions does not entail removing citations to their own work, but rather avoiding first-person references when doing so. This practice preserves the integrity of the review process while allowing the acknowledgment of prior relevant work.

Practical Implications for Publication

The practical implications of these guidelines lie in streamlining the manuscript preparation process for authors, which directly impacts the submission and review processes. By enforcing these formatting and submission criteria, CVPR ensures a standard that enhances the readability and professional appearance of all conference presentations.

Future Considerations

While the document is focused on the procedural aspects of manuscript submission, it indirectly impacts the broader field of computer vision research. By maintaining high submission standards, CVPR continues to cultivate a repository of high-quality research, which serves as a critical resource for further advancements in the field. As AI and computer vision technologies evolve, these guidelines will likely adapt to include new formats or submission content, such as potentially incorporating multimedia elements alongside traditional text-based submissions.

In summary, this document serves as an essential resource for researchers in computer vision, guiding them through the intricacies of CVPR's submission process and ensuring their work meets the high standards required for presentation at this prestigious conference.

PDF Markdown Bookmark Chat (Pro)

Authors (10)

Yann Labbé (12 papers)
Lucas Manuelli (10 papers)
Arsalan Mousavian (42 papers)
Stephen Tyree (29 papers)
Stan Birchfield (64 papers)
Jonathan Tremblay (43 papers)
Justin Carpentier (36 papers)
Mathieu Aubry (50 papers)
Dieter Fox (201 papers)
Josef Sivic (78 papers)

Citations (94)

View on Semantic Scholar

Related Papers

Find Related Papers