Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation (2303.11579v2)

Published 21 Mar 2023 in cs.CV

Abstract: In this paper, a novel Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed for probabilistic 3D human pose estimation. On the one hand, D3DP generates multiple possible 3D pose hypotheses for a single 2D observation. It gradually diffuses the ground truth 3D poses to a random distribution, and learns a denoiser conditioned on 2D keypoints to recover the uncontaminated 3D poses. The proposed D3DP is compatible with existing 3D pose estimators and supports users to balance efficiency and accuracy during inference through two customizable parameters. On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use. It reprojects 3D pose hypotheses to the 2D camera plane, selects the best hypothesis joint-by-joint based on the reprojection errors, and combines the selected joints into the final pose. The proposed JPMA conducts aggregation at the joint level and makes use of the 2D prior information, both of which have been overlooked by previous approaches. Extensive experiments on Human3.6M and MPI-INF-3DHP datasets show that our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively. Code is available at https://github.com/paTRICK-swk/D3DP.

Citations (65)

View on Semantic Scholar

Summary

The paper outlines precise formatting specifications, including two-column layouts and consistent type-styles, to standardize ICCV submissions.
It emphasizes strict page limitations and structured manuscript organization to ensure submissions meet review requirements.
The guidelines also enforce blind review integrity and clear referencing practices to support unbiased and coherent scholarly evaluation.

Overview of the ICCV Proceedings Author Guidelines

This paper delineates the author guidelines for submitting manuscripts to the International Conference on Computer Vision (ICCV) proceedings. It provides a comprehensive framework designed to standardize submissions across various sections and categories, aimed at ensuring uniformity and ease of processing. The content covers an array of elements critical to authors, including formatting directives, submission constraints, and blind review policies.

Key Aspects of Submission

The paper emphasizes several fundamental elements crucial for authors preparing submissions:

Formatting Specifications: The paper outlines specific formatting instructions for authors, focusing on the use of a two-column layout, appropriate margins, and type-styles. It insists on using Times or Times Roman scripts to maintain textual uniformity and readability. An integral component is the inclusion of clear headings and structuring of paragraphs with mandated indentation rules.
Page Limitations and Structure: Authors must strictly adhere to an eight-page limit excluding references, with provisions for adapting figures/graphs to smaller fonts to accommodate the text limits effectively. Non-compliance with these guidelines results in submissions not being reviewed, thus emphasizing the importance of strict adherence.
Style Guide Deviations: The document acknowledges past issues such as the non-use of tape for artwork attachments, reflecting evolving standards in typesetting and manuscript presentation as digital platforms become prevalent.

Review and Submission Guidelines

The instructions extend into specifics of the review process:

Blind Review Preparation: Authors should anonymize their submissions but still allow recognition of previous works cited. The manuscript should avoid the use of personal pronouns like "my" or "our," related to previous publications to maintain author anonymity during the blind review process.
Dual Submissions and Parallel Works: The guidelines address the handling of dual submission scenarios, stressing the necessity to differentiate works while maintaining anonymity.
Use of Tooling and Technological References: There is detailed direction regarding referencing tools or technologies, ensuring submissions relate cutting-edge methodologies without compromising the blind review process.

Practical Implications and Speculation

The detailed nature of these guidelines has implications for both theoretical understanding and practical execution of manuscript preparation. From a practical standpoint, these guidelines assist authors in effectively structuring and presenting complex visual computing research, critical for conveying intricate ideas standardly across the community.

There is continued evolution in submission protocols, reflecting a response to technological advancements and increased accessibility to collaborative tools. Future development may include more advanced digital methods for managing reviews, such as automated style checking or real-time collaboration with built-in compliance features.

The predominant benefit lies in fostering coherence within extensive conference proceedings, thus facilitating better interpretation, discussion, and advancement across research projects. The guidelines ensure that authors present their findings with clarity and uniformity, thereby contributing significantly to the field’s overall progression.

In conclusion, the ICCV author guidelines paper serves as an indispensable resource for researchers intending to submit high-quality, compliant contributions to the field of computer vision, promoting a cohesive scholarly communication practice.

PDF Markdown

Related Papers

GitHub

GitHub - paTRICK-swk/D3DP: [ICCV2023] The PyTorch implementation for "Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation" (180 stars)