Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis
Abstract: Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models. However, existing methods for model adaptation usually update all model parameters, i.e., full fine-tuning paradigm, which is inefficient as it relies on high computational costs (e.g., training GPU memory) and massive storage space. In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency. To achieve this goal, we freeze the parameters of the default pre-trained models and then propose the Dynamic Adapter, which generates a dynamic scale for each token, considering the token significance to the downstream task. We further seamlessly integrate Dynamic Adapter with Prompt Tuning (DAPT) by constructing Internal Prompts, capturing the instance-specific features for interaction. Extensive experiments conducted on five challenging datasets demonstrate that the proposed DAPT achieves superior performance compared to the full fine-tuning counterparts while significantly reducing the trainable parameters and training GPU memory by 95% and 35%, respectively. Code is available at https://github.com/LMD0311/DAPT.
- Crosspoint: Self-supervised cross-modal contrastive learning for 3d point cloud understanding. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- Transfusion: Robust lidar-camera fusion for 3d object detection with transformers. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- Rethinking point cloud registration as masking and reconstruction. In Porc. of IEEE Intl. Conf. on Computer Vision, 2023a.
- Adaptformer: Adapting vision transformers for scalable visual recognition. In Proc. of Advances in Neural Information Processing Systems, 2022.
- Training deep nets with sublinear memory cost. arXiv preprint arXiv:1604.06174, 2016.
- Voxelnext: Fully sparse voxelnet for 3d object detection and tracking. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2023b.
- Pra-net: Point relation-aware network for 3d point cloud analysis. IEEE Transactions on Image Processing, 2021.
- Autoencoders as cross-modal teachers: Can pretrained 2d image transformers help 3d representation learning? In Proc. of Intl. Conf. on Learning Representations, 2022.
- An image is worth 16x16 words: Transformers for image recognition at scale. In Proc. of Intl. Conf. on Learning Representations, 2021.
- A point set generation network for 3d object reconstruction from a single image. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2017.
- Mvtn: Multi-view transformation network for 3d shape recognition. In Porc. of IEEE Intl. Conf. on Computer Vision, 2021.
- Towards a unified view of parameter-efficient transfer learning. In Proc. of Intl. Conf. on Learning Representations, 2021.
- Masked autoencoders are scalable vision learners. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- Attention discriminant sampling for point clouds. In Porc. of IEEE Intl. Conf. on Computer Vision, 2023.
- Query-based temporal fusion with explicit motion for 3d object detection. In Proc. of Advances in Neural Information Processing Systems, 2024.
- Parameter-efficient transfer learning for nlp. In Proc. of Intl. Conf. on Machine Learning, 2019.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Visual prompt tuning. In Proc. of European Conference on Computer Vision, 2022.
- Fact: Factor-tuning for lightweight adaptation on vision transformer. In Proc. of the AAAI Conf. on Artificial Intelligence, 2023.
- Revisiting the parameter efficiency of adapters from the perspective of precision redundancy. In Porc. of IEEE Intl. Conf. on Computer Vision, 2023.
- Compacter: Efficient low-rank hypercomplex adapter layers. In Proc. of Advances in Neural Information Processing Systems, 2021.
- Stratified transformer for 3d point cloud segmentation. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- The power of scale for parameter-efficient prompt tuning. In Proc. of Conf. on Empirical Methods in Natural Language Processing, 2021.
- Pillarnext: Rethinking network designs for 3d object detection in lidar point clouds. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2023.
- Prefix-tuning: Optimizing continuous prompts for generation. In Annual Meeting of the Association for Computational Linguistics, 2021.
- Scaling & shifting your features: A new baseline for efficient model tuning. In Proc. of Advances in Neural Information Processing Systems, 2022.
- Pointmamba: A simple state space model for point cloud analysis. arXiv preprint arXiv:2402.10739, 2024.
- Masked discrimination for self-supervised learning on point clouds. In Proc. of European Conference on Computer Vision, 2022a.
- Relation-shape convolutional neural network for point cloud analysis. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2019.
- Epnet++: Cascade bi-directional fusion for multi-modal 3d object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022b.
- Sgdr: Stochastic gradient descent with warm restarts. In Proc. of Intl. Conf. on Learning Representations, 2017.
- Decoupled weight decay regularization. In Proc. of Intl. Conf. on Learning Representations, 2019.
- Cheap and quick: Efficient vision-language instruction tuning for large language models. In Proc. of Advances in Neural Information Processing Systems, 2023.
- Rethinking network design and local geometry in point cloud: A simple residual mlp framework. In Proc. of Intl. Conf. on Learning Representations, 2022.
- Unipelt: A unified framework for parameter-efficient language model tuning. In Annual Meeting of the Association for Computational Linguistics, 2022.
- Autosdf: Shape priors for 3d completion, reconstruction and generation. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- Masked autoencoders for point cloud self-supervised learning. In Proc. of European Conference on Computer Vision, 2022.
- Self-supervised learning of point clouds via orientation estimation. In Intl. Conf. on 3D Vision. IEEE, 2020.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2017a.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Proc. of Advances in Neural Information Processing Systems, 2017b.
- Contrast with reconstruct: Contrastive 3d representation learning guided by generative pretraining. In Proc. of Intl. Conf. on Machine Learning, 2023.
- Pointnext: Revisiting pointnet++ with improved training and scaling strategies. In Proc. of Advances in Neural Information Processing Systems, 2022.
- Surface representation for point clouds. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Porc. of IEEE Intl. Conf. on Computer Vision, 2019.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 2008.
- Unsupervised point cloud pre-training via occlusion completion. In Porc. of IEEE Intl. Conf. on Computer Vision, 2021.
- Dynamic graph cnn for learning on point clouds. ACM Transactions ON Graphics, 2019.
- Attention-based point cloud edge sampling. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2023.
- 3d shapenets: A deep representation for volumetric shapes. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2015.
- Pointcontrast: Unsupervised pre-training for 3d point cloud understanding. In Proc. of European Conference on Computer Vision, 2020.
- Cape: Camera view position embedding for multi-view 3d object detection. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2023.
- A scalable active framework for region annotation in 3d shape collections. ACM Transactions ON Graphics, 2016.
- 1% vs 100%: Parameter-efficient low rank adapter for dense predictions. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2023.
- Point-bert: Pre-training 3d point cloud transformers with masked point modeling. In Proc. of IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 2022.
- Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In Annual Meeting of the Association for Computational Linguistics, 2022.
- Instance-aware dynamic prompt tuning for pre-trained point cloud models. In Porc. of IEEE Intl. Conf. on Computer Vision, 2023.
- Towards compact 3d representations via point feature enhancement masked autoencoders. In Proc. of the AAAI Conf. on Artificial Intelligence, 2024.
- A simple vision transformer for weakly semi-supervised 3d object detection. In Porc. of IEEE Intl. Conf. on Computer Vision, 2023.
- Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training. In Proc. of Advances in Neural Information Processing Systems, 2022.
- Centerformer: Center-based transformer for 3d object detection. In Proc. of European Conference on Computer Vision, 2022.
- Counter-interference adapter for multilingual machine translation. In Proc. of Conf. on Empirical Methods in Natural Language Processing, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.