Papers

Topics

Authors

Recent

View all

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 77 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 31 tok/s Pro

GPT-4o 91 tok/s Pro

Kimi K2 178 tok/s Pro

GPT OSS 120B 385 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

Close-Fitting Dressing Assistance Based on State Estimation of Feet and Garments with Semantic-based Visual Attention (2505.03400v1)

Published 6 May 2025 in cs.RO

Abstract: As the population continues to age, a shortage of caregivers is expected in the future. Dressing assistance, in particular, is crucial for opportunities for social participation. Especially dressing close-fitting garments, such as socks, remains challenging due to the need for fine force adjustments to handle the friction or snagging against the skin, while considering the shape and position of the garment. This study introduces a method uses multi-modal information including not only robot's camera images, joint angles, joint torques, but also tactile forces for proper force interaction that can adapt to individual differences in humans. Furthermore, by introducing semantic information based on object concepts, rather than relying solely on RGB data, it can be generalized to unseen feet and background. In addition, incorporating depth data helps infer relative spatial relationship between the sock and the foot. To validate its capability for semantic object conceptualization and to ensure safety, training data were collected using a mannequin, and subsequent experiments were conducted with human subjects. In experiments, the robot successfully adapted to previously unseen human feet and was able to put socks on 10 participants, achieving a higher success rate than Action Chunking with Transformer and Diffusion Policy. These results demonstrate that the proposed model can estimate the state of both the garment and the foot, enabling precise dressing assistance for close-fitting garments.

Summary

Close-Fitting Dressing Assistance Using Semantic-Based Visual Attention

The paper "Close-Fitting Dressing Assistance Based on State Estimation of Feet and Garments with Semantic-Based Visual Attention" addresses the burgeoning challenge of providing autonomous dressing assistance for aging populations, a problem exacerbated by an impending shortage of caregivers. Focusing on the specific task of robot-assisted sock dressing, the paper introduces a method that incorporates multi-modal state estimation with semantic-based visual attention to enhance the dexterity and adaptability of robotic systems in dressing tasks, particularly involving close-fitting garments.

Methodology

The authors present a novel approach that leverages semantic understanding of visual input alongside traditional force and torque-based feedback to improve dressing success rates. They employ advanced models such as SAM (Segment Anything Model) for semantic segmentation and DAM (Depth Anything Model) for estimating depth information. This aids in generating reliable and adaptive dressing motions that remain robust to individual foot variations in size, shape, and flexibility without relying solely on RGB imagery.

The system's architecture integrates:

Semantic Segmentation: Semantic masks extracted using SAM allow the robot to focus on object concepts rather than just visual appearances.
Depth Estimation: DAM enhances spatial understanding, crucial for maintaining precise dressing motions.
Attention Mechanisms: Both visual and somatosensory attention, facilitated by models like SKNet, ensure efficient feature extraction and responsive handling of the socks during complex movements.
Hierarchical LSTM: Implemented for capturing temporal dynamics and inter-modal dependencies, enabling seamless transitions between the phases of dressing.

Experimental Validation

The paper validates the proposed method through extensive testing with mannequins and human subjects. Notable achievements include the robot's capability to successfully dress socks on ten diverse participants, outperforming contemporary methods such as Action Chunking with Transformer (ACT) and Diffusion Policy (DP). The system demonstrated considerable success rates—84% in known backgrounds and 74% in unknown backgrounds—showcasing improved robustness against environmental changes and individual differences compared to ACT (66% and 0% respectively under similar conditions) and DP (which failed to complete the task).

Results & Discussion

The experimental results highlight the model’s superior generalization ability and robustness. The attention mechanism effectively manages spatial and depth-related complexities, ensuring smooth and adaptive dressing motions even in untrained environments. The hierarchical LSTM further aids in maintaining stability across varying foot sizes, as demonstrated in tactile data analysis.

The ablation studies provide insights into the contributions of model components, with the complete architecture achieving a 100% success rate compared to degraded performance when significant components like DAM or SAM are omitted.

Implications and Future Directions

The implications of this research are manifold, with potential applications in healthcare robotics, particularly in the domain of autonomous caregiving solutions for the elderly and disabled. Future work may focus on refining the motion planning phases, especially the insertion step, through continued improvements in simulation and reinforcement learning models, potentially enhancing the precision of dressing actions further. Expanding the scope to accommodate dynamic human motions during dressing could further optimize real-world applicability and reliability.

This paper contributes to the expanding domain of assistive robotics, providing a foundation for future work in developing humanoid robots capable of handling intricate physical tasks with a high degree of autonomy and adaptability.