Papers
Topics
Authors
Recent
Search
2000 character limit reached

Contact-Guided Curriculum Learning

Updated 2 July 2026
  • CGCL is a curriculum learning approach that methodically integrates contact feedback to guide multimodal policy training in robotics.
  • It employs structured stages—such as visual corruption and guided force application—to ensure robust sensor fusion and improved task robustness.
  • Quantitative results demonstrate significant gains in success rates and efficiency, validating CGCL's effectiveness in contact-rich robotic applications.

Contact-Guided Curriculum Learning (CGCL) is a methodology that leverages structured curricula and carefully staged integration of contact information—such as force feedback or guided contact-based forces—to accelerate and robustify policy learning in robotics and control. CGCL systematically guides the learning agent through progressively more difficult environments and sensorimotor conditions by modulating contact signals, environment complexity, and non-visual feedback, often via reinforcement or behavior cloning paradigms. This structured exposure ensures that policies develop appropriate reliance on all available modalities, especially in scenarios where contact phenomena are indispensable for generalization and success.

1. Core Principles and Definition

CGCL is characterized by stagewise training regimes designed to optimize exploitation of contact-related signals during policy learning. The underlying principle is to manipulate the quality or magnitude of contact feedback—either by curriculum-based input corruption (e.g., visual blur to enforce early force reliance) or by guided force application (e.g., external stabilizing forces gradually removed). CGCL thus creates an environment where the agent must initially depend on robust, task-relevant contact signals, and then, as the training curriculum progresses, smoothly blend these with more ambiguous or unreliable sensory cues (Liu et al., 24 Feb 2025, Tidd et al., 2020).

2. Methodological Frameworks

Two canonical instantiations of CGCL are FACTR (Force-Attending Curriculum Training) and Guided Curriculum Learning for bipedal walking:

  • FACTR introduces a curriculum in which visual input is corrupted by a controlled blur, decayed according to a schedule γ(n)\gamma(n) (linear, cosine, constant, exponential, or step). This discourages early overfitting to vision and guides attention to force feedback. As training advances and γ(n)\gamma(n) decreases, policies re-incorporate vision, achieving balanced multimodal competence (Liu et al., 24 Feb 2025).
  • Guided Curriculum Learning for Walking employs a three-stage approach: (1) Terrain difficulty is increased; (2) Magnitude of external PD-based guiding forces is annealed; (3) Random base perturbations are progressively applied. Expert references or hand-designed trajectories serve as priors, and each curriculum transition occurs only after specified robustness or success criteria are met (Tidd et al., 2020).

These frameworks can be generalized to other contact-rich domains by varying what is modulated (visual reliability, terrain challenge, force availability, etc.) and how contact guidance is implemented.

3. Network Architecture and Algorithmic Details

CGCL approaches are typically realized with architectures capable of integrating heterogeneous sensory streams:

  • FACTR Network: A pre-trained 12-layer Vision Transformer (ViT) encodes the visual stream, outputting M=196M=196 latent tokens. A force encoder (gψg_\psi), realized as a 2-layer MLP, processes low-pass filtered joint torques to produce a 1×d1 \times d force token. In the policy transformer πθ\pi_\theta, vision and force tokens are concatenated and processed by 6 encoder and 6 decoder layers (hidden size 512, 8 heads), with separate decoder cross-attention matrices for vision and force, αV(â„“)\alpha_V^{(\ell)} and αF(â„“)\alpha_F^{(\ell)}. Monitoring these weights reveals a shift from heavy force attention early in training (with high blur) to increasingly balanced multimodal attention as visual blur decays (Liu et al., 24 Feb 2025).
  • Algorithmic Outline for FACTR:

M=196M=1964

  • Guided Curriculum Learning for Walking: The policy operates over state st=[rst,It]s_t = [rs_t, I_t], with rstrs_t encapsulating proprioceptive and contact variables, and γ(n)\gamma(n)0 a perception input (e.g., 48×48 depth). Guided forces—including base-stabilizing (γ(n)\gamma(n)1) and joint-level (γ(n)\gamma(n)2) PD controllers—are injected into the dynamics but not observed by the policy itself, ensuring contact-reliant skill acquisition. Curricula are advanced only when success thresholds, such as three consecutive task completions, are achieved (Tidd et al., 2020).

4. Curriculum Schedules and Contact Modulation

CGCL curricula adopt well-defined progression strategies:

  • Visual/Latent Corruption Schedules (FACTR):
    • γ(n)\gamma(n)3 (linear)
    • γ(n)\gamma(n)4 (cosine)
    • Decay from γ(n)\gamma(n)5 to 0 over γ(n)\gamma(n)6 iterations, optionally with a warmup phase retaining maximal corruption (Liu et al., 24 Feb 2025).
  • Force and Terrain Curricula (Walking):
    • Terrain: Discrete set γ(n)\gamma(n)7 from easiest to hardest.
    • Guiding Forces: γ(n)\gamma(n)8, with γ(n)\gamma(n)9, decayed following each success criterion.
    • Perturbations: M=196M=1960, adjusting random force magnitudes applied to the robot base (Tidd et al., 2020).

Curriculum transitions are success-based, not merely time-based; only after sustained task completion does the difficulty advance.

5. Quantitative Results and Evaluation Metrics

CGCL consistently outperforms non-curriculum or naively multimodal policies across a range of contact-rich settings.

  • FACTR (Contact Manipulation Tasks):
    • Four tasks: two-arm box lift, non-prehensile pivot, fruit pick-and-place, dough rolling.
    • Success rates on held-out objects: vision-only (21.3%), vision+force w/o curriculum (61.2%), FACTR (87.5%). This corresponds to a relative improvement of approximately 43% over the non-curriculum multimodal policy.
    • Task-specific gains: e.g., box lifting success 91.7% (FACTR) vs. 58.3% (no curriculum), dough rolling 80.0% (FACTR) vs. 0% (vision-only).
    • Teleoperation metrics: FACTR’s force-feedback leader arm increases user completion rates by 64.7%, reduces time by 37.4%, and improves ease-of-use scores by 83.3% compared to passive teleop (Liu et al., 24 Feb 2025).
  • Guided Curriculum Learning for Walking:
    • Main metric: total distance traversed on test terrains (no external forces, maximal difficulty).
    • With all three stages (terrain, force, perturbation), coverage rates reach 99.9% (flat), 72.3% (gap), 58.5% (hurdle), etc.
    • Ablations: omitting any stage of the curriculum results in substantial performance collapse; e.g., gap terrain success drops from 72.3% (full CGCL) to 1.5%–12.8% with reduced curricula (Tidd et al., 2020).
Setting Baseline Non-Curriculum CGCL / FACTR
Box Lift (success, %) 31.7 58.3 91.7
Dough Rolling (success,%) 0.0 70.0 80.0
Walking Flat (%) 79.4 – 99.9
Walking Gap (%) 12.8 – 72.3

6. Implementation Parameters and Experimental Setups

FACTR employs a batch size of 128, AdamW optimizer (M=196M=1961 learning rate), cosine learning rate decay, and approximately 40\,000–50\,000 training steps. The network runs on a single RTX4090 GPU and completes training in 3–5 hours (Liu et al., 24 Feb 2025). The Guided Curriculum Walking setup uses PPO with standard hyperparameters (learning rate M=196M=1962, batch size 64), time step M=196M=1963 s, depth images at 20 Hz, 10 curriculum steps per axis, and prescribes a success criterion of 3 consecutive full-length trials/designated complexity before curriculum advancement (Tidd et al., 2020).

7. Context, Significance, and Extensions

CGCL demonstrates that separating out contact-relevant curricula—either through sensory input manipulation or via physical guidance/perturbation schedules—yields robust, sample-efficient learning without excessive domain or reward engineering. The methodology is extensible to diverse domains of contact-rich robotics (e.g., push-recovery, manipulation, stair-climbing), where reliable physical interaction with dynamic, uncertain environments is critical. Notably, the modularity of CGCL allows practitioners to tailor curriculum axes (task difficulty, modality corruption, guidance types) to the demands of their specific domain, provided well-posed success criteria and transition protocols are defined.

The explicit accounting for contact dynamics, phased guidance, and disturbance training present in CGCL distinguishes it from generic curriculum learning and imbues policies with substantially improved generalization and robustness, as evidenced by ablation and transfer studies (Liu et al., 24 Feb 2025, Tidd et al., 2020).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Contact-Guided Curriculum Learning (CGCL).