Fractional Transfer Learning (FTL)

Updated 19 January 2026

Fractional Transfer Learning is a method that blends pretrained network weights with random initialization using a tunable fraction ω to control knowledge transfer.
It employs a convex combination mechanism that balances retaining useful prior information while avoiding negative transfer in new tasks.
Empirical studies in deep model-based RL with Dreamer demonstrate improved sample efficiency and asymptotic returns on various continuous-control tasks.

Fractional Transfer Learning (FTL) is a method for parameter-based transfer learning that mixes a fraction of a source network’s pretrained weights with a random initialization to seed learning in a new task. Rather than defaulting to full parameter transfer ( $\omega=1$ ) or pure random initialization ( $\omega=0$ ), FTL “blends” source parameters with a tunable fractional coefficient $\omega\in[0,1]$ . This approach enables explicit control over the amount of knowledge reused from prior tasks, mitigating information loss from randomization while avoiding negative transfer that can occur from indiscriminate full reuse. FTL has been specifically evaluated in the context of deep model-based reinforcement learning using the Dreamer algorithm, demonstrating substantial improvements in sample efficiency and learning performance across multi-source visual continuous-control tasks (Sasso et al., 2021).

1. Formal Definition and Mechanism

Fractional Transfer Learning operates by initializing target network layers as a convex combination of a source layer’s pretrained weights and a new random initialization. If $\theta_{\text{src}}$ denotes pretrained source weights, $\theta_{\text{rand}}$ a freshly generated random tensor of equal shape, and $\omega$ the transfer fraction, the FTL initialization is: $\theta_{\text{new}} = (1 - \omega)\,\theta_{\text{rand}} + \omega\,\theta_{\text{src}}$ This formulation preserves the statistical properties of random initialization (when $\omega=0$ ) and exact parameter reuse (when $\omega=1$ ), while enabling intermediate mixtures that allow retained knowledge to be adaptively tuned. The technique is directly compatible with standard initialization schemes such as Glorot or Kaiming for $\theta_{\text{rand}}$ .

2. Motivation: Balancing Knowledge Retention and Flexibility

Traditional parameter transfer strategies in neural networks, particularly in reinforcement learning (RL), often treat transfer as all-or-nothing: parameters are either fully reused or entirely discarded. This dichotomy leads to two primary drawbacks:

Loss of Useful Information: Pure randomization discards all structures acquired by $\theta_{\text{src}}$ , eliminating potential accelerants for early-stage optimization, especially where partial feature reuse is beneficial.
Overfitting and Interference: Full transfer of parameters (notably in output layers such as reward and value heads) can codify task-specific biases. Adaptation to new, divergent reward functions or value structures may then be hindered by the optimizer’s need to “unlearn” this bias.

FTL provides a principled compromise, allowing selective preservation of prior knowledge proportional to $\omega$ , thereby aiding sample efficiency and offering a safeguard against interference from incompatible representations. Empirically, FTL has been shown to enhance both initial “jumpstart” performance and asymptotic returns in tasks with shared partial structure (Sasso et al., 2021).

3. Application within Dreamer and Component-wise Strategy

Dreamer comprises (i) a variational encoder/decoder (CNN-based VAE), (ii) a recurrent state-space model (RSSM) for dynamics, (iii) a reward predictor, (iv) an actor network, and (v) a value network. Integration of FTL into Dreamer proceeds on a per-layer, per-component basis, guided by task and architectural compatibility:

Component	Transfer Strategy	Rationale
Encoder/Decoder CNNs (VAE)	Full transfer ( $\omega=1$ )	Latent representations likely generalize across related visual tasks
RSSM transition model	Full transfer ( $\omega=1$ )	Core dynamics benefit from reuse when physical laws are similar
Reward model (last layer)	Fractional ( $0<\omega<1$ )	Reward mapping is task-dependent; blend preserves flexibility
Value model (last layer)	Fractional ( $0<\omega<1$ )	Value function head sensitive to new reward structure; blend advisable
Preceding layers (reward/model)	Full transfer ( $\omega=1$ )	Shared “feature extraction” layers are more generalizable
Actor last layer, input-to-RSSM	Pure random ( $\omega=0$ )	Task dimension misalignment requires random initialization

All parameters are made fully trainable post-initialization; FTL does not enforce any freezing. The initialization and training sequence for FTL-Dreamer is detailed in Algorithm 1 of (Sasso et al., 2021).

4. Hyperparameterization of the Fractional Coefficient

The fractional transfer coefficient $\omega$ is treated as a hyperparameter, analogous to learning rate or dropout rate. In the referenced experiments, $\omega$ is set globally within each head: $\omega_\text{reward} = \omega_\text{value} = 0.2$ . Grid search over $\omega\in\{0.1,0.2,0.3,0.4\}$ allows empirical tuning. Selection is component-specific but global within a head; all last-layer parameters within a given head use the same $\omega$ . The recommended operating range is $0.1\leq\omega\leq 0.4$ .

Potential extensions include layer-wise or adaptive $\omega$ (e.g., via meta-learning or sensitivity analysis), which could further mitigate negative transfer and optimize knowledge reuse.

5. Experimental Protocol and Baseline Comparison

Empirical evaluation encompasses six PyBullet continuous-control tasks with visual inputs: HalfCheetah, Hopper, Walker2D, Ant, InvertedPendulum, and InvertedDoublePendulum. Multi-source transfer is performed: a Dreamer agent is pretrained jointly on two, three, or four source tasks. Transfer to a target task occurs as per the FTL initialization (Algorithm 1), and the agent is trained for $10^6$ steps.

Baselines are:

DREAMER-Scratch: Random initialization, identical architecture and hyperparameters.
DREAMER-RandInitLast: Identical to FTL, but last reward/value layers are purely random ( $\omega=0$ ).

Performance is assessed by episode return at early stages (jumpstart), mean return over $10^6$ steps, and mean return in the final $10^5$ steps (asymptotic).

6. Empirical Findings

The following tables summarize the effect of FTL ( $\omega=0.2$ with two sources) compared to DREAMER-Scratch. Reported values are episode returns (mean ± standard deviation):

Table 1: Average Episode Return (over $10^6$ steps)

Task	FTL (sources=2, $\omega$ =0.2)	Baseline Scratch
HalfCheetah	$1982 \pm 838$	$1681 \pm 726$
Hopper	$1911 \pm 712$	$1340 \pm 1112$
Walker2D	$1009 \pm 1254$	$116 \pm 885$
InvertedPendulum	$731 \pm 332$	$723 \pm 364$
InvertedDoublePendulum	$1209 \pm 280$	$1194 \pm 306$
Ant	$1124 \pm 722$	$1589 \pm 771$

Table 2: Asymptotic Episode Return (final $10^5$ steps)

Task	FTL (sources=2, $\omega$ =0.2)	Baseline Scratch
HalfCheetah	$2820 \pm 297$	$2264 \pm 160$
Hopper	$2535 \pm 712$	$2241 \pm 502$
Walker2D	$2214 \pm 1254$	$547 \pm 710$
InvertedPendulum	$874 \pm 121$	$883 \pm 17$
InvertedDoublePendulum	$1438 \pm 116$	$1366 \pm 179$
Ant	$2021 \pm 722$	$2463 \pm 208$

FTL yields substantial gains in both overall and asymptotic performance for HalfCheetah, Hopper, Walker2D, and Pendula tasks. Negative transfer is observed in Ant, consistent with prior knowledge that transfer is sensitive to degree of task similarity. The random-init last-layer condition often improves over naive scratch but is consistently inferior to the fractional strategy (Sasso et al., 2021).

7. Limitations and Prospective Directions

Several limitations and potential extensions emerge:

Negative Transfer: With highly dissimilar tasks, e.g., transferring to Ant from locomotion sources, even fractional reuse can reduce performance.
Static Fraction Assignment: A single global $\omega$ per head may not capture optimal transfer for all layers or target-task combinations. Adaptive or layer-wise strategies could further reduce harmful transfer.
Dynamics Model Transfer: The current implementation fully transfers the RSSM; however, selective or fractional transfer for dynamics weights may be beneficial for tasks with differing physical structure.
Broader Applicability: While demonstrated in multi-source model-based RL, FTL’s underlying mechanism is applicable to single-source settings and may generalize to supervised learning transfer, motivating further study.

Fractional Transfer Learning provides a pragmatic and effective mechanism for leveraging partial task similarity in neural network-based RL, positioned between complete reuse and full re-randomization, with measurable benefits in data efficiency and policy quality across representative continuous-control environments (Sasso et al., 2021).

Markdown Report Issue Upgrade to Chat

References (1)

Fractional Transfer Learning for Deep Model-Based Reinforcement Learning (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fractional Transfer Learning (FTL).

Fractional Transfer Learning (FTL)

1. Formal Definition and Mechanism

2. Motivation: Balancing Knowledge Retention and Flexibility

3. Application within Dreamer and Component-wise Strategy

4. Hyperparameterization of the Fractional Coefficient

5. Experimental Protocol and Baseline Comparison

6. Empirical Findings

7. Limitations and Prospective Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Fractional Transfer Learning (FTL)

1. Formal Definition and Mechanism

2. Motivation: Balancing Knowledge Retention and Flexibility

3. Application within Dreamer and Component-wise Strategy

4. Hyperparameterization of the Fractional Coefficient

5. Experimental Protocol and Baseline Comparison

6. Empirical Findings

7. Limitations and Prospective Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research