FLNet: Modular Architectures for Data-Efficient AI

Updated 14 January 2026

FLNet is a set of modular deep learning architectures designed for high-fidelity, data-efficient prediction in areas such as flood damage assessment, federated routability, and facial animation synthesis.
Each variant leverages specialized modules—like super-resolution with segmentation, federated optimization without batch normalization, or dual-stream warping and learning—to target domain-specific challenges.
FLNet models achieve competitive performance improvements while addressing issues like data scarcity, privacy, and temporal consistency in heterogeneous applications.

FLNet refers to a set of distinct deep learning architectures, each sharing a common focus on modular network design for high-fidelity, data-efficient prediction or synthesis in domains with acute data challenges. Notably, the term FLNet has appeared in contexts as disparate as flood-induced agricultural damage assessment using super-resolved satellite imagery (Ghosal et al., 7 Jan 2026), federated learning for routability estimation in electronic design automation (EDA) (Pan et al., 2022), and landmark-driven facial animation synthesis (Gu et al., 2019). The following article provides a domain-specific encyclopedic account of these major FLNet variants, with an emphasis on their core formulations, architectures, loss functions, experimental protocols, and empirical results.

1. FLNet for Flood-Induced Agriculture Damage Assessment

The variant of FLNet presented in (Ghosal et al., 7 Jan 2026) addresses rapid post-disaster crop damage mapping under data and cost constraints, improving granularity and accuracy over traditional manual and satellite-based approaches.

Problem Formulation

FLNet targets semantic segmentation of flood-induced damage on croplands, with the following formulation:

Inputs: Pre-flood and post-flood Sentinel-2 NDVI imagery at 10 m ground sampling distance (GSD), denoted as $X_{\text{pre}, \text{S2}}, X_{\text{post}, \text{S2}} \in \mathbb{R}^{H \times W}$ .
Intermediate Objective: Super-resolution to 3 m GSD, yielding estimated high-resolution pairs $\hat{Y}_{\text{pre}}, \hat{Y}_{\text{post}} \in \mathbb{R}^{\hat{H} \times \hat{W}}$ , with $\hat{H} \times \hat{W} \approx \frac{10\,\mathrm{m}}{3\,\mathrm{m}} \times H \times W$ .
Final Output: Pixel-wise segmentation map $L \in \{0, 1, 2\}^{\hat{H} \times \hat{W}}$ , indicating {No, Partial, Full} crop damage.

The process decomposes into: (a) Super-resolution (SR): $\hat{Y} = \mathrm{SR}(X; \theta_{\mathrm{SR}}) + \epsilon$ , where $\theta_{\mathrm{SR}}$ parameterizes the EDSR module. (b) Change quantification and classification: $\Delta\mathrm{NDVI} = \hat{Y}_{\text{pre}} - \hat{Y}_{\text{post}}$ , followed by $L = G_\phi(\Delta\mathrm{NDVI})$ , where $G_\phi$ is a UNet segmentation head.

Network Architecture

FLNet consists of two sequential subnets:

EDSR Super-Resolution Module: Single-band NDVI input upsampled via bicubic interpolation. Initial 1×1 convolution lifts channels to 64. Sixteen residual blocks each apply Conv(64→64, k=3×3) → ReLU → Conv(64→64, k=3×3) with skip connections and no batch normalization. Final 3×3 convolution and dual stacked pixel-shuffle layers (×2 and ×2, respectively) realize the net ×3.33 upsampling (10 m to 3 m). Final 3×3 convolution outputs the SR NDVI estimate.
UNet Segmentation Head: The encoder comprises four down-sampling blocks (each Conv(1→32,k=3), ReLU ×2, MaxPool), a bottleneck (Conv(32→64, ReLU, Conv(64→64, ReLU)), and a symmetric decoder with upsampling, skip connections, and two Conv(32→32, k=3), ReLU blocks per stage. A final 1×1 convolution projects to three channels, followed by a softmax for damage-class probabilities.

Loss Functions and Optimization

Three core objectives guide training:

Super-Resolution Loss:

$\mathcal{L}_{\mathrm{SR}}(\theta_{\mathrm{SR}}) = \| Y_{\mathrm{PS}} - \mathrm{SR}(X_{\mathrm{S2}}; \theta_{\mathrm{SR}}) \|_1$ , where $Y_{\mathrm{PS}}$ is PlanetScope NDVI at 3 m.

Classification Loss:

$\mathcal{L}_{\text{cls}}(\phi) = -\sum_i \sum_c w_c\, \mathbf{1}(L_i = c)\,\log p_{i,c}$ , with class weights for imbalance (emphasizing "Full Damage").

Combined Loss (for multi-phase or end-to-end regimes):

$\mathcal{L} = \lambda_{\mathrm{SR}} \mathcal{L}_{\mathrm{SR}} + \lambda_{\text{clf}} \mathcal{L}_{\text{cls}}$ . Training is sequential: first SR (λSR = 1), then UNet (λclf = 1).

Dataset and Experimental Setup

FLNet is developed on the BFCD-22 (Bihar Flood Cropland Damage, 2022) dataset:

Region: Muzaffarpur, Bihar, India, October 2022 flood event.
Sources: 12 pre/post Sentinel-2 tile pairs, 12 co-registered PlanetScope scenes (3 m RGB+NIR).
Processing: Cloud/shadow masking (S2 L2A, PS UDM2), orthorectification, co-registration at 3 m, NDVI computation, cropping to cropland mask.
Chips: ~1,200 256×256 patches at 3 m, 80/20 train/test split.
Label Generation: ΔNDVI thresholding on PlanetScope, followed by morphological smoothing to produce the segmentation ground truth.

Training uses an NVIDIA A100 GPU, PyTorch, Adam optimizer with learning rates $1 \times 10^{-4}$ (EDSR) and $1 \times 10^{-3}$ (UNet), early stopping, ReduceLROnPlateau scheduler, and augmentations (random flips, ±10° rotations).

Empirical Results

Super-Resolution Fidelity:

PSNR: pre-flood 21.10 dB, post-flood 20.77 dB; SSIM: pre 0.860, post 0.748, benchmarked against PlanetScope.

Damage Classification F1:

Source	No Damage	Partial Damage	Full Damage
Sentinel-2	1.00	0.98	0.83
PlanetScope	0.98	0.90	0.89
SR (FLNet)	0.99	0.96	0.89

SR achieves a "Full Damage" F1 of 0.89, closing the gap to commercial reference data.

Discussion and Implications

Super-resolution sharply delineates boundaries and suppresses mixed-pixel errors in ΔNDVI, directly improving the recall/precision for the hardest class ("Full Damage") by +0.06 absolute F1. FLNet enables scalable, real-time (~0.5 s per chip) and cost-effective damage mapping from free Sentinel-2 data. Key limitations include remaining cloud/misalignment artifacts and reliance on threshold-driven ground truth generation, which may conflate flood-damage with harvest cycles. Anticipated enhancements include SAR fusion, Transformer-based SR, and integrated cloud removal (Ghosal et al., 7 Jan 2026).

2. FLNet for Federated Routability Estimation in EDA

In (Pan et al., 2022), FLNet denotes a federated learning-compatible, fully-convolutional architecture tailored to routability hotspot prediction across confidential, decentralized IC design datasets.

Design Principles and Problem Motivations

Challenge: EDA ML suffers from data fragmentation—design data are private, proprietary, and insufficiently diverse at any single site.
Federated Learning Approach: Multiple clients (companies) collaboratively train a shared model for routability estimation without exchanging raw layout data.
Heterogeneity Consideration: Circuit styles across clients yield significant feature distribution shifts (non-IID), which can degrade naive federated model performance.
Architectural Compactness: FLNet omits batch normalization, employs a minimal parameter count, and uses large convolutional kernels to maximize receptive field while controlling for fragile cross-site synchronizations.

Architecture

FLNet adopts a shallow, wide receptive-field structure:

Layer	Kernel Size	Channels Output	Activation
input_conv	9×9	64	ReLU
output_conv	9×9	1	None

Input feature maps encode cell-density, RUDY wire-density, and blockage information on grid sites; the network processes spatial context over (9+9−1)=17-pixel neighborhoods without pooling.

Federated Optimization Algorithms

FedAvg: Standard aggregation: each client updates model via local SGD; server weight-averages by sample count.
FedProx: Augments client loss with a quadratic proximity penalty to the global model, controlling drift: $\min_w F_k(w) + \frac{\mu}{2}\|w - w^t\|^2$ , stabilizing convergence under non-IID features.

Personalization Mechanisms

Local Fine-tuning: Post-global model ( $w^T$ ), each client locally adapts on its own data.
FedProx-LG: Split weights into global and local partitions; only global parameters synchronize.
Cluster-based Schemes: IFCA dynamically clusters clients by minimizing local loss over k prototypes; assigned clustering uses static segmentation based on benchmark suite.
α-Portion Synchronization: Each client updates via a convex combination of own and aggregated weights, controlled by α.

Dataset and Partitioning

Benchmark: 74 public designs (ITC’99, ISCAS’89, IWLS’05, ISPD’15), 7131 total placement samples.
Client Groups: K=9 clients by suite; ~70% of each client for training, 30% for testing. Per-client train/test size: 175–812/84–348 samples.

Results

ROC AUC (mean across clients):

Method	Avg. ROC AUC
Local baseline	0.72
Centralized (upper bound)	0.81
FedProx Global	0.78
FedProx + Fine-tuning	0.80
IFCA	0.77
Assigned Clustering	0.78
α-Portion Sync (α=0.5)	0.78

FedProx + fine-tuning approaches the centralized gold standard, achieving an 11% improvement over local-only training. FLNet substantially outperforms prior architectures RouteNet or PROS when subject to federated (non-IID) updates; under FedProx, these older models underperform even the local-only baseline (AUC 0.65 for RouteNet, 0.62 for PROS).

Practical Considerations

Efficiency: Each round, clients transmit ≈0.4 MB of weights (≈ $10^5$ parameters), suitable for commodity network connections; total 50 rounds.
Robustness: Batch-norm removal avoids inter-client synchronization issues.
Privacy: All raw data remains local; secure aggregation and (optionally) differential privacy can be adopted.
Convergence: $\mu = 10^{-4}$ in FedProx, 100 local steps/round, 2×10^{-4} learning rate, 1×10^{-5} weight decay; 50 rounds typical to near-final performance.

3. FLNet in Landmark-Driven Facial Animation Synthesis

FLNet (Gu et al., 2019) in the context of facial animation synthesis adopts a two-stream design to maximize texture fidelity and temporal consistency when generating talking face videos from multiple source images.

Architecture and Data Processing

Inputs: Bank of N=5 source images of the same identity spanning mouth-opening poses, each with 68 facial landmarks; per-source landmark-difference maps encode motion priors.
Fetching Stream:
- Attentive blending masks $(M_w)$ and warping fields $(W_w)$ generated via single-layer convolutions and spatial softmax.
- Bilinear warps reassemble source patches, yielding $I_w$ .
Learning Stream:
- Shared encoder–decoder, final conv layer computes residual hallucinations $I_a$ to synthesize unseen or occluded features.
Fusion: Output composes $I_o^c = (1 - V)\odot I_a^c + V\odot I_w^c$ , where mask $V$ is learned.

Loss Functions and Objectives

Loss Terms:
- Merging mask sparsity: $\mathcal{L}_{S_V} = \|V\|_1$
- Total variation on $V, M_w, W_w$
- Weighted L1 reconstructions for $I_w,I_o$
- VGG16-based perceptual loss for high-level feature similarity
- Hinge adversarial (PatchGAN D with spectral normalization)
Full Objective:

$\mathcal{L} = \lambda_S \mathcal{L}_{S_V} + \lambda_{TV}(\mathcal{L}_{TV_V} + \mathcal{L}_{TV_{ww}} + 0.1\mathcal{L}_{TV_{wm}}) + \lambda_{rec}\mathcal{L}_{rec} + \lambda_p \mathcal{L}_p + \lambda_{adv} \mathcal{L}_{adv}$

Training Protocol and Datasets

Datasets:
- TCD-TIMIT (62 speakers, 6,913 videos, high texture detail)
- FaceForensics (1,004 YouTube videos, diverse conditions)
Preprocessing:
- 68 landmark detection (Dlib), cropping to 224×224 centered on nose or eye.
- For each target frame, select five source frames spanning mouth states.
Optimization:
- Adam( $\beta_1=0.5, \beta_2=0.999$ ), lr $10^{-4}$ .
- 5 discriminator steps per generator step.

Empirical Results

FLNet surpasses GANimation and X2Face on L1 and FID metrics:

Method	TCD-TIMIT L1	TCD-TIMIT FID	FaceForensics L1	FF FID
GANimation	10.86	59.65	16.19	47.99
X2Face	8.31	30.50	11.05	23.98
FLNet	7.99	17.07	10.20	20.62

Qualitatively, FLNet preserves fine teeth and lip structures by warping, while synthesizing occluded regions (e.g., closed eyes) via its learning stream. Temporal smoothness and identity consistency are achieved without explicit alignment or post-processing.

Ablation

Ablation analyses confirm that both multi-image warping and the appearance stream are critical:

Warping-only: fails to hallucinate occluded parts; appearance-only: produces overly smooth, less faithful details.
Full FLNet achieves best quantitative and visual results.

4. Commonalities and Thematic Synthesis

The "FLNet" moniker has denoted distinct architectures with shared emphases on:

Data-Efficiency and Robustness: Compactness (e.g., no batch-norm in federated FLNet) and targeted upsampling (agricultural FLNet) for robustness in low-data or high-heterogeneity contexts.
Modularity: Clear task decomposition—for example, super-resolution + segmentation (agriculture), fetching + learning (face synthesis), or global/local split (federated EDA).
Fidelity-Centric Objectives: Losses and design mechanisms in each FLNet variant are crafted to maximize output fidelity to reference data (whether damage boundaries, facial contents, or IC layout hotspots).

These design principles enable FLNet models to outperform prior art whenever tasks are constrained by data incompleteness, privacy, cost, or spatial/temporal resolution.

5. Future Directions

For agricultural FLNet, anticipated extensions include integration with all-weather SAR data, Transformer-based backbones for super-resolution, and learned cloud removal (Ghosal et al., 7 Jan 2026). Federated FLNet may benefit from further personalization algorithms, communication compression, and formal security guarantees (Pan et al., 2022). Facial synthesis FLNet may adopt higher-order appearance priors, larger source banks, or self-supervised correspondence learning (Gu et al., 2019). A plausible implication is that the overarching FLNet paradigm will continue to expand where modular architectures can exploit complementary sources of information—either across modalities, domains, or clients—while preserving data fidelity and operational efficiency.

Markdown Upgrade to Chat

References (3)

FLNet: Flood-Induced Agriculture Damage Assessment using Super Resolution of Satellite Images (2026)

Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data (2022)

FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FLNet.

FLNet: Modular Architectures for Data-Efficient AI

1. FLNet for Flood-Induced Agriculture Damage Assessment

Problem Formulation

Network Architecture

Loss Functions and Optimization

Dataset and Experimental Setup

Empirical Results

Discussion and Implications

2. FLNet for Federated Routability Estimation in EDA

Design Principles and Problem Motivations

Architecture

Federated Optimization Algorithms

Personalization Mechanisms

Dataset and Partitioning

Results

Practical Considerations

3. FLNet in Landmark-Driven Facial Animation Synthesis

Architecture and Data Processing

Loss Functions and Objectives

Training Protocol and Datasets

Empirical Results

Ablation

4. Commonalities and Thematic Synthesis

5. Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics