LoF Scale: Marine Biofouling Assessment

Updated 4 February 2026

Level of Fouling is a standardized six-point ranking system that categorizes marine biofouling based on visible slime and macrofouling coverage.
Automated assessments employ image classification, semantic segmentation, and LLMs to accurately predict LoF categories and quantify fouling extent.
Hybrid pipelines combining quantitative imaging and LLM-driven interpretability enhance accuracy at class boundaries despite challenges with dataset imbalance.

The Level of Fouling (LoF) scale is a standardized six-point ranking system used to categorize the severity of marine biofouling on vessel hulls and related submerged surfaces. The system quantifies fouling in terms of the presence of slime (biofilm) and the proportionate surface coverage by visible macrofouling organisms. The LoF scale serves as a benchmark for ecological risk assessment, management of biofouling operations, and for the development and evaluation of automated biofouling detection and classification systems (Hamilton et al., 28 Jan 2026).

1. Definition and Formal Structure of the LoF Scale

The LoF scale, as defined by Davidson et al. (2019), employs six discrete categories (0–5). Each category corresponds to explicit criteria based on visible slime and the percentage cover by macrofouling organisms:

LoF	Description	Macro-fouling Coverage
0	No slime, no macrofouling	0%
1	Slime layer present, no visible macrofouling	0%
2	Sparse macrofouling (patchy or isolated)	1–5%
3	Moderate macrofouling patches	6–15%
4	Extensive macrofouling (majority still clean)	16–40%
5	Heavy macrofouling (very heavy coverage)	41–100%

Assignment to a LoF category follows a binary-decision flow: (1) Is slime visible? If no, LoF 0; if yes, proceed to (2) Is macrofouling present? If no, LoF 1; if yes, estimate the macrofouling coverage and assign LoF 2–5 according to the thresholds above (Hamilton et al., 28 Jan 2026).

2. Mathematical Quantification and Decision Rules

Automated measurement of LoF relies on pixel-wise estimates of surface coverage, typically from semantic segmentation masks. The coverage for macrofouling is given by:

$\mathrm{Coverage}_\mathrm{macro} = \frac{\sum_{p \in \mathcal{M}_\mathrm{macro}} 1}{\sum_{p \in \mathcal{M}_\mathrm{hull}} 1} \times 100\%$

where $\mathcal{M}_\mathrm{macro}$ is the set of pixels predicted as macrofouling, and $\mathcal{M}_\mathrm{hull}$ the set corresponding to the entire hull (including Clean, Slime, Macrofouling). An identical formulation applies to slime coverage.

The LoF category $L$ is assigned from coverage values via:

$L = \begin{cases} 0 & \text{if } c_\mathrm{slime}=0 \wedge c_\mathrm{macro}=0 \ 1 & \text{if } c_\mathrm{slime}>0 \wedge c_\mathrm{macro}=0 \ 2 & \text{if } 1\%\leq c_\mathrm{macro}\leq 5\% \ 3 & \text{if } 6\%\leq c_\mathrm{macro}\leq 15\% \ 4 & \text{if } 16\%\leq c_\mathrm{macro}\leq 40\% \ 5 & \text{if } c_\mathrm{macro}\geq 41\% \end{cases}$

This mapping underpins all automated LoF assessments in computer vision and machine learning pipelines (Hamilton et al., 28 Jan 2026).

3. Datasets and Label Distributions in Automated LoF Assessment

Automated LoF classification relies on curated, expert-labeled datasets. A documented dataset from the New Zealand Ministry for Primary Industries contains 762 images with the following distribution:

LoF	Image Count
0	7
1	263
2	70
3	113
4	126
5	183

An 80%/20% train/test split was applied across all model evaluations. A significant class imbalance exists, with relatively fewer samples for LoF 0 and 2 (Hamilton et al., 28 Jan 2026). This skew affects model calibration, particularly at intermediate LoF levels.

4. Methods for Automated Assessment

Multiple pipelines have been employed for mapping images to LoF scores:

Raw Image Classification: ResNet-18 and ResNet-50 architectures (ImageNet-pretrained) classify raw RGB images using cross-entropy loss. Preprocessing using HSV color channels and Canny edge detection improved accuracy for intermediate LoF classes, with test accuracy rising from 60.22% to 62.72%.
Semantic Segmentation: The SegFormer transformer segmenter outputs pixel-wise labels for Water, Clean, Slime, and Macrofouling. Area proportions are computed, and LoF is inferred via the pixel coverage rule. However, SegFormer tended to output extreme-class predictions (100% slime or macrofouling), leading to over-prediction at LoF 1 and 5 and instability at the intermediates.
Evaluation Metrics: Class-wise and overall accuracy, precision, recall, and F1 score were computed as per standard definitions, using true positive, false positive, and false negative counts for each LoF class.
Multimodal LLMs: Large multimodal LLMs were evaluated in zero-shot setups, using both baseline and role-framed, structured prompts encoding the LoF definitions and decision tree. Addition of official LoF guideline text via retrieval-augmented prompts yielded modest alignment gains (Hamilton et al., 28 Jan 2026).

A summary of classifier and segmentation model performance:

Approach	Accuracy (%)	Notable Characteristics
ResNet-18/50 RGB	60.2	Strong on extremes (LoF 1,5); weaker for LoF 2–4
ResNet-18/50 HSV+	62.7	Improved intermediate LoF separation
SegFormer	Unstable	Over-prediction of extreme categories
LLM (zero-shot)	51.1	Accurate on extremes; over-classification at boundaries

LLM performance was highly prompt-dependent; initial prompts classified only 5.1% of images, while detailed prompts increased coverage to 94.8% (Hamilton et al., 28 Jan 2026).

5. Prompting Strategies and Zero-shot Multimodal LLMs

Vision-enabled LLMs (e.g., GPT-4V via OpenRouter) received input images and contextual system prompts delineating LoF scale and thresholds. Two templates were examined:

Baseline Prompt: Provided LoF definitions, requested LoF score, justification, and invasive species note.
Final System Prompt: Simulated expert role, specified decision-tree and coverage thresholds, and required outputs for LoF rating, estimated coverage, species, and biosecurity risk.

Injecting official LoF guideline excerpts further grounded the LLM to domain standards. However, this had a limited effect, as detailed prompts were sometimes ignored due to length. Conservative prompt calibration increased LoF 1 precision to 75.5% at a cost to overall accuracy (42.7%)—a plausible implication is heightened cautiousness led to systematic under-classification at higher LoF levels (Hamilton et al., 28 Jan 2026).

6. Hybrid and Integrated Assessment Pipelines

Hybrid strategies exploit complementary strengths of segmentation and LLM approaches. The pipeline involves:

Segmenting the image (SegFormer) to compute $c_\mathrm{slime}$ and $c_\mathrm{macro}$ .
Preliminary LoF assignment via the deterministic decision rule.
Supplying the original image and coverage percentages to a multimodal LLM with a domain-formulated prompt.
LLM refines border estimates, delivers textual justifications, species identification, and risk rating.

Mathematically, the hybrid LoF estimate is:

$\hat L_\mathrm{hybrid} = \mathrm{LLM}\left(I, [c_\mathrm{slime}, c_\mathrm{macro}], \mathrm{Prompt}\right)$

with $I$ denoting the image input. Such integration allows rigorous coverage-based assignments supplemented by LLM interpretability, with improved transparency at LoF class boundaries (Hamilton et al., 28 Jan 2026).

7. Significance and Limitations

The LoF scale enables reproducible, standardized biofouling severity assessment critical for regulatory, ecological, and operational contexts. Automated systems grounded in the LoF framework face challenges at class boundaries due to image framing, variability in fouling appearance, and dataset imbalance. Computer vision classifiers are robust at extremes but may misclassify intermediate categories; segmentation models offer explainability at the expense of stability; and LLMs provide textual reasoning, but their outputs are sensitive to prompt design.

The convergence of quantitative segmentation with LLM-driven interpretability via hybrid models offers a promising direction for scalable, explainable, and accurate marine biofouling assessment on the LoF scale (Hamilton et al., 28 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Automated Marine Biofouling Assessment: Benchmarking Computer Vision and Multimodal LLMs on the Level of Fouling Scale (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Level of Fouling (LoF) Scale.

LoF Scale: Marine Biofouling Assessment

1. Definition and Formal Structure of the LoF Scale

2. Mathematical Quantification and Decision Rules

3. Datasets and Label Distributions in Automated LoF Assessment

4. Methods for Automated Assessment

5. Prompting Strategies and Zero-shot Multimodal LLMs

6. Hybrid and Integrated Assessment Pipelines

7. Significance and Limitations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

LoF Scale: Marine Biofouling Assessment

1. Definition and Formal Structure of the LoF Scale

2. Mathematical Quantification and Decision Rules

3. Datasets and Label Distributions in Automated LoF Assessment

4. Methods for Automated Assessment

5. Prompting Strategies and Zero-shot Multimodal LLMs

6. Hybrid and Integrated Assessment Pipelines

7. Significance and Limitations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research