BLOOM Carbon Footprint: Lifecycle Analysis
- BLOOM is a 176-billion-parameter language model evaluated through a holistic lifecycle carbon assessment that quantifies dynamic (active compute), idle, and embodied emissions.
- The framework integrates direct energy measurements with manufacturing amortization, utilizing the Green Algorithms methodology and findings from Luccioni et al. (2022).
- Comparative benchmarks indicate BLOOM’s 50.5 t CO₂eq footprint is notably lower than similar models due to low-carbon grid usage and efficient system utilization.
BLOOM, a 176-billion-parameter LLM, was developed within the BigScience workshop to advance open-access LLMs. The environmental impact of such large-scale models has received growing attention, particularly regarding the energy and carbon costs of model training, deployment, and hardware manufacturing. Rigorous quantification of the carbon footprint of BLOOM utilizes a holistic lifecycle assessment, combining direct energy-based emissions from training and inference with amortized embodied emissions from hardware production. The framework and primary results derive from Luccioni et al. (2022), which constitutes a comprehensive report on BLOOM’s emissions, and from the “Green Algorithms” methodology for generalizable carbon assessment in computational science (Luccioni et al., 2022, Lannelongue et al., 2020).
1. Emission Quantification: High-Level Figures
The total carbon footprint of BLOOM can be decomposed into three principal categories: dynamic (active compute), idle (system overhead), and embodied (manufacturing). For BLOOM’s final training run:
- Dynamic emissions (active GPU power): 24.7 t CO₂eq
- Calculated from 1,082,990 GPU·h × 0.4 kW/GPU = 433,196 kWh,
- with French grid carbon intensity g CO₂eq/kWh.
- Idle emissions (cluster overhead): 14.6 t CO₂eq
- Resulting from an additional 256,646 kWh attributed to idle power overhead on the Jean Zay cluster.
- Embodied emissions (manufacturing amortization): 11.2 t CO₂eq
- Amortized over the total training time, based on manufacturer data for servers and GPUs.
- Full lifecycle total: 50.5 t CO₂eq
Dynamic (active computation) represents 49% of BLOOM’s lifecycle emissions, idle 29%, and embodied 22%. This distribution aligns with established server product LCAs, where manufacturing accounts for 20–30% of total footprint and use-phase for 70–80%.
| Component | Value | Fraction (%) |
|---|---|---|
| Dynamic | 24.7 t | 49 |
| Idle | 14.6 t | 29 |
| Embodied | 11.2 t | 22 |
| Total | 50.5 t | 100 |
2. Methodological Framework and Core Formulae
Lifecycle carbon assessment for BLOOM employs a combination of direct measurements and standard carbon accounting formulas. The foundational relationship is:
Where = CO₂-equivalent emissions (kg CO₂eq), = energy consumed (kWh), = carbon intensity of the electricity supply (kg CO₂eq/kWh).
Additional key formulas include:
- GPU dynamic:
( = total GPU hours, = GPU thermal design power)
- Idle consumption:
- Embodied emissions amortization:
, with (hardware manufacturing emissions), (lifetime hours), and (utilization fraction).
- Total full lifecycle:
(where as embodied emissions are pre-computed in CO₂eq).
For broader comparability, the Green Algorithms framework (Lannelongue et al., 2020) generalizes this to:
3. Measurement Procedures and Parameters
3.1 Hardware and Facility Parameters
- GPU/CPU: Nvidia A100 80 GB (TDP = 400 W)
- Cluster: Jean Zay at IDRIS/CNRS (France)
- Grid carbon intensity: 57 g CO₂eq/kWh
Manufacturing emissions are based on proxied data: — HPE Apollo/ProLiant server ≈ 2,500 kg CO₂eq, — Nvidia A100 GPU ≈ 150 kg CO₂eq (lower bound).
3.2 Energy and Emission Accounting
- Training: 1,082,990 GPU·h tracked and translated into energy/kWh
- Idle overhead: Empirically measured on cluster infrastructure (approx. 24% additional to dynamic)
- Inference: Tracked via CodeCarbon for real-time API serving; measurement period 18 days using 16× A100s in GCP us-central1
3.3 Grid Selection
- Training uses 57 gCO₂eq/kWh (France),
- Inference in us-central1 uses 394 gCO₂eq/kWh,
- Embodied emissions rely on public LCA and manufacturer sources.
4. Inference-API Emissions
Inference-related emissions of BLOOM were quantified by monitoring power usage over an 18-day window:
- Compute configuration: 16× A100 40 GB (GCP, us-central1 region)
- Total requests: 230,768 (≈ 558 req/h)
- Total energy: 914 kWh (GPU: 75.3%, RAM: 22.7%, CPU: 2.0%)
- Per-query energy: kWh/query (3.96 Wh/query)
- Per-query emissions: g CO₂eq/query
- Per-day inference emissions: kg CO₂eq/day
Sustaining inference regardless of request load (model-resident idle) results in baseline power draw ≈ 1.7 kWh/h.
5. Uncertainty, Scope, and Limitations
Several systematic uncertainties affect carbon accounting:
- Grid carbon intensity varies by time/location; results use annual or regional averages.
- GPU power draw assessed by TDP; real-time consumption may differ.
- CPU/memory/network overheads: approximations used (e.g., CPU ≈ 1/40 of GPU), some datacenter processes only partially included.
- Embodied emissions for GPUs lack definitive public LCA; figures utilize documented lower bounds.
- Lifetime/amortization parameters: estimates assume 6-year hardware replacement, 85% utilization.
- Scope exclusions: upstream supply-chain logistics, fuller datacenter PUE, end-of-life impacts, and secondary infrastructure impacts not included in reported total.
- Inference sampling: Measurements from one cloud/provider instance; results may not generalize to alternate platforms.
6. Comparative Benchmarks
BLOOM’s carbon footprint is substantially lower than that of U.S.-trained models of similar scale due to the low-carbon French grid. Comparative dynamic-only emissions (training phase):
| Model | Params | E_train (MWh) | CI (g/kWh) | C_dyn (t) |
|---|---|---|---|---|
| GPT-3 | 175 B | 1287 | 429 | 502 |
| Gopher | 280 B | 1066 | 330 | 352 |
| OPT-175B | 175 B | 324 | 231 | 75 |
| BLOOM | 176 B | 433 | 57 | 25 |
BLOOM’s full lifecycle footprint (50.5 t CO₂eq) is less than half of OPT (∼ 70 t) and less than 5% of GPT-3 (502 t CO₂eq), primarily due to lower grid carbon intensity and efficiency in system utilization. The BigScience workshop’s aggregate training and evaluation emissions reached 66.3 t CO₂eq, with the final BLOOM model accounting for 24.7 t CO₂eq (∼37%).
7. Strategies for Emission Reduction and Reporting
Adherence to best practices for carbon footprint minimization is integral for large-scale ML projects:
- Algorithm/Hardware efficiency: Employing mixed-precision training (FP16/TF32), rigorous profiling to request minimal RAM.
- Experiment management: Reducing hyperparameter exploration and debugging at scale; introduce pragmatic scaling factors.
- Facility selection: Prioritizing low-PUE datacenters and regions with low-carbon grids.
- Transparent reporting: Providing complete disclosure of all lifecycle parameters as per Green Algorithms recommendations (Lannelongue et al., 2020).
- Emissions offsetting: Sourcing accredited offsets for any residual CO₂eq.
The methodologies established in (Luccioni et al., 2022, Lannelongue et al., 2020) provide templates for standardized emissions reporting in computational research, facilitating ongoing reduction and benchmarking of the environmental impacts entailed by large-scale LLM development.