Methodology

How We Calculate AI Carbon Footprints

Transparent, research-backed methodology prioritizing independent data over unverified provider claims.

Last Updated: December 2025

⚠️ The Data Problem We're Solving

AI companies don't disclose their actual energy consumption. The numbers you see from providers are unverified marketing claims.

Source Simple Query Reasoning Query Credibility
Google (self-reported) 0.24 Wh Not disclosed ⚠️ Unverified
OpenAI (Sam Altman) 0.34 Wh Not disclosed ⚠️ No methodology
EPRI (independent) 2.9 Wh ✅ Third-party
Hugging Face (Dec 2025) ~50 Wh 5,000-10,000 Wh ✅ Peer-reviewed

Provider claims show 8-500x lower values than independent research. We prioritize independent sources.

Our Approach: Independent Research First

We build our calculator on data we can verify, not marketing claims we can't.

🥇 Primary Sources

Independent Research

  • Hugging Face AI Energy Score (December 2025) — 40 models tested with CodeCarbon real-time tracking. Found reasoning models use 30-500x more energy.
  • Electric Power Research Institute (EPRI) — 2.9 Wh per ChatGPT query, 8.5x higher than OpenAI claims.
  • "How Hungry is AI?" (May 2025) — Academic benchmark (arxiv.org/abs/2505.09598) with standardized methodology.
🥈 Secondary Sources

Provider Data (With Caveats)

  • Google (Aug 2025): 0.24 Wh — Unverified, uses "market-based" accounting
  • OpenAI (Jun 2025): 0.34 Wh — No methodology provided, no model specified
  • Mistral LCA Report: 0.3-0.5 Wh — More transparent, limited scope

Critical Findings (December 2025)

🔴 Reasoning Models: The Hidden Energy Cost

The Hugging Face AI Energy Score project (Sasha Luccioni & Boris Gamazaychikov) found that enabling "deep thinking" or reasoning modes causes energy consumption to increase dramatically:

Model Reasoning OFF Reasoning ON Multiplier
DeepSeek R1 50 Wh 7,626 Wh 152x
Microsoft Phi 4 18 Wh 9,462 Wh 525x
OpenAI GPT (high) 8,504 Wh

What this means: A single complex reasoning query can consume more energy than running your refrigerator for a day. Our calculator reflects these real-world findings.

🌱 Eco-Efficiency Rankings

Not all AI models are equal. The "How Hungry is AI?" benchmark uses Data Envelopment Analysis (DEA) to score efficiency:

1st Claude-3.7 Sonnet 0.886 DEA Best reasoning/efficiency balance
2nd o4-mini (high) 0.867 DEA Good for reasoning tasks
3rd o3-mini 0.840 DEA Efficient reasoning
Last DeepSeek-R1 0.058 DEA High capability, poor efficiency

Our Calculation Formula

CO₂e (grams) = Energy (Wh) × Grid Intensity (g/kWh) × PUE / 1000

Energy (Wh)

Model-specific consumption based on independent benchmarks. Varies by query length, model type, and reasoning mode.

Grid Intensity (g/kWh)

Regional carbon intensity of electricity. US average: 386g. California: 210g. France: 50g.

PUE (Power Usage Effectiveness)

Data center overhead. Google: 1.09. Azure: 1.18. Enterprise average: 1.50.

Calculator Values We Use

We use mid-range conservative estimates with wide uncertainty bands to be honest about what we don't know.

Query Type Low Estimate High Estimate Default Used
Simple text (GPT-4o, Claude) 0.25 Wh 3.0 Wh 1.0 Wh
Complex/long query 1.0 Wh 10.0 Wh 5.0 Wh
Reasoning (o1, o3, DeepSeek-R1) 20 Wh 10,000 Wh 500 Wh
Image generation 2.0 Wh 5.0 Wh 3.0 Wh

Why wide ranges? No standardized measurement methodology exists. Provider claims lack verification. We'd rather be honest about uncertainty than falsely precise.

What We Don't Know (Honest Uncertainty)

  • Exact inference costs: Providers don't publish actual per-query energy consumption
  • Hardware variations: A100 vs H100 vs custom TPUs have different efficiencies
  • Batch utilization: How efficiently queries are batched affects real energy use
  • Training vs inference split: We focus on inference (what users control)
  • Water usage: Data centers use 0.5-1.0 liters per 1000 tokens for cooling

We apply ±30-50% uncertainty to all estimates. This isn't a weakness—it's honest science.

Update Schedule

We re-verify our data quarterly and update whenever significant new research is published.

December 2025 Added Hugging Face AI Energy Score findings, updated reasoning model multipliers
August 2025 Added Google methodology paper data (with caveats)
May 2025 Integrated "How Hungry is AI?" academic benchmark

Next review: March 2026 (or sooner if major research is published)

Full Source List

Independent Research (Higher Credibility)

  1. Hugging Face AI Energy Score Project (December 2025) — Sasha Luccioni & Boris Gamazaychikov. 40 models tested. fortune.com/2025/12/05/ai-reasoning-energy-problem
  2. Jegham et al. "How Hungry is AI?" (May 2025) — Academic benchmark with DEA efficiency scoring. arxiv.org/abs/2505.09598
  3. Electric Power Research Institute — 2.9 Wh per ChatGPT query
  4. Epoch AI (February 2025) — Independent nonprofit, 0.3-0.4 Wh baseline estimate
  5. University of Rhode Island — GPT-5 complex query estimates: 18-40 Wh

Provider-Reported (Use With Caution)

  1. Google (August 2025) — 0.24 Wh/query (unverified, market-based accounting)
  2. OpenAI/Sam Altman (June 2025) — 0.34 Wh/query (NO methodology provided)

Critical Analysis

  1. Bloomberg "How Tech Companies Are Obscuring AI's Real Carbon Footprint" (2024)
  2. Policy Review "Big Tech's 2025 Sustainability Reports" — "obfuscation-by-complexity"
  3. MIT Technology Review "AI Energy Footprint" (2025)

Ready to Offset Your AI Footprint?

Now that you understand the real impact, take action.