Open Data Schema for Energy
Data Quality

Data Completeness Is a First-Class Signal (Not a Reporting Afterthought)

Missingness and timestamp integrity are operational signals that directly change decisions, not cleanup work for month-end reporting.

Suppose you run a quarterly energy report and two sites show identical generation totals. Both look healthy. But one site has 98% interval completeness while the other has chronic gaps that happen to average out. Without explicit completeness metrics, those two sites look equally trustworthy — and any decision built on that assumption is wrong.

Completeness should be managed as a first-class signal in the same operating layer as transform success and validation health. Here's how to make that happen.

Completeness Is About Decision Reliability

Two portfolios can show identical energy totals but very different reliability if one has chronic missing intervals. Without explicit completeness metrics, you're treating those portfolios as equally trustworthy even when they're not.

Contract-First Precondition

Completeness scoring is only meaningful after transform and validation. If records are malformed or semantically implausible, your interval counts alone become misleading.

raw payload -> ODSE transform -> schema validation -> semantic validation -> completeness scoring -> analytics

What to Measure

Expected vs Observed Intervals

For each site and day, compute expected interval count from your configured cadence, then compare with observed valid records.

Gap Window Distribution

Track not only percentage missing but also contiguous gap windows. A single 3-hour outage carries different risk than scattered 5-minute gaps.

Timeliness Lag

Measure delayed arrivals separately from permanent loss. Late data impacts your operational decisions even if backfills eventually restore totals.

Practical Controls

Suggested Dashboard Panel

This separates source-quality incidents from asset-performance incidents and speeds targeted remediation.

Example Gate Logic

In this example, suppose your compliance threshold is 98% and your export-blocking threshold is 95%:

if completeness_score < 0.98:
    mark_output_as_low_confidence()

if completeness_score < 0.95:
    block_compliance_export()

Common Mistakes

Operating rule: A metric without completeness context is a confidence claim without evidence. Publish both together.

Adoption Sequence

Completeness is not a data team hygiene task. It is a core part of your operational truth and should be treated with the same rigor as any primary KPI.

Validation Overview | Schema Validation | Energy Timeseries

← Back to Blog