Normalize

Turn messy files into clean data.

CSV, Excel, and JSON files rarely arrive clean. Normalize detects their structure, lets you confirm it, and produces a typed, validated dataset your pipeline can trust.

Upload

Bring a CSV, Excel, or JSON file into Normalize.

Confirm structure

Normalize samples the file and proposes how each column should be interpreted: types, date formats, null tokens, numeric separators. You confirm before any transformation runs.

Output settings

Set your output format and normalization rules. Normalize applies them consistently across every row, resolving nulls, standardizing types, and recording every parse issue with its row and column.

Download

Download the normalized dataset as CSV, Excel, JSON, or Parquet. Every output includes a quality report, a trace artifact, and a deterministic fingerprint.

Scale

Processes datasets up to 10 million rows. No batching, no scripts, no splitting files.

Type support

Every column gets a confirmed semantic type before conversion runs. Strings, booleans, dates, times, integers, decimals, currencies, percentages, fractions, signed values, and accounting notation.

Output artifacts

Every run produces a normalized dataset, a full trace artifact, and a manifest JSON, enough to audit, replay, or verify any output.

Normalize is open source. The engine, normalization rules, and pipeline are available on GitHub.

The hosted interface is in early access.

Leave your email and we'll let you know when it opens, or explore the source on GitHub now.