JSON Schema Tutorial: Validate JSON Like a Pro (2026 Guide)
Syntax checking vs structural checking
JSON validation answers a single question: is this document parseable JSON? That's necessary but rarely enough. A document can be syntactically perfect — every brace closed, every quote matched — and still be wrong for your application. The required email field might be missing. The age might be a string when your code expects a number. An unknown top-level key might indicate a stale client.
JSON Schema fills that gap. It's a standardised vocabulary for describing the *structure* of valid JSON: required fields, allowed types, value ranges, string patterns, array constraints, conditional dependencies. Once you have a schema, a validator can tell you not just "this parses" but "this has the right shape" — with a useful error message when it doesn't.
Think of it as a contract between systems. The schema is the rulebook; your JSON is the document being checked against it. Cross every boundary with a validation step and you eliminate an entire class of bugs.
A minimal example
Here's a schema describing a simple user record:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "User",
"type": "object",
"properties": {
"name": { "type": "string", "minLength": 1 },
"age": { "type": "integer", "minimum": 0 },
"email": { "type": "string", "format": "email" }
},
"required": ["name", "email"]
}
In plain English: every document is an object with three fields. name must be a non-empty string, age must be a non-negative integer (and is optional), and email must be a string formatted as an email address. Both name and email are required.
Now valid documents:
{ "name": "Ada", "email": "[email protected]" }
{ "name": "Ada", "email": "[email protected]", "age": 200 }
Now invalid documents:
{ "name": "Ada" } // missing email
{ "name": "", "email": "[email protected]" } // empty name
{ "name": "Ada", "email": "not-an-email" } // bad email format
{ "name": "Ada", "email": "x@y", "age": -1 } // negative age
The schema catches all four.
The core keywords you'll use every day
There are dozens of keywords in JSON Schema, but ten cover most real-world use:
type— the basic data type:string,number,integer,boolean,array,object,null. Can also be an array of types for "one of these":["string", "null"].required— an array of property names that must be present.properties— an object whose keys are field names and whose values are sub-schemas for those fields.additionalProperties— set tofalseto reject unknown keys,true(default) to allow them, or a schema to constrain unknown keys' values.enum— restrict a value to one of a fixed list:{ "enum": ["draft", "published", "archived"] }.const— likeenumwith one element:{ "const": "v1" }.pattern— regex that strings must match. JavaScript regex syntax, not Python or PCRE.minLength/maxLength/minimum/maximum/minItems/maxItems— value range constraints.items— schema for elements of an array.$ref— reference another schema, allowing reuse.
Arrays
{
"type": "array",
"items": { "type": "string" },
"minItems": 1,
"maxItems": 10,
"uniqueItems": true
}
Validates an array of 1-10 unique strings.
For arrays of objects, use a sub-schema:
{
"type": "array",
"items": {
"type": "object",
"properties": {
"id": { "type": "integer" },
"name": { "type": "string" }
},
"required": ["id", "name"]
}
}
Reusing schemas with $ref
For anything non-trivial, you'll define types once and reference them in many places. JSON Schema uses $defs (in draft 2020-12; definitions in older drafts) for this:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$defs": {
"Money": {
"type": "object",
"properties": {
"amount": { "type": "integer", "minimum": 0 },
"currency": { "type": "string", "minLength": 3, "maxLength": 3 }
},
"required": ["amount", "currency"]
}
},
"type": "object",
"properties": {
"subtotal": { "$ref": "#/$defs/Money" },
"tax": { "$ref": "#/$defs/Money" },
"total": { "$ref": "#/$defs/Money" }
}
}
Three fields, one definition. Change the Money shape once and every reference updates.
Conditional schemas
For complex business rules, if/then/else handles "when X is true, also require Y":
{
"type": "object",
"properties": {
"shipping": { "enum": ["standard", "express"] },
"address": { "type": "string" },
"trackingId":{ "type": "string" }
},
"if": { "properties": { "shipping": { "const": "express" } } },
"then": { "required": ["trackingId"] }
}
If shipping is "express", trackingId is required.
allOf, anyOf and oneOf cover most other compositional cases.
Validating in Node.js with Ajv
Ajv is the standard JSON Schema validator for the JavaScript ecosystem. It compiles schemas to fast, pre-validated functions.
npm install ajv ajv-formats
import Ajv from 'ajv'
import addFormats from 'ajv-formats'
const ajv = new Ajv({ allErrors: true })
addFormats(ajv)
const validate = ajv.compile(schema)
const valid = validate(data)
if (!valid) {
console.error(validate.errors)
}
allErrors: true reports every problem instead of stopping at the first. For user-facing forms this is what you want; for high-throughput services, leave it off and short-circuit on the first error.
ajv-formats adds support for the standard format keywords (email, uri, date-time, uuid and so on). Without it, format checks are no-ops.
Validating in Python with jsonschema
pip install jsonschema
from jsonschema import validate, ValidationError
try:
validate(instance=data, schema=schema)
except ValidationError as e:
print(f"Validation failed: {e.message}")
print(f"At path: {list(e.absolute_path)}")
For better error reporting and faster validation, look at fastjsonschema (Python compiles schema to bytecode, like Ajv).
Where to put schemas in your codebase
Three patterns work well:
- Inline — small schemas live next to the code that uses them. Fine for internal APIs with a small surface.
- One file per resource —
schemas/user.json,schemas/order.json. Easy to find, easy to diff. - OpenAPI specification — for public REST APIs, generate JSON Schemas from your OpenAPI spec so both stay in sync.
Whichever pattern you pick, version the schemas. A schema change is an API change. Treat v1/user.json and v2/user.json as separate files and migrate clients deliberately.
Generating types from schemas
The killer feature of JSON Schema for TypeScript projects is generating types automatically. json-schema-to-typescript reads a schema and outputs:
interface User {
name: string
email: string
age?: number
}
Now your code is type-safe against the same schema your validator uses. Drift between "the API says" and "the code says" becomes impossible — both are derived from the same source.
Python has datamodel-code-generator for the equivalent (Pydantic models from JSON Schema). Other ecosystems have similar tools.
Best practices
- Start strict, loosen when needed. Begin with
additionalProperties: falseand required fields, then relax as real-world data forces you to. This is much easier than the reverse — every loosening is a non-breaking change, every tightening is breaking.
- Use
$reffor reusable types. DefineAddress,Money,Timestamponce. Schema-level DRY pays off the same way code-level DRY does.
- Validate at the boundary. API request handlers, queue consumers, file importers — anywhere data enters your trusted code. Once data is inside, repeated validation is wasted CPU.
- Don't validate the same shape twice. Pick the boundary deliberately. If a service receives validated data from another internal service, re-validating is paranoid; if it receives data from a public endpoint, validate aggressively.
- Include
$idand$schema.$schematells validators which draft of JSON Schema to use.$idgives the schema a stable URL for$refto target. Both make schemas more portable.
- Test your schemas. Treat them like code. A test suite that runs known-good and known-bad documents against the schema catches mistakes before production does.
When NOT to use JSON Schema
JSON Schema is the right tool when:
- The data crosses a process or network boundary
- Multiple languages need to agree on the format
- You want runtime validation, not just compile-time
It's overkill when:
- The data lives entirely inside one TypeScript or Pydantic codebase — types do the job faster
- The structure is so flexible that a schema would be a wall of
anyOf(consider whether the data should be flatter) - You only need parseability, not shape — use a JSON validator instead
Try it now
Sketch the shape of one of your API responses, write a schema with the keywords above, and run a real example through Ajv or jsonschema. Most teams find the first schema reveals one or two subtle assumptions they had been carrying implicitly — once those are written down, the API gets clearer for everyone.
And before you write the schema: paste your sample JSON into the JSONNeat formatter to make sure it's syntactically valid. A misformatted document will fail every schema and waste an hour of debugging on the wrong layer.
Related tools: JSON formatter, validator, minifier.