← Blog · Written by the JSONNeat maintainer · Published

JSON Schema Tutorial: Validate JSON Like a Pro (2026 Guide)

Syntax checking vs structural checking

JSON validation answers a single question: is this document parseable JSON? That's necessary but rarely enough. A document can be syntactically perfect — every brace closed, every quote matched — and still be wrong for your application. The required email field might be missing. The age might be a string when your code expects a number. An unknown top-level key might indicate a stale client.

JSON Schema fills that gap. It's a standardised vocabulary for describing the *structure* of valid JSON: required fields, allowed types, value ranges, string patterns, array constraints, conditional dependencies. Once you have a schema, a validator can tell you not just "this parses" but "this has the right shape" — with a useful error message when it doesn't.

Think of it as a contract between systems. The schema is the rulebook; your JSON is the document being checked against it. Cross every boundary with a validation step and you eliminate an entire class of bugs.

A minimal example

Here's a schema describing a simple user record:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "User",
  "type": "object",
  "properties": {
    "name":  { "type": "string", "minLength": 1 },
    "age":   { "type": "integer", "minimum": 0 },
    "email": { "type": "string", "format": "email" }
  },
  "required": ["name", "email"]
}

In plain English: every document is an object with three fields. name must be a non-empty string, age must be a non-negative integer (and is optional), and email must be a string formatted as an email address. Both name and email are required.

Now valid documents:

{ "name": "Ada", "email": "[email protected]" }
{ "name": "Ada", "email": "[email protected]", "age": 200 }

Now invalid documents:

{ "name": "Ada" }                          // missing email
{ "name": "", "email": "[email protected]" } // empty name
{ "name": "Ada", "email": "not-an-email" } // bad email format
{ "name": "Ada", "email": "x@y", "age": -1 } // negative age

The schema catches all four.

The core keywords you'll use every day

There are dozens of keywords in JSON Schema, but ten cover most real-world use:

  • type — the basic data type: string, number, integer, boolean, array, object, null. Can also be an array of types for "one of these": ["string", "null"].
  • required — an array of property names that must be present.
  • properties — an object whose keys are field names and whose values are sub-schemas for those fields.
  • additionalProperties — set to false to reject unknown keys, true (default) to allow them, or a schema to constrain unknown keys' values.
  • enum — restrict a value to one of a fixed list: { "enum": ["draft", "published", "archived"] }.
  • const — like enum with one element: { "const": "v1" }.
  • pattern — regex that strings must match. JavaScript regex syntax, not Python or PCRE.
  • minLength / maxLength / minimum / maximum / minItems / maxItems — value range constraints.
  • items — schema for elements of an array.
  • $ref — reference another schema, allowing reuse.

Arrays

{
  "type": "array",
  "items": { "type": "string" },
  "minItems": 1,
  "maxItems": 10,
  "uniqueItems": true
}

Validates an array of 1-10 unique strings.

For arrays of objects, use a sub-schema:

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "id":    { "type": "integer" },
      "name":  { "type": "string" }
    },
    "required": ["id", "name"]
  }
}

Reusing schemas with $ref

For anything non-trivial, you'll define types once and reference them in many places. JSON Schema uses $defs (in draft 2020-12; definitions in older drafts) for this:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$defs": {
    "Money": {
      "type": "object",
      "properties": {
        "amount":   { "type": "integer", "minimum": 0 },
        "currency": { "type": "string", "minLength": 3, "maxLength": 3 }
      },
      "required": ["amount", "currency"]
    }
  },
  "type": "object",
  "properties": {
    "subtotal": { "$ref": "#/$defs/Money" },
    "tax":      { "$ref": "#/$defs/Money" },
    "total":    { "$ref": "#/$defs/Money" }
  }
}

Three fields, one definition. Change the Money shape once and every reference updates.

Conditional schemas

For complex business rules, if/then/else handles "when X is true, also require Y":

{
  "type": "object",
  "properties": {
    "shipping":  { "enum": ["standard", "express"] },
    "address":   { "type": "string" },
    "trackingId":{ "type": "string" }
  },
  "if":   { "properties": { "shipping": { "const": "express" } } },
  "then": { "required": ["trackingId"] }
}

If shipping is "express", trackingId is required.

allOf, anyOf and oneOf cover most other compositional cases.

Validating in Node.js with Ajv

Ajv is the standard JSON Schema validator for the JavaScript ecosystem. It compiles schemas to fast, pre-validated functions.

npm install ajv ajv-formats
import Ajv from 'ajv'
import addFormats from 'ajv-formats'

const ajv = new Ajv({ allErrors: true })
addFormats(ajv)

const validate = ajv.compile(schema)
const valid = validate(data)
if (!valid) {
  console.error(validate.errors)
}

allErrors: true reports every problem instead of stopping at the first. For user-facing forms this is what you want; for high-throughput services, leave it off and short-circuit on the first error.

ajv-formats adds support for the standard format keywords (email, uri, date-time, uuid and so on). Without it, format checks are no-ops.

Validating in Python with jsonschema

pip install jsonschema
from jsonschema import validate, ValidationError

try:
    validate(instance=data, schema=schema)
except ValidationError as e:
    print(f"Validation failed: {e.message}")
    print(f"At path: {list(e.absolute_path)}")

For better error reporting and faster validation, look at fastjsonschema (Python compiles schema to bytecode, like Ajv).

Where to put schemas in your codebase

Three patterns work well:

  1. Inline — small schemas live next to the code that uses them. Fine for internal APIs with a small surface.
  2. One file per resourceschemas/user.json, schemas/order.json. Easy to find, easy to diff.
  3. OpenAPI specification — for public REST APIs, generate JSON Schemas from your OpenAPI spec so both stay in sync.

Whichever pattern you pick, version the schemas. A schema change is an API change. Treat v1/user.json and v2/user.json as separate files and migrate clients deliberately.

Generating types from schemas

The killer feature of JSON Schema for TypeScript projects is generating types automatically. json-schema-to-typescript reads a schema and outputs:

interface User {
  name: string
  email: string
  age?: number
}

Now your code is type-safe against the same schema your validator uses. Drift between "the API says" and "the code says" becomes impossible — both are derived from the same source.

Python has datamodel-code-generator for the equivalent (Pydantic models from JSON Schema). Other ecosystems have similar tools.

Best practices

  1. Start strict, loosen when needed. Begin with additionalProperties: false and required fields, then relax as real-world data forces you to. This is much easier than the reverse — every loosening is a non-breaking change, every tightening is breaking.
  1. Use $ref for reusable types. Define Address, Money, Timestamp once. Schema-level DRY pays off the same way code-level DRY does.
  1. Validate at the boundary. API request handlers, queue consumers, file importers — anywhere data enters your trusted code. Once data is inside, repeated validation is wasted CPU.
  1. Don't validate the same shape twice. Pick the boundary deliberately. If a service receives validated data from another internal service, re-validating is paranoid; if it receives data from a public endpoint, validate aggressively.
  1. Include $id and $schema. $schema tells validators which draft of JSON Schema to use. $id gives the schema a stable URL for $ref to target. Both make schemas more portable.
  1. Test your schemas. Treat them like code. A test suite that runs known-good and known-bad documents against the schema catches mistakes before production does.

When NOT to use JSON Schema

JSON Schema is the right tool when:

  • The data crosses a process or network boundary
  • Multiple languages need to agree on the format
  • You want runtime validation, not just compile-time

It's overkill when:

  • The data lives entirely inside one TypeScript or Pydantic codebase — types do the job faster
  • The structure is so flexible that a schema would be a wall of anyOf (consider whether the data should be flatter)
  • You only need parseability, not shape — use a JSON validator instead

Try it now

Sketch the shape of one of your API responses, write a schema with the keywords above, and run a real example through Ajv or jsonschema. Most teams find the first schema reveals one or two subtle assumptions they had been carrying implicitly — once those are written down, the API gets clearer for everyone.

And before you write the schema: paste your sample JSON into the JSONNeat formatter to make sure it's syntactically valid. A misformatted document will fail every schema and waste an hour of debugging on the wrong layer.


Written by the maintainer of JSONNeat. Questions or corrections? Email [email protected].

Related tools: JSON formatter, validator, minifier.