JSV

hex.pm VersionBuild StatusLicense

A JSON Schema Validation library for Elixir with full support for the latest JSON Schema specification.

Documentation

The API documentation is available on hexdocs.pm.

This document describes general considerations and recipes to use the library.

Getting started

Installation

def deps do
  [
    {:jsv, "~> 0.4"},
  ]
end

Additional dependencies can be added to support more features:

def deps do
  [
    # Optional libraries for better format validation support

    # Email validation
    {:mail_address, "~> 1.0"},

    # URI, IRI, JSON-pointers validation
    {:abnf_parsec, "~> 1.0"},


    # Optional libraries to decode schemas resolved from http
    # prior to Elixir 1.18

    {:jason, "~> 1.0"},
    # OR
    {:poison, "~> 6.0 or ~> 5.0"},
  ]
end

Basic usage

The following snippet describes the general usage of the library in any context.

The rest of the documentation describes how to use JSV in the context of an application.

# 1. Define a schema
schema = %{
  type: :object,
  properties: %{
    name: %{type: :string}
  },
  required: [:name]
}

# 2. Build the validation root
root = JSV.build!(schema)

# 3. Validate the data
case JSV.validate(%{"name" => "Alice"}, root) do
  {:ok, data} ->
    {:ok, data}

  {:error, validation_error} ->
    # Errors can be casted as JSON compatible data structure to send them as
    # an API response or for loggin purposes.
    {:error, JSON.encode!(JSV.normalize_error(validation_error))}
end

Core concepts

Input schema format

"Raw schemas" are schemas defined in Elixir data structures such as %{"type" => "integer"}.

JSV does not accept JSON strings. You will need to decode the JSON strings before giving them to the build function. There are three different possible formats for a schema:

  1. A boolean. Booleans are valid schemas that accept anything (true) or reject everything (false).

  2. A map with binary keys and values such as %{"type" => "integer"}.

  3. A map with atom keys and/or values such as %{type :integer}.

    The JSV.Schema struct can be used for autocompletion and provides a special behaviour over a raw map with atoms: any nil value found in the struct will be ignored.

    Raw maps and other structs have their nil values kept and treated as-is (it's generally invalid in a JSON schema).

    The :__struct__ property of other structs is safely ignored.

Atoms are converted to binaries internally so it is technically possible to mix atom with binaries as map keys or values but the behaviour for duplicate keys is not defined by the library. Example: %{"type" => "string", :type => "integer"}.

Resolvers overview

In order to build schemas properly, JSV needs to resolve the schema as a first step.

Resolving means fetching any remote resource whose data is needed and not available ; basically $schema, $ref or $dynamicRef properties pointing to an absolute URI.

Those URIs are generally URLs with the http:// or https:// scheme, but other custom schemes can be used, and there are many ways to fetch HTTP resources in Elixir.

For security reasons, the default resolver, JSV.Resolver.Embedded, ships official meta-schemas as part of the source code and can only resolve those schemas.

For convenience reasons, a resolver that can fetch from the web is provided (JSV.Resolver.Httpc) but it needs to be manually declared by users of the JSV library. Refer to the documentation of this module for more information.

Custom resolvers can be defined for more advanced use cases.

Meta-schemas: Introduction to vocabularies

You can safely skip this section if you are not interested in the inner workings of the modern JSON schema specification.

JSV was built in compliance with the vocabulary mechanism of JSON schema, to support custom and optional keywords in the schemas.

Here is what happens when validating with the latest specification:

  1. The well-known and official schema https://json-schema.org/draft/2020-12/schema defines the following vocabulary:

    {
      "$vocabulary": {
        "https://json-schema.org/draft/2020-12/vocab/core": true,
        "https://json-schema.org/draft/2020-12/vocab/applicator": true,
        "https://json-schema.org/draft/2020-12/vocab/unevaluated": true,
        "https://json-schema.org/draft/2020-12/vocab/validation": true,
        "https://json-schema.org/draft/2020-12/vocab/meta-data": true,
        "https://json-schema.org/draft/2020-12/vocab/format-annotation": true,
        "https://json-schema.org/draft/2020-12/vocab/content": true
      }
    }

    The vocabulary is split in different parts, here one by object property. More information can be found on the official website.

  2. Libraries such as JSV must map this vocabulary to implementations. For instance, in JSV, the https://json-schema.org/draft/2020-12/vocab/validation part that defines the type keyword is implemented with the JSV.Vocabulary.V202012.Validation Elixir module.

  3. We can declare a schema that would like to use the type keyword. To let the library know what implementation to use for that keyword, the schema declares the https://json-schema.org/draft/2020-12/schema as its meta-schema using the $schema keyword.

    JSV will use that exact value if the $schema keyword is not specified.

    {
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "type": "integer"
    }

    This tells the library to pull the vocabulary from the meta-schema and apply it to the schema.

  4. As JSV is compliant, it will use its implementation of https://json-schema.org/draft/2020-12/vocab/validation to handle the type keyword and validate data types.

    This also means that you can use a custom meta schema to skip some parts of the vocabulary, or add your own.

Building schemas

In this chapter we will see how to build schemas from raw resources. The examples will mention the JSV.build/2 or JSV.build!/2 functions interchangeably. Everything described here applies to both.

Schemas are built according to their meta-schema vocabulary. JSV will assume that the $schema value is "https://json-schema.org/draft/2020-12/schema" by default if not provided.

Once built, a schema is converted into a JSV.Root, an internal representation of the schema that can be used to perform validation.

Enable or disable format validation

By default, the https://json-schema.org/draft/2020-12/schema meta schema does not perform format validation. This is very counter intuitive, but it basically means that the following code will return {:ok, "not a date"}:

schema =
  JSON.decode!("""
  {
    "type": "string",
    "format": "date"
  }
  """)

root = JSV.build!(schema)

JSV.validate("not a date", root)

To always enable format validation when building a root schema, provide the formats: true option to JSV.build/2:

JSV.build(raw_schema, formats: true)

This is another reason to wrap JSV.build with a custom builder module!

Note that format validation is determined at build time. There is no way to change whether it is performed once the root schema is built.

You can also enable format validation by using the JSON Schema specification semantics, though we strongly advise to just use the :formats option and call it a day.

For format validation to be enabled, a schema should declare the https://json-schema.org/draft/2020-12/vocab/format-assertion vocabulary instead of the https://json-schema.org/draft/2020-12/vocab/format-annotation one that is included by default in the https://json-schema.org/draft/2020-12/schema meta schema.

So, first we would declare a new meta schema:

{
    "$id": "custom://with-formats-on/",
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$vocabulary": {
        "https://json-schema.org/draft/2020-12/vocab/core": true,
        "https://json-schema.org/draft/2020-12/vocab/format-assertion": true
    },
    "$dynamicAnchor": "meta",
    "allOf": [
        { "$ref": "https://json-schema.org/draft/2020-12/meta/core" },
        { "$ref": "https://json-schema.org/draft/2020-12/meta/format-assertion" }
    ]
}

This example is taken from the JSON Schema Test Suite codebase and does not includes all the vocabularies, only the assertion for the formats and the core vocabulary.

Then we would declare our schema using that vocabulary to perform validation. Of course our resolver must be able to resolve the given URL for the new $schema property.

schema =
  JSON.decode!("""
  {
    "$schema": "custom://with-formats-on/",
    "type": "string",
    "format": "date"
  }
  """)

root = JSV.build!(schema, resolver: ...)

With this new meta-schema, JSV.validate/2 would return an error tuple without needing the formats: true.

{:error, _} = JSV.validate("hello", root)

In this case, it is also possible to disable the validation for schemas that use a meta-schema where the assertion vocabulary is declared:

JSV.build(raw_schema, formats: false)

Custom build modules

With that in mind, we suggest to define a custom module to wrap the JSV.build/2 function, so the resolver, formats and vocabularies can be defined only once.

That module could be implemented like this:

defmodule MyApp.SchemaBuilder do
  def build_schema!(raw_schema) do
    JSV.build!(raw_schema, resolver: MyApp.SchemaResolver, formats: true)
  end
end

Compile-time builds

It is strongly encouraged to build schemas at compile time, in order to avoid repeating the build step for no good reason.

For instance, if we have this function that should validate external data:

# Do not do this

def validate_order(order) do
  root =
    "priv/schemas/order.schema.json"
    |> File.read!()
    |> JSON.decode!()
    |> MyApp.SchemaBuilder.build_schema!()

  case JSV.validate(order, root) do
    {:ok, _} -> OrderHandler.handle_order(order)
    {:error, _} = err -> err
  end
end

The schema will be built each time the function is called. Building a schema is actually pretty fast but it is a waste of resources nevertheless.

One could do the following to get a net performance gain:

# Do this instead

@order_schema "priv/schemas/order.schema.json"
              |> File.read!()
              |> JSON.decode!()
              |> MyApp.SchemaBuilder.build_schema!()

defp order_schema, do: @order_schema

def validate_order(order) do
  case JSV.validate(order, order_schema()) do
    {:ok, _} -> OrderHandler.handle_order(order)
    {:error, _} = err -> err
  end
end

You can also define a module where all your schemas are built and exported as functions:

defmodule MyApp.Schemas do
  schemas = [
    order: "tmp/order.schema.json",
    shipping: "tmp/shipping.schema.json"
  ]

  Enum.each(schemas, fn {fun, path} ->
    root =
      path
      |> File.read!()
      |> JSON.decode!()
      |> MyApp.SchemaBuilder.build_schema!()

    def unquote(fun)() do
      unquote(Macro.escape(root))
    end
  end)
end

...and use it elsewhere:

def validate_order(order) do
  case JSV.validate(order, MyApp.Schemas.order()) do
    {:ok, _} -> OrderHandler.handle_order(order)
    {:error, _} = err -> err
  end
end

Validation

To validate a term, call the JSV.validate/3 function like so:

JSV.validate(data, root_schema, opts)

General considerations

JSV supports all keywords of the 2020-12 specification except:

Formats

JSV supports multiple formats out of the box with its default implementation, but some are only available under certain conditions that will be specified for each format.

The following listing describes the condition for support and return value type for these default implementations. You can override those implementations by providing your own, as well as providing new formats. This will be described later in this document.

Also, note that by default, JSV format validation will return the original value, that is, the string form of the data. Some format validators can also cast the string to a more interesting data structure, for instance converting a date string to a Date struct. You can enable returning casted values by passing the cast_formats: true option to JSV.validate/3.

The listing below describe values returned with that option enabled.

Important: Some formats require the abnf_parsec library to be available. You may add it as a dependency in your application and it will be used automatically.

date

date-time

duration

email

hostname

ipv4

ipv6

iri

iri-reference

json-pointer

regex

relative-json-pointer

time

unknown

uri

uri-reference

uri-template

uuid

Custom formats

In order to provide custom formats, or to override default implementations for formats, you may provide a list of modules as the value for the :formats options of JSV.build/2. Such modules must implement the JSV.FormatValidator behaviour.

For instance:

defmodule CustomFormats do
  @behaviour JSV.FormatValidator

  @impl true
  def supported_formats do
    ["greeting"]
  end

  @impl true
  def validate_cast("greeting", data) do
    case data do
      "hello " <> name -> {:ok, %Greeting{name: name}}
      _ -> {:error, :invalid_greeting}
    end
  end
end

With this module you can now call the builder with it:

JSV.build!(raw_schema, formats: [CustomFormats])

Note that this will disable all other formats. If you need to still support the default formats, a helper is available:

JSV.build!(raw_schema,
  formats: [CustomFormats | JSV.default_format_validator_modules()]
)

Format validation modules are checked during the build phase, in order. So you can override any format defined by a module that comes later in the list, including the default modules.

Struct schemas

Schemas can be used to define structs.

For instance, with this module definition schema:

defmodule MyApp.UserSchema do
  require JSV

  JSV.defschema(%{
    type: :object,
    properties: %{
      name: %{type: :string, default: ""},
      age: %{type: :integer, default: 0}
    }
  })
end

A struct will be defined with the appropriate default values:

iex> %MyApp.UserSchema{}
%MyApp.UserSchema{name: "", age: 0}

The module can be used as a schema to build a validator root and cast data to the corresponding struct:

iex> {:ok, root} = JSV.build(MyApp.UserSchema)
iex> data = %{"name" => "Alice"}
iex> JSV.validate(data, root)
{:ok, %MyApp.UserSchema{name: "Alice", age: 0}}

Casting to struct can be disabled by passing cast_structs: false into the options of JSV.validate/3.

The module can also be used in other schemas:

%{
  type: :object,
  properties: %{
    name: %{type: :string},
    owner: MyApp.UserSchema
  }
}

Resolvers

The JSV.build/2 and JSV.build!/2 functions accept a :resolver option that takes one one multiple JSV.Resolver implementations.

JSV will try each one in order to resolve a schema by it's URI.

The JSV.Resolver.Embedded and JSV.Resolver.Internal are always enabled to support well-known URIs like https://json-schema.org/draft/2020-12/schema and module-based structs. They are tried last unless you provide them explicitly in a specific order in the option.

Custom resolvers

Users are encouraged to write their own resolver to support advanced use cases.

To load schemas from a local directory, the JSV.Resolver.Local module can be used:

defmodule MyApp.LocalResolver do
  use JSV.Resolver.Local, source: [
    "priv/schemas",
    "priv/messaging/schemas",
    "priv/special.schema.json"
  ]
end

To resolve schemas from the web, you can use the JSV.Resolver.Httpc resolver, or implement your own web fetching resolver with an HTTP library like Req:

defmodule MyApp.WebResolver do
  @behaviour JSV.Resolver

  def resolve("https://" <> _ = uri, _opts) do
    # Delegate known meta schemas to the embedded resolver
    with {:error, {:not_embedded, _}} <- JSV.Resolver.Embedded.resolve(uri, []),
         {:ok, %{status: 200, body: schema}} <- Req.get(uri) do
      {:ok, schema}
    end
  end

  def resolve(uri, _) do
    {:error, {:not_an_https_url, uri}}
  end
end

As mentionned above, you can pass both resolvers when needed:

root = JSV.build!(schema, resolver: [MyApp.LocalResolver, MyApp.WebResolver])

Development

Contributing

Pull requests are welcome given appropriate tests and documentation.

Roadmap