Put some JSON Schema in your life

Fact: JSON is the _de facto_ serialization standard of the web, deal with it. It is not a matter of personal choice, you can deliver a full stack without changing your data structure.
07.02.2014
Diego Caponera
Tags

Fact: JSON is the de facto serialization standard of the web, deal with it. It is not a matter of personal choice, you can deliver a full stack without changing your data structure:

Important: there is no such thing as a JSON Object. As its name suggests, JSON stands for JavaScript Object Notation, and trivially speaking it is a way to store data into a [long] string.

Of course JSON gets deserialized into Javascript Objects during the lifecycle of the application [e.g. once in the server, after retrieving it from database, and in the client, after retrieving it from an HTTP Request]. As both the client and the server are Javascript applications, the deserialization process is fast and harmless.

You consume what you store, you store what you consume.

Now that we agree on the spoken language, we should give it some rules in order to make it consistent across all the layers it will be carried through.

Why is it useful?

As our beloved XML had its own Schema [why are W3C pages always so ugly and '90-ish?], JSON needed it too, for the same reasons:

  • describing existing data format;
  • generating clear, human- and machine-readable documentation;
  • having complete structural validation, useful for automated testing and validating client-submitted data.

So here it is: our draft!

Concrete scenarios that come to my mind:

  1. an application needs both a client side UI, and a RESTful API to be delevoped at the same time. First the team agrees on the JSON Schema of the data, then both can work separately on their prototypes mocking responses and requests, according to the aforementioned schema;

  2. an application is heavily form oriented, and requires thus both automated generation and validation: it wouldn’t be complicated to generate HTML5 <form>s out of the schema rules, nor to validate HTTP requests against them once the user submits.

A concrete example

Let’s go through the schema definition of an all-star entity as the User would be, whose most common outfit would be:

{
  "id" : 1,
  "firstName" : "John",
  "lastName" : "Doe",
  "username" : "johndoe",
  "email" : "john@doe.com",
  "phone" : "+49 30 6098388 0",
  "age" : "29",
  "roles" ["ROLE_ADMIN", "ROLE_USER"]
}

Objections? So let’s start compiling our schema:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "User",
  "description": "A user of our application",
  "type": "object",
  "properties": {
    "id": {
      "description": "The unique identifier for a user",
      "type": "integer"
    }
  },
  "required": ["id"]
}

We just stated that we are describing a User entity of type object, whose properties by now count just the id, which is an integer and is of course required.

Then let’s add further details:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "User",
  "description": "A user of our application",
  "type": "object",
  "properties": {
    "id": {
      "description": "The unique identifier for a user",
      "type": "integer"
    },
    "firstName" : {
      "description" : "User's first name",
      "type" : "string"
    },
    "lastName" : {
      "description" : "User's last name",
      "type" : "string"
    },
    "email" : {
      "description" : "User's email",
      "type" : "string"
    },
    "phone" : {
      "description" : "User's phone number",
      "type" : "string"
    }                
  },
  "required": ["id", "firstName", "lastName", "email"]
}

The phone is not required, so we leave it out from the required field of the schema.

What for age?

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "User",
  "description": "A user of our application",
  "type": "object",
  "properties": {
    "id": {
      "description": "The unique identifier for a user",
      "type": "integer"
    },
    "firstName" : {
      "description" : "User's first name",
      "type" : "string"
    },
    "lastName" : {
      "description" : "User's last name",
      "type" : "string"
    },
    "email" : {
      "description" : "User's email",
      "type" : "string"
    },
    "phone" : {
      "description" : "User's phone number",
      "type" : "string"
    },
    "age" : {
      "description" : "User's age",
      "type" : "integer",
      "minimum" : 14
    }      
  },
  "required": ["id", "firstName", "lastName", "email"]
}

Let’s agree that 14 would be a reasonable minimum value for age.

We just have the roles left over, so:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "User",
  "description": "A user of our application",
  "type": "object",
  "properties": {
    "id": {
      "description": "The unique identifier for a user",
      "type": "integer"
    },
    "firstName" : {
      "description" : "User's first name",
      "type" : "string"
    },
    "lastName" : {
      "description" : "User's last name",
      "type" : "string"
    },
    "email" : {
      "description" : "User's email",
      "type" : "string"
    },
    "phone" : {
      "description" : "User's phone number",
      "type" : "string"
    },
    "age" : {
      "description" : "User's age",
      "type" : "integer",
      "minimum" : 14
    },
    "roles" : {
      "description" : "Roles granted to user",
      "type" : "array",
      "items" : {
        "type" : "string"
      },
      "minItems" : 1,
      "uniqueItems" : true
    }        
  },
  "required": ["id", "firstName", "lastName", "email"]
}

So roles are an array of strings, there should be at least one of them, and the array should not contain any duplicated of course.

The process is linear, human-readable and -understandable. But what about machines?

JSON Schema used in libraries

Ok, we wrote our schemas and the team is very happy to stick to them, but how could this leverage part of our work? Some neat examples come after.

JSON Form is a client-side Javascript library that generates HTML forms out of a JSON Schema.

JaySchema is a comprehensive JSON Schema validator for Node.js.

JSON Schema Random generates valid instances from a given schema file.


Putting some JSON Schema in your life/project, you will not regret it.