How to use JSON-LD


Written by: Allie Tatarian

JSON-LD (JSON for Linking Data) is more than just a data exchange syntax. The JSON-LD specification also includes a set of transforming algorithms to help you create linked data that is easy for machines to use and for humans to read and write.

Linked Data

JSON-LD is one of several W3C-recommended implementations of the Resource Description Framework (RDF), making it an ideal tool to create linked data. Linking data makes data and information more accessible to both machines and people. It can help simplify knowledge management or even help search engines find information more efficiently. Google already uses JSON-LD and linked data: if you Google something like the name of a famous celebrity, the distance to the sun, or the capital of Nepal, you might see a box with information on the subject before you even get to any links. These are called "knowledge panels," and they are created using the magic of JSON-LD and linked data.

That being said, you don't need to know anything about W3C, RDF, or the semantic web to take advantage of JSON-LD and linked data! You can find examples of how you can use transforming algorithms to tap into the power of JSON-LD in the next few sections. But first, let's talk about the @context section, which is necessary to tap into the power of linked data.

@context section

The @context section of a JSON-LD file is what sets it apart from plain JSON and makes both linking data and all of the transformation algorithms possible. The @context defines which external vocabularies will be used in your JSON-LD file: Both the base/default vocabulary (using the @vocab tag) and additional vocabularies that can be used as CURIEs. In addition, the @context section can be used to make aliases for vocabulary terms. For instance, in one of our example files, the MetaSat concept "telecommunicationsNetwork" is aliased to "groundStationNetwork." This can help improve clarity in certain contexts (in the example file, the only type of telecommunications network recorded was the SatNOGS ground station network), or to help your files conform with your own vocabularies more closely.

Because of the many existing linked data vocabularies and high potential for aliasing, the @context section of a file may become long and unwieldy. In these cases, you can link out to an external context file. An external context file is a JSON-LD file with only an @context section that can be hosted remotely, separately from the file you are working with. For example, MetaSat has an external context file hosted on our GitLab repository. This file, or any other external context, can be linked to your file using the @import tag. When using @import, the line "@version": 1.1 must be included in the @context section, because the @import tag was introduced with JSON-LD version 1.1.

An @context section linked to an external context might look like this:

{
  "@context": {
    "@version": 1.1,
    "@import": "https://gitlab.com/metasat/metasat-toolkit/-/raw/master/context.jsonld",
    "@vocab": "https://schema.space/metasat/"
  }
}

Anything after the @import tag, such as the @vocab tag in this example, will supersede whatever is in the external context.

Once our @context section is filled out, our JSON-LD file is capable of linking data. In addition, the @context section is crucial for the set of transformation algorithms that unlock the full potential of JSON-LD.

Transformation algorithms

The magic of JSON-LD lies in the fact that human-readable keys can be transformed into machine-readable URIs (Uniform Resource Identifiers). The human-readable form is called the "compacted" form, and the machine-readable is "expanded."

In addition, JSON-LD can be either "flattened" or "framed." A framed document includes nesting, looks more hierarchical, and is easier for humans to understand. Machines prefer the non-nested "flattened" form, which is faster for them to process. Luckily, there are algorithms to flatten and frame JSON-LD, too!

These algorithms all exist as part of the JSON-LD specification, and can be used in several programming environments. You can find them all on the JSON-LD homepage. You can also try out any of these transformations in the JSON-LD playground.

Expansion

Expanding your JSON-LD relies on the @context section at the beginning of the document. Here's an example of a human-written JSON-LD document, which describes the song that spent the most weeks at number one on the Billboard Hot 100 music chart:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "song": "https://schema.org/MusicRecording",
    "artist": "https://schema.org/byArtist",
    "title": "https://schema.org/name"
  },
  "@type": "song",
  "@id": "https://www.wikidata.org/wiki/Q62587323",
  "title": "Old Town Road",
  "artist": "Lil Nas X"
}

As a human, you know that this is describing a song called "Old Town Road" by the artist Lil Nas X. But a machine doesn't know what any of that means! This is why you use the expansion algorithm: It takes what is in the @context section and uses it to produce something that is easier for a program to understand by replacing the plaintext keys with URIs. In this example, "song" is replaced with "https://schema.org/MusicRecording," "title" is replaced with "https://schema.org/name," and "artist" is replaced with "https://schema.org/byArtist":

[
{
  "@id": "https://www.wikidata.org/wiki/Q62587323",
  "@type": [
    "https://schema.org/MusicRecording"
  ],
  "https://schema.org/byArtist": [
    {
      "@value": "Lil Nas X"
    }
  ],
  "https://schema.org/name": [
    {
      "@value": "Old Town Road"
    }
  ]
  }
]

This output file is called an expanded file. @id, @type, and @value are not expanded, because they are described by the RDF Schema, which JSON-LD links to by default.

Compaction

Let's say you want to take the file that I've written for "Old Town Road" and incorporate it in your database of hit music singles. However, your database doesn't use the terms I've made up, it uses the terms defined by schema.org. That's where compaction comes in!

The compaction algorithm requires two inputs: The expanded JSON-LD you see above, and a new @context section. Since you are using schema.org terms, your new context will just be:

{
  "@context": {
    "@vocab": "https://schema.org/"
  }
}

When you run the two files through the compaction algorithm, your output will be:

{
  "@context": {
    "@vocab": "https://schema.org/"
  },
  "@id": "https://www.wikidata.org/wiki/Q62587323",
  "@type": "MusicRecording",
  "name": "Old Town Road",
  "byArtist": "Lil Nas X"
}

Now, this compacted file is both human-readable and ready to populate your database.

Flattening

Flattening is used to remove any nesting from a JSON-LD document, which can make its processing faster. Before we go any further, let's add on to our original document:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "song": "https://schema.org/MusicRecording",
    "artist": "https://schema.org/byArtist",
    "title": "https://schema.org/name"
  },
  "@type": "song",
  "@id": "https://www.wikidata.org/wiki/Q62587323",
  "title": "Old Town Road",
  "artist": {
    "@id": "https://www.wikidata.org/wiki/Q62591281",
    "@type": "Person",
    "name": "Montero Lamar Hill",
    "alternateName": "Lil Nas X"
  }
}

Here, we've added a little more information about the artist, including an @id (in the form of a Wikidata URL), the artist's stage name, and his given name.

The additional information after the "artist" key is nested—the curly braces mean that this is a separate, new JSON object. Nesting in this way makes it easy for humans to understand, but can take longer for machines to parse, since they have to separate out the individual JSON objects.

To make their job easier, we can run this code through a flattening algorithm. This algorithm separates out the individual JSON objects. The inputs are a nested document, like the one above, and, optionally, a new @context. If you want the output to immediately be machine-readable, don't include a new context. If you don't want the context to change, you can just copy the context you've already written.

The output of the algorithm, with no @context added, will look like this:

{
  "@graph": [
    {
      "@id": "https://www.wikidata.org/wiki/Q62587323",
      "@type": "https://schema.org/MusicRecording",
      "https://schema.org/name": "Old Town Road",
      "https://schema.org/byArtist": {
        "@id": "https://www.wikidata.org/wiki/Q62591281"
      }
    },
    {
      "@id": "https://www.wikidata.org/wiki/Q62591281",
      "@type": "https://schema.org/Person",
      "https://schema.org/alternateName": "Lil Nas X",
      "https://schema.org/name": "Montero Lamar Hill"
    }
  ]
}

This is a flattened document. Although this file is flattened, and the JSON objects separated, you can see that they still link together: The @id of the "byArtist" key in the first object is the same as the @id of the second object!

You might have noticed that @graph was added to the beginning of the new file. This just means that multiple, non-hierarchical JSON items are in this file. The syntax is:

{
  "@graph": [
    {
    JSON object 1...
    },
    {
    JSON object 2...
    },
    {
    JSON object 3, etc...
    }
  ]
}

Notice that there are square brackets after the @graph, instead of curly braces, since this is an array of JSON objects.

Framing

The opposite of flattening is framing. A framing algorithm can take a flattened document and a "frame document" as an input, and output a framed document (be careful with the syntax here: a "frame document" is not the same as a "framed document," as you will see below).

Let's start by adding a little to our flattened document. We'll add the @context section back, too:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "song": "https://schema.org/MusicComposition",
    "artist": "https://schema.org/provider",
    "title": "https://schema.org/name"
  },
  "@graph": [
    {
      "@id": "https://www.wikidata.org/wiki/Q62587323",
      "@type": "song",
      "title": "Old Town Road",
      "inAlbum": {
        "@id": "https://www.wikidata.org/wiki/Q64220899"
      },
      "artist": {
        "@id": "https://www.wikidata.org/wiki/Q62591281"
      }
    },
    {
      "@id": "https://www.wikidata.org/wiki/Q62591281",
      "@type": "Person",
      "alternateName": "Lil Nas X",
      "title": "Montero Lamar Hill"
    },
    {
      "@id": "https://www.wikidata.org/wiki/Q64220899",
      "@type": "MusicAlbum",
      "name": "7"
    }
  ]
}

This time we added some information about the album the song is found in. Now we have three separate JSON objects: One for the song, one for the artist, and one for the album. For a machine, this makes perfect sense, since the song is linked with both the artist and album by URIs. But for a human, it can be a little hard to link together.

You can fix this easily by using a frame document and the framing algorithm. The frame document defines a structure that can be imposed on the original JSON-LD document. Here, since we want both artist and album nested under the song, our frame document might look like this:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "song": "https://schema.org/MusicComposition",
    "artist": "https://schema.org/provider",
    "title": "https://schema.org/name"
  },
  "@type": "song",
  "inAlbum": {
    "@type": "MusicAlbum"
  },
  "artist": {
    "@type": "Person"
  }
}

Notice that the frame document still has a @context section. The @context of the frame document can be the same as or different from the original document.

What the framing algorithm will do is link objects together by their @id. If the "song" object has an "inAlbum" key with an URI for an value, the algorithm will look for an object with the same @id with a @type of "MusicAlbum." The algorithm will not work if the @ids don't match or if the object is of the incorrect @type.

The framing algorithm can nest multiple objects of the same @type—for example, you can connect multiple albums or multiple artists in the resulting framed document.

When we run the original document and the frame through the framing algorithm, we get the following output:

{
  "@context": {
    "@vocab": "https://schema.org/",
    "song": "https://schema.org/MusicComposition",
    "artist": "https://schema.org/provider",
    "title": "https://schema.org/name"
  },
  "@graph": [
    {
      "@id": "https://www.wikidata.org/wiki/Q62587323",
      "@type": "song",
      "inAlbum": {
        "@id": "https://www.wikidata.org/wiki/Q64220899",
        "@type": "MusicAlbum",
        "title": "7"
      },
      "title": "Old Town Road",
      "artist": {
        "@id": "https://www.wikidata.org/wiki/Q62591281",
        "@type": "Person",
        "alternateName": "Lil Nas X",
        "title": "Montero Lamar Hill"
      }
    }
  ]
}

This is our framed document. It's framed with a D because it has had a frame imposed on it.

Keep in mind that this document and the original document in this section are ontologically identical. The only difference is that the first one is easier for machines to read, and the final is easier for humans. They can be transformed into each other using the framing and flattening algorithms with no information loss.

In the future, we hope to provide example frames to make it easy to impose structure on your flat MetaSat documents.