This document describes a simple JSON Schema vocabulary that can be used to validate JSON text sequences as specified by RFC 7464.
It defines two keywords which allow applying a JSON schema to individual elements in a JSON text sequence (hereafter “sequence”) and producing an annotation of the element-wise result of application of the schema.
Though indeed the entirety of a sequence is not itself a standard JSON type (nor valid JSON), proscribed below is a loose stream
type whose implementation is left mostly to the specific language or implementation to further define.
The ID for this vocabulary is https://python-jsonschema.github.io/vocab-json-seq/
(the URI to this document).
A draft 2020-12 meta-schema which includes this vocabulary has been defined for convenience.
The $id
for the meta-schema is https://python-jsonschema.github.io/vocab-json-seq/meta.json
, and it can also be found at this address.
jsonseq
KeywordThe jsonseq
keyword is a JSON Schema annotation whose value MUST be a valid JSON Schema.
Applying the keyword to a stream instance MUST produce a single annotation result, itself a new stream. The contents of the annotation stream MUST be the corresponding result of applying the schema element-wise to each element of the sequence.
Validating an empty stream against this keyword (one which contains no elements) produces an empty stream annotation, as does validating a non-stream instance.
The jsonseq
keyword MUST be processed contextually in accordance with the draft of the schema in which it is used. For example, if jsonseq
is used in a schema that declares draft 2019-09, then its schema value must be processed using the rules specified by the 2019-09 specification.
Implementations MAY choose to offer support for notionally similar formats to JSON Text Sequence, such as jsonl
, which uses newlines instead of record separators but is otherwise quite similar.
If such support is present, implementations SHOULD use the same jsonseq
keyword to apply validation to streams containing jsonl
data.
The processing of the jsonseq
keyword, or truthfully of JSON Text Sequences themselves, depend on the abstract notion of a “stream”.
This document does not define the specific implementation of streams. A programming language or implementation with lazy iterable support SHOULD represent streams using this language feature.
Implementations MUST also consider JSON array
values to be streams for the purpose of the keywords defined in this vocabulary.
Schema authors who do not wish to allow array
valued instances are RECOMMENDED to use existing JSON Schema mechanisms to exclude them (e.g. {"not": {"type": "array"}}
).
The core JSON vocabulary does not allow external vocabularies to define additional data types via the type
keyword.
It does however allow for the application of JSON Schema to types beyond those provided by JSON.
A streamType
keyword is therefore introduced below, which can be used to assert a value is a stream in the sense defined here.
(Editor’s note: the definition of streamType
may be moved to a separate vocabulary in the future).
streamType
KeywordThe value of the streamType
keyword MUST be a boolean, or the value null
.
When true
, validation MUST succeed if the instance is a stream, and fail otherwise.
When false
, validation MUST fail if the instance is a stream, and succeed otherwise.
When null
, validation always succeeds.
Consider the following schema, utilizing the two keywords above, which asserts that elements of a sequence are objects whose foo
property is an integer at most 10:
{
"$schema": "https://python-jsonschema.github.io/vocab-json-seq/meta.json",
"streamType": true,
"jsonseq": {
"type": "object",
"properties": {
"foo": {
"type": "integer",
"maximum": 10,
}
}
}
}
For simplicity, we delimit elements of a JSON Text Sequence below using newlines. Consider the sequence:
{}
{}
{"foo": 12}
{"foo": 8}
{"foo": {}}
{"foo": 1}
{}
The aforementioned schema, when applied to this sequence, should produce a corresponding sequence of validation results:
true
true
false
true
false
true
true
Implementations of JSON Schema which do not support types beyond those present in JSON will undoubtedly not be able to implement this vocabulary easily.
In particular, implementation of this vocabulary requires a statically-typed implementation’s validate
entry point to operate on a union type (of JSON or stream), or requires equivalent language functionality.
An interesting note is that such possibilities may not be unique to this vocabulary, as any vocabulary introducing a new non-JSON-native type may change the signature of their validation methods.