Get all keys in a list of objects with JQ

18 Aug 2021 in TIL

tl;dr: jq -n 'inputs[] | keys[] | unique' input.json

I recently had a set of JSON documents that I needed to insert into a database to analyse. To do so, I needed to know all of the possible key names to build a schema that the data would fit in to.

My data set was a few thousand entries, but they looked something like the following, where a field was only present if it had a value:

json
[
{
"name": "Michael"
},
{
"hello": "world",
"name": "Alice"
},
{
"hello": "everyone",
"name": "Bob"
}
]

In order to build my schema I needed to know that the keys name and hello existed in the documents.

jq reads input automatically by default, but I did not want that to happen as I needed to perform some manipulation on the whole input, so I needed to run jq -n which sends null as the first input.

This allowed me to run jq -n "inputs" to output my JSON input. After that it was a case of building up my jq expression as needed:

  • jq -n "inputs[]" to read all JSON objects and unwrap the outer array
  • jq -n "inputs[] | keys[]" to fetch the keys from each input and unwrap that result too
  • jq -n '[inputs[] | keys[]] | unique' to wrap all keys in a list and show all the unique values

Putting it all together, you get the following:

bash
jq -n 'inputs[] | keys[] | unique' input.json