Extract substring of specific values with JQ
29 Oct 2014 in TIL
This is probably something very specific to my use case, but I was working with some JSON that looked something like this:
json
{"something": {"Identifying Key": [{"foo": "a.b.c","bar": "First Three"},{"foo": "a.b.d","bar": "Second Three"}],"Another Key": [{"foo": "z.b.c","bar": "First Three, Take Two"},{"foo": "z.b.d","bar": "Second Three, Take Two"}]}}
From this, I wanted to extract everything before the first period in every instance of the foo
key. Like I said, very specific use case.
To do it, I built the following jq expression.
bash
cat data.json | jq -r '.something | map(.[].foo | split(".")[0]) | unique | join("\n")'
Breaking that down:
bash
# Select the top level namespacecat data.json | jq -r '.something'# Map over everything as key => valuecat data.json | jq -r '.something | map(.)'# From each of those elements, step in to each arraycat data.json | jq -r '.something | map(.[])'# And extract the foo keycat data.json | jq -r '.something | map(.[].foo)'# Inside the map, use "split" to split the string on a "."cat data.json | jq -r '.something | map(.[].foo | split("."))'# And only select the first element in the returned arrayscat data.json | jq -r '.something | map(.[].foo | split(".")[0])'# We only want to know about unique keyscat data.json | jq -r '.something | map(.[].foo | split(".")[0]) | unique'# And we want it as a string, not an arraycat data.json | jq -r '.something | map(.[].foo | split(".")[0]) | unique | join("\n")'
And with that, I have a unique list of values extracted from a specific key in an object. As I mentioned, it's very specific to my use case but it might help someone in the future