Working With JSON Objects Using SQL
Recently, I have been teaching myself SQL to complete a project I have and one of the biggest problems I ran into was dealing with JSON objects in our companies data warehouse. I had no idea how to access the specific key pair values in the object, let alone access values within a JSON object wrapped in an array..
None of the StackOverflow threads or any online resource really helped or explained this concept in a way for me to understand. So after two-ish hours of playing around with my queries, I finally got it to work. In this small guide, hopefully I can explain this in a much more simple way for everyone to understand. Let's get to it!
The UNNEST Function
First it was essential for me to know about the UNNEST
function and what it allowed me to do. Simply put, if there are values that you need and they are nested in an array, the UNNEST
function allows us to pass in the array as an argument and lets us access these nested values. Let's see an example.
Say we have a column which is named values
and has a value that looks like this:
['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred']
If we wanted to get each individual value from this array, we would write a SQL query that looks like this:
SELECT *
FROM UNNEST(values) AS flattened_values;
This query would then give us a new column named flattened_values
with each individual string in its own row like so:
Unnesting an Array of Objects
Easy enough right? Now that we understand that, lets move into unnesting an array of objects.
Lets use a similar example but this time, some of those strings are going to be in their own objects, wrapped in an array like so:
[{"name": "foo", "last_name": "bar"}]
In this case, we would need to make use of the UNNEST
function as well as the JSON_EXTRACT_SCALAR
function. Both of these functions working together help us reach into the array and extract a particular value from the object based on its key.
First we will need to use the UNNEST
function to access the array and set that as a variable using the AS clause.
SELECT *
FROM UNNEST(names) AS flattened_names
Now that we have the name of the column that contains the array, which is set to flattened_names
, we can select specific keys and grab their values using the JSON_EXTRACT_SCALAR
function like so:
SELECT
JSON_EXTRACT_SCALAR(flattened_names, '$.name') as first_name,
JSON_EXTRACT_SCALAR(flattened_names, '$.last_name') as last_name
FROM UNNEST(names) AS flattened_names
So what is exactly happening here? Well since we have the array set to the name of flattened_names
we can access it through the JSON_EXTRACT_SCALAR
function. This function extracts a scalar value which could be a string, number, or boolean. The first parameter we give the function is the array in which we want it to look at. In our case, we want it to look into the flattened_names
array. The second parameter is going to be the name of the key whose value we want to extract. We prefix the key name with a "$.{key_name_here}". This effectively would be equivalent to accessing this key in an object with syntax like flattened_names.name
.
Now that we've accessed the array, and extracted the individual values, running the query would then result in this:
Conclusion
So there you have it! My super simple run-down of something I learned in SQL and hopefully that helped some of you out there. If there is anything you wanna discuss on this, please post em down in the comments and if you liked this, make sure to give it a reaction ☺️
Hit me up on Twitter to stay up-to-date on when a new blog post comes out! Thanks for reading.