Redshift – How to flatten JSON data in redshift

input

[a,b,c,d]
[b,c]
[f,g,h]

output

a
b
c
d
b
c
f
g
h

 

with NS AS (

select 1 as n union all

select 2 union all

select 3 union all

select 4 union all

select 5 union all

select 6 union all

select 7 union all

select 8 union all

select 9 union all

)

select SPLIT_PART(input,‘,’, NS.n) AS output from NS

join bi_userdb.test on NS.n <= REGEXP_COUNT(input, ‘,’) + 1

 

Note –

Number Sequence (NS) is a CTE that return a list of number from 1 to n. hence we need to make sure that number in NS is always grater than count of value in input column.

As i was analyzing google analytics data for car selling e-commerce company; I was not sure how much records is going to be in each column hence I have created a number sequence table sequence table called vd_seq_200 ( it has 200 value in it ) – it is based on maximum listing on e-commerce site.

select * from userdb_test.vd_seq_200 limit 10

0
1
2
3
4
5
6
7
8
9
10

Also, data was in JSON format.

Here is solution for JSON data

SELECT
JSON_EXTRACT_ARRAY_ELEMENT_TEXT(makeid, seq.seq) AS makeid
, JSON_EXTRACT_ARRAY_ELEMENT_TEXT(modelid, seq.seq) AS modelid
, JSON_EXTRACT_ARRAY_ELEMENT_TEXT(trimid, seq.seq) AS trimid
FROM user_testdb.f_ga_daily_test t1, user_testdb.vd_seq_200 AS seq
WHERE seq.seq < JSON_ARRAY_LENGTH(makeid) limit 100

Input record –

makeid modelid trimid
[“2227″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2232″,”2232″,”2227″,”2232″,”2227″,”2227″,”2227”] [“2258″,”2258″,”2258″,”2258″,”2258”] [“2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277”]
[“2227″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2232″,”2232″,”2227″,”2232″,”2227″,”2227″,”2227”] [“2258″,”2258″,”2258″,”2258″,”2258”] [“2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277”]
[“2227″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2232″,”2232″,”2227″,”2232″,”2227″,”2227″,”2227”] [“2258″,”2258″,”2258″,”2258″,”2258”] [“2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277”]
[“2227″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2232″,”2232″,”2227″,”2232″,”2227″,”2227″,”2227”] [“2258″,”2258″,”2258″,”2258″,”2258”] [“2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277”]
[“2227″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2232″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2227″,”2232″,”2232″,”2227″,”2232″,”2227″,”2227″,”2227”] [“2258″,”2258″,”2258″,”2258″,”2258”] [“2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277″,”2277”]

 

OUTPUT

 

makeid modelid trimid
2227 2258 2277
2232 2258 2277
2232 2258 2277
2232 2258 2277
2232 2258 2277
2232 2277
2232 2277
2232 2277
2232 2277

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s