Querying dynamic json array

Hi there,
We’re having trouble reading json and hoping for some assistance. We have some very large customer profile data files in json format. Within the json are several arrays such as for contact details. For example a json element “contact_details” which contains an array of contact data such as
[{“context”:“personal”, “channel”:“mail”, “address”:“123 blob street”}
,{“context”:“personal”, “channel”:“phone”, “number”:“+99012345678123”}
,{“context”:“business”, “channel”:“email”, “emailaddress":"blob.blab@dribble.com”}]

Our goal is to ingest the json data to a MemSQL table with a row per customer, for example containing columns of:

  • first_name,
  • last_name,
  • email address
  • email_context

The details such as first_name, last_name are pretty easy to reference using JSON key paths, but when it comes to the contact_details array I need to reference it with an index number for the respective element of the array. eg in the example the email detail is in the 2-th element of the array so we would need a key path such as contact_details::2::email_address.
However for another customer profile the email address might be the 0-th element or the 4-th element depending on how many contact details they have.

So our question is: is there a way to query the json data in MemSQL to find the element of the array that has “channel” = “email” and then reference the emailaddress value from that element?

Thanks

You can accomplish this with the following SQL (assuming a table t with a json column a):

select tab.table_col::$email from t join table(json_to_array(t.a)) tab where tab.table_col::$context = 'personal' and tab.table_col::$channel = 'email'

Hi Rob, Thank you very much for the suggestion. We tried it out, but realised that the table function is not available to us as we are on v7.0.12. I’ll look at upgrading soon, but meanwhile is there an alternate to using the table function?

There are very manual ways to accomplish this on 7.0, perhaps via a UDF. However, I’d recommend upgrading as it will be much easier with the TABLE function.