[
{
"id_source":"data",
"title":"data",
"action":"data",
"dimension":"data",
"date":"data",
"user":{
"phone":"data"
}
},
{
"id_source":"data",
"title":"data",
"action":"data",
"dimension":"data",
"date":"data",
"user":{
"phone":"data"
}
}
]
Hey @luis.machado,
You can do a lot of processing using SQL. See this as an example:
SELECT parsed[1].user.phone FROM
(SELECT CAST (json_parse(json) AS ARRAY(ROW(idSource VARCHAR, title VARCHAR, action VARCHAR, dimension VARCHAR, date VARCHAR, user ROW(phone VARCHAR)))) AS parsed
FROM (VALUES '[' ||
'{"id_source":"data","title":"data","action":"data","dimension":"data","date":"data","user":{"phone":"data1"}},' ||
'{"id_source":"data","title":"data","action":"data","dimension":"data","date":"data","user":{"phone":"data2"}}' ||
']'
) t(json)
);
This will return “data1”.
While this is good for prototyping, I would always recommend external cleaning. This type of processing at scale is going to be slooooooooooowwwww.
Also if you’re asking about directly reading from a json file, that’s not possible. You would need to have Trino parse the entire json file and bypass the serde, which is not scalable. i.e. Doing so is not a permanent solution or one that I would deploy in any production environment.
But it’s good when exploring data like this to play around with the data.