Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
This is a mock data of my input :
This has to be split and transformed to this form :
Then have to search a databse, say "db1" and find the levels :
Then for the final output, data should look like :
Exclude column is just a yes or no..if there is any negative value, exclude is set to Y or else N
how can we achieve the solution? open to both python recipe solution as well as visual recipe solutions
Thanks in advance
Operating system used: windows
Operating system used: windows
Hi @info-rchitect ,
Not quite sure if i get your question right. But the column, for eg, account might have 2 or 3 values eperated by a ',' delimiter. that has to be split into seperate rows. these values will be given to us & just have to find the level from seperate databse & transform it into the final form
I would assume you don't know when CSV data will show up vs a single value in each of the three columns. This is valid Snowflake SQL that will accomplish the task:
create or replace temporary table mytable(rule varchar(50), account varchar(50), entity varchar(50), audit varchar(50)); insert into mytable values ('R1', 'a1,a2', 'e1', null), ('R2', null, 'e2', '-c2'); with account_data as ( select distinct rule, split(account, ',') as account_array from mytable ), account_data_flattened as ( select rule, 'Account' as dimension, to_char(a1.value) as value from account_data, table(flatten(account_data.account_array)) a1 ), entity_data as ( select distinct rule, split(entity, ',') as entity_array from mytable ), entity_data_flattened as ( select rule, 'Entity' as dimension, to_char(a1.value) as value from entity_data, table(flatten(entity_data.entity_array)) a1 ), audit_data as ( select distinct rule, split(audit, ',') as audit_array from mytable ), audit_data_flattened as ( select rule, 'Audit' as dimension, to_char(a1.value) as value from audit_data, table(flatten(audit_data.audit_array)) a1 ) select * from account_data_flattened union all select * from entity_data_flattened union all select * from audit_data_flattened;
I take that back, this is much better:
with unpivoted as ( select * from mytable unpivot(value for dimension in (Account, Entity, Audit)) ), formatted as ( select rule, dimension, split(value, ',') as value_array from unpivoted ) select rule, dimension, to_char(a1.value) as value from formatted, table(flatten(formatted.value_array)) a1;
and even better:
with unpivoted as ( select * from mytable unpivot(value for dimension in (Account, Entity, Audit)) ) select rule, dimension, records.value::varchar as value from unpivoted, lateral flatten (input => split(value, ','), outer => true) records;
@info-rchitect Thanks for the reply.
let me make it more clear
from this table, we have to transform the table into this form first:
after that, the table needs to expanded. all the columns, except Rule (this is just a mock data, in reality many columns will be present) will become under a new columns called Dimension, ie, dimension will contain account, entity, audit etc. The values a1,a2,e1 etc will come under new column named as value. so the output has to be like this :
Hope its clear
@shahas71 So the middle table will actually be used and people/processes will select data from it? Also, do you only want these manipulations done in Python and not on the database itself? We can use Python to manipulate the database if needed (what I typically prefer).
One more question: Why dos the middle table have to exist before the final table? It is much easier to use the SQL I posted above to make the final table first and then make the middle table from the final table.