Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on September 7, 2022 1:50PM
Likes: 0
Replies: 2
Hi,
I'm trying to find a way to filter a data set using a separate list of items.
Here's the scenario:
My company is a wholesale distributor of products to retail stores. I'm analyzing sales data which contains customer (store), and details about the products sold to them, including brand, category, etc.
Some brands we sell are of particular interest to us, and others are being phased out, or exist only in historical records.
I would like to filter my dataset of all these purchases by a list of brands that we care about.
I wondered if there might be a way to establish a sort of array variable that I could update from time to time, but I couldn't find any documentation about using variables this way.
I could also upload a separate dataset containing the list of brands. But from there I'm not sure what recipe would help me utilize the list of "focus" brands to filter the larger dataset.
Anyone have any guidance on this?
Operating system used: MacOS Monterey v12.4
Hi @wjkelly
,
If the list is long, and you have it on a text file to start with, it might be a good idea to upload the file as a new dataset, and then do an inner join with the dataset you want to filter, using the "Join with" recipe. When the brands change, you just need to change the text file, and update it on your project.
But also, you could use a solution with variables:
(in the last screenshot I used the column "Band" as the one that needed to be filtered by the "brands" array... sorry for the typo).
Hope this helps!
@Ignacio_Toledo
Thanks for that. The inner join did the trick really quick. I'll also explore the variable stored as a json array.