Fold multiple columns by pattern¶
This processor takes values from multiple columns and transforms them to one line per column, like the Fold multiple columns processor. It selects the columns to fold using a pattern, and if the pattern has a capture group, the captured portion of the column name is used instead of the full column name.
It is a variant of Fold multiple columns, where the columns to fold are specified by a pattern instead of a list.
The processor only creates lines for non-empty columns.
For example, using “tag_(.*)” as column to fold pattern :
name |
n_connection |
tag_1 |
tag_2 |
tag_3 |
---|---|---|---|---|
Florian |
16570 |
bigdata |
python |
puns |
becomes
name |
n_connection |
tag |
rank |
---|---|---|---|
Florian |
16570 |
bigdata |
1 |
Florian |
16570 |
python |
2 |
Florian |
16570 |
puns |
3 |
With capture groups¶
Another example: with the following dataset representing quarterly scores:
person |
age |
Q1_score |
Q2_score |
Q3_score |
---|---|---|---|---|
John |
24 |
3 |
4 |
6 |
Sidney |
31 |
6 |
9 |
|
Bill |
33 |
1 |
4 |
Applying the “Fold multiple columns by pattern” processor with a pattern “.*_score” will generate the following result:
person |
age |
quarter |
score |
---|---|---|---|
John |
24 |
Q1_score |
3 |
John |
24 |
Q2_score |
4 |
John |
24 |
Q3_score |
6 |
Sidney |
31 |
Q2_score |
6 |
Sidney |
31 |
Q3_score |
9 |
Bill |
33 |
Q1_score |
1 |
Bill |
33 |
Q3_score |
4 |
When the pattern contains a capture group, the captured portion of the folded column’s name is used. On the same dataset, using the pattern “(.*)_score” would produce:
person |
age |
quarter |
score |
---|---|---|---|
John |
24 |
Q1 |
3 |
John |
24 |
Q2 |
4 |
John |
24 |
Q3 |
6 |
Sidney |
31 |
Q2 |
6 |
Sidney |
31 |
Q3 |
9 |
Bill |
33 |
Q1 |
1 |
Bill |
33 |
Q3 |
4 |
For more details on reshaping, please see Reshaping.
Smart Pattern¶
You can get help from Smart Pattern to write your regular expression. Click on ‘Find with Smart Pattern’.
In the ‘Smart Pattern’ window, you can highlight the portion of the column name that you wish to use. To use a pattern in the processor, select it and click on ‘OK’.