Fold multiple columns by pattern

This processor takes values from multiple columns and transforms them to one line per column, like the Fold multiple columns processor. It selects the columns to fold using a pattern, and if the pattern has a capture group, the captured portion of the column name is used instead of the full column name.

It is a variant of Fold multiple columns, where the columns to fold are specified by a pattern instead of a list.

The processor only creates lines for non-empty columns.

For example, using “tag_(.*)” as column to fold pattern :

name

n_connection

tag_1

tag_2

tag_3

Florian

16570

bigdata

python

puns

becomes

name

n_connection

tag

rank

Florian

16570

bigdata

1

Florian

16570

python

2

Florian

16570

puns

3

With capture groups

Another example: with the following dataset representing quarterly scores:

person

age

Q1_score

Q2_score

Q3_score

John

24

3

4

6

Sidney

31

6

9

Bill

33

1

4

Applying the “Fold multiple columns by pattern” processor with a pattern “.*_score” will generate the following result:

person

age

quarter

score

John

24

Q1_score

3

John

24

Q2_score

4

John

24

Q3_score

6

Sidney

31

Q2_score

6

Sidney

31

Q3_score

9

Bill

33

Q1_score

1

Bill

33

Q3_score

4

When the pattern contains a capture group, the captured portion of the folded column’s name is used. On the same dataset, using the pattern “(.*)_score” would produce:

person

age

quarter

score

John

24

Q1

3

John

24

Q2

4

John

24

Q3

6

Sidney

31

Q2

6

Sidney

31

Q3

9

Bill

33

Q1

1

Bill

33

Q3

4

For more details on reshaping, please see Reshaping.

Smart Pattern

You can get help from Smart Pattern to write your regular expression. Click on ‘Find with Smart Pattern’.

In the ‘Smart Pattern’ window, you can highlight the portion of the column name that you wish to use. To use a pattern in the processor, select it and click on ‘OK’.