Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I need to develop a Deep Learning model on sequential data. My dataset has two features Column-1 and Column-2. Both these columns have sequential data. Data in these columns exist in the form of a list., where values of the list indicate chronological sequence. Please check the reference image below for clarity.
For example, for the first record, in Column-1, [a,b,c,d,e,f] refer to the values at six-time points.
I need to prepare the data for an LSTM model, for which I already have a Python function that receives the Dat6aFrame as input, transforms both Column-1 and Column-2, and returns a 3-dimensional Numpy array of shape (num_samples, time_sequence, num_features).
For the example dataset above, one-hot encoding of Column-1 creates 6 columns, and so the final Numpy array would have shape (3,6,7).
I have the following questions:
Operating system used: Red Hat Enterprise Linux
That's correct, the transform method must return either a pandas DataFrame or a 2-D numpy array or scipy.sparse.csr_matrix containing the preprocessed result. A single processor may output several numerical features, corresponding to several columns of the output, however, having the possibility to return 3-D array is currently a feature request. Please see the following document for details: https://doc.dataiku.com/dss/latest/machine-learning/features-handling/custom.html#implementing-a-cus...
Please let us know if you have any further questions.