Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Sharepoint Plugin and DataType issue

Solved!
indy2005
Level 3
Sharepoint Plugin and DataType issue

Hi,

I am reading in data from Sharepoint.  The number is coming over as a % formatted decimal, so 94.5%, and other numbers are coming over formatted as numbers with thousand separators.

No matter what I set in the schema of the dataset, when I read in as a dataframe, I am getting a lot of  (i.e. all) NANs indicating that it isnt able to convert the value from SPOL to a decimal.  The dtypes of the data frame is indicating its a float64, but it is obviously not able to read from the data set definition the formatted values and convert these, meaning everything is NaN.

0 Kudos
1 Solution
AlexT
Dataiker
Dataiker

Hi @indy2005,

In this case, you can manually set the schema to string in your Sharepoint input dataset.

Then use infer_with_pandas=False in the get_dataframe()  when reading the data frame in Python and format your columns in your python code.  https://www.w3schools.com/python/ref_string_format.asp

So to convert 30% to 0.3 you can do :

df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0

https://www.w3schools.com/python/ref_string_format.asp

Let me know if that helps!

 

View solution in original post

0 Kudos
1 Reply
AlexT
Dataiker
Dataiker

Hi @indy2005,

In this case, you can manually set the schema to string in your Sharepoint input dataset.

Then use infer_with_pandas=False in the get_dataframe()  when reading the data frame in Python and format your columns in your python code.  https://www.w3schools.com/python/ref_string_format.asp

So to convert 30% to 0.3 you can do :

df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0

https://www.w3schools.com/python/ref_string_format.asp

Let me know if that helps!

 

0 Kudos