Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello Community,
I need a help in regards to one scenario, where in we will be updating the csv/excel file in SFTP server and that updated file has to uploaded to Dataiku on daily basis, what is the way to achieve this.
Any help would be much appreciated.
Thank you.
Hi @vinayk,
OK, so, three options:
import dataiku
​import paramiko
import pandas as pd
client_sftp = paramiko.SSHClient()
client_sftp.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client_sftp.connect('host.name',
username='user',
password='password')
sftp = client_sftp.open_sftp()
# if the file is csv:
tmp_df = pd.read_csv(sftp.open('path_to_the_csv_file/filename.csv'))
result_dataset = dataiku.Dataset("result_dataset")
result_dataset.write_with_schema(tmp_df)
Hope this helps!
Hi @vinayk,
Is it a single file or multiple files? Does the name change?
Using the download recipe might be a good option, but somehow I do prefer to use a python recipe to do the trick.
While you give us some more details, maybe you can also find useful information in this post: Import dynamic dataset from SFTP
Cheers!
Thank you @Ignacio_Toledo, that is a single file and name remains same.
Hi @vinayk,
OK, so, three options:
import dataiku
​import paramiko
import pandas as pd
client_sftp = paramiko.SSHClient()
client_sftp.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client_sftp.connect('host.name',
username='user',
password='password')
sftp = client_sftp.open_sftp()
# if the file is csv:
tmp_df = pd.read_csv(sftp.open('path_to_the_csv_file/filename.csv'))
result_dataset = dataiku.Dataset("result_dataset")
result_dataset.write_with_schema(tmp_df)
Hope this helps!
Thank you @Ignacio_Toledo , this helps.