Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

Move File

Solved!
Kyest00
Level 2
Move File

Hi,

 

I am trying to move a file from SFTP to local folder. Tried through Download/Export Visuals, python but no success.
Do you know how can I do that that in the easiest way?

 

 

 


Operating system used: Windows


Operating system used: Windows


Operating system used: Windows


Operating system used: Windows

0 Kudos
1 Solution
tgb417
Neuron
Neuron

@Kyest00 

Glad to hear that you got things working.

I tend to create a connection for each of my SFTP servers.

Then I will typically create a managed folder using the SFTP connection. (See white Raw data folder below.)

From the managed data folder icon there is an option to create a dataset over in the right hand pull out a menu.  (see the blue raw_data folder below)

Then I tend to use a recipe to cleanup the dataset for analysis and at the same time I set the output of the recipe to a local PostgreSQL database server.

Shows a DSS flow with Managed Folder to startShows a DSS flow with Managed Folder to start

 

Be a little bit careful with this configuration, if you use a managed data folder and dataset with a SFTP connection and you have permissions to delete files off of the SFTP connection; when you delete the folder in Dataiku DSS you cam deleting the files on the SFTP server.  Use descression.  

--Tom

View solution in original post

5 Replies
Kyest00
Level 2
Author

Note that I am trying to create a ETL data flow, and this should be one of the steps

0 Kudos
tgb417
Neuron
Neuron

@Kyest00 

Welcome to the Dataiku Community.  So great to have you as part of the community.

I have run SFTP connections from Dataiku, so I know that this can be done, and I find it a very useful tool.  That said first time setup can be a bit tricky.  

In your  post you don’t say a lot about your SFTP server and Dataiku setups.  So, I’m going to be guessing here and providing general recommendations and pointers to documentation. 

You mention that you are on Windows. Dataiku DSS will not run directly on a windows computer.  It only runs on a Unix like operating system like Linux or Mac OS.  Therefor, I can guess that you are working from either a Virtual Machine VM, WSL, on-site hosted Dataiku DSS or Dataiku Online.  

Depending on the platform you are working from. If I were trying to make this work.  I’d first login to operating system running DSS, and becoming the Account Running Dataiku DSS.  Then I’d try to determine if I can connect to the SFTP site not from within Dataiku DSS.  If you are Runing DSS on a Virtual Machine Locally or have a Linux computer some where try to make a SFTP connection from ther on the account running Dataiku. In that situation I’d use some other SFTP tool that I’m familiar with like FileZilla to see if you can connect from the computer Linux OS running DSS to your SFTP site.  My guess is that you will run into some kind of problem. These could come from a variety of sources often related to network setup like firewalls. (Those firewall and network setups might be all within your local computer.) If you do not control the networks involve because you are either trying to connect to your internal SFTP site from a Dataiku hosted externally or visa versa an external SFTP site from an internal SFTP site you may need to bring others into the conversation.  Regardless you need to be able to confirm that the Linux like OS in which Dataiku runs can make a connection to your SFTP data.

Once that is done, you will take the credentials and other setup used to make that work and create a DSS connection under the Admin part of DSS.  Instructions can be found here. https://doc.dataiku.com/dss/latest/connecting/scp-sftp.html

Once that is done the SFTP connection acts very much like any other file based data source in DSS.  Maybe a little bit slow but well worth having as a tool.  

If you are using Dataiku online you might also want to submit a support ticket.  The support team can be very helpful.  

Hope that helps a bit.  If you would like further help from community members, we will likely need some further information about the setup.  (Before sharing such information do consider if it’s ok to share such information due to any confidentiality.). Otherwise, the Dataiku Support team may be a better group to discuss this issue.  

--Tom
0 Kudos
Kyest00
Level 2
Author

Hi Tom,

Thank you for your advice!
My bad for not providing enough details.

I managed to create the source dataset with the SFTP file and get data into Dataiku, I use the VM and Dataiku in browser.

My question is: how do I easier create a step that downloads the file from SFTP to my local computer?

 

 

 

 

tgb417
Neuron
Neuron

@Kyest00 

Glad to hear that you got things working.

I tend to create a connection for each of my SFTP servers.

Then I will typically create a managed folder using the SFTP connection. (See white Raw data folder below.)

From the managed data folder icon there is an option to create a dataset over in the right hand pull out a menu.  (see the blue raw_data folder below)

Then I tend to use a recipe to cleanup the dataset for analysis and at the same time I set the output of the recipe to a local PostgreSQL database server.

Shows a DSS flow with Managed Folder to startShows a DSS flow with Managed Folder to start

 

Be a little bit careful with this configuration, if you use a managed data folder and dataset with a SFTP connection and you have permissions to delete files off of the SFTP connection; when you delete the folder in Dataiku DSS you cam deleting the files on the SFTP server.  Use descression.  

--Tom
tgb417
Neuron
Neuron

@Kyest00 ,

Thanks for accepting my comments as a “solution”.  Glad to help.

That said I suspect that others may do thing in other ways.  I’d enjoying hearing others thoughts on this topic.  

--Tom
0 Kudos