Using Dataiku

Sort by:

71 - 80 of 142

Odata plugin bring back nulls as None
Hi – we are using the OData - protocol version 3.0) to connect to a dataset that our vendor maintains. version 4.0 does not work. When the dataset is viewed in PowerBI fields will show nulls but when …
Answered ✓
Datasets
Connections
Started by aw30
Most recent by aw30
Feb 3, 2022
2
2
Solution by Clément_Stenac
Hi,
Thanks for your feedback. We will look into possibly changing this behavior. Currently, it cannot be configured, so you will need to remove the None afterwards, with a prepare recipe.
Reply to Discussion
"Finger Printing" files in a Managed Folder
I have a managed folder out on an SFTP Dataiku connection with lots of files. (Hundreds of Thousands to Millions of files.) I'm able to open the connection and get basic file details. #... input_folde…
Answered ✓
code
Python
Metrics & checks
Started by tgb417
Most recent by tgb417
Dec 23, 2021
0
3
Solution by Clément_Stenac
Hi,

You'll want to compute a digest on the fly from the stream, as such:

import hashlib digest = hashlib.md5() with folder.get_download_stream("myfile") as stream: while True: block = stream.read(4096) if len(block) == 0: break digest.update(block) print(digest.hexdigest())
Reply to Discussion
Python recipe with Partitioned input and output
Hi Greetings... I am trying to connect a Python recipe with multiple inputs among which couple of them has to be partitioned so that I can do some transformation on a part of data and then write it to…
Question
Datasets
Python
Partitioning
Started by Sajid
Most recent by Jurre
Dec 18, 2021
0
3
Last answer by
Reply to Discussion
SFTP Site with .Zip files (with more than just data in the .zip file)
I'm receiving data from an external partner. They have setup an SFTP file server for me to get the data. They .zip the .tsv files that I'm expecting. However, they also add other documents in the .zip…
Question
Data types
Datasets
Scenarios
Started by tgb417
Most recent by tgb417
Nov 24, 2021
0
4
Last answer by
Reply to Discussion
Copy data from a MySQL database to Vertica
Hi, I explain you a little bit my problem. The data I use come from a MySQL Database where I have a read-only access. For my work, I use a Vertica Database. The first operation is to copy the data fro…
Answered ✓
code
Data types
Datasets
Started by Romain_L
Most recent by Marlan
Nov 3, 2021
0
6
Solution by
Reply to Discussion
Querying multiple db's in same query with SQLExecutor2
Hi, I am trying to use SQLExecutor2 in python to pull in a dataset from a query. The query uses multiple tables from different databases as the example below. Even though I am specifying the DB in the…
Question
Python
Connections
SQL databases
Started by ckilduff
Most recent by Alexandru
Oct 12, 2021
0
1
Last answer by
Reply to Discussion
Reading a file with 40M+ records using PySpark
Hello, I'm trying to read a file having 40M+ records and around 70 columns. After reading the file, when I'm trying to display the record count using df.count() method, it is taking loads of time to e…
Question
Datasets
Connections
Started by rbhattacharya
Most recent by rbhattacharya
Aug 11, 2021
0
2
Last answer by
Reply to Discussion
Run receipe
Hi Team, I have download a file from AWS S3, Created a recipe ( rename and add new column ) , getting an error while executing the script, com.dataiku.dip.datasets.fs.HTTPDatasetHandler cannot be cast…
Question
Cloud storage
Connections
Started by fornanthu
Most recent by Manuel
Jul 7, 2021
0
3
Last answer by
Reply to Discussion
Use variable value in processing and both variable name
Hi, I have user case where I have defined a variable like PHONE: "'iPhone','Samsung Galaxy'" I am writing a case statement like when device type in (S{PHONE}) then 'PHONE' I want to use the variable n…
Question
Connections
Started by davidlanl
Most recent by CoreyS
Jun 4, 2021
0
1
Last answer by
Reply to Discussion
Application-as-recipes and SQL scripts reusable for different connections
While turning a flow of SQL script recipes into an application-as-recipe we encountered some difficulties making it generalisable for new input SQL tables that have a different connection/schema/table…
Answered ✓
Scenarios
Connections
SQL databases
Started by pvannies
Most recent by CoreyS
Apr 28, 2021
1
2
Solution by
Reply to Discussion

71 - 80 of 1428

Trending Discussions

How to Automatically Create an Up-to-Date Dataset from Data Quality Rules ?
Answered ✓
1
How to visualize project variables in dashboard?
Answered
1
How to update data from database and run other processes using Scenario and pipeline with SQL DB con
Answered ✓
4

Leaderboard

Member	Points
Turribeach	3710
tgb417	2516
Ignacio_Toledo	1082

Using Dataiku

Top Tags

Trending Discussions

Leaderboard