Writing null value to MongoDB
How do I write null value into MongoDB? It seems like DSS is only writing it as string. I am using a python script and sending python ‘None’ into panda data frame. Mongo DB will not register the key if it’s an empty string, it needs it to be a null data type
Answers
-
Hi @Desmond
,Indeed, it's not possible to store
null
values in a MongoDB document using a DataFrame. The PythonNone
values are considered as missing attributes accordingly to this NoSQL specific allowance.However, if your column is numerical, you can force writing a
null
value by setting it toNaN
. To do so you need to use a DatasetWriter.Here a code sample to do so:
import dataiku
import numpy as np
dataset = dataiku.Dataset("dataset_name")
dataset.write_schema([{'name' : 'name', 'type' : 'string'},{'name' : 'age', 'type' : 'bigint'}], dropAndCreate=True)
writer = tmp_null_value.get_writer()
writer.write_row_array(['Amy', 52])
writer.write_row_array(['Hannah', np.nan])
writer.close()Otherwise, if you want to store a
null
value for a string column, you'll have to write them manually without going through the Dataiku API.The following code sample can help you, it uses the PyMongo Python package to write in a MongoDB database without authentication:
import dataiku
import pymongo
client = dataiku.api_client()
dataset = dataiku.Dataset("dataset_name")
conn = client.get_connection(dataset.get_config()['params']['connection'])
myclient = pymongo.MongoClient(host = conn.get_info()['params']['host'], port = conn.get_info()['params']['port'])
mydb = myclient[conn.get_info()['params']['db']]
mycol = mydb[dataset.get_config()['params']['collection']]
dataset.write_schema([{'name' : 'name', 'type' : 'string'},{'name' : 'age', 'type' : 'bigint'}], dropAndCreate=True)
mylist = [
{ "name": "Amy", "age": 52},
{ "name": "Hannah", "age": None},
{ "name": None, "age": 45},
{ "name": "Richard", "age": 31}
]
x = mycol.insert_many(mylist)
print('Inserted {} rows.'.format(len(x.inserted_ids)))Have a good day!