Discover all of the brand-new features and improvements to existing capabilities in the Dataiku 11.3 updateLET'S GO

ChatGPT: your mind will explode!

Turribeach
Level 6
ChatGPT: your mind will explode!

If you haven't heard of ChatGPT, get ready to have your mind explode! This is chatbot that has been trained to respond to questions iusing natural language. The most interesting thing is that it understands code and APIs and can suggest code to you with an explanation. See sample below:

 

Screenshot 2022-12-13 at 2.09.20 pm.png

2 Replies
tgb417

Here is a little more code written for Dataiku by ChatGPT

# Import the necessary packages
import dataiku
import datetime
import os
import pandas as pd
import zlib

# Get the project variables as a dictionary
project_variables = dataiku.get_custom_variables()

# Access the value of the file path project variable
base_path = project_variables["base_path"]

# Expand the '~' character in the file path to the full path to your home directory
base_path = os.path.expanduser(base_path)

# List the names of all the files and directories in the directory
file_names = os.listdir(base_path)

# Create an empty list to store the full file paths, last modified dates, and calculated CRC values
file_crcs = []

# Iterate over the list of file names
for file_name in file_names:
# Create the full file path
file_path = os.path.join(base_path, file_name)

# Check if the path is a file
if not os.path.isdir(file_path):
# Get the last modified timestamp of the file
last_modified_timestamp = os.path.getmtime(file_path)

# Convert the last modified timestamp to a human-readable date and time
last_modified_date = datetime.datetime.fromtimestamp(last_modified_timestamp)

# Open the file in binary mode
with open(file_path, "rb") as file:
# Read the file in binary mode and calculate the CRC value
crc = zlib.crc32(file.read())

# Store the full file path, last modified date, and calculated CRC value in the list
file_crcs.append([file_path, last_modified_date, crc])

# Create a Pandas DataFrame from the list of file paths, last modified dates, and CRC values
df = pd.DataFrame(file_crcs, columns=["File Path", "Last Modified Date", "CRC Value"])

# Get the Dataiku dataset object for the 'File_CRC_Values' dataset
dataset = dataiku.Dataset("File_CRC_Values")

# Write the Pandas DataFrame to the Dataiku dataset, using the schema of the dataset
dataset.write_with_schema(df)
--Tom
0 Kudos
Turribeach
Level 6
Author

It's really amazing it can be trained on so many languages. Enjoy it while it lasts! I can't see it being free for too long, their infrastructure costs must be huge (they claim it cost them 10x the cost of a Google search query). So I think this will become a paid service. Having said that I wouldn't mind paying something for a service like this.

0 Kudos