Submit your use case or success story to the 2023 edition of the Dataiku Frontrunner Awards ENTER YOUR SUBMISSION

Scraping Pyspark jobs without input data sets

zjacobs23
Level 1
Scraping Pyspark jobs without input data sets

Hello,

I am currently creating PySpark jobs that do not have defined input data sets within my Pyspark notebooks. The tables are executed with spark sql within the actual notebook itself. I am wanting to see if there is a way to access all the tables executed within the spark sql across multiple projects. See screen shot below. The 'SELECT * FROM DB.TABLE' is where I am trying to grab the data sets being used. As you can see in the screen shot there is no inputs within the Pyspark notebook.

 

0 Kudos
0 Replies