Issue with Rendering Images in HTML from Temporary Folder path in Dataiku
Hi everyone,
I am facing issue while trying to render images in a HTML file that i have generated from a word document using mammoth library in Python in dataiku. Here's what i have done so far:
a) Extracted images from the word document and saved them in to a managed folder in dataiku named "images" as i didn't want to go with base64 embedding approach.
b) Converted docx file to html using mammoth library in python
c) Downloaded the images from the managed folder to a temporary folder using Python
d) Updated the <img src> tags in my html file to reference the images using the temporary folder's path
e) When published to wiki in dataiku, the html file is rendering the text correctly but the images are not showing up . Instead, i see a placeholder or broken image icon.
I suspect it has something to do with how i am referencing the images in the <img src> tag or the accessibility of the temporary folder .
Code to download images from managed folder to temp folderimport tempfile
temp_dir = tempfile.mkdtemp()
folder = dataiku.Folder("Shared Resources")
files = folder.list_paths_in_partition()
image_files = [file for file in files if file.startswith("/images")]
print(image_files)
for file in image_files:
with folder.get_download_stream(file) as stream:
file_path = os.path.join(temp_dir,os.path.basename(file))
print(file_path)
with open(file_path,"wb") as local_file:
local_file.write(stream.read())
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content,"html.parser")
for idx,img_tag in enumerate(soup.find_all("img")):
if idx < len(image_files):
img_tag['src'] = os.path.join(temp_dir,os.path.basename(image_files[idx]))
project = dataiku.api_client().get_project("Model Resources")
wiki_page_name = "wiki_demo"
try:
# Try to retreive the page if it exists
wiki_page = project.get_wiki().get_article(wiki_page_name)
print("Updating existing wiki page.")
except:
# If the page doesn't exist, Create it
wiki_page = project.get_wiki().create_article(wiki_page_name)
print("Creating a new wiki page")
article_data = wiki_page.get_data()
article_data.set_body(soup.prettify())
article_data.save()
Has anyone encountered a similar issue or has suggestions on how to ensure the images are rendered correctly in the browser?
Any help or pointers would be greatly appreciated
Thanks in advance