New to Dataiku DSS? Try out our NEW Quick Start Programs today and get onboarded on the product in just one hour! Let's go

Error with Tensorflow & GPU

Solved!
Wave
Level 1
Error with Tensorflow & GPU

I been trying to use my GPU (RTX 3090) to run some Tenforflow models; I tried different environment also with Conda and I have installed and reinstalled a few times CUDA 10 & cuDNN7 without much success.

I do see data loading into the GPU memory but no calculation and then following error:

Failed to train : <class 'tensorflow.python.framework.errors_impl.InternalError'> : 2 root error(s) found. (0) Internal: Blas GEMM launch failed : a.shape=(100, 64), b.shape=(64, 64), m=100, n=64, k=64 [[{{node dense_2/MatMul}}]] (1) Internal: Blas GEMM launch failed : a.shape=(100, 64), b.shape=(64, 64), m=100, n=64, k=64 [[{{node dense_2/MatMul}}]] [[Mean/_53]] 0 successful operations. 0 derived errors ignored.

I would appreciate some support to get the GPU running.  

0 Kudos
1 Solution
Wave
Level 1
Author

Hi @CoreyS , I eventually managed to fix this. It seams to be complexity with the RTX30XX cards.

In case someone else have similar issues, this is the guidance I followed:

https://www.pugetsystems.com/labs/hpc/How-To-Install-TensorFlow-1-15-for-NVIDIA-RTX30-GPUs-without-d...

Below is a screenshot of the packages installed using a Conda environment.

In addition I had to do a manual downgrade of h5py (with pip) as by default the installation was taking a higher one which have some issues. 

Screenshot 2021-04-08 at 18.22.12.png

View solution in original post

3 Replies
CoreyS
Community Manager
Community Manager
Hi, @Wave! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if you’ve tried any fixes already?This should lead to a quicker response from the community.
Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
0 Kudos
Wave
Level 1
Author

Hi @CoreyS , I eventually managed to fix this. It seams to be complexity with the RTX30XX cards.

In case someone else have similar issues, this is the guidance I followed:

https://www.pugetsystems.com/labs/hpc/How-To-Install-TensorFlow-1-15-for-NVIDIA-RTX30-GPUs-without-d...

Below is a screenshot of the packages installed using a Conda environment.

In addition I had to do a manual downgrade of h5py (with pip) as by default the installation was taking a higher one which have some issues. 

Screenshot 2021-04-08 at 18.22.12.png

View solution in original post

CoreyS
Community Manager
Community Manager
Thank you for sharing this with everyone!
Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
0 Kudos
A banner prompting to get Dataiku DSS