Error with Tensorflow & GPU
I been trying to use my GPU (RTX 3090) to run some Tenforflow models; I tried different environment also with Conda and I have installed and reinstalled a few times CUDA 10 & cuDNN7 without much success.
I do see data loading into the GPU memory but no calculation and then following error:
Failed to train : <class 'tensorflow.python.framework.errors_impl.InternalError'> : 2 root error(s) found. (0) Internal: Blas GEMM launch failed : a.shape=(100, 64), b.shape=(64, 64), m=100, n=64, k=64 [[{{node dense_2/MatMul}}]] (1) Internal: Blas GEMM launch failed : a.shape=(100, 64), b.shape=(64, 64), m=100, n=64, k=64 [[{{node dense_2/MatMul}}]] [[Mean/_53]] 0 successful operations. 0 derived errors ignored.
I would appreciate some support to get the GPU running.
Best Answer
-
Hi @CoreyS
, I eventually managed to fix this. It seams to be complexity with the RTX30XX cards.In case someone else have similar issues, this is the guidance I followed:
Below is a screenshot of the packages installed using a Conda environment.
In addition I had to do a manual downgrade of h5py (with pip) as by default the installation was taking a higher one which have some issues.
Answers
-
CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭Hi, @Wave
! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if you’ve tried any fixes already?This should lead to a quicker response from the community. -
CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭Thank you for sharing this with everyone!