Python import from Library

Erlebacher
Erlebacher Registered Posts: 82 ✭✭
I am new to Dataiku and am using the free version on my Macbook M1 laptop. I have created a python recipe and include a file from a library I created. The library has the following structure:
```
Python/
__init__.py
rankfm_ge/
__init__.py
codeA.py
codeB.py
```
Inside my Jupyter notebook I import `codeA.py` as follows:
``` python
import rankfm_ge.codeA.py
```
and I get and error because `codeA.py` imports `codeB.py` as follows:
``` python
import codeA
```
Of course, my actual code is more complex. How do I solve this problem? Any help is appreciated!
Operating system used: MacOSX 11.6 M1
Operating system used: MacOS 11.52, M1 chip
Tagged:

Best Answer

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited July 2024 Answer ✓

    Hi @Erlebacher
    ,

    To clarify, is this what your library structure looks like?

    image.png

    If so, you can import the libraries using the following code:

    # /python/rankfm_ge/codeA.py
    from . import codeB
    
    def some_function():
        # Do stuff with codeB
        codeB.foo()
    # Notebook
    from rankfm_ge import codeA
    
    # Do stuff with codeA
    codeA.some_function()

    `from . import codeB` is a relative import. For more information, see Package Relative Imports.

    Thanks,

    Zach

Answers

  • Erlebacher
    Erlebacher Registered Posts: 82 ✭✭

    Thank you! I have adopted your solution.

  • Erlebacher
    Erlebacher Registered Posts: 82 ✭✭
    edited July 2024

    HI @ZachM
    ,

    Your solution worked on Dataiku, but not in my terminal on my mac. With the configuration you give above, the statement in codeA.py:

    from . import codeB

    works in Dataiku, but not in the terminal. In the terminal, there "from ." is not necessary. This seems contrary to what I have read in several documents online. I am seeking a solution that works in both cases, if possible. Thanks for any insight.

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited July 2024

    Hi @Erlebacher
    ,

    The library works for me if I import it from outside of DSS. Here's how I'm running it:

    $ ls -R .
    .:
    rankfm_ge
    
    ./rankfm_ge:
    __init__.py  __pycache__  codeA.py  codeB.py
    
    ./rankfm_ge/__pycache__:
    __init__.cpython-310.pyc  codeA.cpython-310.pyc  codeB.cpython-310.pyc
    
    $ python
    Python 3.10.6 (main, Sep  8 2022, 18:07:02) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from rankfm_ge import codeA
    >>> codeA.some_function()
    hello, world
    >>>

    codeA.py is the same as in my previous post. It contains the relative import.

    Could you show me how you're running it from the terminal?

    Thanks, Zach

  • Erlebacher
    Erlebacher Registered Posts: 82 ✭✭

    Are you invoking python from inside the rankfm_ge folder? Recall that both codeA and codeB are in this folder and codeA is importing codeB.

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker

    I'm running Python from the parent directory of rankfm_ge.

  • Erlebacher
    Erlebacher Registered Posts: 82 ✭✭

    That is why it worked. In my case, both files are in the same folder.

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited July 2024

    Are you trying to run a file in the package directly? Files in packages (a directory that contains __init__.py) are meant to be imported, not run directly.

    If you want to run the package directly, you can add a __main__.py file to the package directory (rankfm_ge), and then run it using the following command:

    # Must be run from the parent dir of rankfm_ge
    python -m rankfm_ge

    This will execute the __main__.py file. The file can look like this:

    # rankfm_ge/__main__.py
    from . import codeA
    
    # Do stuff with codeA
    codeA.some_function()

    If you want to be able to run/import the module without being in the parent directory, you can add the parent directory of rankfm_ge to the PYTHONPATH (this is what DSS does), or you can package the project so that you can install it (see below docs).

    Reference docs:

  • Erlebacher
    Erlebacher Registered Posts: 82 ✭✭

    No, I am not running code manually. Here is a link to an example structure on my laptop that works. So now I do not understand the problem when I execute my downloaded Dataiku project on my laptop. Must investigate further.

    https://drive.google.com/file/d/1riz5eMr7rc1fPuqfubqBajND15cFd7Fv/view?usp=sharing

  • dlastrazeneca
    dlastrazeneca Registered Posts: 4
    edited July 2024

    I'm sorry, but I'm still not getting my notebooks to import any custom code.

    I have these files in my 'Library Editor':

    lib/python/__init__.py
    lib/python/src/__init__.py
    lib/python/src/config.py

    When I try to import from config.py in a notebook, I get the error 'No module named 'src', :(.
    Here's the code I try to run:

    from src import config

    Please, any tips?
  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,168 Neuron

    Hi. This thread is from 2022 has already been marked as solved. Please start a new thread. Thanks

Setup Info
    Tags
      Help me…