It's time for me to get a new Laptop.
I've been using a Macintosh laptop for a number of years now and generally like the system. The installation of DSS community edition is fairly straightforward on a Mac.
That said I'm open to a new configuration. And ready to go back and even look at the basic question Mac vs PC (or Linux?) I know that in the past the suggestion from Dataiku for DSS when it came to a PC was to use Linux on a virtual box. (This did not work for me when I tried it last back in 2017.)
I just found this post from @Alex_Reutter about using Windows 10 WSL (Windows Sub System for Linux). It's been a while since this post was originally made. If you are around @Alex_Reutter I would love to hear what your current experience is. Is WLS a viable alternative to a Macintosh for DSS on a MS Windows PC? Hows the performance?
Now to the question of what I'm looking for from my new computer:
It appears that I don't have the ability to get a Cuda / Pytorch supported GPU on a Macintosh. (Thanks Apple & Nvidia.) Where something like the new Surface Book 3 provides an Nvidia GPU built-in. However, it is generally designed to run Windows.
I'm wondering if anyone has any experience with WSL 2 and DSS on MS Windows 10. I've also noted a presentation at MS Build 2020 about the forthcoming GPU support for WSL 2.
Or do folks have any ideas about how to get generally unlimited access to GPU without lots of incremental costs from a Macintosh?
Thanks for any thoughts you might share.
this is somehow a very peculiar question, and we as Dataikers can't really stand for one faction or another, so we'll leave this to the users community. It will be fun to see what users have to say.
From a very personal point of view, I'm sad to see you don't mention linux option in your post. Is there any particular reason ? If it's because you're not familiar, fear not: nowadays linux desktop distributions are very user-friendly (I'm a mac-borrowed linux user myself).
The main issues with linux laptop has always been:
When it comes to the kind of user community DSS is relevant to, Ubuntu generally is a very good move because you will sure find a python package compiled for it, with all the libraries required. From this point of view, linux really is many steps ahead of Windows. Did I mention that the majority of times, if your machine fails you can take your HDD and mount it to a new linux laptop with similar characteristics and it will just work, or at least let you retrieve your data (restrictions applies, ofc)? Try that with Windows.
Finally, in your specific case (willing to use GPU), the hardest part is to find a laptop with a decent graphics card supported by the OS. There are a lot of super cool and slick laptops with amazing hardware out there, but the vendor just doesn't invest in linux support, so you're on your own there. It might work just fine, it might not (I have a lot of experience on this).
Luckily enough, there are vendors like System76 that build interesting machines from the hardware point of view (they are just not cool and slick as other big vendors, but they work just fine) along with an OS that is tailored for them and provides support to the GPU they carry onboard. As with the very best open-source mentality, you can actually use their OS (for free) on another laptop you might prefer, you'll still leverage the GPU card (provided it's the same kind, ofc). I haven't tried System76 machines myself yet, but the internet is full of happy users.
Architect @ Dataiku
Excellent point Linux.
In many ways, I’m actually OS agnostic. At various times I’ve had three or more computers at my desk. A PC, a Macintosh, and often a SUN SPARC Stations. For many years I ran Ubuntu Linux with PC VMs running on top because I need to support PC, but want to have my usual terminal commands when I need or want them. I’ve used KDE, gnome and I’m sure a few other window managers over the years. Based on my experience with Linux host at my desk. Running a PC VM I found that I spent most of my time in Windows. In general, I found the window managers and applications for Linux less refined. And my primary work was in support the mainline PC Applications.
I’ve not tried ElementryOS. I may take a look. How smoothly does DSS install in that environment?
Yes at least historically Windows has been a challenge with anything Unix like. They do over the last few years have this Terminal that will run at least a Microsoft version of a bash or csh shell and WSL. However, I suspect that there may be compromises there.
One of the reasons I’ve liked Macintosh in the past is because the Mach kernel and Unix like features of the os. Historically for me, the Macintosh computers have allowed me to run MS Windows in a VM or on the base hardware if the full resources of the computer are needed and I was willing to waste a reboot. Linux could also be run in a VM but with a significant performance hit.
Unfortunately, for many years one could not run a Mac OS on a PC. I might consider a Hackintosh as a VM. But again I hear there are a bunch of compromises with less than fully functional results on the Macintosh side.
VMs yes they are useful to me. The time I have tried to use WINE I found it to be mostly sour grapes.
I’ll take a look at System76. Thanks for the pointer.
what are others using?
As DSS users you have each likely had to think through these multiple OS challenges. What are your current personal laptop 💻 computer conclusions: PC, Macintosh, Linux,
If it is not Linux on the laptop where and how are you running DSS? desktop computer or server at home or work or in the cloud, raspberry pi cluster?
If DSS is running in Linux on the laptop is Linux the base os of the computer? if so which computer do you like? If not is DSS running on a VM if so set up in what way?
thanks to everyone for your thoughts.
Hi @tgb417 , I was mostly playing with WSL to see if I could get a basic install of Dataiku up and running on a Windows machine b/c I was having difficulty with Virtualbox. I didn't do any real work on that instance, so I can't comment on its viability.
Right now, I have DSS running on my MacBook (using the osx tar, not the dmg) as well as on some Linux machines. The experience is very similar, but if I were choosing a new laptop primarily for working with DSS, I would be hesitant to choose anything that wasn't running Linux b/c while my experience with DSS on MacOs has been very good, DSS is not native to it.
@Alex_Reutter what additional benefits are you getting from using the tarball version of DSS on your Macintosh? If I get a new mac. I'm inclined to re-install from scratch. Also on a related point how do you choose to install your python. Anaconda, HomeBrew, something else?
Finally, In reading and thinking about this thread, If I were to pick a laptop primarily for DSS. Then, I think that one might look at one of the high end "gaming laptop" with an Nvidia GPU, However, the size of most of these remind me of the laptops from the late 1990s or very early 2000s. Today these laptops are really more of a "desktop replacement" computer than a "walk around to the coffee shop everyday laptop.
I'm really looking for something smaller, and I'm coming to believe that I'm going to for now have to rent appropriate hardware from one of the hardware/platform as a service providers like Amazon, Azure, PaperSpace, maybe others.
Slick design + NVidia GPU + Linux-friendly (because that's the only real way to leverage GPUs from Python+DSS) is indeed a complicated equation to solve.
If travel-friendliness isn't an absolute requirement, renting GPUs and settling for a lighter/slicker laptop is probably your best bet.
Hi @tgb417, the tarball gives me more flexibility for setting up a DSS instance on my mac.
It's been a little while since I installed python. IIRC, 2.7 came installed on the macbook. For a little while, I used Anaconda to manage a python3 installation on my macbook, but eventually ran into a situation I couldn't resolve, and ended up tearing it out and reinstalling python3 with homebrew...
@Alex_Reutter , Interesting... Yeah, I think that I'm now getting to the same place with the Anaconda Navigator software on my Mac. It has been a lovely set of "training wheels" for me. It resolves many of the library dependencies, compiling, and sourcing challenges. But there are some configurations that Anaconda just does not seem to be able to resolve. There are recent libraries that I want to try that it does not have. And if I go and install those libraries with PIP or install.package(), then the Anaconda dependency resolver seems to get in trouble. Anaconda does provide me some other packages. For example, things like QGIS can be installed. But in my experience so far it is not a replacement for an apt-get type Linux package installer. I'm hearing and have experienced Anaconda and homebrew not playing nicely together. These are part of the reason that if I go the Macintosh direction I'm considering reinstalling from scratch and dropping Anaconda and for HomeBrew.
What non-dmg flexibilities are you actually using with the Tarball? Are you moving the location of the DSS_HOME out of ~/Library/DataSienceStudio/dss_home? If so where? Are there other flexibilities that you value?
So I have my new Mac Mini. I'm happy with Homebrew as the App Installer.
However, I'm not clear about the best way to get Python 3.6 installed in a way that DSS can use in a clean way.
Homebrew seems to be installing 3.9.x right now. And there seem to be questions about how to get an older version of Python 3.6 with Homebrew.
Right now I can run Python 3.9 from the terminal as python3.
But I can not install any code environments in DSS that are version 3.x. With or without conda. The latest version that seems to be supported in the drop-down menu below is 3.6. I've got a feeling I'm missing something. When I try to set up a v 3.6 python in the following manner.
I get the following error
I'd agree with this error message I've not installed python 3.6 on this computer.
Can DSS use the Python 3.9 already installed on the computer and being maintained by Homebrew? It does not show up in the dropdown menu. If not what would be the best way to install a supported 3.6 python. Should I run one of the "magic" Dataiku installation scripts?
I currently have the standard Mac Installer for DSS 8.0.5 installed on this computer as a Homebrew Cask. I did notice that the Mac Installer for 9.0.0 installed a version of Python 3.6 as part of its bundle. Is it best to just upgrade to DSS 9.0 manually rather than waiting for the Homebrew Cask upgrade? I could try to install the latest version of Python 3.6 available on python.org.
Lots of different options here. Just wondering what experience folks have and what is likely the cleanest way to set up a newish computer.
After completing what looks like a successful pyenv install. And getting the latest version of python 3.6 installed (python 3.6.13)
When I go to the Terminal and enter the following commands. I get the results below
Mac-mini ~ % which python
Mac-mini ~ % python --version
Note: the above is altered [user_name] is a replacement in the above post of my actual user name.
So Python looks to be installed ok. And things like qgis installed by homebrew which use python 3.9 seem to be still running OK.
However, DSS even after a re-start is still not able to find Python. I'm getting the same error as before.
Do I have to run a re-install script or something for DSS to find the python versions I've installed after I installed DSS?
having to go through:
# Stop DSS DATA_DIR/bin/dss stop # Save the list of locally-installed packages DATA_DIR/bin/pip freeze -l >dss-local-packages.txt # Remove the virtualenv, keeping backup mv DATA_DIR/pyenv DATA_DIR/pyenv.backup # Reinstall DSS (upgrade mode), choosing the underlying base Python to use dataiku-dss-VERSION/installer.sh -d DATA_DIR -u [-P BASE_PYTHON] # Review and possibly edit the list of locally-installed packages vi dss-local-packages.txt # Reinstall local packages DATA_DIR/bin/pip install -r dss-local-packages.txt # Start DSS DATA_DIR/bin/dss start # When everything is considered stable, remove the backup rm -rf DATA_DIR/pyenv.backup
Feels like I did something wrong. I’m not clear where I might have messed up. Any idea how to do a home brew python and DSS install that just works from the start?
It’s getting late here so I not going to do anything more on this tonight.
Check this post out:
Doesn't solve your GPU requirement since nothing on ML it;s going to run on Metal but could be a good option for general Dataiku use.
Windows User / VMWare Linux user here
A couple of notes: the GPU is hard requirement as DSS is native to Linux and at this time there is no way to do GPU passthrough from Windows to a Linux host. If you are comfortable in Linux and are OK with not having all the power-saving features, you could go with something like what I'm using a Lenovo P53 with either Ubuntu, CentOS or Fedora as your primary OS and perhaps run a VM with anything else.
For overall ease, its tough to beat the MacOS, but you get lower specs at a higher price. That said, your GPU setup should be much easier.
There is one other potential option, but it's some time away. Microsoft has been working with Ubuntu on WSL 2 and recently committed to bringing GPUs to WSL through a kernel driver allowing a host Windows system to have a client Linux instance instantiate and use the host Windows GPU. To me, this is a game-changer. Microsoft has said this will be at least "a few months." I wouldn't get my hopes too high and likely the first iterations of this will likely need some kinks ironed out. That said, the #1 WSL request has been GPU support so Microsoft definitely has reasons to get this done as this has been a major stumbling block for more people to use Windows as their primary development platform.
Finally, you could take Clement's advice. If your GPU needs are fairly inconsistent and transitory then you could grab something fun and sleek and just rent GPU time as needed. Hope this helps!
Thanks for your response. Your comments are super helpful.
I too heard at Microsoft Build (5/19/2020) that the folks working on WSL 2 are working on a GPU pass-through "WSL will support GPU Compute workflows" I believe that they showed a working demo. This looks promising for the kind of work I'm considering. I agree that it would be a game-changer.
Good suggestion, the Lenovo P53 has a bunch of things to be liked. I see that the 8th Generation Intel® Core™ i7 Processors is a 10-25 watt chip. Are you aware of any significant thermal throttling or crazy loud fans with style ML loads? Sounds like you have not been able to find drivers to support all of the power management features that the hardware seems to provide.
You suggested that the Macbook GPU setup should be much easier. I agree that Mac OS apps this will be true.
I seem to have found a way in principle to get Neural Network loads to work on the Macintosh GPU.
During my research, I have found PlaidML also discussed in this post for Macintosh and installation instruction here. In the install instruction, it seems to indicate that support for OpenCL 1.2 is required. I'm also seeing Apple going away from OpenCL in favor of their own Metal. And on their list of OpenCL compliment computers, I'm not seeing the latest generation of devices.
Am I going down the right path with PlaidML as a way to get GPU access for ML load on Mac? Is there an easier way to do this I'd be very interested to know anyone's thoughts?
Late to the party here. I've been running LinuxMint as my primary desk/laptop systems since Mint 18. Prior to that I ran Mandrake and/or PCLinuxOS. All worked pretty well, but LinuxMint has been rock solid since 2017 for me.