How to install Spark and Sparkling Water Via VirtualBox

Ourkid123uk Registered Posts: 3 ✭✭✭✭


I have the VirtualBox up and running and have downloaded the Spark File "spark-2.4.3-bin-hadoop2.7.tgz"

Im trying to install this and then Sparkling Water but im really struggling with how to do this.

Whenever i try any commands it says "command not found"

I have VERY limited knowledge working within a Linux command prompt, whats my next steps i order to install Spark onto my Virtual Machine?

Thanks for looking



  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker

    To be very honest, this is almost impossible without some knowledge of Linux command line.

    Also, your Spark and/or your H2O will not actually be distributed, which significantly limits the benefit they bring in, compared to simple in-memry machine learning. What is it that you want to do more precisely ?
  • Ourkid123uk
    Ourkid123uk Registered Posts: 3 ✭✭✭✭
    Hi Clement!

    Thanks for taking your time to reply.

    I dont mind learning about Linux and how to do this and ive posted some questions on a Linux forum to start me off.

    I basically have the free version and i am using it on a small to medium size data set in memory.

    I just wanted to use the Naive Bayes algorithm and have it integrated into the DSS work flow.
Setup Info
      Help me…