I'm trying to import the KMeans and Vectors classes from spark.mllib. The platform is IBM Cloud (DSX) with python 3.5 and a Junyper Notebook.
I've tried:
import org.apache.spark.mllib.linalg.Vectors
import apache.spark.mllib.linalg.Vectors
import spark.mllib.linalg.Vectors
I've found several examples/tutorials with the first import working for the author. I've was able to confirm that the spark library itself isn't loaded in the environment. Normally, I would download the package and then import. But being new to VMs, I'm not sure how to make this happen.
I've also tried pip install spark without luck. It throws an error that reads:
The following command must be run outside of the IPython shell:
$ pip install spark
The Python package manager (pip) can only be used from outside of IPython.
Please reissue the `pip` command in a separate terminal or command prompt.
But this is in a VM where I don't see the ability to externally access the CLI.
I did find this, but I don't think I have a mismatch problem -- the issue on importing into DSX is covered but I can't quite interpret it for my situation.
I think this is the actual issue I'm having but it is for sparkR and not python.