I am trying to import the bitarray library into a SparkContext. https://pypi.python.org/pypi/bitarray/0.8.1.
To do this I have zipped up the contexts in the bit array folder and then tried to add it to my python files. However even after I push the library to the nodes my RDD cannot find the library. Here is my code
zip bitarray.zip bitarray-0.8.1/bitarray/*
// Check the contents of the zip file 
unzip -l bitarray.zip
Archive:  bitarray.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
   143455  2015-11-06 02:07   bitarray/_bitarray.so
     4440  2015-11-06 02:06   bitarray/__init__.py
     6224  2015-11-06 02:07   bitarray/__init__.pyc
    68516  2015-11-06 02:06   bitarray/test_bitarray.py
    78976  2015-11-06 02:07   bitarray/test_bitarray.pyc
---------                     -------
   301611                     5 files
then in spark
import os 
# Environment
import findspark
findspark.init("/home/utils/spark-1.6.0/")
import pyspark
sparkConf = pyspark.SparkConf()
sparkConf.set("spark.executor.instances", "2") 
sparkConf.set("spark.executor.memory", "10g")
sparkConf.set("spark.executor.cores", "2")
sc = pyspark.SparkContext(conf = sparkConf)
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql import HiveContext
from pyspark.sql.types import StructType, StructField, StringType
from pyspark.sql.functions import udf
hiveContext = HiveContext(sc)
PYBLOOM_LIB = '/home/ryandevera/pybloom.zip'
sys.path.append(PYBLOOM_LIB)
sc.addPyFile(PYBLOOM_LIB)
from pybloom import BloomFilter
f = BloomFilter(capacity=1000, error_rate=0.001)
x = sc.parallelize([(1,("hello",4)),(2,("goodbye",5)),(3,("hey",6)),(4,("test",7))],2)
def bloom_filter_spark(iterator):
    for id,_ in iterator:
        f.add(id)
    yield (None, f)
x.mapPartitions(bloom_filter_spark).take(1)
This yields the error -
ImportError: pybloom requires bitarray >= 0.3.4
I am not sure where I am going wrong. Any help would be greatly appreciated!