I am trying to accomplish a simple task of trimming all whitespace across every column in my dataframe. I have some values that have trailing spaces after words, before words, and some columns that only contain a "   " value. I want all of that stripped.
I read this post which gave a great way to accomplish this:
data_frame_trimmed = data_frame.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
However, I frequently get the following:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-9-31d35db1d48c> in <module>
      1 df = (pd.read_csv('C:\\Users\\wundermahn\Desktop\\aggregated_po_data.csv',
----> 2                     encoding = "ISO-8859-1", low_memory=False).apply(lambda x: x.str.strip() if (x.dtype == "object") else x))
      3 print(df.shape)
      4 
      5 label = df['class']
c:\python367-64\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   6876             kwds=kwds,
   6877         )
-> 6878         return op.get_result()
   6879 
   6880     def applymap(self, func) -> "DataFrame":
c:\python367-64\lib\site-packages\pandas\core\apply.py in get_result(self)
    184             return self.apply_raw()
    185 
--> 186         return self.apply_standard()
    187 
    188     def apply_empty_result(self):
c:\python367-64\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    294             try:
    295                 result = libreduction.compute_reduction(
--> 296                     values, self.f, axis=self.axis, dummy=dummy, labels=labels
    297                 )
    298             except ValueError as err:
pandas\_libs\reduction.pyx in pandas._libs.reduction.compute_reduction()
pandas\_libs\reduction.pyx in pandas._libs.reduction.Reducer.get_result()
<ipython-input-9-31d35db1d48c> in <lambda>(x)
      1 df = (pd.read_csv('C:\\Users\\wundermahn\Desktop\\aggregated_data.csv',
----> 2                     encoding = "ISO-8859-1", low_memory=False).apply(lambda x: x.str.strip() if (x.dtype == "object") else x))
      3 print(df.shape)
      4 
      5 label = df['ON_TIME']
c:\python367-64\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5268             or name in self._accessors
   5269         ):
-> 5270             return object.__getattribute__(self, name)
   5271         else:
   5272             if self._info_axis._can_hold_identifiers_and_holds_name(name):
c:\python367-64\lib\site-packages\pandas\core\accessor.py in __get__(self, obj, cls)
    185             # we're accessing the attribute of the class, i.e., Dataset.geo
    186             return self._accessor
--> 187         accessor_obj = self._accessor(obj)
    188         # Replace the property with the accessor object. Inspired by:
    189         # http://www.pydanny.com/cached-property.html
c:\python367-64\lib\site-packages\pandas\core\strings.py in __init__(self, data)
   2039 
   2040     def __init__(self, data):
-> 2041         self._inferred_dtype = self._validate(data)
   2042         self._is_categorical = is_categorical_dtype(data)
   2043         self._is_string = data.dtype.name == "string"
c:\python367-64\lib\site-packages\pandas\core\strings.py in _validate(data)
   2096 
   2097         if inferred_dtype not in allowed_types:
-> 2098             raise AttributeError("Can only use .str accessor with string values!")
   2099         return inferred_dtype
   2100 
**AttributeError: Can only use .str accessor with string values!**
So, in trying to find a workaround, I stumbled upon this post, which suggests using:
data_frame_trimmed = data_frame.apply(lambda x: x.str.strip() if x.dtype == "str" else x)
But, that doesn't strip away empty cells that just contain spaces or tabs.
How can I efficiently strip away all variants of white space? I ultimately am going to drop columns with more than 50% null values.
 
     
     
    