I have a clustered a DataFrame and then used groupby to group it by the resulting 'clusters' value
clusterGroup = df1.groupby('clusters')
Each group in clusterGroup has multiple rows (and ~30 columns) and I need to create a new dataframe of a single row for each group that is that represents the cluster center for each group. I'm using Kmeans to do this, specifically ".cluster_centers_" The idea was to loop through each group and calculate the cluster center then append this to a new dataframe called logCenters.
df1.head()
9367    13575   13577   13578   13580   13585   13587   13588   13589   13707   13708   13719   13722   13725   13817   13819   14894   20326   20379   20384   20431   20433   22337   22346   22386   22388   22391   clusters
493 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 105.0   0.0 0.0 0.0 0.0 0.0 0.0 112.0   0.0 107.0   0.0 0.0 0.0 14
510 0.0 0.0 0.0 113.0   0.0 0.0 111.0   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 105.0   0.0 0.0 0.0 0.0 0.0 26
513 0.0 0.0 0.0 114.0   0.0 0.0 106.0   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 106.0   0.0 0.0 0.0 0.0 0.0 26
516 0.0 0.0 0.0 114.0   0.0 0.0 111.0   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 108.0   0.0 0.0 0.0 0.0 0.0 26
519 0.0 0.0 0.0 113.0   0.0 0.0 113.0   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 109.0   0.0 0.0 0.0 0.0 0.0 26
.
    from sklearn.cluster import KMeans
K = 1
logCenters = []
for x in clusterGroup:
    kmeans_model = KMeans(n_clusters=K).fit(x)
    centers = np.array(kmeans_model.cluster_centers_)
    logCenters.append(centers)
The error I get when running this loop is:
    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-108-148e4053f5fb> in <module>()
      3 logCenters = []
      4 for x in clusterGroup:
----> 5     kmeans_model = KMeans(n_clusters=K).fit(x)
      6     centers = np.array(kmeans_model.cluster_centers_)
      7     logCenters.append(centers)
/home/nbuser/anaconda3_23/lib/python3.4/site-packages/sklearn/cluster/k_means_.py in fit(self, X, y)
    878         """
    879         random_state = check_random_state(self.random_state)
--> 880         X = self._check_fit_data(X)
    881 
    882         self.cluster_centers_, self.labels_, self.inertia_, self.n_iter_ = \
/home/nbuser/anaconda3_23/lib/python3.4/site-packages/sklearn/cluster/k_means_.py in _check_fit_data(self, X)
    852     def _check_fit_data(self, X):
    853         """Verify that the number of samples given is larger than k"""
--> 854         X = check_array(X, accept_sparse='csr', dtype=[np.float64, np.float32])
    855         if X.shape[0] < self.n_clusters:
    856             raise ValueError("n_samples=%d should be >= n_clusters=%d" % (
/home/nbuser/anaconda3_23/lib/python3.4/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    380                                       force_all_finite)
    381     else:
--> 382         array = np.array(array, dtype=dtype, order=order, copy=copy)
    383 
    384         if ensure_2d:
ValueError: setting an array element with a sequence.
 
    