I am trying to assign new coordinates to a xarray DataArray's multiIndex.
I have a dataArray which contains 2 main dimensions ('longitude', 'latitude), and a single multiindex ('states').
Here is the DataArray structe:
print(dataArray)
<xarray.DataArray (longitude: 5000, latitude: 3000)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]])
Coordinates:
  * longitude  (longitude) float64 -145.0 -145.0 -144.9 ... -15.05 -15.03 -15.0
  * latitude   (latitude) float64 -85.0 -84.96 -84.93 ... 24.93 24.96 25.0
    states     (latitude, longitude) float64 nan nan nan nan ... nan nan nan nan
The 'states' multiindex contains solely interger values, and I would like to convert them, or add a second multiIndex with "named coordinates" (i.e.: USA, Italy, Germany, Brazil...).
Once with a named "states" multiindex, one could easily select a given state by its proper name - from the available index.
Below is a reproducible script. It has been taken from here :
import pandas as pd
pd.set_option('display.width', 50000)
pd.set_option('display.max_rows', 50000)
pd.set_option('display.max_columns', 5000)
import geopandas
from rasterio import features
from affine import Affine
import numpy as np
import xarray as xr
from cartopy.io import shapereader
def transform_from_latlon(lat, lon):
    lat = np.asarray(lat)
    lon = np.asarray(lon)
    trans = Affine.translation(lon[0], lat[0])
    scale = Affine.scale(lon[1] - lon[0], lat[1] - lat[0])
    return trans * scale
def rasterize(shapes, coords, fill=np.nan, **kwargs):
    """Rasterize a list of (geometry, fill_value) tuples onto the given
    xray coordinates. This only works for 1d latitude and longitude
    arrays.
    """
    transform = transform_from_latlon(coords['latitude'], coords['longitude'])
    out_shape = (len(coords['latitude']), len(coords['longitude']))
    raster = features.rasterize(shapes, out_shape=out_shape,
                                fill=fill, transform=transform,
                                dtype=float, **kwargs)
    return xr.DataArray(raster, coords=coords, dims=('latitude', 'longitude'))
if '__main__' == __name__:
    # this shapefile is from natural earth data
    # http://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/
    resolution = '10m'
    category = 'cultural'
    name = 'admin_0_countries'
    shpfilename = shapereader.natural_earth(resolution, category, name)
    # read the shapefile using geopandas
    states = geopandas.read_file(shpfilename)
    South_America = states[states['SUBREGION'] == 'South America'].reset_index(drop=True)
    state_ids = {k: i for i, k in enumerate(South_America['NAME_LONG'])}
    shapes = [(shape, n) for n, shape in enumerate(South_America.geometry)]
    LONGITUDE = np.linspace(-145, -15, num=5000)
    LATITUDE = np.linspace(-85, 25, num=3000)
    ds = xr.DataArray(coords=(LONGITUDE, LATITUDE), dims=['longitude', 'latitude'])
    ds['states'] = rasterize(shapes, ds.coords)
    # trying to assign new coordinates to the dimension:
    try:
        ds = ds.assign_coords(states = South_America['NAME_LONG'])
    except ValueError:
        print("message error", "cannot add coordinates with new dimensions to a DataArray")
    # ds = ds.expand_dims({'names':South_America['NAME_LONG']})  # --> this does not work
    Array = np.random.randn(LATITUDE.size, LONGITUDE.size)
    dArray_Brazil = xr.DataArray(Array, coords=(LATITUDE, LONGITUDE), dims=['latitude', 'longitude'])
    import matplotlib.pyplot as plt
    quadmash = dArray_Brazil.plot()
    ax = ds.states.where(ds.states != 'Brazil').plot(ax=quadmash.axes)
    plt.show()
ideally, I would like to have a DataArray structure as one of the two options below:
option 1)
<xarray.DataArray (longitude: 5000, latitude: 3000)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]])
Coordinates:
  * longitude  (longitude) float64 -145.0 -145.0 -144.9 ... -15.05 -15.03 -15.0
  * latitude   (latitude) float64 -85.0 -84.96 -84.93 ... 24.93 24.96 25.0
    states     (latitude, longitude) string Brazil, USA Germany ...
option 2)
 <xarray.DataArray (longitude: 5000, latitude: 3000)>
    array([[nan, nan, nan, ..., nan, nan, nan],
           [nan, nan, nan, ..., nan, nan, nan],
           [nan, nan, nan, ..., nan, nan, nan],
           ...,
           [nan, nan, nan, ..., nan, nan, nan],
           [nan, nan, nan, ..., nan, nan, nan],
           [nan, nan, nan, ..., nan, nan, nan]])
    Coordinates:
      * longitude  (longitude) float64 -145.0 -145.0 -144.9 ... -15.05 -15.03 -15.0
      * latitude   (latitude) float64 -85.0 -84.96 -84.93 ... 24.93 24.96 25.0
        states     (latitude, longitude) float64 nan nan nan nan ... nan nan nan nan
        Named_states     (latitude, longitude) string Brazil, USA Germany ...