I imagine your iterative way is something like this:
In [204]: dd = {
...: 'distro': {0: 2.42, 3: 2.56},
...: 'constant': 4.55,
...: 'size': 10,
...: }
In [205]: dd
Out[205]: {'constant': 4.55, 'distro': {0: 2.42, 3: 2.56}, 'size': 10}
In [207]: x = np.zeros(dd['size'])
In [208]: x[:] = dd['constant']
In [210]: for i,v in dd['distro'].items():
...: x[i] = v
In [211]: x
Out[211]: array([ 2.42, 4.55, 4.55, 2.56, 4.55, 4.55, 4.55, 4.55, 4.55, 4.55])
An alternative to the x[:], is x.fill(dd['constant']),but don't think there's much difference in speed.
Here's a way of setting values from the dictionary without explicit iteration:
In [221]: ddvals = np.array(list(dd['distro'].items()),dtype='i,f')
In [222]: ddvals
Out[222]:
array([(0, 2.42000008), (3, 2.55999994)],
dtype=[('f0', '<i4'), ('f1', '<f4')])
In [223]: x[ddvals['f0']]=ddvals['f1']
In [224]: x
Out[224]:
array([ 2.42000008, 4.55 , 4.55 , 2.55999994, 4.55 ,
4.55 , 4.55 , 4.55 , 4.55 , 4.55 ])
or without the structured array:
In [225]: vals = np.array(list(dd['distro'].items()))
In [226]: vals
Out[226]:
array([[ 0. , 2.42],
[ 3. , 2.56]])
In [227]: x[vals[:,0]] = vals[:,1]
...
IndexError: arrays used as indices must be of integer (or boolean) type
In [228]: x[vals[:,0].astype(int)] = vals[:,1]
In [229]: x
Out[229]: array([ 2.42, 4.55, 4.55, 2.56, 4.55, 4.55, 4.55, 4.55, 4.55, 4.55])
The dictionary items() (or list(items()) in PY3) gives a list of tuples. Newer numpy versions don't like to use floats as indices, so we have to add a few steps to preserve the integer key values.
This might be the simplest:
x[list(dd['distro'].keys())] = list(dd['distro'].values())
(I assume keys, values and items return values in the same key order).
For this small case I suspect the plain iterative approach is faster. But something much larger one of that latter ones is probably better. I can't predict where the cross over occurs.
scipy.sparse makes 2d matrices. It does not implement any sort of const fill. (Pandas sparse does have such a fill). We could certainly construct a sparse matrix from dd['size'] and dd['distro']. But I don't know if it will offer any speed advantages.
And if Tensorflow is your real target, then you may need to look more at its construction methods. Maybe you don't need to pass through numpy or sparse at all.
This x, without the const can be represented as a scipy sparse matrix with:
In [247]: Xo = sparse.coo_matrix([x])
In [248]: Xo
Out[248]:
<1x10 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in COOrdinate format>
Its key attributes are:
In [249]: Xo.data
Out[249]: array([ 2.42, 2.56])
In [250]: Xo.row
Out[250]: array([0, 0], dtype=int32)
In [251]: Xo.col
Out[251]: array([0, 3], dtype=int32)
In [252]: Xo.shape
Out[252]: (1, 10)
Xr=Xo.tocsr() the csr format is similar, except the row attribute is replaced with a indptr array, which has one value per row (+1), so it doesnt' grow with the number of non-zero terms. It is used for most sparse math.
There is also a dok format, which is actually a dictionary subclass:
In [258]: dict(Xo.todok())
Out[258]: {(0, 0): 2.4199999999999999, (0, 3): 2.5600000000000001}
If the input is valid json, you will need to convert index keys to integer.
In [281]: jstr
Out[281]: '{"distro": {"0": 2.42, "3": 2.56}, "constant": 4.55, "size": 10}'
In [282]: jdd = json.loads(jstr)
In [283]: jdd
Out[283]: {'constant': 4.55, 'distro': {'0': 2.42, '3': 2.56}, 'size': 10}
In [284]: list(jdd['distro'].keys())
Out[284]: ['0', '3']
In [285]: np.array(list(jdd['distro'].keys()),int)
Out[285]: array([0, 3])
In [286]: np.array(list(jdd['distro'].values()))
Out[286]: array([ 2.42, 2.56])
My impression from SO searches is that json.load is as fast, if not faster than eval. It has to parse a much simpler syntax.
python eval vs ast.literal_eval vs JSON decode
If you can process the json strings, and store them in some sort intermediate data structure there are several possibilities. How 'sparse' are these vectors? If the dictionary has values for nearly all the 1000 'size' entries, it may be best to build the full numpy array and save that that (e.g. with np.save/load pair).
If it is sparse (say 10% of the values being non-const), the saving the 2 index and values arrays may make more sense (out 285 and 284). Either keep them separate, or join them in the kind of structured array I produced earlier.