As of Python v3.6, random.choices could be used to return a list of elements of specified size from the given population with optional weights.
random.choices(population, weights=None, *, cum_weights=None, k=1)
population : list containing unique observations. (If empty, raises IndexError)
weights : More precisely relative weights required to make selections.
cum_weights : cumulative weights required to make selections.
k : size(len) of the list to be outputted. (Default len()=1)
Few Caveats:
1) It makes use of weighted sampling with replacement so the drawn items would be later replaced. The values in the weights sequence in itself do not matter, but their relative ratio does.
Unlike np.random.choice which can only take on probabilities as weights and also which must ensure summation of individual probabilities upto 1 criteria, there are no such regulations here. As long as they belong to numeric types (int/float/fraction except Decimal type) , these would still perform.
>>> import random
# weights being integers
>>> random.choices(["white", "green", "red"], [12, 12, 4], k=10)
['green', 'red', 'green', 'white', 'white', 'white', 'green', 'white', 'red', 'white']
# weights being floats
>>> random.choices(["white", "green", "red"], [.12, .12, .04], k=10)
['white', 'white', 'green', 'green', 'red', 'red', 'white', 'green', 'white', 'green']
# weights being fractions
>>> random.choices(["white", "green", "red"], [12/100, 12/100, 4/100], k=10)
['green', 'green', 'white', 'red', 'green', 'red', 'white', 'green', 'green', 'green']
2) If neither weights nor cum_weights are specified, selections are made with equal probability. If a weights sequence is supplied, it must be the same length as the population sequence.
Specifying both weights and cum_weights raises a TypeError.
>>> random.choices(["white", "green", "red"], k=10)
['white', 'white', 'green', 'red', 'red', 'red', 'white', 'white', 'white', 'green']
3) cum_weights are typically a result of itertools.accumulate function which are really handy in such situations.
From the documentation linked:
Internally, the relative weights are converted to cumulative weights
before making selections, so supplying the cumulative weights saves
work.
So, either supplying weights=[12, 12, 4] or cum_weights=[12, 24, 28] for our contrived case produces the same outcome and the latter seems to be more faster / efficient.