I'm looking for a reasonable definition of a function weighted_sample that does not return just one random index for a list of given weights (which would be something like
def weighted_choice(weights, random=random):
    """ Given a list of weights [w_0, w_1, ..., w_n-1],
        return an index i in range(n) with probability proportional to w_i. """
    rnd = random.random() * sum(weights)
    for i, w in enumerate(weights):
        if w<0:
            raise ValueError("Negative weight encountered.")
        rnd -= w
        if rnd < 0:
            return i
    raise ValueError("Sum of weights is not positive")
to give a categorical distribution with constant weights) but a random sample of k of those, without replacement, just as random.sample behaves compared to random.choice.
Just as weighted_choice can be written as
lambda weights: random.choice([val for val, cnt in enumerate(weights)
    for i in range(cnt)])
weighted_sample could be written as
lambda weights, k: random.sample([val for val, cnt in enumerate(weights)
    for i in range(cnt)], k)
but I would like a solution that does not require me to unravel the weights into a (possibly huge) list.
Edit: If there are any nice algorithms that give me back a histogram/list of frequencies (in the same format as the argument weights) instead of a sequence of indices, that would also be very useful.
 
     
     
     
     
    