I want to calculate the probability of all the data in a column dataframe according to its own distribution.For example,my data like this:
    data
0      1
1      1
2      2
3      3
4      2
5      2
6      7
7      8
8      3
9      4
10     1
And the output I expect like this:
    data       pro
0      1  0.155015
1      1  0.155015
2      2  0.181213
3      3  0.157379
4      2  0.181213
5      2  0.181213
6      7  0.048717
7      8  0.044892
8      3  0.157379
9      4  0.106164
10     1  0.155015
I also refer to another question(How to compute the probability ...) and get an example of the above.My code is as follows:
import scipy.stats
samples = [1,1,2,3,2,2,7,8,3,4,1]
samples = pd.DataFrame(samples,columns=['data'])
print(samples)
kde = scipy.stats.gaussian_kde(samples['data'].tolist())
samples['pro'] = kde.pdf(samples['data'].tolist())
print(samples)
But what I can't stand is that if my column is too long, it makes the operation slow.Is there a better way to do it in pandas?Thanks in advance.