I have a dataset X such that X.shape yields (10000, 9). I want to choose a subset of X with the following code:
X = np.asarray(np.random.normal(size = (10000,9)))
train_fraction = 0.7 # fraction of X that will be marked as train data
train_size = int(X.shape[0]*train_fraction) # fraction converted to number
test_size = X.shape[0] - train_size # remaining rows will be marked as test data
train_ind = np.asarray([False]*X.shape[0])     
train_ind[np.random.randint(low = X.shape[0], size = (train_size,))] = True # mark True at 70% of the places
The problem is that np.sum(train_ind) is not the expected value of 7000. Instead it gives random values like 5033, etc.
I initially thought that np.random.randint(low = X.shape[0], size = (train_size,)) might be the culprit. But when I do np.random.randint(low = X.shape[0], size = (train_size,)).shape I get (7000,). 
Where am I going wrong?
 
    