A computer has 2 physical cores, and 4 logical cores (For example, a computer with an i5-3210M processor).
When a program A runs, htop shows that it uses 100% of 1 core, and other 3 cores are nearly idle. The throughput in this case is X.
My question is, if I run 4 instances of A on 4 logical cores, then the total throughput is 4X, or 2X? What if I run only two instances?