Two things:
- 10000is not a lot on a modern computer. Therefore that loop will run in probably less than a millisecond - less than the precision of- clock(). Therefore it will return zero.
 
- If you aren't using the result of - non_parallelits possible that the entire loop will be optimized out by the compiler.
 
Most likely, you just need a more expensive loop. Try increasing ARRAY_SIZE to something much larger.
Here's a test on my machine with a larger array size:
#define ARRAY_SIZE 100000000
int main(){
    clock_t start, end;
    double *non_parallel = (double*)malloc(ARRAY_SIZE * sizeof(double));
    double *vec          = (double*)malloc(ARRAY_SIZE * sizeof(double));
    start = clock();
    for(int i = 0; i < ARRAY_SIZE; i++) 
    {
        non_parallel[i] = vec[i] * vec[i];
    }
    end = clock();
    printf( "Number of seconds: %f\n", (end-start)/(double)CLOCKS_PER_SEC );
    free(non_parallel);
    free(vec);
    return 0;
}
Output:
Number of seconds: 0.446000