Don't know about metal specifically, but in ordinary C, you'd want to put f and byteArray inside a union
Here's some sample code:
#include <stdio.h>
#include <stdint.h>
union float_byte {
    float f;
    uint8_t byteArray[sizeof(float)];
};
union float_byte u;
void
dotest(float f)
{
    u.f = f;
    printf("%.6f",u.f);
    for (int idx = 0;  idx < sizeof(u.byteArray);  ++idx)
        printf(" %2.2X",u.byteArray[idx]);
    printf("\n");
}
int
main(void)
{
    dotest(2.1);
    dotest(7.6328);
    return 0;
}
Here's the program output:
2.100000 66 66 06 40
7.632800 E6 3F F4 40
UPDATE:
even today, isn't it still technically UB to read a union member that wasn't the last one written to? Although, it sounds like this is widely supported now with implementation-defined behavior. One of many related questions: stackoverflow.com/questions/2310483/… – yano
No, it's not UB for a number of reasons.
It might be "implementation defined" behavior, but only because of the CPU/processor endianness re. the format of a [32 bit] float in memory, if we wished to interpret the bytes in byteArray
But, AFAICT, that doesn't affect OP's issue since the point was just to get a byte buffer [for binary/serialization of the data?].
If one wanted to interpret the data (e.g. designing a DIY F.P. S/W implementation), then the format and endianness of the float would have to be known. This is [probably] IEEE 784 format, and the processor endianness.
But, using the union to just get a byte pointer, there is no issue. It's not even "implementation defined" behavior.
It's just about the same as:
float my_f = f;
uint8_t *byteArray = (uint8_t *) &my_f;
And, it works because it has to work.
Also, the union [as used here] is a common idiom, dating back to the 1970's so it has to be supported.
Also, it just works [because it has to by design].
If we had:
void
funcA(float f)
{
    u.f = f;
}
void
funcB(void)
{
    for (int idx = 0;  idx < sizeof(u.byteArray);  ++idx)
        printf(" %2.2X",u.byteArray[idx]);
    printf("\n");
}
For fun, assume funcA and funcB are in separate .c files. When funcA is called it changes the memory of u [in a predictable way].
Nothing in between changes the layout of u.
Then, we call funcB. The layout of the bytes in byteArray will be the same/predictable data.
This is similar to and works the same way as writing the float to a file as binary data:
#include <unistd.h>
int fd;
void
writefloat(float f)
{
    float my_f = f;
    write(fd,&my_f,sizeof(float));
}
void
writefloat2(float f)
{
    write(fd,&f,sizeof(float));
}
void
writefloat3(float f)
{
    write(fd,&f,sizeof(f));
}
Perhaps this would be easier to see if we used uint32_t instead of float. We could do an endian test. [Note: this is crude and doesn't account for oddball endianness like pdp11/vax]:
#include <stdio.h>
#include <stdint.h>
union uint_byte {
    uint32_t i;
    uint8_t b[sizeof(uint32_t)];
};
union uint_byte u;
int
main(void)
{
    u.i = 0x01020304;
    if (u.b[0] == 0x04)
        printf("cpu is little-endian\n");
    else
        printf("cpu is big-endian\n");
    return 0;
}