Here is an example piece of code:
#include <stdint.h> 
#include <iostream>
typedef struct {
    uint16_t low;
    uint16_t high;
} __attribute__((packed)) A;
typedef uint32_t B;
int main() {
    //simply to make the answer unknowable at compile time
    uint16_t input;
    cin >> input;
    A a = {15,input};
    B b = 0x000f0000 + input;
    //a equals b
    int resultA = a.low-a.high;
    int resultB = b&0xffff - (b>>16)&0xffff;
    //use the variables so the optimiser doesn't get rid of everything
    return resultA+resultB;
}
Both resultA and resultB calculate the exact same thing - but which is faster (assuming you don't know the answer at compile time).
I tried using Compiler Explorer to look at the output, and I got something - but with any optimisation no matter what I tried it outsmarted me and optimised the whole calculation away (at first, it optimised everything away since it's not used) - I tried using cin to make the answer unknowable at runtime, but then I couldn't even figure out how it was getting the answer at all (I think it managed to still figure it out at compile time?)
Here is the output of Compiler Explorer with no optimisation flag:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 32
        mov     dword ptr [rbp - 4], 0
        movabs  rdi, offset std::cin
        lea     rsi, [rbp - 6]
        call    std::basic_istream<char, std::char_traits<char> >::operator>>(unsigned short&)
        mov     word ptr [rbp - 16], 15
        mov     ax, word ptr [rbp - 6]
        mov     word ptr [rbp - 14], ax
        movzx   eax, word ptr [rbp - 6]
        add     eax, 983040
        mov     dword ptr [rbp - 20], eax
Begin calculating result A
        movzx   eax, word ptr [rbp - 16]
        movzx   ecx, word ptr [rbp - 14]
        sub     eax, ecx
        mov     dword ptr [rbp - 24], eax
End of calculation
Begin calculating result B
        mov     eax, dword ptr [rbp - 20]
        mov     edx, dword ptr [rbp - 20]
        shr     edx, 16
        mov     ecx, 65535
        sub     ecx, edx
        and     eax, ecx
        and     eax, 65535
        mov     dword ptr [rbp - 28], eax
End of calculation
        mov     eax, dword ptr [rbp - 24]
        add     eax, dword ptr [rbp - 28]
        add     rsp, 32
        pop     rbp
        ret
I will also post the -O1 output, but I can't make any sense of it (I'm quite new to low level assembly stuff).
main:                                   # @main
        push    rax
        lea     rsi, [rsp + 6]
        mov     edi, offset std::cin
        call    std::basic_istream<char, std::char_traits<char> >::operator>>(unsigned short&)
        movzx   ecx, word ptr [rsp + 6]
        mov     eax, ecx
        and     eax, -16
        sub     eax, ecx
        add     eax, 15
        pop     rcx
        ret
Something to consider. While doing operations with the integer is slightly harder, simply accessing it as an integer easier compared to the struct (which you'd have to convert with bitshifts I think?). Does this make a difference?
This originally came up in the context of memory, where I saw someone map a memory address to a struct with a field for the low bits and the high bits. I thought this couldn't possibly be faster than simply using an integer of the right size and bitshifting if you need the low or high bits. In this specific situation - which is faster?
[Why did I add C to the tag list? While the example code I used is in C++, the concept of struct vs variable is very applicable to C too]
 
     
     
     
    