0

I've recently been working on wifi sniffer using ESP8266, in order to sniff wifi packets there is a function called wifi_set_promiscuous_rx_cb(wifi_sniffer_packet_handler) it takes a callback function as a parameter and passes buffer pointer, which has the packet info and length of the packet as parameters to the callback function , wifi_sniffer_packet_handler(uint8_t *buff, uint16_t len) is the call back function. i am not understanding what these two statements are doing const wifi_promiscuous_pkt_t *ppkt = (wifi_promiscuous_pkt_t *)buff; const wifi_ieee80211_packet_t *ipkt = (wifi_ieee80211_packet_t *)ppkt->payload; wifi_promiscuous_pk_t is a structure

 typedef struct{
    wifi_pkt_rx_ctrl_t rx_ctrl; /**< metadata header */
    uint8_t payload[0];       /**< Data or management payload. Length of payload is described by rx_ctrl.sig_len. Type of content determined by packet type argument of callback. */
} wifi_promiscuous_pkt_t;

and wifi_ieee80211_packet_t is another structure

typedef struct
{
    wifi_ieee80211_mac_hdr_t hdr;
    uint8_t payload[2]; /* network data ended with 4 bytes csum (CRC32) */
} wifi_ieee80211_packet_t;

how the data in the *buff is assigned to these structures, are the above statements responsible for the assignment i've seen many stackoverflow questions and many other threads regarding this but none of the posts clarified my doubt

yajna Rohit
  • 21
  • 1
  • 3
  • 2
    Choose one language, C or C++, and delete the other tag. Do not tag both C and C++ except when asking about differences or interactions between the two languages. – Eric Postpischil Jun 04 '21 at 09:35
  • This seems tobe C code (or not well written C++ code). In C++ you should write this as `autoppkt = reinterpret_cast(buff);` You can use `reinterpret_cast` to inspect a type bytewise and the reverse conversion happening here is possible too, see https://en.cppreference.com/w/cpp/language/reinterpret_cast – fabian Jun 04 '21 at 09:40

3 Answers3

0

The packet coming from the radio presumably has a format like this:

| metadata | mac hdr | mac payload                   |

This is stored initially as a uint8_t array pointed to by buff.

Now, you typecast the buffer into the first structure type:

const wifi_promiscuous_pkt_t *ppkt = (wifi_promiscuous_pkt_t *)buff;

This is what our buffer conceptually looks like now:

|<---------------------- buff ---------------------->| buff
|<-- rx_ctrl -->|<------- payload ------------------>| ppkt 

The structure is kind of overlaid on top of the byte array and since the metadata is always a fixed size, ppkt->rx_ctrl will now contain this metadata header.

But what about payload? The declaration indicates payload to be a zero sized array. This is special syntax to tell the compiler that this is a 'placeholder' for an arbitrarily large array of user-managed memory. (Please note that this is an outdated GCC extension and might produce compile errors on most modern compilers. The post C99 approach is to simply leave out the array dimension like this: uint8_t payload[])

Hence ppkt->payload is now an array starting with the first byte beyond the metadata. Since C doesn't bother with bounds checking, you can access any byte of this payload with ppkt->payload[i]

Now, we need to get to the mac payload by parsing out the mac header, which is again of a fixed size. This is accomplished by:

const wifi_ieee80211_packet_t *ipkt = (wifi_ieee80211_packet_t *)ppkt->payload;

In this case, we discard the header (rx_ctrl) and overlay the wifi_ieee80211_packet_t struct on top of the payload portion of the buffer.

Looking at our buffer again, this is what has happened to it:

|<---------------------- buff ---------------------->| buff
|<-- rx_ctrl -->|<------- payload ------------------>| ppkt
                |<-- hdr -->|<----- payload -------->| ipkt

Now you can easily get to the mac payload in the original packet with ipkt->payload.

This is a very common idiom used when parsing packets in networking protocol stacks.

PS: However, please note that typecasting between incompatible types is risky business. Firstly, you have to understand and account for the alignment of structure members. Also, depending on how your structures have been defined, this might result in violation of strict aliasing rules and hence, invoke undefined behaviour on certain compilers.

th33lf
  • 2,177
  • 11
  • 15
  • 1
    While this may be what the code is attempting to do, the behavior of accessing a buffer defined as an array of `uint8_t` through types of `wifi_promiscuous_pkt_t` or `wifi_ieee80211_packet_t` is not defined by the C standard, and any answer should state this and warn about continuing to use the code without addressing the issue. Additionally, the zero-length array is an outdated compiler extension, and this should be explained. – Eric Postpischil Jun 04 '21 at 09:32
  • @EricPostpischil Agree with the zero-length array, updated the post. There is nothing inherently UB about typecasting a byte array into a struct as long as you respect alignment. This is quite standard and used heavily in lots of networking code. – th33lf Jun 04 '21 at 09:59
  • 1
    Accessing an object with an effective type of array of `uint8_t` through an lvalue with structure type violates C 2018 6.5 7, which begins “An object shall have its stored value accessed only by an lvalue expression that has one of the following types…” Violating this rule renders the behavior undefined by the C standard per C 2018 4 2: “If a "shall" or "shall not" requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined…” The fact it is heavily used in networking code does not alter the rules in the C standard. – Eric Postpischil Jun 04 '21 at 10:06
  • 1
    One way of addressing the issue is to ensure that the C implementation being used supports this aliasing. Authors of such programs should provide documentation (in comments in the source code or separate) that draws attention to this issue and the necessity of requiring support from the C implementation (typically the compiler), and users of such programs (those who compile and build them) should check that their compiler does provide such a guarantee beyond the requirements of the C standard. – Eric Postpischil Jun 04 '21 at 10:09
  • The OP is not asking for the best way to do what is being done, he is asking for what the code is doing so there is no reason to downvote my answer because it does answer the question. Also, if the first member of those structs is a char and the other elements account for alignment (or are declared packed) then it DOES NOT violate the standard, since: "an aggregate or union type that includes one of the aforementioned types among its members" is also legal! – th33lf Jun 04 '21 at 10:32
  • @EricPostpischil Added a note on potential strict aliasing violation. – th33lf Jun 04 '21 at 10:40
  • @th33lf *Also, if the first member of those structs is a char and the other elements account for alignment (or are declared packed) then it DOES NOT violate the standard, since: "an aggregate or union type that includes one of the aforementioned types among its members" is also legal!* Are you ***really*** trying to say that just because the structure has a `uint8_t` element in it as its first member, the entire structure can be used to overlay a completely unrelated `uint8_t` (i.e. `unsigned char`) buffer and ***not*** violate strict aliasing? If you say "Yes", you're getting another DV. – Andrew Henle Jun 04 '21 at 10:49
  • @th33lf: Re “he is asking for what the code is doing so there is no reason to downvote”: The code is **not** doing what this answer says it is doing if the compiler does not support aliasing. – Eric Postpischil Jun 04 '21 at 11:31
  • 1
    Re “Also, if the first member of those structs is a char and the other elements account for alignment (or are declared packed) then it DOES NOT violate the standard, since: "an aggregate or union type that includes one of the aforementioned types among its members" is also legal!”: If the code merely accessed the data as a structure (aggregate) that included such a member, maybe. But that is not license to access the members of the structure. E.g., maybe it would be legal to copy the entire structure (accessing the structure as a single object), since the lvalue would be an aggregate type… – Eric Postpischil Jun 04 '21 at 11:35
  • 1
    … However, if you access some member, `p->member`, then the lvalue of `p->member` is (generally) not of aggregate type and does not satisfy this portion of the rule. – Eric Postpischil Jun 04 '21 at 11:36
  • @EricPostpischil I think we are over complicating the discussion here. OP has not shared the details of the header structs and a reasonable assumption to make would be that a compiler for Arduino would support basic type-punning use cases which are very commonly used in embedded systems. Nothing in the post suggests that he is having an issue. OP only wanted to know what the code does, which I tried to explain. Also added the note on strict aliasing and zero sized arrays after you pointed them out. I don't see what is missing in the answer anymore so I'm leaving it at this. – th33lf Jun 04 '21 at 14:18
  • @th33lf [Type-punning that you're advocating isn't even safe on x86 systems](https://stackoverflow.com/questions/47510783/why-does-unaligned-access-to-mmaped-memory-sometimes-segfault-on-amd64/47512025#47512025). My standards are higher than "I haven't ***observed*** it fail. ***Yet***." Because that's what your standard is here: you haven't seen it fail so you think it's safe even though the C standard say's it's UB. That's a very low standard to advocate. – Andrew Henle Jun 04 '21 at 14:57
0

The goal of converting from a pointer to bytes that have been read from the network to a pointer to a structure is to interpret the bytes as that structure type. This relies on compiler features that are not required by the C standard.

The code (wifi_promiscuous_pkt_t *)buff converts the pointer buff to a pointer to the type wifi_promiscuous_pkt_t. This relies on buff being aligned as required for the structure,1 and it requires that the conversion produce a pointer to the same place in memory but with a different type.2

Then const wifi_promiscuous_pkt_t *ppkt = (wifi_promiscuous_pkt_t *)buff; defines a new pointer ppkt and initializes it to point to the same memory buff points to. The goal is to have references such as ppkt->rx_ctrl interpret the data as the data for the wifi_promiscuous_pkt_t structure instead of as the raw uint8_t bytes.

The C standard does not define the behavior here.3 Compilers may support this by defining the behavior themselves. Notably, the compiler needs to support aliasing memory: Accessing memory using a type other than the original type used to define or store data there.

For illustration, let’s say the type of rx_ctrl, wifi_pkt_rx_ctrl_t, is a four-byte int. Then, when these requirements above are met, ppkt->rx_ctrl_t will access the first four bytes at buff and interpret them as if they were an int.

(If those requirements are not met, then a C-standard way to reinterpret the bytes is to make a new object and copy the data into it: wifi_promiscuous_pkt_t temp; memcpy(temp, buff, sizeof temp + extra for the payload length);, after which temp.rx_ctrl will provide the int value. However, depending on circumstances and compiler quality, that can cause a lot of extra memory copying we would like to avoid. Accessing the memory directly is preferable, if it is supported. Another alternative is to read the data directly from the network into the target object instead of into a buffer of bytes.)

After this, const wifi_ieee80211_packet_t *ipkt = (wifi_ieee80211_packet_t *)ppkt->payload; does the same thing with ppkt->payload instead of buff. Once ppkt has been set up, ppkt->payload points to the bytes starting after the rx_ctrl member.4 This pointer is then converted to point to the type wifi_ieee80211_packet_t, and ipkt is initialized with the new pointer.

As before, the intent is that ipkt->hdr and ipkt->payload will refer to the data in the first structure’s payload area as if that data contained a wifi_ieee80211_packet_t structure. However, there is another problem here. As noted above, buff may pointed to well-aligned memory if it was allocated with malloc. But ipkt does not point to the same place as buff. It points to ppkt->payload, which is some number of bytes after the start because it follows the rx_ctrl member. So we do not know, from this code alone, that ipkt is properly aligned for a wifi_ieee80211_packet_t. Possibly it is, if the size of the rx_ctrl member is a multiple of the alignment requirement for the hdr member.

Without comments in the code speaking to this, this requirement may have been neglected. The code might work because rx_ctrl is an okay size for this to work, but it might generate an alignment trap. If not now, then possibly in the future when the types involved change.

Footnotes

1 If buff points to the start of memory allocated with malloc or another standard memory allocation routine, it will be correctly aligned for any fundamental type. Or, if it was defined as an array, correct alignment can be requested with the standard _Alignas keyword or possibly with a compiler extension. If it was merely defined as an ordinary array of uint8_t, is not guaranteed to be aligned as required for this use.

2 For general pointer-to-object conversions, the standard only requires that a pointer converted to another pointer-to-object type and then back to its original type points to the original object. It does not guarantee the pointer is otherwise usable in the other type. But this is common in C implementations.

3 In general, accessing an array of uint8_t using other types violates the rule in C 2018 6.5 7, which says that memory which has been defined with a certain type, or, for dynamic memory, assigned with a certain type, shall be accessed only by certain “matching” types or by a “character” (byte) type.

4 Formally, ppkt->payload designates an array, and that array is automatically converted to a pointer. Also, the array is declared to have zero elements. That is not defined by the C standard, but GCC and other compilers supported it as a way to allow variable length data at the end of a structure. Since C 1999, the standard way to do this is to declare the member with [] instead of [0].

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
0

When I used the structure:

|<-- rx_ctrl -->|<------- payload ------------------>|

I get a misalignment in the payload data by a shift of 2 bytes to the left. When I replace rx_ctrl part with a field e.g. unsigned frame_ctrl:16 I do not see this shift. I do have to typecast frame_ctrl to rx_ctrl to get the details.

I assume this is because I use esp32-S2 which assumes teh rx_ctrl is 32 bits regardless of its definition as 16 bits.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294