Networking isn't quite as simple as it seems. We're talking about a layer cake, there's a lot of things happening under the surface. It's important to understand that communication has 2 levels; physical and logical.
Everything networking is defined by IEEE 802 standards. Ethernet is governed by 802.3, WiFi 802.11.
The first layer is Physical, OSI Model Layer 1. Here we have only electricity, radiomagnetic radiation and light; actual physical cables, antennas and interfaces. Everything on this layer is dumb, the devices have no awareness of each other whatsoever.
On top of physical is the Data Link Layer, OSI Model Layer 2. This is where the ones and zeros get translated to modulations of the signal, and vice versa. This is the minimum requirement for devices to be aware of each other, and to be able to communicate at all. Because multiple interfaces can be connected to a single L2 network segment, each interface must have a unique L2 identifier, i.e. hardware address - the MAC address.
The Protocol Data Unit on this layer is called frame, and the structure is:
+--------------+-----------------+---------+------------+
| SenderL2Addr | RecipientL2Addr | Payload | ErrorCheck |
+--------------+-----------------+---------+------------+
L2 frames can only be forwarded within the same subnet and broadcast domain. Devices in the same subnet / bcast domain communicate using L2 addressing.
Next one up is the Network Layer, OSI model Layer 3. This layer handles transmissions between networks. To allow communication on this layer each device must have a unique L3 identifier - the IP address.
The protocols on this layer neither do nor need to have any knowledge of physical medium, and they don't care about pysical transmission; that's L2 job. The PDU on this layer is called a packet, and the structure is:
+--------------+-----------------+---------+------------+
| SenderL3Addr | RecipientL3Addr | Payload | ErrorCheck |
+--------------+-----------------+---------+------------+
During transmission the L3 packet is encapsulated in the L2 frame, it's the payload. So when the data hits the wire, the frame actually looks like this:
+-----------+-----------+-----------+-----------+---------+--------+
| SndL2Addr | RcpL2Addr | SndL3Addr | RcpL3Addr | Payload | ErrChk |
+-----------+-----------+-----------+-----------+---------+--------+
Because IP is a L3 protocol, it has no method for handling actual data forwarding on the physical level. So without L2 protocols data simply cannot move between systems.
Of course it would be theoretically possible to remove L2 with its addressing from the picture entirely. However - as already said - this means we're not talking about 802.3 and .11 anymore. New standards would be required, all protocols would need to be re-written from the ground up, new chipsets would need to be designed, manufacturing processes changed to produce them, new devices would need to be designed, new manufacturing processes to produce those...
Excluding certain enterprise-class networking devices L3 is handled on software, but L2 is on hardware; i.e. chips. That means your phone wouldn't be able to communicate using WiFi, as its WiFi hardware is designed according to 802.11 standard.
The only realistic way for the transition to happen would be through natural device retirement cycle. That's a slllloooowwwww process.
The idea of re-designing networking this way is magnitudes more demanding than IPv6 adoption. It was introduced 1995, but took off only about a decade ago. The adoption globally is around 40%. So while I can imagine a future where MAC addressing is scrapped (tho' I don't really see a reason why) it's VERY far in the future indeed.