It's tempting to say that __dict__ has to be a descriptor because implementing it as a __dict__ entry would require you to find the __dict__ before you can find the __dict__, but Python already bypasses normal attribute lookup to find __dict__ when looking up other attributes, so that's not quite as compelling as it initially sounds. If the descriptors were replaced with a '__dict__' key in every __dict__, __dict__ would still be findable.
There's some space savings by not having a key for '__dict__' in every __dict__, but that's not the big reason. There's also time saved by not having to set a '__dict__' key, and time and space saved by not creating a circular reference, and these benefits are all really nice, but they're still probably smaller than the next thing.
The big thing requiring __dict__ to be a descriptor is handling attempts to reassign or delete an object's __dict__. If attribute lookup for __dict__ went through a __dict__ key, then reassigning someobj.__dict__ would reassign the dict key without changing what dict Python actually looks in to find someobj's attributes. __dict__ needs to be a descriptor so it stays in sync with the actual C-level struct slot Python looks in to find an object's dict.