Remember that C++ takes the "don't pay for what you don't use" philosophy to heart. Let's consider for instance an imaginary platform that uses an opaque type to represent a mutex; let's call that type mutex_t. If the interface to operate on that mutex uses mutex_t* as arguments, for instance like void mutex_init(mutex_t* mutex); to 'construct' a mutex, it might very well the case that the address of the mutex is what's used to uniquely identify a mutex. If this is the case, then that means that mutex_t is not copyable:
mutex_t kaboom()
{
mutex_t mutex;
mutex_init(&mutex);
return mutex; // disaster
}
There is no guarantee here when doing mutex_t mutex = kaboom(); that &mutex is the same value as &mutex in the function block.
When the day comes that an implementor wants to write an std::mutex for that platform, if the requirements are that type be movable, then that means the internal mutex_t must be placed in dynamically-allocated memory, with all the associated penalties.
On the other hand, while right now std::mutex is not movable, it is very easy to 'return' one from a function: return an std::unique_ptr<std::mutex> instead. This still pays the costs of dynamic allocation but only in one spot. All the other code that doesn't need to move an std::mutex doesn't have to pay this.
In other words, since moving a mutex isn't a core operation of what a mutex is about, not requiring std::mutex to be movable doesn't remove any functionality (thanks to the non-movable T => movable std::unique_ptr<T> transformation) and will incur minimal overhead costs over using the native type directly.
std::thread could have been similarly specified to not be movable, which would have made the typical lifetime as such: running (associated to a thread of execution), after the call to the valued constructor; and detached/joined (associated with no thread of execution), after a call to join or detach. As I understand it an std::vector<std::thread> would still have been usable since the type would have been EmplaceConstructible.
edit: Incorrect! The type would still need to be movable (when reallocating after all). So to me that's rationale enough: it's typical to put std::thread into containers like std::vector and std::deque, so the functionality is welcome for that type.