You write the solution on your question, the limit you impose to yourself is quite low:
I've tried to run btrfs balance start -m and btrfs balance start
-musage=x where x varies from 0 to 50 but none of them helped.
The 50 indicates that you are allowing to have chunks with 100-50=50% non used space (wasted). If you put 60 you will tell you only wnat chunks with at most 40% wasted space, so chunks with more free space will be merged and freed.
Just use a bigger number, that number indicate how much percent of the chunk space must be used on each chunk, if a chunk has less usage than that percentage it will be merged with others into new chunk, freeing chunks.
Just try with 55, 60, 65, 70 ... 85, 90, 95, 100 till you get the desired result.
Or if you have plenty of time just use 100 directly, that way all chunks will be rellocated and it will use as less chunks as possible.
Putting 100 does not mean each chunk (neither each chunk less 1) will be 100% filled, but it means all chunks except one will be filled to the maximum possible, so it will make as much chunks as possible to be freed, at the cost of moving arround a lot of data / metadata, that is why all people recomend to try with a bigger value in low increments, to move as less data / metadata as possible until user is happy with wasted space.
Hope one day documentation will be more clear for users like me (novice) and not only for (experts) advance people... it took me a while to discover that the chunk size was 1GiB for data... i was writting a small (<1KiB) file on a newly btrfs raid 1 (two devices) and wow, there were 2GiB less on free space (one on each device)... and i was thinking all my data was getting lost, since i write more files and free size did not change... there were all being writted on one whole chunk (really two chunks, one on each device)... till i understand there is a pre-allocation in units of 1GiB.
If one chunk is not filled, it still takes 1GiB of space; so if you have two chunks at 75% filled, you are wasting 25% of two chunks of 1GiB, that is 250MiB on each chunk, so in total 500MiB, since i talk about RAID 1, it also happens the same on both devices so in total the waste is 1GiB of the 4GiB (2 chunks * 1GiB * 2 devices), that is a 25% wasted space.
But since you are putting 50 as the value, you are accepting a 100%-50%=50% wasted space. If you put 75, then 100%-75%=25%, so only 25% wasted space. Ando so on.
If you want to minimize wasted space, use 99 or 100, etc., a high value; but be aware that implies a lot of moves because of CoW (Copy on Write), extra caution if using SSD / NVME / etc., also extra care on USB flash memories/cards/etc.
Hope this heps you and others to understand.
Note: If someone knows how to force Btrfs to not use new chunks until actual chunks are filled, that would be great for me to know! I mean by not manually doing a balance