2

I've seen some similar questions but these are dealing with files under 1GB and the answers generally recommend services such as Dropbox, S3 and Skydrive. This does not appear to be suitable for my needs.

I have a very large dataset (Dota2 public matchmaking history) which in its raw form in MongoDB (without indexes) is around 800GB. Dumping this and using 7Zip Ultra level compression I can achieve around 9-10% compression ratio, I can reduce this size to around 80GB for distribution. I am seeking a way to make these compressed files publicly available, but am unsure of the best way to distribute these. I can split the files into smaller pieces by dumping with a query. This has a negligible impact on the compression ratio.

My home internet has a very slow upload speed (1.3Mbps max, often throttled), so I would prefer not to seed a torrent from my home connection.

What is the best way of distributing this dataset? Could there be a way to further compress the dataset?

EDIT: Since this question has been marked as duplicate, I don't think I can answer it anymore. I'm not sure how anyone thinks that this is a duplicate of a question where the accepted answer is Dropbox, but for anyone who stumbles across this question by best options seems to be as follows:

Use BitTorrent as the transfer protocol, but host the files with a "Seedbox" provider. These appear to be VPS providers focused on provided bandwidth and storage space for heavy users of the BitTorrent protocol. As an average price, enough space and bandwidth for my needs can be had for around $10 a month. In order to get the files onto the hosting providers, I will copy them to an external drive and then FTP them to the hosting from multiple locations where I have access to internet connections.

Oliver Salzburg
  • 89,072
  • 65
  • 269
  • 311
Charles A
  • 161

0 Answers0