4

I require 64GB to fit an entire dataset in memory for deep learning, but have only 12GB RAM. Virtual memory being the next-best alternative, I learned it can be effectively increased via increasing the pagefile size - but this source suggests it'd increase system instability.

All other sources state to the contrary, only noting lowered SSD lifespan, which isn't a problem - but I rather not take chances; this said, is there a limit to how much pagefile size can be increased without yielding instability?


Additional info: Win10 OS, 26GB OS-allocated pagefile size (need 52GB + c, c = safe minimum)


PRE-ANSWER: proceeded as described here, with ~70GB memory-mapped data; the average data load speedup is 42-FOLD. I suspect this figure may be bumped to ~130, though won't work on it now unless someone answers this. Lastly, this is sustainable and won't degrade the SSD, as the use is 99.9%+ reads. Will post full answer with details eventually.

2 Answers2

2

The page file supports swapping (a.k.a paging) 4K blocks of data (which are called pages) in RAM out to disk and back.

Code that the CPU is running must live in physical RAM. Also, Windows, like other OSes, uses "unused" RAM to cache disk I/O until it is flushed (and if disk data is just read and re-read, it might just stay in "unused" RAM for a longtime).

In a multitasking operating system, there may be some code that is owned by tasks that are waiting on some event that hasn't happened recently, like user input. It helps system performance to page this out to a disk file and call it back in when the events happen, so that code that is actually doing something on your computer can leverage the free RAM.

Now of course, the operating system can page code that might actually be doing something, but is a lower priority, if a sudden request for more memory than the system has comes in. This is better in most cases than denying a program outright a request for memory, if it isn't too much more physical RAM than what is available.

At some point, if you keep allocating memory that isn't there, your program will be competing with basic Windows services and other programs running on your computer. Plus, you've removed all the unused RAM, so disk I/O won't be cached at all. You will experience a massive decrease in performance that will affect all processes on the system, including system ones.

The instability described as harmful can come from basic Windows functions becoming unresponsive because they are going back and forth from disk to RAM and swapping with your machine learning program and other programs. For example, clicking on a desktop icon may take minutes to respond. So you might think the system is frozen totally when it's just going through swapping like crazy and will eventually respond.

LawrenceC
  • 75,182
0

It sounds like your program is going to be going all over that dataset while it's running. That is going to cause a tremendous amount of swapping. You point out a fast SSD can be at 10% of RAM speed--but your program might have wanted 100 bytes of data while the system proceeded to read 4096 bytes off the disk. Your 10% doesn't mean it merely takes 10x as long to run.

Furthermore, if your program is modifying the data as it works with it things get far worse--dirty pages get written. If there's much modification of data you'll deplete the drive of spare blocks and your write speed can become truly atrocious. (A page must erased before being written. Last I knew that was an operation measured in a substantial number of milliseconds, although I'm not finding current data. Normally a drive keeps a supply of empty pages around to handle writes but when the writes come in faster than it can wipe the pages the pool gets depleted and a write must wait until the eraser finishes with a page.)