65

It is conventional wisdom¹ that each time you spin a hard disk down and back up, you shave some time off its life expectancy.

The topic has been discussed before:

Common explanations for why spindowns and spinups are harmful are that they induce more stress on the mechanical parts than ordinary running, and that they cause heat variations that are harmful to the device mechanics.

Is there any data showing quantitatively how bad a spin cycle is? That is, how much life expectancy does a spin cycle cost? Or, more practically, if I know that I'm not going to need a disk for X seconds, how large should X be to warrant spinning down?

¹ But conventional wisdom has been wrong before; for example, it is commonly held that hard disks should be kept as cool as possible, but the one published study on the topic shows that cooler drives actually fail more. This study is no help here since all the disks surveyed were powered on 24/7.

5 Answers5

21

I am not aware of any studies on the subject, but I do know what the SMART data tells me:

For one particular drive (a WD Scorpio Blue 2.5") a start-stop count of ~200,000 or a load-cycle count of ~600,000 corresponds to SMART value 0 (i.e. the disk is at the end of its life according to SMART). (This is a laptop drive, they are made to handle a larger number of spindowns than desktop drives are.)

As these values come from the manufacturer, I assume they represent the manufacturer's best guesstimate for what their drives can handle. Lacking independent data, I'd be inclined to think that the manufacturer's guess is probably better than mine, so you could probably do worse than using those numbers in calculating the X.

j-g-faustus
  • 1,309
9

The google study does mention the effect of power-on cycles:

Power Cycles. The power cycles indicator counts the number of times a drive is powered up and down. In a server-class deployment, in which drives are powered continuously, we do not expect to reach high enough power cycle counts to see any effects on failure rates. Our results find that for drives aged up to two years, this is true, there is no significant correlation between fail- ures and high power cycles count. But for drives 3 years and older, higher power cycle counts can increase the absolute failure rate by over 2%. We believe this is due more to our population mix than to aging effects. More- over, this correlation could be the effect (not the cause) of troubled machines that require many repair iterations and thus many power cycles to be fixed.

6

I guess the issue you'll have in finding literature on this subject is that the area in which disk failure research is done is commercial datacentres, where the latency involved in spinning disks down is unacceptable.

That said, I found this paper from the IEEE. The authors propose letting the second disk in a RAID 1 array spin down until it's absolutely needed. They term this RAREE (Reliability Aware Energy Efficient Approach). Though it's not the quantitative data you are seeking, their approach seems to assume that spinning down the second disk will extend the lifetime of the array overall.

grw
  • 266
0

That Google study is probably the best you're going to get for the temperature question. I doubt anybody's collected as much data on as many different types of drives in the same environment.

Cooler drives to NOT "fail more". If you get too cold you're going to have higher failure rates. Too much of a good thing...isn't. The next graph down shows that 3 years in, at over 45 Celsius your failure rate is 3x what it would be 5-10 degrees hotter. Heat & friction are BAD for quickly-moving machinery. That's not going to change.

I suspect there aren't too many studies on the subject because it's not gray area. For the excellent reasons given in other posts, it's just plain physics.

Kara Marfia
  • 2,061
0

The data handled by the S.M.A.R.T. counters are a good average reference forecasted by the manufacturer but usually is overrided by external factors... or even a screw less tighten in the drive.

Then you have the RAISE/STOP rotation which consumes more energy than stay rotating for a large number of seconds, and consumes time also (this vary largely from old to new hd's and from low cost to better drives)... and other factors...

You can see an analogy with a fluorescent lamp... which consumes in the initial ignition more than several minutes of operation...

Stop spin only is productive if the system software(s) can stay operating only in memory/cache for a long period, in the actual systems (multi-processes/daemons/services/rx's) this usually only occurs if you control/tweak largely your system

The quality of the energy supplied for the drive is of high importance... and contributes very much for a healthy drive...

The RAID thing is not quite clear... if we want to have a 2nd/3rd/nth disk very protected the ideal should be a MIRROR DISK only activated in a specific interval than sleeps until next activation... stopping spin for long period...

In my experience I found drives with 10+ years of age and working perfectly and drives with 1 year and several problems (interface, main rotor, arm drive and surfaces)

I would have very much more to say about this, from materials to vibrations, thermal conditions, etc. but to abbreviate I want to say that the magnetic surface properties are also a big player in this equation... and many times are the factor that conditions the quality of the HD

ZEE
  • 1,014