2

I have 1Tb HDD on my laptop, it is about 2 years old. Recently, I started noticing random hang up and freezes, then checked my HDD's health. The first time I checked there were 502 bad sectors, then it kept increasing every day, over 3 days, it jumped to 702. Is it a bad sign? Does that mean it might fail soon?

enter image description here

UPD

After installing Speccy. SMART status shows Warning but every attribute is Good and Reallocated sector counts increased to 750

UPD

It increased to 807

4 Answers4

4

I treat SMART like this (based on 20 years experience in data recovery):

  • If SMART says all is well, do not for one minute think your drive can not die the next minute.

  • If SMART however gives reason for caution, act as if the drive will die soon.

IOW, drives die without SMART ever warning but some SMART values can actually help us determine if some issue is going on a with a drive. Examples are values for reallocated sectors and sectors pending reallocation. These are sectors the drive could not read whatever it tried.

AFAIK the RAW value for these attributes is simply a count. If we see 0xFF we know 255 reallocations took place, simple as that. Some manufacturers may employ more complex RAW values for certain attributes (example), but not for these in my experience.

In question we do not only see a lot of reallocations, although this is arbitrary, and some say 700 or so reallocations isn't a lot, we also see them increase rapidly. IMHO the number and the rate are alarming. It is why I believe the drive is dying.

If we consider a patient with a wound, it may take some time for blood pressure to drop below critical values. But if we observe at the same time patient is losing a lot of blood we're not going to wait until his blood pressure is below the critical value, we act immediately and try to prevent the situation from getting worse.

Each time a drive encounters a sector it can not read and for which it can not ECC correct the data a drive initiates a so called error correcting procedure. The OS can not do anything else than sit these out and so this can cause apparent hang-ups. These procedures takes at least seconds for each sector and may take up to 20 seconds.

You will push 'dying drives' closer to the edge by the simple fact of reading from them. So you then better make each read count and not waste them on disk surface scans. A data recovery engineer would hook up such a drive to specialized hardware disk imager and skip bad sectors as much as possible. Closest we can get to this specialized hardware is probably the open source tool HDDSuperClone.

So if you need the data from this drive my advice would be to clone it ASAP using this software. If you don't I'd replace it.

EDIT: It seems this may be an SMR drive. Once a SMR drive actually fails they're often problematic to recover data from even by a data recovery lab.

0

There perhaps is too much discussion here around interpreting your SMART values, rather than discussing what is necessary to determine the condition of your disk.

In 30 years of diagnosing bad hard disks rarely ever did I use SMART values to do it. Indeed, it didn't even exist back then. That's not to say there isn't value with the SMART stats, but you can see the kind of confusion it causes. The problem with SMART is that each manufacturer implements it in different ways, and it's nearly impossible to know for sure what you're looking at unless the manufacturer's own specification is documented, or their own tool interprets them.

My suggestion is that you perform a surface scan on the disk. A surface scan, sometimes referred to a "long test" in some tools, will physically read (and optionally write) every physical sector on the disk helping to determine the condition of the drive. This type of test is primarily used on hard disks, and has little value on SSDs, although I was able to detect a bad SSD one time with a surface scan, that was otherwise passing other diagnostic tools.

First, a warning. If your drive has bad sectors, this test will find them. If your drive is failing, it may have hundreds of bad sectors. Running this test on a failing drive will make it worse. Data recovery and backup is priority one! Rescue your data FIRST with a backup or clone of the disk BEFORE running surface scans on it.

Now, there are many tools which can perform these tests. Most manufacturers release their own tools which can do this type of test, and there are other commercial and freeware products available. Back when I did this on a regular basis, I most recently used a tool call HDD Regenerator. Before that we used a tool called SpinRite. In addition there are Seagate Seatools, Western Digital Data LifeGuard Tools, and many others. However, I fall back on a freeware tool called HDDScan for easy Windows based tests, especially for end users.

So, how do you use HDDScan to determine if your drive is good or bad?

Get the tool and start the test:

  • Navigate to https://hddscan.com and download the .zip file of the tool.
  • The tool does not require installation on your computer. Instead, you can decompress the downloaded .zip file by using your favorite tool, or by right-clicking the downloaded .zip file and choosing 'Extract All,' and then follow the instructions.
  • You should now have the decompressed HDDScan files in a folder on your computer. enter image description here
  • Double-click the HDDScan.exe file to run the application. HDDScan needs administrative access on your computer and you will be prompted to allow the application to make changes to your computer.
  • Accept the license agreement.
  • The first page you are presented with is the drive and test selection page. Choose your drive from the dropdown, and choose the 'TESTS' button. Then choose the 'VERIFY' test. enter image description here
  • The next page will allow you to choose the sector range you want to test. In this case, it defaults to the whole disk and you can click the right-arrow to continue. Before starting the test, be sure to close all other applications down on your computer to obtain more accurate results and avoid transient read/writes that can throw the sector read times off. enter image description here
  • The test will start immediately and you will see the task in the task list view. enter image description here
  • Double-click the running task to open the live view. You can pause and stop the test here as well. enter image description here

How do you interpret these results?

First, allow the test to complete. It will take a significant amount of time. However, you should monitor it occasionally. As mentioned earlier, this test will find bad sectors if they exist. And if it starts finding a lot of them (>10), you can stop the test. The drive is failing. There is no sense in continuing to beat it up.

When the test is complete you can review the stats. The test status window has three tabs: Graph, Map, and Report.

The Graph Tab. This tab displays the testing speed in KB/s during the course of the test. We expect the drive should maintain a fairly consistent read speed throughout the test. Transient spikes may not be an indication of a problem, and could be artifacts of other disk access that occurred by Windows during the scan. It is also worth noting that because the outer edge of a physical disk spins faster than the inner edge, you may see a ramp up or ramp down effect in speed over the course of the test.

Extended bursts of decreased read speed is a clear indication the drive may be having trouble reading the disk surfaces.

The Map Tab. This view is perhaps the most useful. Here you see a live view of the status of each sector that is read including the time it took to read the sector, and if any bad sectors are detected. In this view, we are primarily interested in the statistics on the right hand side.

enter image description here

This chart gives you the number of sectors read at certain speeds during the test. By far, most of your reads should be less than 10ms on a properly working drive. By default, any sector which takes longer than 50ms to read will create a log entry on the 'Report' tab. Sectors which take longer than 50ms to read are not necessarily bad. Again, because this is on a running windows system, your drive may be actively used during the test which will affect read speeds. However, if you begin to see a large number of sectors, especially when they are consecutive, taking over 150ms or worse, over 500ms, then this is a pretty clear indicator the drive is having issues reading this area of the drive.

Finally, the number of 'Bads' is the number of bad sectors detected. These are sectors in which the drive was unable to read the sector, and the data in that sector is most likely lost. While slow read times can indicate issues, the bad sector count is a clear indicator the drive has physical damage to it's disk surfaces.

The Report Tab.

This tab displays a log of all events of interest. Whether that be bad sectors or sectors which took an unusually long time to read, this log will show you a summary of things you might need to be concerned about on the drive.

What indicates a bad drive?

There is a little room for personal experience and preference here. However, a general rule of thumb is that slow reading sectors (>150ms) and bad sectors are an indication of physical issues on the drive. But, it is difficult to set hard and fast rules here. Drives do have a pool of spare sectors specifically for handling bad sectors. Drives will automatically lock out bad sectors and remap them to good sectors. To a point, these minor failures are expected and handled by the drive without any user intervention. I have reason to think (but am not sure) that if ANY bad sectors show up in this test, the drive has already exhausted its spare pool of sectors. So, determining when to replace a drive is sometimes decided by your level of risk tolerance.

Here is how I interpret test results.

  • If the drive has one or two bad sectors detected on the disk. I get concerned the drive is starting to fail. But, I also understand that the drive may have experienced a single event (such as a drop or bang) that damaged that specific area and it very well may continue to operate just fine. This is especially true if it is 2 or 3 consecutive bad sectors. However, there are numerous times this type of test found 1 or 2 bad sectors and even after returning the drive to service, it failed shortly after. So, this is a scenario where you need to decide what your risk tolerance is, backup often, and possibly continue to monitor the drive rather than replace it.
  • If the drive has numerous bad sectors, let's say 10 or more, especially if they are spread around on the drive. The drive is failing. It's time to replace it.
  • If the drive has numerous (greater than 10) slow reading sectors (>150ms), especially when consecutive, this can be an indication of problems on the drive. If no bad sectors are found, I would lean towards continuing to monitor in the future. However, when coupled with bad sectors, these slow reading areas are nearly as clear an indicator of physical damage as the bad sectors are and should be counted the same.

Ultimately, if it was MY drive, and any bad sectors are detected I would replace the drive immediately. Drives that are in tip top shape never have slow reading or bad sectors show up in these tests. In fact, many manufacturer diagnostic tools will fail the drive if ANY bad sectors are detected.

Finally, if you are really interested in SMART values. It would be a neat experiment to record the values before this test, and then look at them again after the test. This test will force the drive to read every available sector, so if there are any problems, SMART should be detecting it.

Appleoddity
  • 11,970
-2

Below are two answers, before and after seeing the SMART attributes. One describes the disk as dying soon, and the second as not perfect but still in a working condition.

This has precipitated an argument here between people who believe in SMART attributes and people who don't.

As usual, the truth is probably in-between. The disk should be watched for further degradation, but at the moment there is no indication that it will fail in the very near future.

A product like Speccy that analyzes the SMART attributes, is preferable as the tool, above one that just reports the raw data and leaves it to us to argue about.

Regarding the number of Reallocated sectors : This is not the same and is less serious than Unrecoverable sectors. Modern disks are fabricated with thousands of spare sectors, meaning they are made to recover from such problems. The point of no return arrives when these sectors are exhausted and no more sectors can be mapped. A disk that starts showing Unrecoverable sectors, and their number is growing, should be replaced.


Yes, the disk is failing. About a hundred bad sectors per day is extremely worrying.

Save your data before it fails completely.

Please add a screenshot of where you see the number of bad sectors, so I can be sure of my prognosis.


According to the screenshot, your disk in is in a very good shape, no errors at all.

You were misled by the SMART attributes. The values 100, 200 and sometimes 253 are normalized values that mean "no errors". These are the initial values of most SMART indicators, and errors cause them to go down to zero. The Raw values are mostly to be ignored - they are usually divided into bit-fields, so treating them as integers is meaningless.

DO NOT replace the disk - there is nothing wrong with it.

harrymc
  • 498,455
-2

Your reallocated sector count is not 750. For SMART values, 100 is "normal", lower is worse, and below the indicated threshold means "failing". The raw value is not standardized at all.

In some harddisks, the raw value for this attribute is really the number of reallocated sectors, but it can be also some compound value where different bitfields that are part of the value mean different things (that's also why the value is displayed in hex).

So if no value is going below 100, and in particular if there are no other values going down that indicate read errors etc. which are the reason sectors get reallocated, you don't have to worry at all. The warning is lying to you.

I started noticing random hang up and freezes,

Could have other reasons, like a cable not correctly seated. Did you investigate the system errors that lead to the hang up/freeze?

dirkt
  • 17,461