Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 May 2020 11:33:41 +0100
From:      Arthur Chance <freebsd@qeng-ho.org>
To:        Christoph Kukulies <kuku@kukulies.org>, Polytropon <freebsd@edvax.de>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: g_vfs_done() ada2p1 error = 5
Message-ID:  <24144a81-1f6e-206b-73ab-85846ce6db50@qeng-ho.org>
In-Reply-To: <E652115F-96EC-4B94-BA49-AB7356161D5C@kukulies.org>
References:  <2ED6F7F2-0F70-49F0-AF9F-E6E4CE11B2C3@kukulies.org> <20200505105033.ff69a110.freebsd@edvax.de> <E652115F-96EC-4B94-BA49-AB7356161D5C@kukulies.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 05/05/2020 10:17, Christoph Kukulies wrote:
> 
> 
>> Am 05.05.2020 um 10:50 schrieb Polytropon <freebsd@edvax.de>:
>>
>> On Tue, 5 May 2020 10:42:56 +0200, Christoph Kukulies wrote:
>>> I have a GPT partitioned SSD here which was serving as a boot
>>> volume of my previously installed
>>> FreeBSD 8.0 and now, as I have 12.1 installed and booting from
>>> that I have mounted this SD into 
>>> my running system.
>>>
>>> In my system log I’m seeing the following error message (and when
>>> I try to mount that partition, I’m getting an
>>> INPUT/OUTPUT error also):
>>>
>>> g_vfs_done():ada2p1[READ(offest=262144, length=8192)]error = 5
>>>
>>> ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
>>> [...]
>>>
>>> What does this error message mean? How do I get rid of it?
>>
>> Could it be that the SSD has reached the end of its lifetime?
> 
> 
> Hope not. Was trusting that FreeBSD takes care of saving SSDs from degrading over time.

It's good, but FreeBSD can't prevent wear and tear. SSDs have a finite
lifetime and all an OS can do is not decrease it.

>> I've seen similar messages on regular hard disks which were
>> about to die... Can you check the SSD with smartctl and see
>> if there is something suspicious?
>>
> 
> 
> The output of smartctl is overwhelming :) Happen to know what I should look for?

The attributes section is often the place to start. You can see just
that by giving the -A flag to smartctl.

> smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.1-RELEASE amd64] (local build)
> Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Samsung based SSDs
> Device Model:     Samsung SSD 840 EVO 120GB
> Serial Number:    S1D5NSAD978704E
> LU WWN Device Id: 5 002538 8a002c09c
> Firmware Version: EXT0AB0Q
> User Capacity:    120,034,123,776 bytes [120 GB]
> Sector Size:      512 bytes logical/physical
> Rotation Rate:    Solid State Device
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
> SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
> Local Time is:    Tue May  5 11:11:45 2020 CEST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00)	Offline data collection activity
> 					was never started.
> 					Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0)	The previous self-test routine completed
> 					without error or no self-test has ever 
> 					been run.
> Total time to complete Offline 
> data collection: 		( 4200) seconds.
> Offline data collection
> capabilities: 			 (0x53) SMART execute Offline immediate.
> 					Auto Offline data collection on/off support.
> 					Suspend Offline collection upon new
> 					command.
> 					No Offline surface scan supported.
> 					Self-test supported.
> 					No Conveyance Self-test supported.
> 					Selective Self-test supported.
> SMART capabilities:            (0x0003)	Saves SMART data before entering
> 					power-saving mode.
> 					Supports SMART auto save timer.
> Error logging capability:        (0x01)	Error logging supported.
> 					General Purpose Logging supported.
> Short self-test routine 
> recommended polling time: 	 (   2) minutes.
> Extended self-test routine
> recommended polling time: 	 (  70) minutes.
> SCT capabilities: 	       (0x003d)	SCT Status supported.
> 					SCT Error Recovery Control supported.
> 					SCT Feature Control supported.
> 					SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 1
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
>   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
>   9 Power_On_Hours          0x0032   091   091   000    Old_age   Always       -       42753
>  12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       69
> 177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       2
> 179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
> 181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
> 182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
> 183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
> 187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
> 190 Airflow_Temperature_Cel 0x0032   073   057   000    Old_age   Always       -       27
> 195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
> 199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
> 235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       51
> 241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       1078954976

Attributes 177, 179, 181, 182, 183 & 241 are interesting for SSDs.
You've not used any of your spare blocks, and you've had ~1 billion
writes. For comparison I've got a 840 PRO in one of my machines and
that's showing a wear leveling count of 17 and ~4 billion writes.

>From Samsung's documentation (found via an online forum)

ID # 177 Wear Leveling Count

This attribute represents the number of media program and erase
operations (the number of times a block has been erased). This value is
directly related to the lifetime of the SSD. The raw value of this
attribute shows the total count of P/E Cycles.


Each MLC NAND cell can be erased ~10,000 times before it stops reliably
holding charge.

Given your wear leveling count is 2 you should be a long way off end of
life.

> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
>  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>     1        0        0  Not_testing
>     2        0        0  Not_testing
>     3        0        0  Not_testing
>     4        0        0  Not_testing
>     5        0        0  Not_testing
> Selective self-test flags (0x0):
>   After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 



-- 
Fat Earther: One who believes the world is round but has put on too
much weight round the middle.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?24144a81-1f6e-206b-73ab-85846ce6db50>