Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 5854

Troubleshooting • Unstable NVME disk

$
0
0
Hello,
I have a strange behavior with 1TB NVME disk (Crucial CT1000P2SSD8) and case Lemorele : https://www.amazon.it/gp/product/B08TWW ... UTF8&psc=1

Coming back to the issue, it works fine for days, I have quirks in place for the disk, but after a random time (usually 2-4 days) , it start logging "reset SuperSpeed USB device" and stop working.

The disk is under heavy workload because it's used by LVM as cache ! Not less than 50MB/s 24h/day.
I have others five 3.5" disks and on2 2.5" SSD working without issues.
To fix I have to shutdown and restart, just a restart often don't work. It seems that the case is the issues.

Does Someone else faced this issue ?
Is there a more stable case (nvme to usb) ?

Thanks
Jan 24 05:09:45 localhost kernel: [233471.311952] usb 2-1.4: reset SuperSpeed USB device number 9 using xhci_hcd
Jan 24 05:09:45 localhost kernel: [233471.368948] usb 2-1.4: Enable of device-initiated U1 failed.
Jan 24 05:09:45 localhost kernel: [233471.370024] usb 2-1.4: Enable of device-initiated U2 failed.
.....
Jan 24 05:09:46 localhost kernel: [233472.063893] usb 2-1.4: reset SuperSpeed USB device number 9 using xhci_hcd
Jan 24 05:09:46 localhost kernel: [233472.118972] usb 2-1.4: Enable of device-initiated U1 failed.
Jan 24 05:09:46 localhost kernel: [233472.120175] usb 2-1.4: Enable of device-initiated U2 failed.
Jan 24 05:09:46 localhost kernel: [233472.120528] sd 6:0:0:0: [sdk] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=31s
Jan 24 05:09:46 localhost kernel: [233472.120537] sd 6:0:0:0: [sdk] tag#0 CDB: opcode=0x28 28 00 2c 73 f7 88 00 00 08 00
Jan 24 05:09:46 localhost kernel: [233472.120543] I/O error, dev sdk, sector 745797512 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Jan 24 05:09:46 localhost kernel: [233472.120597] EXT4-fs warning (device dm-27): dx_probe:931: inode #11403296: lblock 512: comm nfsd: error -5 reading directory block


root@xxxxxx:~# smartctl -d sntrealtek -x /dev/sdi1
smartctl 7.2 2020-12-30 r5155 [aarch64-linux-6.1.21-v8+] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: CT1000P2SSD8
Serial Number: 2114E59386C9
Firmware Version: P2CR033
PCI Vendor/Subsystem ID: 0xc0a9
IEEE OUI Identifier: 0x6479a7
Total NVM Capacity: 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity: 0
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,000,204,886,016 [1.00 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 6479a7 4a0000003d
Local Time is: Wed Jan 24 07:59:36 2024 CET
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 70 Celsius
Critical Comp. Temp. Threshold: 85 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 3.50W - - 0 0 0 0 0 0
1 + 1.90W - - 1 1 1 1 0 0
2 + 1.50W - - 2 2 2 2 0 0
3 - 0.0700W - - 3 3 3 3 5000 1900
4 - 0.0020W - - 4 4 4 4 13000 100000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 1
1 - 4096 0 0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 53 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 51%
Data Units Read: 1,991,407,179 [1.01 PB]
Data Units Written: 541,034,167 [277 TB]
Host Read Commands: 58,370,007,172
Host Write Commands: 4,055,861,276
Controller Busy Time: 204,239
Power Cycles: 33,741
Power On Hours: 16,270
Unsafe Shutdowns: 748
Media and Data Integrity Errors: 0
Error Information Log Entries: 72
Warning Comp. Temperature Time: 2593
Critical Comp. Temperature Time: 0
Thermal Temp. 1 Transition Count: 434
Thermal Temp. 2 Transition Count: 5
Thermal Temp. 1 Total Time: 8005424
Thermal Temp. 2 Total Time: 164971

Warning: NVMe Get Log truncated to 0x200 bytes, 0x200 bytes zero filled
Error Information (NVMe Log 0x01, 16 of 16 entries)
No Errors Logged

Statistics: Posted by mcanto — Wed Jan 24, 2024 7:07 am



Viewing all articles
Browse latest Browse all 5854

Latest Images

Trending Articles



Latest Images