|
| |||
|
|
Электроника винта пошла гулять? Приветствую, уважаемые сообщники! Сегодня ночью, при попытке переписать большие и толстые файлы на один из разделов виртуалки, получил фигвам. В логах хоста такое: Nov 13 23:52:29 reiss kernel: [2599179.840101] ata5: lost interrupt (Status 0x50) Nov 13 23:52:29 reiss kernel: [2599179.840140] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Nov 13 23:52:29 reiss kernel: [2599179.840156] ata5.00: failed command: WRITE DMA EXT Nov 13 23:52:29 reiss kernel: [2599179.840165] ata5.00: cmd 35/00:58:20:df:35/00:00:54:00:00/e0 tag 0 dma 45056 out Nov 13 23:52:29 reiss kernel: [2599179.840166] res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Nov 13 23:52:29 reiss kernel: [2599179.840176] ata5.00: status: { DRDY } Nov 13 23:52:29 reiss kernel: [2599179.840222] ata5: soft resetting link Nov 13 23:52:34 reiss kernel: [2599185.004130] ata5.00: qc timeout (cmd 0x27) Nov 13 23:52:34 reiss kernel: [2599185.004136] ata5.00: failed to read native max address (err_mask=0x4) Nov 13 23:52:34 reiss kernel: [2599185.004140] ata5.00: HPA support seems broken, skipping HPA handling Nov 13 23:52:34 reiss kernel: [2599185.004143] ata5.00: revalidation failed (errno=-5) Nov 13 23:52:34 reiss kernel: [2599185.004200] ata5: soft resetting link Nov 13 23:52:34 reiss kernel: [2599185.176485] ata5.00: configured for UDMA/100 Nov 13 23:52:34 reiss kernel: [2599185.176502] ata5: EH complete Nov 13 23:53:05 reiss kernel: [2599215.840096] ata5: lost interrupt (Status 0x50) Nov 13 23:53:05 reiss kernel: [2599215.840137] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Nov 13 23:53:05 reiss kernel: [2599215.840152] ata5.00: failed command: WRITE DMA EXT Nov 13 23:53:05 reiss kernel: [2599215.840161] ata5.00: cmd 35/00:58:20:df:35/00:00:54:00:00/e0 tag 0 dma 45056 out Nov 13 23:53:05 reiss kernel: [2599215.840163] res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Nov 13 23:53:05 reiss kernel: [2599215.840171] ata5.00: status: { DRDY } Nov 13 23:53:05 reiss kernel: [2599215.840219] ata5: soft resetting link Nov 13 23:53:05 reiss kernel: [2599216.020449] ata5.00: configured for UDMA/100 Nov 13 23:53:05 reiss kernel: [2599216.020464] ata5: EH complete Nov 13 23:53:36 reiss kernel: [2599246.880101] ata5: lost interrupt (Status 0x50) Nov 13 23:53:36 reiss kernel: [2599246.880141] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Nov 13 23:53:36 reiss kernel: [2599246.880156] ata5.00: failed command: WRITE DMA EXT Nov 13 23:53:36 reiss kernel: [2599246.880166] ata5.00: cmd 35/00:58:20:df:35/00:00:54:00:00/e0 tag 0 dma 45056 out Nov 13 23:53:36 reiss kernel: [2599246.880167] res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Nov 13 23:53:36 reiss kernel: [2599246.880176] ata5.00: status: { DRDY } Nov 13 23:53:36 reiss kernel: [2599246.880222] ata5: soft resetting link Nov 13 23:53:36 reiss kernel: [2599247.060459] ata5.00: configured for UDMA/100 Nov 13 23:53:36 reiss kernel: [2599247.060475] ata5: EH complete ... и так далее, кучу экранов.... В логах виртуалки - ошибки о записи на диск в кучу секторов. S.M.A.R.T. - статус винта рапортует вот о чём: smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-3-amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: ST2000DM001-9YN164 Serial Number: Z1E0N13L LU WWN Device Id: 5 000c50 04d7a86db Firmware Version: CC4B User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Wed Nov 14 13:00:19 2012 EET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 245) Self-test routine in progress... 50% of test remaining. Total time to complete Offline data collection: ( 584) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 220) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 116 100 006 Pre-fail Always - 115806880 3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 7 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 211070 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 2474 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 7 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 062 061 045 Old_age Always - 38 (Min/Max 32/38) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 3 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 182 194 Temperature_Celsius 0x0022 038 040 000 Old_age Always - 38 (0 27 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 101094940213271 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 190605094874 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 5068997233 SMART Error Log Version: 1 No Errors Logged Правильно ли я понимаю, что это глюк электроники винта, который висит на 5-м SATA канале, так как бэдов нету, винт не стукали и не перегревали? UPD: Совсем забыл сказать - рядом, в той же корзине стоят 2 других (терабайтных) винта в зеркале - никаких нареканий, используются активно - там образы виртуалок. |
||||||||||||||