|
Paieška |
Įvadas Stebėjimas realiu laiku leidžia OVH sekti serverio būklę, o S.M.A.R.T renka svarbiausią informaciją apie standųjį diską. Cron dėka serveris reguliariai siunčia informaciją į mūsų stebėjimo sąsają. Aptikę serverio gedimą galime greitai jį pašalinti. SmartMon įrankis - tai disko analizės įrankis. Jis tikrina svarbiausias fizines charakteristikas. Tai atliekama dviem būdais:
SmartMon įrankio aktyvavimas / diegimas Prisijunkite prie serverio kaip root. Labai svarbu, kad šį veiksmą atliktumėte kaip root. Jeigu prisijungsite kaip paprastas vartotojas ir naudosite sudo komandą, diegimas gali nepavykti ir smartctl neveiks. Kai prisijungsite, įdiekite naujausią versiją atnaujindami savo serverį: [root@delirium root]# wget ftp://ftp.ovh.net/made-in-ovh/release/patch-all.sh -O patch-all.sh; sh patch-all.sh
Connection to ftp.ovh.net:21...Connect! Session starting under Anonymous...Established session! ==> SYST ... complete. ==> PWD ... complete. ==> TYPE I ... complete. ==> CWD /made-in-ovh/release/1.58-1.59 ... complete. ==> PASV ... complete. ==> LIST ... complete. 0K @ 84.96 KB/s 10:48:32 (84.96 KB/s) - `.listing' backup [87] `.listing' deleted. --10:48:32-- ftp://ftp.ovh.net/made-in-ovh/release/1.58-1.59/smartmontools-5.33-1.i386.rpm => `smartmontools-5.33-1.i386.rpm' ==> CWD isn't required. ==> PASV ... complete. ==> RETR smartmontools-5.33-1.i386.rpm ... complete. Lenght: 342,512 0K .......... .......... .......... .......... .......... 14% @ 5.43 MB/s 50K .......... .......... .......... .......... .......... 29% @ 6.98 MB/s 100K .......... .......... .......... .......... .......... 44% @ 8.14 MB/s 150K .......... .......... .......... .......... .......... 59% @ 8.14 MB/s 200K .......... .......... .......... .......... .......... 74% @ 8.14 MB/s 250K .......... .......... .......... .......... .......... 89% @ 8.14 MB/s 300K .......... .......... .......... .... 100% @ 16.84 MB/s 10:48:32 (7.60 MB/s) - `smartmontools-5.33-1.i386.rpm' backup [342512] End --10:48:32-- Download: 342,512 bits in 1 file Preparing... ########################################### [100%] 1:smartmontools ########################################### [100%] Shutting down smartd: [ OK ] Starting smartd: [ OK ] Restarted smartd services smartd will continue to start up on system boot Shutting down smartd: [ OK ] Starting smartd: [ OK ] Jeigu visas procesas bus atliktas be klaidų, matysite informaciją: Use smartctl -h to get a usage summary. Smart aktyvuotas ir įtrauktas į Cron užduotis. Smartd Dabar Smartd reguliariai tikrins kietojo disko informaciją ir perduos ją į RTM žurnalą bei įrašys juos jūsų žurnale: [root@delirium /]# cat /var/log/messages | grep smartd
Mar 17 10:48:34 delirium smartd[990]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices Mar 17 10:48:34 delirium smartd[990]: Device: /dev/hda, opened Mar 17 10:48:34 delirium smartd[990]: Device: /dev/hda, found in smartd database. Mar 17 10:48:35 delirium smartd[990]: Device: /dev/hda, is SMART capable. Adding to "monitor" list. Mar 17 10:48:35 delirium smartd[990]: Device: /dev/hdb, opened Mar 17 10:48:35 delirium smartd[990]: Device: /dev/hdb, not ATA, no IDENTIFY DEVICE Structure Mar 17 10:48:35 delirium smartd[990]: Monitoring 1 ATA and 0 SCSI devices Mar 17 10:48:35 delirium smartd: Lancement smartd succeeded Mar 17 10:48:35 delirium smartd[2421]]: smartd has fork()ed into background mode. New PID=2421. Mar 17 13:48:35 delirium smartd[2421]: Device: /dev/hda, SMART Prefailure Attribute: 8 Seek_Time_Performance? changed from 246 to 247 Mar 17 15:48:35 delirium smartd[2421]: Device: /dev/hda, SMART Prefailure Attribute: 8 Seek_Time_Performance? changed from 247 to 246 Mar 17 17:18:35 delirium smartd[2421]: Device: /dev/hda, SMART Prefailure Attribute: 8 Seek_Time_Performance? changed from 246 to 247 Kaip suprasti šiuos įrašus? Standusis diskas nuolatos pateikia reikšmes, kintančias nuo 246 iki 247. Jeigu reikšmė staiga pasikeistų nuo 247 iki 500, tai būtų neįprastas veikimas. Kitame skyriuje rasite daugiau informacijos apie šias reikšmes. Patarimai Visą informaciją galite gauti el. paštu, tam pakanka įtraukti ar pakeisti vieną eilutę /etc/smartd.conf faile [root@delirium /]# pico /etc/smartd.conf
# A very silent check. Only report SMART health status if it fails # But send an email in this case # /dev/hdc -H -m admin@example.com Norėdami įtraukti savo adresą ir gauti laiškus, įvykdykite komandą: /dev/hda -H -m jusu@adresas.com Kaip nurodyta šio gido pradžioje, smartctl komandos naudojimas turi būti atliekamas tik root teisėmis. Apžvelkime kitas komandos charakteristikas. Toliau nurodytos pagrindinės smartctl komandos. [root@delirium /]# smartctl -h
smartctl version 5.33 [i386-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Usage: smartctl [options] device * h, --help, --usage Display this help and exit * i, --info Show identity information for device * a, --all Show all SMART information for device Komandos įvedamos tokiu formatu: [root@delirium /]# smartctl -i /dev/hda. Ši komandą pateikia informaciją apie diską: ===START OF INFORMATION SECTION ===
Device Model: Maxtor 6E040L0 Serial Number: E1KTPXFE Firmware Version: NAR61590 User Capacity: 41,110,142,976 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 Local Time is: Thu Mar 17 22:21:52 2005 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled Įrankio surinktą informaciją peržiūrėsite naudodami -a argumentą: ===START OF READ SMART DATA SECTION===
SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (1021) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 17) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time? 0x0027 252 252 063 Pre-fail Always - 2463 4 Start_Stop_Count? 0x0032 253 253 000 Old_age Always - 18 5 Reallocated_Sector_Ct? 0x0033 253 253 063 Pre-fail Always - 0 6 Read_Channel_Margin? 0x0001 253 253 100 Pre-fail Offline - 0 7 Seek_Error_Rate? 0x000a 253 252 000 Old_age Always - 0 8 Seek_Time_Performance? 0x0027 247 238 187 Pre-fail Always - 46214 9 Power_On_Minutes? 0x0032 241 241 000 Old_age Always - 950h+09m 10 Spin_Retry_Count? 0x002b 252 252 157 Pre-fail Always - 0 11 Calibration_Retry_Count? 0x002b 253 252 223 Pre-fail Always - 0 12 Power_Cycle_Count? 0x0032 253 253 000 Old_age Always - 22 192 Power-Off_Retract_Count? 0x0032 253 253 000 Old_age Always - 13 193 Load_Cycle_Count? 0x0032 253 253 000 Old_age Always - 72 194 Temperature_Celsius? 0x0032 253 253 000 Old_age Always - 31 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 25095 196 Reallocated_Event_Count? 0x0008 253 253 000 Old_age Offline - 0 197 Current_Pending_Sector? 0x0008 253 253 000 Old_age Offline - 0 198 Offline_Uncorrectable? 0x0008 253 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0 200 Multi_Zone_Error_Rate? 0x000a 253 252 000 Old_age Always - 0 201 Soft_Read_Error_Rate? 0x000a 251 138 000 Old_age Always - 1746 202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0 203 Run_Out_Cancel? 0x000b 253 252 180 Pre-fail Always - 137 204 Shock_Count_Write_Opern? 0x000a 253 252 000 Old_age Always - 0 205 Shock_Rate_Write_Opern? 0x000a 253 252 000 Old_age Always - 0 207 Spin_High_Current? 0x002a 252 252 000 Old_age Always - 0 208 Spin_Buzz? 0x002a 252 252 000 Old_age Always - 0 209 Offline_Seek_Performnce? 0x0024 187 183 000 Old_age Offline - 0 99 Unknown_Attribute? 0x0004 253 253 000 Old_age Offline - 0 100 Unknown_Attribute? 0x0004 253 253 000 Old_age Offline - 0 101 Unknown_Attribute? 0x0004 253 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. [root@delirium /]# Dabar reikia suprasti pateikiamą informaciją, pvz., disko uptime, temperatūrą ir svarbiausia, klaidas. Dėl to didžiausią dėmesį kreipsime į du paskutinius stulpelius: WHEN_FAILED is RAW_VALUE, taip pat į skyrių: SMART Error Log Version: 1 No Errors Logged. Pavyzdys: 5 Reallocated_Sector_Ct? 0x0033 016 016 063 Pre-fail Always FAILING_NOW 598 Čia matome, kad nepavyko pakartotinai priskirti sektorių. Reikia patikrinti šią dalį. Jeigu šis skaičius staigiai didėja, padarykite atsargines duomenų kopijas ir susisiekite su pagalbos skyriumi. Diegimui taip pat reikia root teisių. Pavadinimas priklauso nuo debian versijos. Žemiau pateiktame pavyzdyje naudojami nestabilūs debian. Jeigu naudojate OVH versiją, turite naudoti: apt-get install smartsuite. Procesas ir komandos išlieka tokios pačios. 23:19 root@revolution / # apt-get install smartmontools Reading package lists... Done
Building dependency tree.. Done The following NEW packages will be installed: smartmontools 0 upgraded, 1 newly installed, 0 to remove and 60 not upgraded. Need to get 222kB of archives. After this operation, 508kB of additional disk space will be used. Get: 1 http://ftp.fr.debian.org unstable/main smartmontools 5.32-3 [222kB] 222kB received in 0s (272kB/s) Selecting previously deselected package smartmontools. (Reading database... 67466 files and directories currently installed.) Unpacking smartmontools (from .../smartmontools_5.32-3_i386.deb) ... Setting up smartmontools (5.32-3) ... Not starting S.M.A.R.T. daemon smartd, disabled via /etc/default/smartmontools Kaip matote, programa automatiškai nestartavo. Reikia paredaguoti failą /etc/default/smartmontools 23:20 root@revolution /# pico /etc/default/smartmontools Defaults for smartmontools initscript (/etc/init.d/smartmontools)
# This is a POSIX shell fragment # list of devices you want to explicitly enable S.M.A.R.T. for # not needed if the device is monitored by smartd enable_smart="/dev/hda /dev/hdb" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup #smartd_opts="--interval=1800" Enable-smart vietoje pakeiskite diskus ir nuimkite sistemos paleidimo komentavimą. Išsaugokite keitimus ir paleiskite programą: 23:21 root@revolution /# /etc/init.d/smartmontools start
Enabling S.M.A.R.T. for: /dev/hda /dev/hdb. Starting S.M.A.R.T. daemon: smartd. 23:21 root@revolution /# smartctl -a /dev/hda smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen Viskas! Smartmontool yra labai naudingas įrankis, kurį lengva naudoti. Tačiau šis įrankis nepakeičia atsarginių kopijų. OVH siūlo papildomas atsarginių kopijų paslaugas. Daugiau detalių apie smartmontool ieškokite oficialioje svetainėje. |