Difference between revisions of "Check disk failure and send alert"
Jump to navigation
Jump to search
(Created page with "when no disks failed, the output of storcli is: <bash [root@disk-test-node1 ~]# storcli64 /c0/vall show Controller = 0 Status = Success Description = None Virtual Drives...") |
|||
| Line 1: | Line 1: | ||
when no disks failed, the output of storcli is: | when no disks failed, the output of storcli is: | ||
| − | < | + | <syntaxhighlight> |
[root@disk-test-node1 ~]# storcli64 /c0/vall show | [root@disk-test-node1 ~]# storcli64 /c0/vall show | ||
Controller = 0 | Controller = 0 | ||
| Line 24: | Line 24: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
| + | |||
| + | To add health check in bright: | ||
| + | |||
| + | 1- Adding the healthcheck. | ||
| + | |||
| + | <syntaxhighlight> | ||
| + | # cmsh | ||
| + | % monitoring healthchecks | ||
| + | % add <healthcheck_name> | ||
| + | % set command <path_to_your_script> | ||
| + | % commit | ||
| + | </syntaxhighlight> | ||
| + | 2- Configuring the healthcheck | ||
| + | <syntaxhighlight> | ||
| + | % monitoring setup healthconf <category_name> | ||
| + | % add <healthcheck_name> | ||
| + | % set checkinterval <interval> | ||
| + | % commit | ||
| + | </syntaxhighlight> | ||
| + | You can then add a fail action if the healthcheck fails like getting an email alert or powering the node off. You can find more information about metrics and monitoring in Bright in chapter 9 of Bright 7.0 admin manual. | ||
when disk failure occurs, the output is: | when disk failure occurs, the output is: | ||
Revision as of 10:33, 2 July 2015
when no disks failed, the output of storcli is:
[root@disk-test-node1 ~]# storcli64 /c0/vall show
Controller = 0
Status = Success
Description = None
Virtual Drives :
==============
-------------------------------------------------------------
DG/VD TYPE State Access Consist Cache Cac sCC Size Name
-------------------------------------------------------------
0/0 RAID6 Optl RW No RWBD - ON 1.063 TB
-------------------------------------------------------------
Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|dgrd=Degraded
Optl=Optimal|RO=Read Only|RW=Read Write|HD=Hidden|B=Blocked|Consist=Consistent|
R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack|
AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check ConsistencyTo add health check in bright:
1- Adding the healthcheck.
# cmsh
% monitoring healthchecks
% add <healthcheck_name>
% set command <path_to_your_script>
% commit2- Configuring the healthcheck
% monitoring setup healthconf <category_name>
% add <healthcheck_name>
% set checkinterval <interval>
% commitYou can then add a fail action if the healthcheck fails like getting an email alert or powering the node off. You can find more information about metrics and monitoring in Bright in chapter 9 of Bright 7.0 admin manual.
when disk failure occurs, the output is:
[root@disk-test-node1 ~]# storcli64 /c0/vall show
Controller = 0
Status = Success
Description = None
Virtual Drives :
==============
-------------------------------------------------------------
DG/VD TYPE State Access Consist Cache Cac sCC Size Name
-------------------------------------------------------------
0/0 RAID6 Pdgd RW No RWBD - ON 1.063 TB
-------------------------------------------------------------
Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|dgrd=Degraded
Optl=Optimal|RO=Read Only|RW=Read Write|HD=Hidden|B=Blocked|Consist=Consistent|
R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack|
AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check ConsistencyTo check if any disk failure occurs, we can use this command:
[root@disk-test-node1 ~]# storcli64 /c0/vall show |grep '\ Optl\ '
0/0 RAID6 Optl RW No RWBD - ON 1.063 TB