Ever fancied a Nagios plugin to check all the sensors on the host without any hassle? Try this one, it collects all the sensors’ input values, compares it to their thresholds (the script collects threshold values from the system by itself). Then the plugin throws a warning when the rate of input value to the threshold value is 0.8 or more (actually you can change it by the -w option), also it yells about the critical state if the rate is equal or greater than 1 (of course, you can change it too by the -c option, although I wouldn’t suggest you to do that).
Oh, I almost forgot to add: you need to have lm_sensors utiluity installed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
#!/bin/bash CODE_OK=0 CODE_UNKNOWN=1 CODE_WARNING=2 CODE_CRITICAL=3 WARNING_THRESHOLD="80%" CRITICAL_THRESHOLD="100%" DEBUG=0 while getopts ":C:S:c:w:dh" OPT; do case "${OPT}" in C) CHIP="${OPTARG}" ;; S) SENSORS="${OPTARG}" ;; c) CRITICAL_THRESHOLD="${OPTARG}" ;; w) WARNING_THRESHOLD="${OPTARG}" ;; d) DEBUG=1 ;; h) cat <<__USAGE_END__ $0 [-C <chip>] [-S <sensors>] [-w <warn>] [-c <crit>] [-d] [-h] -C <chip> - to read sensors on certain chip -S <sensors> - to read sensors matching certain regexp -w <warning> - warning rate threshold (80% by default) -c <critical> - critical rate threshild (100% by default) -d - to print debug info -h - you've just done it :) __USAGE_END__ >&2 exit ${CODE_UNKNOWN} ;; \?) echo "Invalid option: -${OPTARG}, see -h" >&2 exit ${CODE_UNKNOWN} ;; :) echo "Option -${OPTARG} requires some argument, see -h" >&2 exit ${CODE_UNKNOWN} ;; esac done if [ -z "${CHIP}" ]; then DATA=$(/bin/sensors -u) else DATA=$(/bin/sensors -u "${CHIP}") fi if [ $? -ne 0 ]; then echo "Can't get sensors' data" >&2 exit ${CODE_UNKNOWN} fi echo "${DATA}" | awk \ -v CriticalT="${CRITICAL_THRESHOLD}" \ -v WarningT="${WARNING_THRESHOLD}" \ -v Sensors="${SENSORS}" \ -v CodeOK="${CODE_OK}" \ -v CodeUnknown="${CODE_UNKNOWN}" \ -v CodeWarning="${CODE_WARNING}" \ -v CodeCritical="${CODE_CRITICAL}" \ -v Debug=${DEBUG} \ ' function Croak(Sensor, Input, Threshold) { printf(\ "%s sensor shows %0.3f whilst its threshold is %0.3f (%0.0f%)\n", \ Sensor, Input, Threshold, Input / Threshold * 100 \ ) } function Finish(Sensor, Input, Threshold, Alarm) { if(Debug) { printf( \ "Finishing the %s sensor: input is %0.3f threshold is %0.3f alarm flag is %0.3f\n", \ Sensor, Input, Threshold, Alarm \ ) } if(length(Sensors)) { if(Sensor !~ Sensors) { return } } if(Alarm > 0) { printf("%s sensor in alarm state\n", Sensor) exit CodeCritical } if(match(CriticalT, /^([[:digit:]]+)%$/, matched)) { if ((length(Threshold) > 0) && (Input / Threshold * 100 >= matched[1])) { Croak(Sensor, Input, Threshold) exit CodeCritical } } else { if (Input >= CriticalT) { Croak(Sensor, Input, CriticalT) exit CodeCritical } } if(match(WarningT, /^([[:digit:]]+)%$/, matched)) { if ((length(Threshold) > 0) && (Input / Threshold * 100 >= matched[1])) { Croak(Sensor, Input, Threshold) exit CodeWarning } } else { if (Input >= WarningT) { Croak(Sensor, Input, WarningT) exit CodeWarning } } } // { if(match($0, /^([^:]+):$/, matched)) { SensorNew = matched[1] if(length(Sensor) > 0) { Finish(Sensor, Input, Threshold, Alarm) } Sensor = SensorNew Input = "" Threshold = "" Alarm = "" } if(match($0, /^[[:space:]]+([^:]+): ([0-9\.\-]+)$/, matched)) { Parameter = matched[1] Value = matched[2] if(Parameter ~ /_input$/) { Input = Value } if(Parameter ~ /_(max|crit|emergency)$/) { if (Value > Threshold) { Threshold = Value } } if(Parameter ~ /_alarm$/) { Alarm = Value } } } END { if(length(Sensor) > 0) { Finish(Sensor, Input, Threshold, Alarm) } }' STATUS="${?}" if [ "${STATUS}" -lt 4 ]; then exit "${STATUS}" else exit "${CODE_UNKNOWN}" fi |
Here are some examples:
check_all_sensors.sh
This command will check each sensor on each chip. It will raise the critical status when the input value is equal or greater than 100% of the threshold value. If the input value is greater or equal to 80% of the threshold value, the warning state will be raised. All threshold values are being got from the system.
check_all_sensors.sh -c 90% -w 50%
It’s almost the same, but the critical rate is 90% and the warning rate is 50%.
check_all_sensors.sh -C zaloopa -S '/^Temperature [0-9]+$/' -c 90 -w 75
Only temperature sensors on the zaloopa chip will be checked. The critical status will be raised when the absolute input value is greater or equal to 90 degrees. If it’s equal or greater than 75 degrees, the warning state will be thrown.
Also saved it to Gist.