OmniCheck was originally implemented on HP/UX 9.05, and has been successfully ported to all current revisions of the following operating systems:
NOTE: may not be used with files that consist of
multiple physical lines.
The theory goes that a single component of a farm can endure an failure
without causing adverse impact to the farm as a whole.
See 'production' above for true and false values. When the farm value is true, the effect is the same as if the production value is false.
NOTE: not available in persistent mode.
NOTE: not available in persistent mode.
The word
The word
By default, the account name used to form both the email-to-pager
and follow-up email are the same. If this is undesirable, you can
divide the two addresses in the following manner:
Here, any page generated by this action will go to pager@pager.foo.com,
where the accompanying mail will go to mail@foo.net. This value
can also be set in the
The 'file' action has the ability to
interpret the values of the parenthesized data within each matched
log entry, and use that data to alter the filename being opened for
appending.
The 'exec' action has the ability to
interpret the values of the parenthesized data within each matched
log entry, and use that data to alter the script name and/or the
parameters passed to the script. In these situations, the script
will be invoked one time per matched log entry, whereas the default
behavior is to pass all matching log entries to a single invocation
of the script.
When the
The valid relations are:
There must be a space after the word
Actions can be coded to only activate when a specific
organization is using the rule file. This feature reads the
'organization' configuration file entry to test against the
login in the rule file. In the following example, the FOO
organization will get "host issue" mail, the BAR team will
get a "fix me" page to their oncall, and everyone else
will ignore the pattern:
If there are actions outside, or after, an
Note: pattern-action interactions are now functional, so that patterns like
this:
Then, you need to add a threshold to your actions:
On each iteration of OmniCheck, the number of matches for all patterns
that have been tagged will be stored in a
If the number of matches for a particular pattern, including what is currently
matching the pattern within the current iteration, and if the preface control
word 'under' is used, and the number is less than or equal to the quantity per
time unit specified in the action, then the action is invoked; otherwise, it is not.
The default control is 'over'.Changes from last version
Installation
Management via iterative process (cron)
mkdir /opt/omnicheck # or wherever you wish
mkdir /opt/omnicheck/logs # or wherever you wish
cd {install_dir}
vi configfile
vi {install_dir}/rules.{nodename}
/usr/local/bin/perl {install_dir}/omnicheck -F {install_dir}/configfile
Command-Line Options
The command-line options are as follows:
-F /oap/omnicheck.config
SIGHUP
.
Configuration
The configuration file contains all the necessary information
to properly run an instance of OmniCheck. Below is a list
of the available entries:
process: syslog
process: sapi-apps
home: /oap
home: /opt/omnicheck
name: foobar
name: NODE
#!
)
@
)
file: /usr/adm/syslog/syslog.log
file: #!/bin/df -k
file: /oap/logs/*.err
file: @/opt/omnicheck/file.list
file
configuration file entry.
OmniCheck will parse and data that was written to this file after the last
run, but before the file was rotated. This feature only works when a
single file is being monitored in the block.
oldfile: /usr/adm/oSYSLOG
oldfile: /oap/logs/.old/process-a.err
oldfile: #!/oap/calc_oldfile_name.sh
See here for documentation on specifying
oldfiles for files within a filelist.
Flags to control OmniCheck's function:
production: no # or 0 or off
production: yes # or 1 or on
farm: no # or 0 or off
farm: yes # or 1 or on
maint: no # or 0 or off
maint: yes # or 1 or on
Required Entries for sending mail or pages
smtphost: localhost
smtphost: relay.mail.here.com
smtphost: /usr/lib/sendmail -t
pagerhost: pager.foo.com
pagerhost: page.mail.here.com
admin: jblow
admin: jblow@here.com
admin: /oap/omnicheck_admin
admin: #!/opt/omnicheck/get_admin.sh
oncall: jblow
oncall: jblow@here.com
oncall: /oap/omnicheck_oncall
oncall: #!/opt/omnicheck/get_oncall.sh
organization: QA_Team
organization: NorthAm.Prod
organization: Foobar
fqdn: foobar.db.foo.com
Method of integration oldfiles into filelists
Configuration data for 'oldfiles' can be added to the contents of a
filelist (the 'file' configuration value starts with an at-sign
(@
). Only single files can have their oldfiles specified in
this manner.
syslog.log{tab}syslog.log.0
syslog.log{tab}syslog.log.gz
Rule files
Rule files are the core of OmniCheck: they provide the patterns
to use, and the actions to take when a pattern matches against the
data being monitored. Each rule in the file must be separated by
some amount of blank lines, and is comprised of two parts: the pattern
and the actions.Patterns
The pattern follows Perl's regular expression syntax, with some additional
features. The following are in order of precedence.
The patterns must follow proper Perl regular expression syntax. Any
occurance of the these special characters in the data to monitor must be
escaped with backslashes
Abra##pattern
actions
20030124##pattern
actions
! detected co-location site
actions
pattern-a +3 pattern-b
actions
pattern-a
+3 pattern-b
actions
pattern-a ... pattern-b
actions
pattern-a
... pattern-b
actions
pattern-a && pattern-b
actions
pattern-a
&& pattern-b
actions
pattern-a|pattern-b
actions
pattern-a
|| pattern-b
actions
\
in the pattern:
Any pattern with non-escaped special characters will be considered
corrupted, will not be used by OmniCheck, and will be noted in
the report file (if enabled) and/or in debug output (if enabled).
Actions
The actions are the list what to do when a pattern matches. The
available actions are:
mail admin ; test message
admin
will be translated to the value of
the admin
in the configuration file.
page oncall ; system reboot
oncall
will be translated to the value of
the oncall
in the configuration file.
page pager@pager.foo.com/mail@foo.net ; split addresses
oncall
configuration file entry.
file /usr/adm/logs/separate.log
file /usr/adm/logs/file_$1.log
file /usr/adm/logs/file_@1.log
exec /usr/local/bin/process_data.sh
exec /usr/local/bin/new_output.sh -d @1 -m @2
exec --ignore /usr/local/bin/new_output.sh -d @1 -m @2
--ignore
option is used, the script does not
receive the matched log entries as STDIN. This is to allow
the external script/program to run without needing to manage the
matching log entries if it is not designed to do so.
modify --prepend "this" ;
modify --prepend "#!/output/of/script args" ;
modify --append "that" ;
modify --append "#!/output/of/script args" ;
modify --replace "this" "that" ;
modify --replace "this" "#!/output/of/script args" ;
modify --replace "regex" "that" ;
modify --replace "regex" "#!/output/of/script args" ;
Using --replace to simulate --prepend:
modify --replace "^" "that" ;
modify --replace "^" "#!/output/of/script args" ;
Using --replace to simulate --append:
modify --replace "$" "that" ;
modify --replace "$" "#!/output/of/script args" ;
Instances of "this" and "that" represent simple text strings; "regex"
represents a Perl regular expression; and /output/to/script
represents some external program. The args
of the script can
contain the same $1, $2 variables as other actions:
see Pattern-Action Interaction.
ignore juser ; junk messages
Altering when actions act
OmniCheck can be instructed to take a specified action only
if a specific number of lines match the pattern. Known as a threshhold
within OmniCheck, its syntax is this:
if >= 10 mail admin ; test messages
if
and after
the numeric value. Space between the relation and numeric
value is optional.
<pattern>
if org eq "FOO" mail admin-team ; host issue
elsif org eq 'BAR' page oncall ; fix me
else ignore admin ; not important
endif
Use either single (') or double (") quotes to surround the
organization value within the rule.if
block, they
will always take effect. In this example, only instances
used by the FOO organization will get the "host issue" mail,
but all instances will send the "fix tomorrow" mail:
<pattern>
if org eq "FOO" mail admin-team ; host issue
endif
mail admin ; fix tomorrow
NOTE: Organization-sensitive actions and threshholded
actions are current mutually exclusive, i.e., you cannot do this:
if (org eq "FOO" && >= 10) mail admin ; FOO and over 10
if (>= 10 || org eq "BAR") mail admin ; over 10 or BAR
If this is a feature that is requested, the effort will be applied
to work out the parsing logic. For now, however, you can do one
or the other.
Pattern-action interaction
OmniCheck can capture sections of the lines that match the pattern and
use them in the actions as pieces of the subject of a mail. Parentheses
are used to surround the section of pattern to capture, then number
variables ($1, $2, etc.) are used to insert the captured values into
the action:
ftpd\[\d+\]: FTP LOGIN FROM (\d+\.\d+\.\d+\.\d+) as (\w+)
mail admin ; FTP from $1 as $2
Error: (\w+) ... Description: "([^"]+)"
will now capture the data within two parentheses and provide it as expected.
Thresholding Use
To use the thresholding feature, you need to tag the pattern with a label.
This label needs to be two or more alphanueric
characters long, the first being alphabetic (think variable name), followed
by a double pound (similar to the pattern expiration feature).
Alpha##foobar
if 30/day mail admin ; lots of foobar
if over 30/day mail admin ; lots of foobar
if under 30/day mail admin ; not enough foobar
.thresh
file in
the tmpdir
directory, timestamped to when the match occurred.
Also, the .thresh
file is kept manageable by trimming off data
entries that exceed by 2 times the maximum threshold time value within the rule files.
If the number of matches for a particular pattern, including what is currently
matching the pattern within the current iteration, and if the preface control
word 'over' is used, and the number is greater than or equal to the
quantity per time unit specified in the action, then the action is invoked;
otherwise, it is not.