SMF

Solutions in SolveWare subject SMF are designed to automate management of the System Management Facilities (SMF). SMF is a standard MVS feature that collects and records system and job related information. This information is kept in SMF datasets to be used later as input for user-written report programs.

All messages (DO SHOUT actions) in SMF rules are sent to an INCONTROL user named U-SYSADMIN. A user with this name must be defined in the IOA Dynamic Destination table (CTMDEST). For more information, see the Dynamic Destination Table chapter of the INCONTROL for z/OS Administrator Guide.

In many cases, rule definitions make use of the inverse IN condition feature. This feature activates rules only if the specified IN conditions are not set. For more information about using inverse IN conditions, see Inverse IN Conditions.

Solutions Provided

SolveWare subject SMF contains the following solutions:

  • Copy and Clean SYS1.MANx

  • For situations where the SMF dataset SYS1.MANx becomes full. A copy and clean job is sent automatically to prevent loss of system information.

  • SMF Problem Alerts

  • Manages error messages issued by SMF. A message is sent notifying the system administrator of the problem.

  • SMF Rule Thresholds

  • Handles exceeded thresholds of all other SMF rules.

Copy and Clean SYS1.MANx

SMF datasets are named SYS1.MANx, where x is any alphabetic or numeric character (that is, from A through Z, or from 0 through 9). At any one time, a single dataset is the current SMF dataset. SMF uses this dataset, until it becomes full, for recording information.

When the current SMF dataset becomes full, a switch is automatically performed to an empty dataset, which becomes the current dataset, and a message is issued. A message is also issued if no alternate dataset is available.

In response to either message, the operator must dump and clean full datasets so that they can be used again by SMF. Failing to do so can result in losing important system information.

This solution responds to both messages (which means that SMF datasets were switched or that no datasets are available) by initiating a pre-scheduled job in Control-M to dump and clean the full SMF datasets.

Rules

The Copy and Clean SYS1.MANx solution includes the SMF Dataset Switched, No Empty Dataset Found rule.

Rules Structure

The following tables describe the structure of the Copy and Clean SYS1.MANx solution rules.

Table 63 SMF Dataset Switched, No Empty Dataset Found Rule Structure

Item

Description

Title

SMF Dataset Switched, No Empty Dataset Found

Name

IEE362A

Table

SMF

Message

Any of the following messages:

IEE362A SMF ENTER DUMP FOR SYS1.MANx ON volser

IEE361I SMF DATA LOST - NO DATASETS AVAILABLE, DATA BEING BUFFERED TIME = hh.mm.ss

IEE366I NO SMF DATASETS AVAILABLE - DATA BEING BUFFERED TIME= hh.mm.ss

Message Description

  • IEE362A – The current SMF dataset has filled up, or a HALT EOD or SWITCH SMF command has been issued. No more SMF records are written to the old SMF dataset, whose name and volser are indicated in the message.

  • IEE361I, IEE366I – The current SMF dataset has filled up and no empty alternate dataset was found.

Basic Scheduling Parameters

Always schedule this rule.

Runtime Scheduling Parameters

No special considerations

Global Variables

%%DUMP_SYS1MANx\
Flag that indicates whether to dump SMF dataset SYS1.MANx. This variable is set for every SMF dataset.

Valid values are:

  • YES

  • NO

Rule Logic

The rule is triggered either when SMF dataset SYS1.MANx is switched, or when no empty SMF dataset is found. Either situation requires a user response to initiate a job or started task that executes utility IFASMFDP. This utility dumps and cleans the full SMF datasets.

The rule issues command D SMF to obtain SMF dataset information from response message IEE949I and then uses the "percentage-full" value of each dataset to determine which SMF datasets are to be dumped. The current SMF dataset is not dumped by the rule.

Flags for each SMF dataset, which indicate individually whether to dump the dataset, are set in Global AutoEdit variables that are referenced by the job JCL. This is done by including the Control-O $GLOBAL member in the job JCL and defining %%LIBSYM and %%MEMSYM AutoEdit control statements.

The rule sets a condition to trigger a pre-scheduled job in Control-M. The job executes utility IFASMFDP that dumps SMF datasets.

The job must be defined as a cyclic job or cyclic started task (STC). It is triggered by adding the prerequisite condition or date CTO-SMFDUMP-GO 0101. (For more details on the job scheduling definition, see Customization in this table.)

Once the rule has been triggered, it is temporarily deactivated by setting its own inverse IN condition. This prevents multiple triggering of the rule caused by messages appearing before the job has finished cleaning the SYS1.MANx datasets.

Once the job has successfully executed, condition CTO-IEE362A-HANDLED is deleted to reactivate the rule. The condition that triggered the cyclic job is also deleted upon completion of the job.

Rule Actions

  • Sets a condition to deactivate the rule temporarily (seeRule Logic in this table.)

  • Sets variable %%RESPMSG to IEE949I.

  • Issues operator command DISPLAY SMF.

  • If a response is received, the following actions are performed:

    • Analyzes response message IEE949I and sets Global variable %%DUMP_SYS1MANx for each SMF dataset to either YES (if it needs dumping) or NO (if it does not need dumping).

    • Issues a command to instruct Control-O to write the Global variables.

    • Adds a prerequisite condition to start a job in Control-M to dump the SMF dataset.

Activating the Rule

Once ordered, the rule remains active until one of the messages IEE362A, IEE361I or IEE366I exceeds a predefined threshold. (For more information regarding threshold handling, see SMF Rule Thresholds)

The rule is also temporarily deactivated when it is triggered and reactivated after the copy and clean job finishes OK (see Rule Logic in this table).

Recommended Mode or Category

If a different automatic mechanism to clean SYS1.MANx (for example, SMF Exit IEFU29) is already implemented, the mechanism must be removed before testing this rule.

During the testing period the rule must be activated in LOG mode. Once you are satisfied with the results of the rule, change the mode to PROD to avoid log messages for the rule.

The SolveWare category for this rule is 2—some customization is required before implementation.

Customization

The rule dumps each dataset whose "percentage-full" value is not less than a certain value. Currently, the rule dumps datasets that are at least 90% full. This value must be adapted (in the rule definition) to site requirements.

A job scheduling definition and JCL for the SYS1.MANx copy and clean job must be created. The SOLVSCHD and SOLVJCL libraries contain a sample scheduling definition and JCL to copy and clean SYS1.MANx. These samples can be adapted to a site's conventions and requirements.

The job must be defined as a cyclic job or cyclic started task (STC) with a MAXWAIT value of 99. It then only needs to be ordered once, but must not be removed manually from the Control-M Active Jobs file. The MAXWAIT value of 99 ensures that the job is never removed from the Active Jobs file by the Control-M New Day procedure.

The cyclic job is always ready for submission. It is triggered by adding the prerequisite condition or date CTO-SMFDUMP-GO 0101. When an execution of the job is completed, this condition is deleted. This prevents cyclic re-invoking of the job and ensures that the job is only invoked again if the rule is triggered again.

Prerequisite condition or date CTO-IEE362A-HANDLED 0101 must be deleted at time of IPL to make sure the rule is active after system startup (see SolveWare Initialization.)

SMF Alerts

This solution handles various SMF error situations that are not resolved automatically. This solution ensures that the system administrator is notified immediately by Control-O when SMF problems occur, so that the system administrator can locate the source of the error and solve the problem.

Rules

The SMF Alerts solution includes the following rules:

  • SMF Dataset Resides on a Non-Direct Access Volume

  • I/O Error While Writing on SMF Dataset

  • SMF Dataset or SYS1.PARMLIB Cannot Be Opened

Rules Structure

The following tables describe the structures of the SMF Alerts solution rules.

Table 64 SMF Dataset Resides on a Non-Direct Access VolumesRule Structure

Item

Description

Title

SMF Dataset Resides on a Non-Direct Access Volumes

Name

IEE3631

Table

SMF

Message

IEE363I SMF ser NOT DIRECT ACCESS

Message Description

SMF dataset SYS1.MANx resides on a non-direct access device and therefore cannot be used by SMF.

Basic Scheduling Parameters

Always schedule this rule.

Runtime Scheduling
Parameters

IN !CTO-IEE363I-THRESH STAT

Global Variables

None.

Rule Logic

The rule is triggered when the SYS1.MANx dataset resides on a non-direct access device. Control-O immediately brings the error to the attention of the system administrator who can then determine the cause of the problem and proceed to solve it.

Rule Actions

Sends a message notifying user U-SYSADMIN of the problem.

Activating the Rule

Once ordered, the rule remains active until message IEE363I exceeds a predefined threshold. (For more information regarding threshold handling, see SMF Rule Thresholds.)

Recommended
Mode or Category

This rule must be activated in PROD mode.

The SolveWare category for this rule is 1—little or no customization is required before implementation.

Table 65 SI/O Error While Writing on SMF Dataset Rule Structure

Item

Description

Title

SI/O Error While Writing on SMF Dataset

Name

IEE3641

Table

SMF

Message

IEE364I SMF {LOGICAL | PHYSICAL} I/O ERROR ON SYS1.MANx {FEEDBACK CODE = fc | error text}

Message Description

An I/O error occurred while writing on the SYS1.MANx dataset indicated in the message.

Basic Scheduling Parameters

Always schedule this rule.

Runtime Scheduling Parameters

No special considerations

Global Variables

None.

Rule Logic

The rule is triggered when an error occurs while writing on SYS1.MANx. Control-O immediately brings the error to the attention of the system administrator who can then determine the cause of the problem and proceed to solve it.

Rule Actions

Sends a message notifying user U-SYSADMIN of the problem.

Activating the Rule

Once ordered, the rule remains active until message IEE364I exceeds a predefined threshold. (For more information regarding threshold handling, see SMF Rule Thresholds.)

Recommended Mode or Category

This rule must be activated in PROD mode.

The SolveWare category for this rule is 1—little or no customization is required before implementation.

Table 66 SMF Dataset or SYS1.PARMLIB Cannot Be Opened Rule Structure

Item

Description

Title

SMF Dataset or SYS1.PARMLIB Cannot Be Opened

Name

IEE3651

Table

SMF

Message

IEE365I SMF SYS1.{MANx | PARMLIB} NOT OPENED

Message Description

SMF failed to open SMF dataset SYS1.MANx or SYS1.PARMLIB. No records are written to the SMF dataset.

Basic Scheduling Parameters

Always schedule this rule.

Runtime Scheduling Parameters

No special considerations

Global Variables

None.

Rule Logic

The rule is triggered when SYS1.MANx or SYS1.PARMLIB cannot be opened. Control-O immediately brings the error to the attention of the system administrator who can then determine the cause of the problem and proceed to solve it.

Rule Actions

Sends a message notifying user U-SYSADMIN of the problem.

Activating the Rule

Once ordered, the rule remains active until message IEE365I exceeds a predefined threshold. (For more information regarding threshold handling, see SMF Rule Thresholds.)

Recommended Mode or Category

This rule must be activated in PROD mode.

The SolveWare category for this rule is 1—little or no customization is required before implementation.

SMF Rule Thresholds

This solution handles message overload – situations in which a message appears on the console more times than is acceptable. If an SMF message appears too often on the system console, threshold rules deactivate the relevant SMF rule until the source of the problem is found and the problem corrected.

For more information regarding threshold handling., see SolveWare Implementation Considerations.

Rules

The SMF Rule Thresholds solution includes the following rules:

  • Handling Exceeded SMF Thresholds

  • Resetting SMF Rule Threshold Conditions

Rules Structure

The following tables describe the structures of the SMF Rule Thresholds solution rules.

Table 67 Handling Exceeded SMF Thresholds Rule Structure

Item

Description

Title

Handling Exceeded SMF Thresholds

Name

IEE361I

Table

SMF

Message

Any of the following messages:

  • IEE361I SMF DATA LOST - NO DATASETS AVAILABLE, DATA BEING BUFFERED TIME=hh.mm.ss

  • IEE362A SMF ENTER DUMP FOR SYS1.MANx ON volser

  • IEE363I SMF ser NOT DIRECT ACCESS

  • IEE364I SMF {LOGICAL | PHYSICAL} I/O ERROR ON SYS1.MANx {FEEDBACK CODE = fc | error text}

  • IEE365I SMF SYS1.{MANx | PARMLIB} NOT OPENED

  • IEE366I NO SMF DATASETS AVAILABLE - DATA BEING BUFFERED TIME=hh.mm.ss

Message Description

Descriptions for messages handled by this rule are found in the other rule descriptions belonging to SolveWare subject SMF.

Basic Scheduling Parameters

Always schedule this rule.

Runtime Scheduling Parameters

No special considerations

Global Variables

None.

Rule Logic

To avoid message overload situations, this rule deactivates SMF rules whose messages have exceeded a pre-determined number of appearances in a period of time.

These threshold values are defined for every message included in this rule in the threshold parameters APPEARED ### TIMES IN #### MINUTES.

To synchronize threshold handling correctly, this rule is assigned a higher PRIORITY value than the message rules that it monitors, and has a CONTINUE SEARCH value of Y (Yes).

Deactivation of an SMF rule is achieved by adding the appropriate (inverse) IN prerequisite condition for the rule.

To reactivate a deactivated rule, the threshold conditions must be deleted. This can be done either manually or automatically by Control-O (see the following section).

Threshold conditions must be specified in the IGNORE list of the Control-M CONTDAY procedure (see SolveWare Implementation Considerations).

Rule Actions

  • Notifies user U-SYSADMIN that the message that exceeded its threshold is no longer handled by Control-O.

  • Sets the appropriate IN condition to deactivate the rule. The format of this condition is CTO-msgid-THRESH, where msgid is the message ID of the specific message.

Activating the Rule

Once scheduled, the rule is triggered whenever one of the above messages exceeds its threshold.

Recommended Mode or Category

During the testing period, the rule must be activated in LOG mode. Once you are satisfied with the results of the rule, change the mode to PROD to avoid log messages for this rule.

The SolveWare category for this rule is 2—some customization is required before implementation.

Customization

Review each SMF message monitored by this rule and determine appropriate threshold values.

For each message included in this rule, adapt to site requirements APPEARED ### TIMES IN #### MINUTES values, which specify a number of appearances in a time period.

Table 68 Resetting SMF Rule Threshold Conditions Rule Structure

Item

Description

Title

Resetting SMF Rule Threshold Conditions

Name

RESSMF

Table

SMF

Event

RESSMF

Event Description

This Event rule deletes all threshold conditions for all SMF rules.

Basic Scheduling Parameters

Always schedule this rule.

Runtime Scheduling Parameters

No special considerations

Global Variables

None.

Rule Logic

Using the INTERVAL parameter, this rule periodically deletes the threshold conditions of all SMF rules in order to reactivate SMF rules that exceeded their thresholds. (For further information regarding the resetting of threshold conditions, see Customization in this table.)

Rule Actions

  • Deletes condition or date CTO-IEE361I-THRESH STAT

  • Deletes condition or date CTO-IEE362A-THRESH STAT

  • Deletes condition or date CTO-IEE363I-THRESH STAT

  • Deletes condition or date CTO-IEE364I-THRESH STAT

  • Deletes condition or date CTO-IEE365I-THRESH STAT

  • Deletes condition or date CTO-IEE366I-THRESH STAT

Activating the Rule

Once scheduled, the rule is triggered periodically according to the INTERVAL parameter specification.

Recommended Mode or Category

During the testing period, the rule must be activated in LOG mode. Once you are satisfied with the results of the rule, change the mode to PROD to avoid log messages for this rule.

The SolveWare category for this rule is 2—some customization is required before implementation.

Customization

Adapt the INTERVAL parameter to site requirements:

If you use the INTERVAL parameter without the TIME FROM parameter, the threshold conditions are deleted when the rule is ordered.

Threshold conditions must also be deleted at time of IPL. For more information, see SolveWare Initialization.

Threshold conditions must be specified in the IGNORE list of the Control-M CONTDAY procedure. For details, see SolveWare Implementation Considerations.