[an error occurred while processing this directive]
Performance: Getting the Most Out of WebSphere MQ
What is Performance
WebSphere MQ applications
Factors that affect WebSphere MQ performance - an overview
Performance Factors and Techniques
Getting the most out of WebSphere MQ first requires you to define performance. This should be followed up with a look at several factors that affect WebSphere MQ performance, and techniques that can be used to improve performance.
What is Performance
The concept of performance has different aspects. When addressing the issue of performance, your first consideration should be to identify the aspect you plan to address.
People associate performance with:
- The measurement of response time. Thus, better performance is defined as the completion of a task in less time (quicker response).
- Moving the work through the system with less impact on the system and other applications.
- Providing the most throughput at peak times.
- Availability. The application has to be available when required.
Based on your objectives, you may address performance similarly, but the trade-offs you choose may vary.
WebSphere MQ applications
Different types of WebSphere MQ applications have different performance characteristics. Several types are provided below.
The asynchronous send application sends messages based on some external activity. Examples might include a stock ticker or an application that reports scores for a sports event.
This application is primarily concerned with throughput, in that it needs to keep up with the rate that events occur. Whether the message takes seconds or days to reach its destination, once it is sent, it does not affect the application.
Synchronous send and receive
Another common type of WebSphere MQ application is the synchronous application. Technically, this application is not synchronous, but rather it is an asynchronous one that expects a timely response to the sent message. For this application, the key concern is the response time for the reply message(s). If the responding application is remote (on a network), this time includes WebSphere MQ processing on multiple hosts, the processing by the remote application and the network transmission time for both messages.
It is possible given the design of WebSphere MQ that the response may not be timely and must be dealt with by the application design. For example, if the application does an infinite wait for the message to arrive, this will consume system resources and could affect other applications.
The server application is complementary to the previous two examples. It processes WebSphere MQ messages; performs local processing, such as accessing a database; and may send a response. Multiple servers may be used to share a portion of the workload.
WebSphere MQ client
The WebSphere MQ client application may be an implementation of any of the previous applications, but introduces a key response-driven component. As there is no local queue manager, all requests must pass over the network to the associated server queue manager. The number of requests, the speed at which the request and response can be transmitted, and the additional processing time on the server are all components of the application's performance.
Factors that affect WebSphere MQ performance - an overview
There are many different factors that address performance. Given the diversity of WebSphere MQ environments, some recommendations may or may not be applicable. For each, there is typically a trade-off in providing better performance to the WebSphere MQ application, such as degrading another application. It is important that the cost and benefit are understood before making changes.
The usual suspects
The key elements of response time have not changed in more than 20 years.
- The processor time to service the application plus the overhead of the operating environment (in this case, WebSphere MQ and the operating system)
- The time spent waiting for I/O operations (all computers wait at the same speed)
- The time spent transmitting requests over a network
- Any contention for resources required by the application
Each of these key components will be addressed in the following sections.
Adding system resources
The most common approach to improving performance is simply to add additional resources. To accomplish this, you could try moving the queue manager to a larger server or adding additional memory. With today's price-to-performance ratios, this can provide significant improvements at little cost.
Performance Factors and Techniques
The main component of CPU consumption for a WebSphere MQ application is the type and number of MQI calls issued.
Calls in order of CPU consumption:
- MQCONN - connects to the queue manager, creates required task structures and control blocks
- MQOPEN - opens a specific queue for processing, may lock required resources and acquires control blocks
- MQCLOSE - closes the queue, commits resources, frees locks and releases control blocks
- MQPUT - puts a message to a queue (recovery processing may be required)
- MQGET - gets a message from a queue (recovery processing may be required)
On S390, most CPU is charged to the calling application. On distributed systems, an agent process is used to communicate with WebSphere MQ, and this process will consume most WebSphere MQ-related CPU.
Avoid unnecessary calls
The best and most obvious way to avoid CPU consumption is to avoid unnecessary MQI calls. For example, consider the server application discussed previously. The application could be designed to trigger the server's start when a message arrives, connect to the queue manager, open the queue, retrieve the message and process the response (opening a second queue), close all queues and disconnect from the queue manager. The process would repeat for the next message, and so on. This may be a good solution for a low arrival rate. However, for higher message arrival rates, there are two alternatives.
First, rather than closing all queues and disconnecting1, the application could try to do an additional get with wait2 from the queue. If another message is already available, it could process this message and avoid additional connect and open calls3. This process could be repeated until no unprocessed messages remain, and only then would the server terminate. If the message arrival rate is high enough, rather than using triggering, the application could be permanently active, simply looping on a get with wait call. Note that if the arrival rate is insufficient, the above solutions could be unnecessary processing.
Reduce message size and/or compress messages
Message size is a key component in message processing. While application developers can be coerced into reducing message size, there is no guarantee that they will do so. Traditionally, software solutions to compress messages have had a greater success rate than those that relied on application methodology. As seen in Figure 1, which demonstrates the use of compression software, message size can have a significant impact on CPU time. This is primarily due to data movement within the queue manager. Data must be moved out of the application and into WebSphere MQ buffers. It must be logged if persistent, and may have to be written to and read from physical DASD.
Figure 1—CPU Consumption
Directly related to the reduced CPU, but also due to I/O and network transmission time savings achieved, the elapsed time of the compressed data is significantly lower than that of the native messages.
To achieve these savings however, it should be noted that messages must be compressed prior to their placement on the queue.
Reduced number of messages
Is one big message better than several small ones? Opinions vary.
Larger messages are subject to additional processing overhead, whereas each small message incurs a base amount of processing. WebSphere MQ now supports messages up to 100MB, so it is possible to logically join multiple records (taking care not to go overboard). Define messages that make sense from an application point of view, and don't overanalyze message design. If the number of messages is low, the difference in processing for either method will be small. If a large number of messages are being written, combining the messages may result in a significant reduction in processing overhead.
Use intermediate commits for large numbers of messages
There are several reasons to periodically commit messages. First, the processing required is not linear to the number of messages. The impact of the final commit increases as the number of messages in the unit of work increases. Second, periodic commits spread the total time to process over a longer period (less impact on other applications). Third, messages are not visible to other applications until they have been committed, thus the messages will appear all at once to the server application. The processing application may be overwhelmed. Of course, the commits must be reflected in completed units of work.
Is it better to have a single queue shared by multiple application instances or individual queues? This is an area of debate, but it typically does not make sense to share queues across different applications. However, it may make sense to share queues within an application domain. For example, the Command MQ for S/390 product from BMC Software supports multiple users connected to a single queue manager. It could have been designed with a unique queue per user, but instead it implements a single queue shared by all users based on correlation ID (CorrelId), resulting in fewer queues to manage.
WebSphere MQ on distributed platforms uses an indexed technique to make this efficient. On S/390 with V1.2 and later, the queue can be defined as indexed. This builds an in-storage index. The index can be based on message ID (MsgId) or CorrelId, but not both. This is not typically a problem, as applications use one or the other. However, if the queue is a priority-based queue, additional processing is required for each message with the same index.
If you have applications displaying this behavior, it is important that you define the associated queue as indexed. Consider a queue with 1,000 messages for application A, followed by 50 messages for application B. To read the 50 messages, application B would actually read the 1,000 messages for application A before hitting any of its own messages. Depending on application design, this could result in a total reference of 50,050 messages to process all 50 messages.
Defining a queue as indexed adds a minimal amount of additional processing during put processing, but can be noticeable during queue manager restart for large queues.
Additionally, if all access to the queue is by MsgId or CorrelId and message expiry is used, it is possible to fill a queue with expired messages. WebSphere MQ does not remove messages until the expired message's get with update is performed.
For distributed queue managers, another CPU consumer is process switching. Process switching prevents the corruption of WebSphere MQ due to application program errors.
The queue manager is isolated from the application program through the use of an agent process that executes within the queue manager domain. For each WebSphere MQ call, an IPC is used to switch from the application to the agent. When defined as trusted, the application, the agent and the queue manager are within a common domain. This eliminates the overhead, but leaves the queue manager open to corruption by the application. Thus, it is intended only for truly trusted applications.
Trusted applications are primarily used for the WebSphere MQ channel agents. While from a WebSphere MQ perspective these are application programs, from a customer point of view they are part of WebSphere MQ. These should be configured as trusted, reducing overhead for the channel processing. Channel exits will execute within the trusted environment, should be evaluated, and must conform to trusted application restrictions. Note that an application must be designed to use trusted binding. For example, it must use an MQCONNX call instead of the standard MQCONN call.
I/O can be a major component of a WebSphere MQ application, particularly if logging is the primary factor. In order to provide guaranteed once-and-once-only delivery, WebSphere MQ must log every processed message. Additionally, WebSphere MQ must ensure that the log has been committed prior to the work unit's completion.
Queue I/O is typically performed independent from application response time, but could affect device use. When processed within a resource manager, I/O to the queue is not performed unless buffer space is exhausted. Therefore, it is possible for a message to be sent to a queue and read by the processing application without ever being written to the physical queue storage.
Use nonpersistent messages when appropriate
Because logging is only performed for persistent messages, using nonpersistent messages eliminates logging activity. Nonpersistent messages are not guaranteed from a WebSphere MQ perspective. That is, they may never be delivered. Most notably, nonpersistent messages are not maintained across a restart of the queue manager. Some cases of non-persistent messages make sense. For example, consider an application that sends the current temperature. If a single reporting instance is lost, correction will occur with the next temperature report. However, stock trade messages cannot be lost.
Reduce message size and/or compress messages
Reducing the message size reduces the amount of buffer space required to hold the message, both for message data and for logging.
Use intermediate commits for large numbers of messages
As the number of messages within a unit of work increases, WebSphere MQ can not maintain all data in internal buffers and will have to spill this information to disk storage. This decreases overall efficiency.
Separate logging from data volumes
To maximize the efficiency of logging, the logging volumes should be separated from the data volumes. On small servers, this may not be practical. However, by doing so, the potential for contention between logging and writing of the queue data is reduced. This eliminates the potential of a single point of failure for both the logs and the associated data.
Place logs on low usage volumes
Logs should be placed on low usage volumes with the highest-speed device available. Of course, all I/O should always be done using high-speed devices that have low utilization.
For distributed queue managers, logging comes in two flavors - circular and linear.
Circular is the default and is the easiest to manage, as the logs are simply reused in a circular fashion. However, in some situations recovery may be impossible while in this mode.
Linear logs guarantee recovery but require management to prevent filling up all available disk space. Eventually, linear logs that are no longer required will need to be archived and/or deleted. This should be automated to prevent unexpected outages.
PATROL for WebSphere MQ product from BMC Software directly addresses this requirement. PATROL for WebSphere MQ optimizes the disk space occupied by linear logs and monitors the queue managers for log-related events that affect system performance, alerting the management console of the event and taking automatic corrective action. Automated actions include pre-allocating additional log space and compressing, archiving or deleting system logs as necessary.
Note that logging parameters are set during queue manager definition. Some attributes can be changed by manually editing the log stanza. This includes setting the directory for the log file location4. However, re-creating the queue manager is the only way to change the logging type.
Distribute active and archive logs across many volumes
WebSphere MQ uses two types of logs on S/390 - active logs and archive logs.
WebSphere MQ directly uses active logs to log WebSphere MQ changes. When these logs become full, they are copied to archive logs. The archive logs can be on tape or on disk. The archive logs are typically created as disk GDG data sets. As in circular logging, multiple active logs are required so as one fills, processing can continue to the next log. The minimum requirement is three active logs.
Depending on the size of these datasets and the duration of the work units, both the active and archive logs may be required for recovery. This should be avoided. WebSphere MQ supports dual logging (two sets of active logs). No two logs should be defined on the same volume. Thus, for dual logging with three active datasets, this would require six volumes. If archival is to disk, these should be on separate devices.
Archive log before busy periods (or during low activity ones)
On WebSphere MQ on S/390, the archival requires CPU and I/O to complete. Rather than have this activity be random in nature, schedule archival outside of busy periods. Your system management tool, such as PATROL for WebSphere MQ, should provide a means to do this.
S/390 logger statistics
The logger on WebSphere MQ on S/390 provides many statistics about its operation. PATROL for WebSphere MQ collects and maintains these numbers for historical analysis. Some of the primary statistics are addressed below.
- BackOut efficiency - ratio of reads from archive to total reads. Smaller values are better. Larger numbers indicate that long units of work are exceeding active log capacity.
- BackOut work - ratio of total reads to total writes. Indicates how much work has been backed out
- Archive log - ratio of total reads delayed by maximum allocated logs (MAXALLC) / Total Reads; should be less than 1%
Buffer pools and pagesets
On S/390, WebSphere MQ provides additional buffer and message storage control. This is done through two main structures - buffer pools and pagesets.
The relationship between a queue and the buffer pools and pagesets is indirect. The queue references an associated storage class. The storage class is associated with a pageset. The pageset defines the associated buffer, and is also associated with a VSAM dataset for storage of the messages. See Figure 2 for a depiction of this relationship.
Figure 2—Component relationships
In Figure 2 we see the relationships between the various components. Shown are four queues that are mapped to two unique storage classes. In this case, the storage classes are mapped to two unique pagesets that share the same buffer pool. Thus, messages written to any of the four queues are stored in the same buffer pool (and contend for the same buffers) but are physically stored, as needed, on two datasets.
Applications should not use buffer pool zero or pageset zero
To avoid contention with WebSphere MQ operation, application messages should not be placed in buffer pool zero or pageset zero. WebSphere MQ objects such as queue definitions are stored as messages in pageset zero. Sharing the buffer pools and pagesets with application data adds contention for buffer space. If the application exhausts available pageset space, it will cause the queue manager to suspend.
Understand message characteristics
Given the limited number of buffers, little flexibility exists with regard to message tuning. However, some characteristics can be addressed. For example, if messages are retrieved as soon as they are put, they will typically only exist in buffers and do not need to be backed by large high-speed pagesets. However, messages that are processed only at periodic intervals do not benefit from buffering. Thus, it is better to have these separated into two buffer pools. Message size could be a factor in choosing which buffer pools and/or pageset to use. Use storage classes to define queue characteristics. By default, WebSphere MQ defines storage class as system, default and remote.
Use multiple pagesets to allow overlapped I/O
An advantage of using multiple pagesets is that WebSphere MQ will overlap I/O processing.
Use DASD fast write
Take advantage of the fastest DASD available.
WebSphere MQ Buffer Pool Manager
The WebSphere MQ Buffer Pool Manager on S/390 provides many operation statistics. PATROL for WebSphere MQ collects and maintains these numbers for historical analysis. Some of the primary statistics are described below.
- Total buffers - the total number of buffers in the buffer pool
- Lowest # bufs - the lowest number of buffers available (should typically be about 15% of the total buffers; less than 5% requires investigation)
- Page get reqs/new get reqs - count of buffer pool get requests for an existing page or a new page
- No bufs - number of times that a get request was suspended because no buffer was available (anything more than zero is bad)
- Page read I/O - the number of requested pages that had to be read from DASD
- Page read - the ratio of page read I/O to existing page gets; the smaller the number, the more efficient the buffer pool
- Page find - the ratio of page accesses that did not find the page in the buffer pool to the total number of page gets; the lower the value, the more efficient the buffer pools
- Asynchw - the ratio of page updates to page writes. The higher the number, the more efficient the asynchronous write processing
The WebSphere MQ queue manager on S/390 provides statistics on pagesets. PATROL for WebSphere MQ collects and maintains these numbers for historical analysis. Some of the primary statistics are addressed below.
- Data pages - the total amount of formatted space available for storing buffers
- Unused pages - the unused page space available
- Persistent pages - the number of pages allocated to holding persistent messages
- Nonpersistent pages - the number of pages allocated to holding nonpersistent messages
- Pages in use - pages currently in use
- Extended - the number of times the pageset has been dynamically extended (maximum of 123 extensions)
- Restart RBA - indicates the RBA for the oldest log required for restart
WebSphere MQ tuning notes
One aspect of WebSphere MQ is that changes to the buffer pool and pageset definitions require the queue manager to be stopped and restarted. This is not always possible in an operational environment. BMC Software provides OPERTUNE® for MQ, a product that can dynamically tune the system. This can be done automatically to address different characteristics of WebSphere MQ operation. For example, additional buffer space can be allocated for key jobs that only run periodically, but then the buffers return to normal.
WebSphere MQ uses a simple algorithm when managing buffers. BMC Software provides EXTENDED BUFFER MANAGER for MQ (XBM™ for MQ), a product that improves the I/O processing by improving buffer management. XBM uses several techniques to improve buffer efficiency, including:
- Using hyperspace buffers (avoiding paging I/O)
- Ranking cache usage by application
- Pre-fetching buffers if sequential processing is being done
- Allocating additional buffers on the fly if a specific pageset or log is experiencing heavy I/O; buffers are shared across MQ buffer pools, making sizes of these pools less critical
- Can also manage DB2® and VSAM buffers
It is important to note that messages continue to be written to physical DASD, thus avoiding integrity issues.
Benefits of buffer caching
Figure 3—Benefits of buffer caching
Figure 3 demonstrates the value of extended buffer management. By using a 100-MB hyperspace cache, the elapsed time of two jobs, one that put 400,000 messages to a queue and one that gets the messages, was reduced.
Because a key use of WebSphere MQ is tying together different platforms, WebSphere MQ is very dependent on the network. Network speed, network traffic and message volume are all key components. The remaining two elements are WebSphere MQ specific and will be covered in the following sections.
Increase network speed
If network speed is an issue, one option is to make it quicker. With the cost ratio of network speeds dropping, higher-speed networks are now an option.
The next step to decrease network transmission time is to reduce the size of the message. Several options exist in this area.
Vendor products to compress the message exist at several levels. Some execute as part of the application, some run as exits to WebSphere MQ, some run within the transport layer and some run external to the platform itself. Each has its own costs and benefits and needs to be justified for specific environments. IBM® supplies a free compression support pac that runs as a channel exit on the sender and receiver sides. This support pac is available only for limited platforms:
A more appropriate implementation of message compression would occur prior to WebSphere MQ, however.
As seen previously, compression can provide significant benefit.
WebSphere MQ provides two main channel-tuning parameters. Batch size defines the maximum number of messages sent within a batch. Batching records together reduces the amount of channel processing required, such as for commit processing. If the associated transmission queue is empty, WebSphere MQ will end the batch. However, depending on message arrival rate, you may use "batchint" to indicate that WebSphere MQ should wait before ending the batch. If a message arrives within the specified time span, it is added to the batch, and so on, until the batch size is reached. If the message rate is low, this can cause an unacceptable transmission delay. Consider an application that sends a single message to a remote server and expects a single reply. If the batchint value was one second for both channels, the minimum turnaround for the reply is two seconds. It is important to note that none of the messages become visible on the remote queue manager until the entire batch is sent and committed. Thus, increasing batch size results in message spikes.
Fast message support was introduced in V5.0 and V1.2 of WebSphere MQ, but was distributed as fixes for earlier versions. The basic benefit of using fast message is that nonpersistent messages bypass significant channel processing.
Without fast message support, nonpersistent messages are logged and included within the channel recovery processing. Since nonpersistent messages will be lost in the case of a queue manager restart, the overhead of channel recovery could be considered unnecessary. When active, fast message support transfers the nonpersistent messages over the channel outside the normal batch process. Thus, the message does not count against batch values. Since the message is put outside the batch, it is immediately visible on the receiving side, creating the possibility for nonpersistent messages to be processed out of order with persistent messages. Fast message support can be activated on a channel-by-channel basis and must be active on both sender and receiver channels.
Avoid dead letter queue scenarios
When a situation occurs that requires dead-letter processing, the normal channel flow is disrupted.
If you have an application that expects messages from a remote queue manager, be aware that if the application uses a temporary dynamic queue for the reply queue and the application terminates before the response is received, this will result in a dead-letter situation.
WebSphere MQ clustering eliminates many common causes of dead-letter failures, such as queue not found and queue disabled.
This final section will address some causes of contention for WebSphere MQ applications.
If messages arrival > server processing
WebSphere MQ supports multiple servers per queue. Depending on your application design, you may be able to add multiple servers, as demand requires. Customers have implemented automated solutions that increase the number of servers for a queue as the queue depth grows. Likewise, the number of servers decreases when depth drops.
If server resource requirement > platform capacity
Of course, you may find that adding servers exceeds the system-processing capacity (for example, CPU, memory and semaphores). Reducing the number of servers can result in greater throughput.
Additionally, with WebSphere MQ 5.1, you can use clustering to provide servers across multiple systems. This capability requires that your application process across multiple servers. Other benefits of using clustering support, even without using multiple application servers, include easier failure recovery and reduced system management.
Figure 4 shows four queue managers connected in a cluster with an application putting messages. The arrows indicate the flow of the messages. Messages to queue "R" can be processed by three queue managers, while messages to queues "G" and "B" can each be processed by two queue managers. As supplied, the workload exit supplied by WebSphere MQ will use a round-robin technique to send the messages to the available queue managers. The exception is queue "B" that is local to the application. This queue will always be selected to receive messages, as this avoids any remote transmission.
The example below in Figure 5 shows the benefit of clustering in a failure situation. QM3 is no longer available, but at least one server still exists for all queues. When QM3 is restarted, it will be included to serve messages, automatically. Note that it is possible for messages to be "trapped" inside QM3 if they were transferred to QM3 but not processed before it failed. These will not be processed until the queue manager is restarted.
Figure 5—Clustering failover
For S/390 users, CICS (CTS) is a common application environment. Under CICS, WebSphere MQ is allocated eight TCBs (tasks). As WebSphere MQ applications execute, these servers are associated with a given application. More than eight applications may require WebSphere MQ services. In this case, the applications will be suspended until a TCB becomes available. Should this occur, adding an additional CICS region is the only option.
In closing, a key performance component is determining your performance requirements and understanding your application's behavior. For example, implementing fast messages has no benefit if all messages are persistent. Use available tools to measure system use and set a baseline to verify positive results. PATROL for WebSphere MQ covers the entire spectrum of WebSphere MQ management that you can use to improve your operation.
Helping you maintain advantage
BMC Software Professional Services helps your company maintain its competitive advantage through a comprehensive suite of services that includes service level management consulting, installation, implementation, configuration, and customization. Our professional services and education offerings are designed to ensure the ongoing availability of critical business applications, maximize product potential, reduce project risk, deliver IT value to your business, and improve your operations. For more information about BMC Software Professional Services, visit http://www.bmc.com/profserv.
About BMC Software
BMC Software, Inc. [NYSE:BMC], is a leader in enterprise management. The company focuses on Assuring Business Availability® for its customers by helping them proactively improve service, reduce costs, and increase value to their business. BMC Software solutions span enterprise systems, applications, and databases. Founded in 1980, BMC Software has offices worldwide and is a member of the S&P 500, with fiscal year 2002 revenues of approximately $1.3 billion. Visit www.bmc.com to learn more.
1If a normal shutdown of the queue manager is initiated, applications that are connected will prevent the queue manager from being terminated.
2Get with wait = MQGMO_WAIT. Get with wait should be done using small time values and must be used in conjunction with the MQGMO_FAIL_IF_QUIESCING option.
3Care must be taken to ensure that commit processing is performed for long-running applications.
4Care must be taken when relocating log files. The queue manager must be down and all existing logs and associated files must be moved to the new location.