Introduction:
IBM Power Systems is an essential platform for diverse companies worldwide; 80% of the Fortune 100 companies are using the system to manage different types of workloads, and in 2023 more than 7300 companies are using IBM Power.
Let me explain the nomenclature to avoid confusion. IBM I is the operating system for the IBM i platform (Power Platform), and iSeries and Application System/400 (AS/400) are legacy servers. As with all operating systems, IBM I manage hardware and software and provide a user interface.
"At its core, the IBM i platform is designed to adapt to the ever-changing needs of both business and computing. Its defining characteristic, the “integration” represented by the “i” in IBM i, can help you gain more value from advanced technology with fewer resources and higher reliability." Source
"IBM Power® helps customers respond faster to business demands, protect data from core to cloud, and streamline insights and automation while maximizing reliability in a sustainable way. Power can modernize applications and infrastructure with a frictionless hybrid cloud experience to provide the agility companies need." Source
IBM CIO has many IBM Power being used to host highly critical applications. Because of this reason, we need excellent monitoring of those systems to be sure everything is working well and with a good performance.
IBM-i Concepts:
I am not an IBM-i specialist. However, I got some concepts from IBM documentation and added the source to allow you to go deeper.
For me, the concepts about Subsystems, library, memory pool, and jobs were sufficient to allow me to design the solution to monitor IBM-i; as soon as we need more advanced monitoring, we also need to learn more and support the teams on how to monitor the systems.
What is Subsystem? Concept Source
The subsystem is the work place for jobs on your system. All user work is done by jobs running in the subsystem and it is important to monitor this area for slow work performance. In From IBM® Navigator for i, you can view jobs and job queues associated with the subsystems. Also, you have the same functionality with jobs and job queues from any other area that displays jobs and job queues.
A subsystem is a single, predefined operating environment through which the system coordinates the work flow and resource use. The system can contain several subsystems, all operating independently of each other. Subsystems manage resources. The runtime characteristics of a subsystem are defined in an object called a subsystem description. You can use subsystems to support users in a multilingual environment. You should create a separate subsystem for each set of users with differing needs. -
What is library? Concept Source
A library is an object used to group related objects and to find objects by name. Thus, a library is a directory to a group of objects. You can use libraries to perform the following tasks:
Group certain objects for individual users. Group all objects used for an individual application.Ensure security. Simplify security by having automatic authorization list and public authority assignment for newly created objects Simplify save/restore operations by grouping objects that are saved and restored at the same time into the same library. Use multiple libraries for testing. *Use multiple production libraries.
What is memory pool? source
A memory pool is a logical division of main memory or storage that is reserved for processing a job or group of jobs. On your system, all main storage can be divided into logical allocations called memory pools. By default, the system manages the transfer of data and programs into memory pools.
The memory pool from which user jobs get their memory is always the same pool that limits their activity level. (The activity level of a memory pool is the number of threads that can be active at same time in a memory pool.) Exceptions to this are system jobs (such as Scpf, Qsysarb, and Qlus) that get their memory from the Base pool but use the machine pool activity level. Additionally, subsystem monitors get their memory from the first subsystem description pool, but it uses the machine pool activity level. This allows a subsystem monitor to always be able to run regardless of the activity level setting.
Job Processing
Jobs run specific commands on the IBM-i operating system; the jobs are how the operating system organizes and processes work. The jobs should include all files, commands, and instructions to complete the activity.
The image below shows how the job can be executed, interactive, and Batch.
Instana for IBM i
With this target, Instana developed a monitoring sensor for IBM-i allowing the teams to provide Observability for the IBM-i operating system and the jobs running there. You can see all IBM-i metrics here
For IBM-i, Instana offers remote monitoring, which means we cannot install the Instana agent into the IBM-i operating system; instead, we need to install the Instana agent on a compatible platform and configure the connection with an IBM-i on the Instana configuration. Instana IBM I Configuration.
If you don't need to customize anything on the IBM-i, you can simplify just by using this other yaml file that will be sufficient to start the sensor.
To avoid problems with the sensor, don't leave any fields empty.
As you can see in the following diagram:
Do not forget to be sure your user ID on IBMi has the QSECOFR authority; without that, we cannot collect the data.
As soon as you connect, you will be able to see the IBM-i as an entity on the infrastructure session; you will also be able to see on the Analytics the infrastructure components, in this case, IBM i and DB2 on i, on the host you will also be able to see the sensors active.
Some dashboards:
Alerting Configuration:
Instana, by default, will only show the top 20 jobs based on CPU consumption; if you need to add jobs outside of the top 20, you should add this configuration on the YAML file:
To monitor the job status for a subsystem, you should define the value as JobStatus/Subsystem, as you can see in the following configuration.yaml, you can also use wildcards to check the status of a job on all subsystems:
In this example, an alert will be triggered if any job into the subsystem QZDASOINIT has the status 'RUN'; an alert will also be triggered if any job on any subsystem has the status DEQW or DLYW.
If you want to check all IBMi Job statuses, you can see them here.
To monitor the current status of the subsystems, you also need to modify the YAML file, as you can see in the following image. An alert will be triggered if a subsystem is not active.
Conclusion:
As I mentioned in the introduction, IBM Power Systems are being used by many companies spread worldwide.
And because of that, we should provide monitoring for this robust platform allowing the companies to understand how the jobs are being processed, the performance, and the usage capacity, and to alert them when needed.
We only have a few options on the market to provide monitoring for IBM I systems, and I used IBM Instana as the example tool which can satisfy the monitoring needs for IBM-i systems.
Any suggestions on this topic? Please let me know.
Tiago Dias Generoso is a Distinguished IT Architect | Senior SRE | Master Inventor based in Pocos de Caldas, Brazil. The above article is personal and does not necessarily represent the employer’s positions, strategies or opinions.
Tiago Dias Generoso
Acesse aqui, todos seus conteúdos!
Comments