Creating a Custom Metric to Check if a Linux Filesystem is Mounted

Some filesystems are critical to a business, such as those used in interfaces. This custom metric group will alert if a filesystem is not mounted.

Create the Bash Script to Check the Filesystem Status

Firstly, we need to create a bash script that takes the filesystem as its input argument and then checks its status. Create the following script called /sbin/checkfilesystemmounted.sh (owner is root, permissions 755). You may put this script somewhere else if you prefer, but be sure to refer to the correct location later on in this post.

#!/bin/bash
findmnt $1 >/dev/null && echo \{type:integer, name:FileSystemMounted, value:1\} || echo \{type:integer, name:FileSystemMounted, value:0\}

The findmnt command returns the mount details if the filesystem is mounted. The filesystem is passed as a script argument in variable $1. If the filesystem is mounted, the script returns integer 1. If the filesystem is not mounted, the script returns integer 0. For example, to check your desired filesystem, execute it like this as root:

/sbin/checkfilesystemmounted.sh /the/filesystem/you/want/to/check

The output will be in JSON format. If the filesystem is mounted, the value will be 1, as follows:

{type:integer, name:FileSystemMounted, value:1}

The name:FileSystemMounted is the name of the value to be picked up by saphostctrl, as described next.

Create the Custom Operation for saphostctrl

To load these values into Focused Run, we create a custom operation for saphostctrl. Create the following custom operations conf file:

/usr/sap/hostctrl/exe/operations.d/checkfilesystemmounted.conf

This contains:

Command: /sbin/checkfilesystemmounted.sh $[FILESYSTEM]
Workdir: /home/sapadm
Description: Check if filesystem is mounted
ResultConverter: flat
Platform: Unix

To test the custom operation, execute the following command:

/usr/sap/hostctrl/exe/saphostctrl -function ExecuteOperation -name checkfilesystemmounted FILESYSTEM=/the/filesystem/you/want/to/check

The result should be as per the following example:

Webmethod returned successfully
Operation ID: 0A02C69098121EDDA68C041B50FE858D

----- Response data ----
description=Check if filesystem is mounted
{type:integer, name:FileSystemMounted, value:1}
exitcode=0

Create the Custom Alert in SAP Focused Run

In Focused Run, we create an alert in a Linux host monitoring template. For example, the alert name is “Interface Filesystem not Mounted”. The Alert should be in Category “Exceptions” and the Severity is up to you. In this case it is 9.

Create the Custom Metric Group in SAP Focused Run

Next, we create the custom Metric Group . A Metric Group allows variants to be created, and each variant corresponds to a filesystem you wish to monitor.

Overview Tab:
  • Name: “Interface Filesystem not Mounted”
  • Category: Exceptions
  • Class: Metric Group
  • Data Type: Integer
  • Technical Name: INTERFACE_FILESYSTEM_NOT_MOUNTED
Data Collection Tab:
  • Data Collector Type: Diagnostic Agent (push)
  • Data Collector Name: OS: ExecuteOperation
  • Collection Interval: 5 Minutes (depending on the criticality)
  • CUSTOM_OPERATION_NAME: checkfileystemmounted – This corresponds to the custom operation for saphostctrl created earlier
  • METRIC_NAME: FileSystemMounted – This corresponds to the name of the metric in the JSON output by the bash script
  • RETURNFORMAT: JSON – This is the output format of the bash script
Usage Tab:
Threshold Tab:

As the script returns a numeric value 0 if the filesystem is not mounted, then the threshold will alert if the value is 0.

Assignment Tab

Assign to the custom alert created earlier.

Add Variants

The variable passed to the saphostctrl operation is “FILESYSTEM”. We can add the rest of the filesystems as individual variants. The format for the operation parameters is as follows:

FILESYSTEM:/the/filesystem/you/want/to/check

For example:

You can enter as many filesystems as you like as separate variants.

Activate Alert

Go to the “Metrics, Events, Alerts Hierarchy” tab, and activate System Monitoring.

Testing the Metric

In a non-Production environment, try to unmount a filesystem, and at most 5 minutes later, there should be an alert produced.

Monitoring GTS system

This blog will focus on monitoring on GTS systems.

Monitoring productive GTS systems

GTS systems are at the not frequent in use. When in use they do play a vital role in import and export business scenario’s when good are crossing borders.

Since a GTS system is normally installed, and often no to little maintenance and software changes are performed on the system. Also basis teams tend not to look at it too often, since it normally runs stable.

In case of non-availability of GTS, ECC scenario’s linked to GTS might fail and can causes severe business disruptions.

For this reason it is important to set up monitoring in FRUN for your GTS system and also configure mail alerts in case of issues. They will not happen too often, but when they happen you can act fast. This will also save the basis team spending a lot of time on checking GTS system for log (most cases, the checks are good).

When monitoring a productive system, you will need to finetune the monitoring templates for:

  • ABAP 7.10 and higher Application template, for the ABAP application
  • ABAP 7.10 and higher Technical instance template, for the ABAP application servers
  • System host template
  • Database template

ABAP application template

Make sure you cover in the ABAP application template the following items:

Availability:

  • Message server HTTP logon
  • System logon check
  • RFC logon check
  • License status
  • Certificates expiry
  • Update status

Performance and system health:

  • Critical number ranges
  • Enqueue lock % filled
  • SICK detection
  • Dumps last hour
  • Update errors last hour
  • Cancelled jobs last hour
  • Long running work processes and jobs (see blog)

Security:

  • Global changeability should be that the system is closed
  • Locking of critical users like SAP* and DDIC (see blog)

Fine tune the metrics so you are alerted on situation where the system is having issues.

ABAP application server template

Make sure you cover in the ABAP application server template the following items:

Availability:

  • Local RFC logon test
  • Local HTTP logon test
  • Local Logon test
  • Message server disconnects (see blog)

Application server performance and health:

  • Amount of critical SM21 messages
  • No more free work processes (see blog)
  • Update response times

You can consider to setup extra custom metrics for the application servers:

System host template

For system host the regular CPU, memory, disc template is sufficient. Finetune the thresholds to your comfort level.

Database template

Important items of the database template:

  • Database availability
  • Database health checks
  • Backup

Functions monitoring

Next to the availability and performance mentioned above, check also for monitoring certain functions:

Monitoring SCM system

This blog will focus on monitoring on SCM systems. Also known as APO systems.

Monitoring productive SCM systems

SCM systems are at the often used logistics optimization systems. They are mainly used in combination with traditional ECC systems. They are less needed in combination with S4HANA systems (or you can use the embedded SCM of HANA).

The core of an SCM system is a BI system. Many data is using similar extractors and process chains as a BI system. Hence follow the tuning needed for a BI system.

Extra in an SCM system are the LiveCache and the CIF (Core interface).

LiveCache monitoring

LiveCache is normally running on a MaxDB database.

So it is important to activate, assign and finetune the metrics for the MaxDB database:

Focus on:

  • Availability
  • Backup
  • Performance

Next to the database, you also need to activate, assign and finetune the LiveCache specific application template:

This template contains the primary elements to monitor for the LiveCache functions like:

  • Availability of LiveCache as a function
  • Structure check for LiveCache
  • Memory issues for LiveCache specifically

Fine tune the metrics so you are alerted on situation where the system is having issues.

CIF monitoring

The CIF is the core interface between SCM and ECC system. The interface typically uses RFC and qRFC. And it is working both ways.

Setup for the CIF specific RFC’s and qRFC’s the monitoring:

Process chain monitoring

SCM uses process chains. To monitor process chains, read this dedicated blog.

Monitoring BW system

This blog will focus on monitoring on BW systems.

Monitoring productive BW systems

BW systems are at the often used as reporting systems within an SAP landscape.

When monitoring a productive system, you will need to finetune the monitoring templates for:

  • ABAP 7.10 and higher Application template, for the ABAP application
  • ABAP 7.10 and higher Technical instance template, for the ABAP application servers
  • System host template
  • Database template

ABAP application template

Make sure you cover in the ABAP application template the following items:

Availability:

  • Message server HTTP logon
  • System logon check
  • RFC logon check
  • License status
  • Certificates expiry
  • Update status

Performance and system health:

  • Critical number ranges
  • SICK detection
  • Dumps last hour
  • Cancelled jobs last hour
  • Long running work processes and jobs (see blog): this is more tricky in a BW system, since it can have longer running extraction and processing jobs

Security:

  • Global changeability should be that the system is closed
  • Locking of critical users like SAP* and DDIC (see blog)

Fine tune the metrics so you are alerted on situation where the system is having issues.

ABAP application server template

Make sure you cover in the ABAP application server template the following items:

Availability:

  • Local RFC logon test
  • Local HTTP logon test (if any BW web scenario is used)
  • Local Logon test
  • Message server disconnects (see blog)

Application server performance and health:

  • Amount of critical SM21 messages
  • No more free work processes (see blog)
  • Update response times

You can consider to setup extra custom metrics for the application servers:

For a BW system some numbers are typically higher than on an ECC or S4HANA system. Response times of 1.5 seconds would indicate horrible performance on ECC, but are normal on BW system.

System host template

For system host the regular CPU, memory, disc template is sufficient. Finetune the thresholds to your comfort level.

Database template

Important items of the database template:

  • Database availability
  • Database health checks
  • Backup

Functions monitoring

Next to the availability and performance mentioned above, check also for monitoring certain functions:

Monitoring ECC and S4HANA systems

This blog will focus on monitoring on ECC and S4HANA systems.

Monitoring productive ECC and S4HANA systems

ECC and S4HANA systems are at the core of each SAP landscape, and most vital to the business.

When monitoring a productive system, you will need to finetune the monitoring templates for:

  • ABAP 7.10 and higher Application template, for the ABAP application
  • ABAP 7.10 and higher Technical instance template, for the ABAP application servers
  • System host template
  • Database template

ABAP application template

Make sure you cover in the ABAP application template the following items:

Availability:

  • Message server HTTP logon
  • System logon check
  • RFC logon check
  • License status
  • Certificates expiry
  • Update status

Performance and system health:

  • Critical number ranges
  • Enqueue lock % filled
  • SICK detection
  • Dumps last hour
  • Update errors last hour
  • Cancelled jobs last hour
  • Long running work processes and jobs (see blog)

Security:

  • Global changeability should be that the system is closed
  • Locking of critical users like SAP* and DDIC (see blog)

Fine tune the metrics so you are alerted on situation where the system is having issues.

ABAP application server template

Make sure you cover in the ABAP application server template the following items:

Availability:

  • Local RFC logon test
  • Local HTTP logon test
  • Local Logon test
  • Message server disconnects (see blog)

Application server performance and health:

  • Amount of critical SM21 messages
  • No more free work processes (see blog)
  • Update response times

You can consider to setup extra custom metrics for the application servers:

System host template

For system host the regular CPU, memory, disc template is sufficient. Finetune the thresholds to your comfort level.

Database template

Important items of the database template:

  • Database availability
  • Database health checks
  • Backup

Functions monitoring

Next to the availability and performance mentioned above, check also for monitoring certain functions: