Creating a Custom Metric to Check SAPRouter SNC Certificate Expiry

This metric and script checks the SAPRouter SNC certificate expiry and gives an alert depending on how many days left until expiry.

Define SAProuter in Focused Run LMDB

Prerequisite: Make sure you have installed the SAP Host Agent and performed Outside Discovery on the SAPRouter host.

Go to the LMDB of you Focused Run system, then go to Single Customer Network:

Switch Namespace and go to “Technical Systems” and choose “Unspecific Standalone Application System” from the drop-down, then hit Create.

Enter:

Enter the Extended SID:

Add Software Component Version:

Search on the software, and then select the correct release, then Add. E.g.:

Save

Add Installation Path and Required LMDB Attributes. NOTE: Find out the relevant software installation path of the SAPRouter on the server.

Perform Simple System Integration

Configure Script and Custom SAPHostCtrl Operation

Prerequisite: The script requires the SAPRouter SNC Certificate to be properly installed, and the PSE credentials in the cred_v2 file for the account that runs the SAPRouter service.

A script is scheduled to run daily as the same user that runs the SAPRouter service. The script reads the expiry date of the from the PSE and writes it to a file in JSON format so that the FRUN metric can read and interpret it. The script by default has SAPRouter home directory as /usr/sap/saprouter or <drive>:\usr\sap\saprouter, and it can be changed. This will be denoted as <SAPROUTER_HOME>. The PSE file variable can also be changed.

Linux/Unix

Create a new executable file in the SAPRouter home directory and copy the code in Appendix A. Ensure it is owned and executable by the user that runs the SAPRouter service and belongs to group sapsys. E.g.,

chown <saprouter user>:sapsys saprouter_expiry_days.sh
chmod 744 saprouter_expiry_days.sh

In the script, adjust the following variables:

  • SAPROUTER_HOME
  • SECUDIR (The SECUDIR folder is where the PSE file resides)
  • SAPROUTER_USER
  • PSE

Test the Script. Switch to the user that runs the SAPRouter service, then execute the script:

/usr/sap/saprouter/saprouter_expiry_days.sh

Look for file NUMDAYS.json. It should contain e.g.:

{type:integer, name:NumDays, value:136}

Schedule script in the root crontab as follows:

# Run SAProuter SNC Certificate expiry check
00 09 * * * su - <saprouter user> -c /usr/sap/saprouter/saprouter_expiry_days.sh

Windows

Create a batch file (*.bat) in the SAPRouter home directory. In the script, adjust the following variables:

  • SAPROUTER_HOME
  • SECUDIR (The SECUDIR folder is where the PSE file resides)
  • PSE

Test the Script. Logon as the user that runs the SAPRouter service. Right-click on the bat script and “run as Administrator”. Alternatively, if you are not logged as the SAPRouter service account, you can “run-as” the user that runs the SAPRouter service and execute the script that way.

runas /user:<domain or hostname>\<SAPRouter service user> <SAPROUTER_HOME>\saprouter_expiry_days.bat

NOTE: The SAPRouter service user requires “Allow log on locally” user rights.

Look for file NUMDAYS.json. It should contain e.g.:

{type:integer, name:NumDays, value:136}

Schedule the Script. Open a command prompt as administrator. Execute the command:

SCHTASKS /CREATE /RU <DOMAIN>\<saprouter user> /RP <password> /SC DAILY /TN "Run SAPRouter SNC Certificate expiry check" /TR "C:\Windows\System32\cmd.exe /C \"<SAPROUTER_HOME>\saprouter_expiry_days.bat\"" /ST 09:00 /RL HIGHEST

Note: The start time e.g., 09:00 is in 24 hour format. You should see:

Run the new task to test it:

SCHTASKS /RUN /TN "Run SAPRouter SNC Certificate expiry check"

You should see:

Check the timestamp of the file NUMDAYS.json.

Create Custom Operation for saphostctrl

To load these values into Focused Run, a custom operation for saphostctrl needs to be created.

Linux/Unix

As Root: Create the following custom operations conf file:

/usr/sap/hostctrl/exe/operations.d/checksnccert.conf

Enter the following into the conf file:

Command: cat /usr/sap/saprouter/NUMDAYS.json
Description: Check number of days to SNC certificate expiry
ResultConverter: flat
Platform: Unix

Test the custom operation as follows:

/usr/sap/hostctrl/exe/saphostctrl -function ExecuteOperation -name checksnccert

Result should be:

Webmethod returned successfully
Operation ID: 06C635D6863A1EEDB6BC5C819EE199D7

----- Response data ----
description=Check number of days to SNC certificate expiry
{type:integer, name:NumDays, value:169}
exitcode=0

Windows

Create the following custom operations conf file:

C:\Program Files\SAP\hostctrl\exe\operations.d\checksnccert.conf

Enter the following into the conf file:

Command: FOR /F "tokens=* delims=" %x in (<SAPROUTER_HOME>\NUMDAYS.json) DO @echo %x
Description: Check number of days to SNC certificate expiry
ResultConverter: flat
Platform: Windows

In a Command Prompt as Administrator, test the custom operation as follows:

"C:\Program Files\SAP\hostctrl\exe\saphostctrl" -function ExecuteOperation -name checksnccert

Result should be:

Webmethod returned successfully
Operation ID: 00155D657F901EEDBCF9E32BC564F964

----- Response data ----
description=Check number of days to SNC certificate expiry
{type:integer, name:NumDays, value:360}
exitcode=0

Create Custom Alert in Focused Run

Create a new monitoring template for SAPRouter here:

Enter into Expert Mode. Create a new Alert with the following settings:

NOTE: If the tick box “Do not Group Individual Occurrences” is ticked, it will alert at each data collection regardless of its previous rating, and not only at a change of rating.

Create Custom Metric In Focused Run

Create a new Metric with the following settings:

Data Collection

All parameters of the Data Collector which are fixed should have the “Configure” box unticked, and the common parameters pre-filled with the correct values.

The “Custom Operation” parameter is the saphostctrl operation “checksnccert” created earlier. The “Metric Name” parameter is the name of the metric in the JSON output file produced by the script, which is “NumDays”.

Usage

Threshold

The threshold picks up the integer as returned by the script; if it falls to below or equals 10 days, then raise a yellow alert, if below or equals 5 days, then raise a red alert. Choose whatever threshold values suit you.

Assignment

Assign the Metric to the Alert:

Activate the Alert

Apply SAPRouter Template and Check Monitoring

TIP: Initially set the collection interval to 5 minutes and apply template to see whether it is collecting data. Then you can set the collection interval back to daily.

You should see this in the Monitoring Application:

Appendix A: saprouter_expiry_days.sh

#!/bin/bash
# Outputs the expiry date of the SNC Certificate and calculates the number of days
# If the number of days falls below threshold, it sends an alert
# Written by Tony Swietochowski

SAPROUTER_USER=saprouter
SAPROUTER_HOME=/usr/sap/saprouter
SECUDIR=$SAPROUTER_HOME/sec
PSE=local.pse
HOSTNAME=$(hostname -f)

# Check  for saprouter user
[[ ! "$USER"=="$SAPROUTER_USER" ]] && echo "This script must be run using the $SAPROUTER_USER user. Exiting." && exit 1

EXPIRYDATE=$(${SAPROUTER_HOME}/sapgenpse get_my_name -p $PSE -n validity 2>&1 | grep NotAfter | awk -F\( '{print$2}' | cut -c -6)
NUMDAYS=$(echo $(( ($(echo $(date --date="$EXPIRYDATE" +%s) - $(date -d $(date +%y%m%d) +%s)) )/86400 )))
# Above method based on https://stackoverflow.com/questions/4946785/how-to-find-the-difference-in-days-between-two-dates

echo $NUMDAYS > $SAPROUTER_HOME/NUMDAYS
echo \{type:integer, name:NumDays, value:$NUMDAYS\} > $SAPROUTER_HOME/NUMDAYS.json

Appendix B: saprouter_expiry_days.bat

@echo off

REM Outputs the expiry date of the SNC Certificate and calculates the number of days
REM If the number of days falls below threshold, it sends an alert
REM Written by Tony Swietochowski

set SAPROUTER_HOME=D:\usr\sap\saprouter
set SECUDIR=%SAPROUTER_HOME%\sec
set SNC_LIB=%SECUDIR%\sapcrypto.dll
set PSE=local.pse

for /f "tokens=2 delims=\" %%i in ('whoami') do set THISUSER=%%i
FOR /F "tokens=* USEBACKQ" %%F IN (`hostname`) DO (SET HOSTNAME=%%F)

chdir /d %SAPROUTER_HOME%

for /f "tokens=2 delims=(" %%a in ('%SAPROUTER_HOME%\sapgenpse.exe get_my_name -p %PSE% -n validity ^2^>^&^1 ^| findstr /l "NotAfter"') do set DATESTRING=(%%a

set expiry_year=20%DATESTRING:~1,2%
set expiry_month=%DATESTRING:~3,2%
set expiry_day=%DATESTRING:~5,2%

set current_year=%date:~-4%
set current_month=%date:~4,2%
set current_day=%date:~7,2%

set "from=%current_month%-%current_day%-%current_year%"
set "to=%expiry_month%-%expiry_day%-%expiry_year%"
echo Wscript.Echo DateDiff("d", "%from%", "%to%") > %TEMP%\tmp.vbs

for /f %%a in ('cscript /nologo %TEMP%\tmp.vbs') do set /a "numdays=%%a"

del %TEMP%\tmp.vbs

echo Number of days to SAPRouter certificate expiry: %numdays% > %SAPROUTER_HOME%\NUMDAYS.log
echo {type:integer, name:NumDays, value:%numdays%} > %SAPROUTER_HOME%\NUMDAYS.json

SLT integration monitoring…

This blog focuses specifically on SLT integration monitoring. Monitoring an SLT system itself is explained in this dedicated blog.

Set up SLT integration scenario

Start the integration and exception monitoring FIORI tile:

On the configuration add the SLT system:

Select SLT as specific scenario:

On the Monitoring part you can filter on a specific source system and/or SLT schema:

On the 3rd tab you can set the Alerting in cases of errors:

Now save and activate. The monitoring is active now.

Next step is to use this system in a model for your scenario:

Using the SLT integration monitor

If you open the FIORI tile and you have selected your scenario, you still need to perform an extra click to go to the SLT monitor:

First you get overview of your system(s):

You need to click on the blue numbers to drill down:

This gives overview of errors, source connection status and target connection status.

You cannot drill down further on this tile. If you see an error, you need to go to your SLT server and start transaction LTRO to see all detailed error and start fixing from there. Transaction LTRO can have errors shown that are not visible in transaction LTRC. Focused Run uses LTRO data.

Monitoring web dispatchers…

This blog will focus on monitoring of standalone web dispatchers. Standalone web dispatchers are used to load balance web traffic towards ABAP and/or JAVA systems. Common use case is to have web dispatcher for a large Netweaver Gateway FIORI installation.

Monitoring productive cloud web dispatchers

Monitoring of web dispatchers focuses on availability and connectivity/performance.

The web dispatcher template contains most needed elements out of the box:

Issues with performance are often caused by limitations set in the web dispatcher configuration. Keep these settings active.

You might want to add specific custom metric to monitor the most important URL for your web dispatcher. Read more in this specific blog.

Next to this setup the normal host monitoring to make sure the file system and CPU of the web dispatcher are not filling up and causing availability issues for the web dispatcher function.

Monitoring non-productive web dispatcher systems

For monitoring non-productive web dispatcher systems, it is normally sufficient to restrict to host and availability monitoring.

Bug fix OSS notes

3373764 – Issue with Content Server on Web dispatcher templates

Monitoring content servers…

Content servers are often used to store attachment and data archiving files. They are technical systems with usually no direct access for end user. End users normally fetch and store data form content server via an ABAP or JAVA application.

Technical setup

The technical setup for monitoring content server in SAP Focused Run is described in detail in a PDF attached to OSS note 3151832 – SAP Content Server 6.40/6.50/7.53 Monitoring with SAP Focus Run. There is no need to repeat here.

The main part of content server monitoring is availability.

ABAP connection to content server monitoring

In some cases both your ABAP stack and content server are up and running, but communication between them is failing on application level. This leads to not working system for end users. Root causes can be firewall issues, certificate issues, or somebody altered settings.

To test the ABAP system connection to content server a custom ABAP program is needed. See this blog. You can schedule the program in batch and set up a new custom metric to capture the system log entry written by the program.

System host template

For system host the regular CPU, memory, disc template is sufficient. Finetune the thresholds to your comfort level.

Database template

Important items of the database template:

  • Database availability
  • Database health checks
  • Backup

In most installations it is chosen to install Content Server with the SAP MaxDB database (similar to LiveCache).

Relevant OSS notes

Batch job monitoring overview SAP Focused Run 4.0…

This blog will explain the use of batch job monitoring in SAP Focused Run 4.0. If you are using older SAP Focused Run 3.0 version, read this blog. If you are on 3.0 and did not use batch job monitoring, then don’t. First upgrade to 4.0 to avoid conversion effort.

For setup of batch job monitoring in SAP Focused Run 4.0, read this blog.

New powerful functions in SAP Focused Run 4.0 on Analytics and Job trending are explained below.

Batch job monitoring

Batch job monitoring in SAP Focused Run is part of Job and Automation monitoring:

After opening the start screen and selecting the scope you get the total overview:

Click on the top round red errors to zoom in to the details (you can’t drill down on the cards below):

Click on the job to zoom in:

Systems overview

Click on the system monitoring button:

On the screen, zoom out on the overview by clicking the blue Systems text top left:

Now you get the overview per system:

Batch job analysis

Batch job analysis is a powerful function. Select it in the menu:

Result screen shows 1 week data by default:

The default sorting is on total run time.

Useful sortings:

  1. Total run time: find the jobs that run long in your system in total. These most likely will also be the ones that cause high load, or business is waiting long for to finish to give results.
  2. Average run time: find the jobs that take on average long time to run. By optimizing the code or batch job variant, the run time can be improved.
  3. Failure rate: find the jobs that fail with a high %. Get the issues known and then address them.
  4. Total executions: some jobs might simply be planned too frequently. Reduce the run frequency.

By clicking on the job trend icon at the end of the line you jump to the trend function.

Job trend function

From the analysis screen or by selecting the Trend graph button you reach the job trend function:

Select the job and it will show the trend for last week:

You can see if execution went fine, or not, and bottom right see average time the job took to complete.

Batch job monitoring setup in Focused Run 4.0…

In Focused Run 4.0 the batch job monitoring was revised. If you are using older version of Focused Run, read this blog on older batch job monitoring setup.

For using batch job monitoring, read this blog.

Open batch job monitoring

Batch job monitoring is now combined with other automation functions like process chain monitoring. Open this tile:

Global settings

For batch job monitoring settings, open the configuration and start with the global settings:

Here you can see the data volume used and set the retention time for how long aggregated data is kept.

You can also set generic rating rules:

Activation per system

In the activation per system select the system and it will open the details:

First switch on the generic activation for each system.

Activation of jobs to monitor

Now you can start creating a job group. First select left Job groups, then the Plus button top right:

Add a job by clicking the plus button and search for the job:

Press Save to add the job to the monitoring.

Grouping logic

You can group jobs per logical block. For example you can group all basis jobs, all Finance jobs, etc. Or you can group jobs per system. Choice is up to you. Please read first the part on alerting. This might make you reconsider the grouping logic.

Adding alerting

The jobs added to the group are monitored. But alerting is a separate action.

Go to the Alerting part of the job group. And an alert. First select the Alert type (critical status, delay, runtime, missing a job). Assign a notification variant (who will get the alert mail), and decide on alert grouping or atomic alerts.

If you do not specify a filter it will apply for the complete group. You can also apply a filter here to select a sub group of the job group.

Based on the alerting you might want to reconsider the grouping.

Relevant OSS notes

Monitoring SAP Cloud Connector…

This blog will focus on monitoring of Cloud Connector systems.

Monitoring productive cloud connector systems

The Cloud connector is used between on premise systems and Cloud solutions provided by SAP.

Monitoring of cloud connector focuses on availability and connectivity.

The cloud connector template contains all the needed elements out of the box:

If your landscape has only 1 cloud connector that is also used for non-productive systems, you might find a lot of issues in the non-productive system. Like expired certificates, channels not working, many logfile entries. If the cloud connector is very important for your business, it is best to split off the productive cloud connector from the non-productive usage.

This way you can apply sharp rule settings for production: even single issue will lead to alert. While on non-production the developers will be making a lot of issues as part of their developer process.

Monitoring non-productive cloud connector systems

In your landscape you might have a non-productive cloud connector that is used for testing purposes. In the non-productive cloud connectors you might apply a different template with less sensitive settings on certificates, logfiles and amount of tunnels that are failing.

Relevant OSS notes

Monitoring ADS (Adobe Document Server)…

This blog will focus on monitoring on Adobe Document Server (ADS)

Monitoring productive ADS systems

When monitoring a productive system, you will need to finetune the monitoring templates for:

  • SAP J2EE 7.20 – 7.50 Application template, for the JAVA application
  • SAP J2EE 7.20 – 7.50 Technical instance template, for the JAVA application servers
  • System host template
  • Database template

JAVA APPLICATION TEMPLATE

Make sure you cover in the JAVA application template the following items:

Availability:

  • JAVA HTTP availability
  • Expiring certificates
  • JAVA license expiry

From the JAVA instance template make sure to cover the following items:

  • J2EE application status
  • Instance HTTP availability and logon
  • JAVA server node status
  • GC (Garbage collection)

Fine tune the metrics so you are alerted on situation where the system is having issues.

ADobe document server template

ADS has a specific Technical instance template.

Make sure you activate it:

Most important here is the first one: ADS availability. Please make sure you are alerted when this one is not available.

System host template

For system host the regular CPU, memory, disc template is sufficient. Finetune the thresholds to your comfort level.

Database template

Important items of the database template:

  • Database availability
  • Database health checks
  • Backup

Creating a Custom Metric to Check if a Linux Filesystem is Mounted…

Some filesystems are critical to a business, such as those used in interfaces. This custom metric group will alert if a filesystem is not mounted.

Create the Bash Script to Check the Filesystem Status

Firstly, we need to create a bash script that takes the filesystem as its input argument and then checks its status. Create the following script called /sbin/checkfilesystemmounted.sh (owner is root, permissions 755). You may put this script somewhere else if you prefer, but be sure to refer to the correct location later on in this post.

#!/bin/bash
findmnt $1 >/dev/null && echo \{type:integer, name:FileSystemMounted, value:1\} || echo \{type:integer, name:FileSystemMounted, value:0\}

The findmnt command returns the mount details if the filesystem is mounted. The filesystem is passed as a script argument in variable $1. If the filesystem is mounted, the script returns integer 1. If the filesystem is not mounted, the script returns integer 0. For example, to check your desired filesystem, execute it like this as root:

/sbin/checkfilesystemmounted.sh /the/filesystem/you/want/to/check

The output will be in JSON format. If the filesystem is mounted, the value will be 1, as follows:

{type:integer, name:FileSystemMounted, value:1}

The name:FileSystemMounted is the name of the value to be picked up by saphostctrl, as described next.

Create the Custom Operation for saphostctrl

To load these values into Focused Run, we create a custom operation for saphostctrl. Create the following custom operations conf file:

/usr/sap/hostctrl/exe/operations.d/checkfilesystemmounted.conf

This contains:

Command: /sbin/checkfilesystemmounted.sh $[FILESYSTEM]
Workdir: /home/sapadm
Description: Check if filesystem is mounted
ResultConverter: flat
Platform: Unix

To test the custom operation, execute the following command:

/usr/sap/hostctrl/exe/saphostctrl -function ExecuteOperation -name checkfilesystemmounted FILESYSTEM=/the/filesystem/you/want/to/check

The result should be as per the following example:

Webmethod returned successfully
Operation ID: 0A02C69098121EDDA68C041B50FE858D

----- Response data ----
description=Check if filesystem is mounted
{type:integer, name:FileSystemMounted, value:1}
exitcode=0

Create the Custom Alert in SAP Focused Run

In Focused Run, we create an alert in a Linux host monitoring template. For example, the alert name is “Interface Filesystem not Mounted”. The Alert should be in Category “Exceptions” and the Severity is up to you. In this case it is 9.

Create the Custom Metric Group in SAP Focused Run

Next, we create the custom Metric Group . A Metric Group allows variants to be created, and each variant corresponds to a filesystem you wish to monitor.

Overview Tab:
  • Name: “Interface Filesystem not Mounted”
  • Category: Exceptions
  • Class: Metric Group
  • Data Type: Integer
  • Technical Name: INTERFACE_FILESYSTEM_NOT_MOUNTED
Data Collection Tab:
  • Data Collector Type: Diagnostic Agent (push)
  • Data Collector Name: OS: ExecuteOperation
  • Collection Interval: 5 Minutes (depending on the criticality)
  • CUSTOM_OPERATION_NAME: checkfileystemmounted – This corresponds to the custom operation for saphostctrl created earlier
  • METRIC_NAME: FileSystemMounted – This corresponds to the name of the metric in the JSON output by the bash script
  • RETURNFORMAT: JSON – This is the output format of the bash script
Usage Tab:
Threshold Tab:

As the script returns a numeric value 0 if the filesystem is not mounted, then the threshold will alert if the value is 0.

Assignment Tab

Assign to the custom alert created earlier.

Add Variants

The variable passed to the saphostctrl operation is “FILESYSTEM”. We can add the rest of the filesystems as individual variants. The format for the operation parameters is as follows:

FILESYSTEM:/the/filesystem/you/want/to/check

For example:

You can enter as many filesystems as you like as separate variants.

Activate Alert

Go to the “Metrics, Events, Alerts Hierarchy” tab, and activate System Monitoring.

Testing the Metric

In a non-Production environment, try to unmount a filesystem, and at most 5 minutes later, there should be an alert produced.

Monitoring EWM systems…

This blog will focus on monitoring on EWM systems.

Monitoring productive EWM systems

EWM systems are at the often used as stand alone systems that make sure logistics and warehousing can keep running at high availability. If the connected ECC or S4HANA system is down, EWM can continue to support logistics operations.

EWM can be older version based on SCM/BI system core. Newer EWM systems are using S4HANA with EWM activated as standalone.

Extra in an EWM system are the use of qRFC and the CIF (Core interface). And many EWM systems have users that interact with the system via ITS GUI based handheld scanners.

CIF monitoring

The CIF is the core interface between SCM and ECC system. The interface typically uses RFC and qRFC. And it is working both ways.

Setup for the CIF specific RFC’s and qRFC’s the monitoring:

Handheld scanners

Many EWM systems are having interaction with scanners via the ITS server. Basically this is a small web page on a scanning device.

Make sure you monitor the availability of the URL’s that the scanners are using. More on URL monitoring can be found in this blog.

ABAP APPLICATION TEMPLATE

Make sure you cover in the ABAP application template the following items:

Availability:

  • Message server HTTP logon
  • System logon check
  • RFC logon check
  • License status
  • Certificates expiry
  • Update status

Performance and system health:

  • Critical number ranges
  • SICK detection
  • Dumps last hour
  • Cancelled jobs last hour
  • Long running work processes and jobs (see blog)

Security:

  • Global changeability should be that the system is closed
  • Locking of critical users like SAP* and DDIC (see blog)

Fine tune the metrics so you are alerted on situation where the system is having issues.

ABAP APPLICATION SERVER TEMPLATE

Make sure you cover in the ABAP application server template the following items:

Availability:

  • Local RFC logon test
  • Local HTTP logon test (if any handheld scenario is used)
  • Local Logon test
  • Message server disconnects (see blog)

Application server performance and health:

  • Amount of critical SM21 messages
  • No more free work processes (see blog)
  • Update response times

You can consider to setup extra custom metrics for the application servers:

System host template

For system host the regular CPU, memory, disc template is sufficient. Finetune the thresholds to your comfort level.

Database template

Important items of the database template:

  • Database availability
  • Database health checks
  • Backup