AIX & Linux OS agent

LPAR2RRD OS agent is a solution for these of you who wish to get further metrics which can be obtained only from Operating System level.
CPU
OS agent CPU
CPU Queue
OS agent CPU Queue
Memory
OS agent Memory
LAN
OS agent LAN
SAN
OS agent SAN
SAN IOPS
OS agent SAN IOPS
SAN RESP
OS agent SAN response time
More examples on our demo site.

OS agent metrics and features

  • AIX FC errors in graphs (FC physical adapters only)
  • Linux: CPU core and CPU GHz graphs
  • Linux: Total IOPS, Data and Latency graphs
  • SAN multipath monitoring
  • JOB TOP, CPU and Memory tracking of running processes graphically in the time
  • OS CPU utilization of user/sys/IO wait/idle in %
  • CPU queue: load average, blocked processes / raw / direct IO
  • Memory utilization of used/FS cache/free memory in MB
  • Paging rate in MB/sec
  • Paging space utilization in %
  • SAN (FC & vSCSI) throughput per adapter
    • data in MB/sec
    • IO/sec
    • response time (latency)
  • LAN (ethernet) throughput per adapter
    • data in MB/sec
    • packet count
  • Total IO throughput (Linux only, v7.20+)
    • IOPS
    • Data in MB/sec
    • response time (latency)
  • Filesystem capacity utilization
  • AIX SEA (Shared Ethernet Adapter) throughput per adapter in MB/sec (IBM Power only)
  • AIX WLM (Workload Manager) monitoring (IBM Power only)
  • AIX AME (Active Memory Expansion) allocation (IBM Power only)
  • Solaris Memory Pools

Operating systems

Implementation

it is implemented as simple client/server application.
OS agent architecture
There is LPAR2RRD daemon listening on the host where LPAR2RRD server is running on port 8162 (IANA official port assigned to LPAR2RRD project).
Each LPAR has installed simple Perl based agent which is started every minute from the crontab and saves memory and paging statistics into a temporary file.
The agent contacts the server every 15-25 minutes and sends all locally stored data for that period.

Agent prerequisites

  • Perl interpreter. All Unix/Linux systems contain Perl in basic installation.
  • It might run under whatever user account, it does not need any special privileges in the OS.
  • Opened TCP communication between each LPAR and LPAR2RRD server on port 8162.
  • Connections are initiated from LPARs.
  • Additional disk space on LPAR2RRD server (about 40MB per each monitored LPAR)

OS agent release notes

Usage

perl lpar2rrd-agent.pl [-s ] [-d] [-c] [-n  ] [-b  ] [-i  ] <LPAR2RRD server hostname/IP>[:<PORT>]

 -d  forces sending out data immediately to check communication channel (DEBUG purposes)
 -c  agent collects & sends only internal HMC data
 -n  agent sends only NMON data from NMON directory <NMON_DIR>
 -b  path to Hitachi HvmSh API
 -i  IP address of HVM (Hitachi Virtualization Manager)
 -t  <max send time in seconds>
 -s  <step in seconds>, do not set < 60, do not forget to update crontab line accordingly e.g. -s 300 means in crontab */5 for minutes
 -m  using sudo for multipath (only root can run it): sudo multipath -l", put this into sudoers: lpar2rrd  ALL = (root) NOPASSWD: /usr/sbin/multipath -ll

 options -c and -n are mutual exclusive
 options -b and -i are both required for Hitachi agent
 no option - agent collects & sends standard OS agent data
Crontab entry for scheduling, use non admin account preferably
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
The agent collects data and sends them every 5 - 20 minutes to LPAR2RRD server
If you use other than standard LPAR2RRD port then place if after SERVER by ':' delimiter
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD server hostname/IP>:<PORT> > /var/tmp/lpar2rrd-agent.out 2>&1
If you want to send data to more LPAR2RRD server instances (number is not restricted)
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD server 1 hostname/IP> <LPAR2RRD server 2 hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1

NMON usage

Documentation
/usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -n  <LPAR2RRD server hostname/IP>  >/var/tmp/lpar2rrd-agent-nmon.out 2>&1
crontab usage: run it either every 10 minutes to process new data collected by nmon or once a day to get all day in once
0,10,20,30,40,50 * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -n <NMON_DIR> <LPAR2RRD server hostname/IP> > /var/tmp/lpar2rrd-agent-nmon.out 2>&1
-n option disables normal agent data collection, only NMON data is collected.
Use 2 separate crontab lines to get standard OS agent data & NMON data load

HMC usage

It works only for HMC CLI (ssh) connected HMCs
Documentation
. /home/lpar2rrd/lpar2rrd/etc/lpar2rrd.cfg; /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -c <LPAR2RRD server hostname/IP>  >/var/tmp/lpar2rrd-agent-hmc.out 2>&1
crontab usage: run it every 5 minutes
0,5,10,15,20,25,30,35,40,45,50,55 * * * * . /home/lpar2rrd/lpar2rrd/etc/lpar2rrd.cfg; /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -c <LPAR2RRD server hostname/IP>  > /var/tmp/lpar2rrd-agent-hmc.out 2>&1
-c option disables normal agent data collection, only internal HMC data is collected.
Use 2 separate crontab lines to get standard OS agent data & HMC data load
Notice: when option -c, agent needs some Env info, change the path to etc/lpar2rrd.cfg file according your installation

Hitachi Compute Blade (BladeSymphony) usage

place it into crontab
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -b <HVMSH_PATH> -i  <LPAR2RRD server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
Use 2 separate crontab lines to get standard OS agent data & HITACHI data load

Enhanced setting

  • default behaviour is, that the agent tries randomly send data to the LPAR2RRD server between 5 - 20 mins
    you can specify max time when data is send, minimum is 5 minutes
    * * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -t <max send time in seconds> <LPAR2RRD server hostname/IP>
    
  • How to avoid SAN checks via fcstat (they might cause some problems, it should not happen in v4.50+)
    * * * * * FCSTAT=/bin/true /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
    
  • By default only interfaces which have IP address assiggned are reported, by env variable can this be skiped and selection is done base on LPAR2RRD_LAN_INT env var, it allows regex only for Linux, be carefull here to do not stack in 1 graph interfaces from different virtualization level what might lead to creasing of presented traffic by counting some traffic more times
    * * * * * LPAR2RRD_LAN_INT="eth.*0$,bond.*,rhevm,9.*" /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD server hostname/IP>  > /var/tmp/lpar2rrd-agent.out 2>&1
    

Debug

  • option -d forces sending out data immediately to check communication channel
      /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <LPAR2RRD server hostname/IP>
    
  • error log: /var/tmp/lpar2rrd-agent-*.err
  • output log, last run: /var/tmp/lpar2rrd-agent-*.out
  • collected data waiting for sending: /var/tmp/lpar2rrd-agent--.txt