VMware Monitoring

Skip Prerequisites, Web and LPAR2RRD tabs in case of configuring Virtual Appliance, Docker or a Container

Follow installation procedure for your operating system platform
VMware monitoring diagram

Prerequisites

  • Create a user on each vCenter with read-only role.

    Create an account, it must be the global one (like below lpar2rrd in xorux.com domain)
    VMware user rights

    Assign lpar2rrd@xorux.com user read-only role
    VMware user rights


  • Download Linux VMware Perl SDK in tar.gz format from the VMware site.
    It requires free registration at VMware.
    Does not matter if you are on AIX or Linux, if 32bit or 64bit package, just a few Perl libraries is used from it
    Use always the latest SDK 7.x
    Download link: VMware vSphere Perl SDK 7.0U2

  • Allow access from the LPAR2RRD host to vCenter on port 443 TCP (https).
    You can test it via command line like below or during connection test from the UI.
    perl /home/lpar2rrd/lpar2rrd/bin/conntest.pl <vCenter-host> 443  
      Connection to <vCenter-host> on port "443" is ok
    

Installation of VMware SDK from the UI

    You can use this method only on the Virtual Appliance.
    Select previously downloaded Linux VMware Perl SDK, and upload it.
    VMware Perl SDK installation 1
    If you an error "Server version unavailable" durring connection test then check the troubleshooting section below.

Installation of VMware SDK manually

  • Install script, download VMware-vSphere-Perl-SDK-7.0.0-17698549.i386.tar.gz into /tmp
    # su - lpar2rrd
    cd /home/lpar2rrd/lpar2rrd
    ./bin/vmware_install.sh /tmp
      /tmp/VMware-vSphere-Perl-SDK-7.0.0-17698549.i386.tar.gz found
      Extracting selected package to /home/lpar2rrd/lpar2rrd/vmware-vsphere-cli-distrib ...
      Installing selected libraries and apps to /home/lpar2rrd/lpar2rrd/vmware-lib ...
      ...
      Continue by define VMware hosts and their credentials
      UI: menu ➡ VMware ➡ Configure ➡ Add credentials
    
  • Test of vCenter connection (the last cmd must print vCenter time):
    Use double quotes around username and password only in this test
    cd /home/lpar2rrd/lpar2rrd
    . etc/lpar2rrd.cfg
    $PERL vmware-lib/apps/connect.pl --version
      vSphere SDK for Perl version: 7.0.0
      Script 'connect.pl' version: 1.0
    $PERL vmware-lib/apps/connect.pl --server <vCenter host> --username "lpar2rrd@your_domain" --password "XXXXX"
      Connection Successful
      Server Time : 2016-02-25T16:28:44.086369Z
    
    If you get output like below then follow bellow Troubleshooting section
    Server version unavailable at 'https://<vCenter host>:443/sdk/vimService.wsdl' at /home/lpar2rrd/lpar2rrd/vmware-lib/apps/../VMware/VICommon.pm line 704.
    

Configuration

  • Continue by define VMware hosts (vCenter) in the UI:
    http://your_web_server/lpar2rrd/
    Add credentials for every vCenter if it is not managed by the vCenter
    Settings ➡ VMware ➡ New ➡ place credentials (username in format: lpar2rrd@your_domain)
    Test vCenter connection via "Test" button.

    VMware install

  • Run "Connection test" for each added vCenter to assure that it is ok

  • When you use Virtual Appliance then use "run data load" button and wait a bit.
    Your vcenter appears in a few minutes after Ctrl-F5 (browser refresh). That is all!

  • When you do not use Virtual Appliance then run ./load.sh which starts collecting of VMware data
    cd /home/lpar2rrd/lpar2rrd
    ./load.sh 
    
  • You should see VMware hosts in the UI when load.sh finishes (reload the UI in the browser via Ctrl-F5)

Troubleshooting

  • If you get "Server version unavailable" during connection test or during manual connection then:
    cd /home/lpar2rrd/lpar2rrd
    . etc/lpar2rrd.cfg
    $PERL vmware-lib/apps/connect.pl --server <server name> --username "lpar2rrd"
      Server version unavailable at 'https://vcenter01:443/sdk/vimService.wsdl' at /lpar2rrd/lpar2rrd/vmware-lib/apps/..//VMware/VICommon.pm line 734.
    
    1. Make sure firewall does not blocked the traffic in layer 7

    2. If above is is not the case or did not help then follow these procedures

  • Data is not being collected with this error in error.log
    more /home/lpar2rrd/lpar2rrd/logs/error.log
    Thu Feb 18 10:49:01 2016: vmware name: vcenter01 has not array of hosts ?!? : 
    ...
    
    The reason is that user you have configured for retrieving data from the vCenter probably has not read access to whole vCenter and it cannot access all required perf counters.
    Consult user rights setup with your vCenter administrator.

  • Test vCenter connection from the command line:
    1. With entering password manually:
      cd /home/lpar2rrd/lpar2rrd
      . etc/lpar2rrd.cfg
      $PERL vmware-lib/apps/connect.pl --server <vCenter host> --username "lpar2rrd@your_domain" --password "XXXXX"
          Connection Successful
          Server Time : 2016-08-02T06:58:26.355767Z
      
    2. By using saved password already:
      $PERL vmware-lib/apps/connect.pl --server <vCenter host> --username "vCenter read only user" --credstore .vmware/credstore/vicredentials.xml
      
      You should get the same answer with no password request.
      Use the same hostname and username as defined in LPAR2RRD VMware Configuration.

  • In case of a problem check our forum or contact us via support@lpar2rrd.com



OS agent is add-on feature for monitoring from operating system level.
It is monitoring CPU, memory utilization, paging, LAN and SAN traffic on all adapters.
It requires the OS agent deployment to every monitored VM.
The agent is written in Perl and calls basic OS commands to obtain required statistics like vmstat, iostat.

OS agent architecture

Additional information about the OS agent:

Prerequisites

  • Perl
  • Opened TCP communication between each VM and LPAR2RRD server on port 8162.
    Connections are initiated from VM side.
  • Additional disk space on LPAR2RRD server (about 40MB per each monitored VM)
  • Create preferable dedicated user lpar2rrd on each VM with minimum rights
    # useradd -c "LPAR2RRD agent user" -m lpar2rrd
    

OS agent installation (client)

  • Get the latest OS agent from download page

  • Linux installation under root
    # rpm -Uvh lpar2rrd-agent-5.00-0.noarch.rpm
    # rpm -qa|grep lpar2rrd-agent
      lpar2rrd-agent-5.00-0
    
  • Solaris x86 installation under root:
    # gunzip lpar2rrd-agent-5.00-0.solaris-i86pc.tar.gz
    # tar xf lpar2rrd-agent-5.00-0.solaris-i86pc.tar
    # pkgadd -d .
      The following packages are available:
      1  lpar2rrd-agent     LPAR2RRD OS agent 5.00
                            (i86pc) 5.00
     ...
    
    Solaris upgrade under root:
    # pkgrm lpar2rrd-agent
    # pkgadd -d .
    
  • Schedule its run every minute from the crontab on every VM.
    This line must be placed into lpar2rrd crontab:
    # su - lpar2rrd
    crontab -e 
    * * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD-SERVER> > /var/tmp/lpar2rrd-agent.out 2>&1
    
    Replace <LPAR2RRD-SERVER> by hostname of your LPAR2RRD server.

  • You might need to add lpar2rrd user into /var/adm/cron/cron.allow (/etc/cron.allow on CentOS 8) under root user if above "crontab -e" fails.
    # echo "lpar2rrd" >> /var/adm/cron/cron.allow
    

LPAR2RRD server (daemon)

  • Edit etc/lpar2rrd.cfg and set following (if it is not already set):
    vi /home/lpar2rrd/lpar2rrd/etc/lpar2rrd.cfg
    
    LPAR2RRD_AGENT_DAEMON=1
    
  • The daemon is started when load.sh starts
    ./load.sh
      Starting LPAR2RRD daemon on port:8162
      ...
    
  • Assure it is running and listening on port 8162:
    ps -ef|grep lpar2rrd-daemon
      lpar2rrd 10617010 1 0 Mar 16 - 0:00 /usr/bin/perl -w /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl
    netstat -an| grep 8162
      tcp4  0  0  *.8162   *.*   LISTEN
    
  • OS agent data graphs will appear in the UI, use Ctrl-F5 to refresh your web browser

Troubleshooting

  • Client (agent) side:
    • Test if communication through the LAN is allowed.
      telnet  <LPAR2RRD-SERVER> 8162
        Connected to 192.168.1.1   .
        Escape character is '^]'.
      
      This is ok, exit either Ctrl-C or ^].

    • Check following agent files:
      data store: /var/tmp/lpar2rrd-agent-*.txt
      error log: /var/tmp/lpar2rrd-agent-*.err
      output log: /var/tmp/lpar2rrd-agent.out

    • run the agent from cmd line:
      /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <LPAR2RRD-SERVER>
        ...
        Agent send     : yes : forced by -d 
        Agent send slp: sending wait: 4
        OS/HMC agent working for server: <LPAR2RRD-SERVER>
        store file for sending is /var/tmp/lpar2rrd-agent-<LPAR2RRD-SERVER>-lpar2rrd.txt
      
      It means that data has been sent to the server, all is fine
      Here is example when the agent is not able to sent data :
      /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <LPAR2RRD-SERVER>
        ...
        Agent send     : yes : forced by -d 
        Agent send slp: sending wait: 1
        OS/HMC agent working for server: <LPAR2RRD-SERVER>
        store file for sending is /var/tmp/lpar2rrd-agent-<LPAR2RRD-SERVER>-lpar2rrd.txt
        Agent timed out after : 50 seconds /opt/lpar2rrd-agent/lpar2rrd-agent.pl:265
      
      It means that the agent could not contact the server.
      Check communication, port, above telnet example, DNS resolution of the server etc.

  • Server side:
    • test if the daemon on LPAR2RRD server is running, and checking the logs
      ps -ef|grep lpar2rrd-daemon
        lpar2rrd 10617010 1 0 Mar 16 - 0:00 /usr/bin/perl -w /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl
      cd /home/lpar2rrd/lpar2rrd
      tail logs/error.log-daemon
      tail logs/daemon.out
        new server has been found and registered: Linux (lpar=linuxhost01)
        mkdir : /lpar2rrd/data/Linux/no_hmc/linuxhost01/
      
      It means that new OS agent has been registered from linuxhost01 (Linux stand-alone example)

    • Test if OS agent data is being stored on the LPAR2RRD server and have actual timestamp:
      cd /home/lpar2rrd/lpar2rrd
      ls -l data/<server name>/*/<VM name>/*mmm
        -rw-r--r-- 2 lpar2rrd staff  7193736 Mar 17 16:16 data/<server name>/no_hmc/<VM name>/cpu.mmm
        -rw-r--r-- 2 lpar2rrd staff  7193736 Mar 17 16:16 data/<server name>/no_hmc/<VM name>/lan-en1.mmm
        -rw-r--r-- 2 lpar2rrd staff 10790264 Mar 17 16:16 data/<server name>/no_hmc/<VM name>/mem.mmm
        -rw-r--r-- 2 lpar2rrd staff  7193736 Mar 17 16:16 data/<server name>/no_hmc/<VM name>/pgs.mmm
        -rw-r--r-- 2 lpar2rrd staff  7193736 Mar 17 16:16 data/<server name>/no_hmc/<VM name>/san-vscsi0.mmm
        -rw-r--r-- 2 lpar2rrd staff  3597208 Mar 17 16:16 data/<server name>/no_hmc/<VM name>/san_resp-vscsi0.mmm
      find data -name mem.mmm -exec ls -l {} \;
        ...
      
  • In case of a problem check our forum or contact us via support@lpar2rrd.com.
    We would need this data for start of troubleshooting.

Notes

    You will not need to upgrade LPAR2RRD agents regularly with each LPAR2RRD upgrade.
    Read release notes if that is necessary.
    Check OS agent upgrade steps.

Install LPAR2RRD server

  • Download the latest LPAR2RRD server
    Upgrade your already running LPAR2RRD instance.

  • Install it:
    # su - lpar2rrd
    tar xvf lpar2rrd-7.XX.tar
    cd lpar2rrd-7.XX
    ./install.sh
    cd /home/lpar2rrd/lpar2rrd
    
  • Make sure all Perl modules are in place
    cd /home/lpar2rrd/lpar2rrd
    . etc/lpar2rrd.cfg; $PERL bin/perl_modules_check.pl
    
    If there is missing "LWP::Protocol::https" then check this docu to fix it

  • Enable Apache authorisation
    su - lpar2rrd
    umask 022
    cd /home/lpar2rrd/lpar2rrd
    cp html/.htaccess www
    cp html/.htaccess lpar2rrd-cgi
    
  • Schedule to run it from lpar2rrd crontab (it might already exist there)
    crontab -l | grep load.sh
     
    
    Add if it does not exist as above
    crontab -e
    
    # LPAR2RRD UI
    0,30 * * * * /home/lpar2rrd/lpar2rrd/load.sh > /home/lpar2rrd/lpar2rrd/load.out 2>&1 
    
    Assure there is just one such entry in crontab.

  • You might need to add lpar2rrd user into /var/adm/cron/cron.allow (/etc/cron.allow on CentOS 8) if crontab command fails
    Allow it for lpar2rrd user as root user.
    # echo "lpar2rrd" >> /var/adm/cron/cron.allow
    
  • Initial start from cmd line:
    cd /home/lpar2rrd/lpar2rrd
    ./load.sh
    
  • Go to the web UI: http://<your web server>/lpar2rrd/
    Use Ctrl-F5 to refresh the web browser cache.

Troubleshooting

  • If you have any problems with the UI then check:
    (note that the path to Apache logs might be different, search apache logs in /var)
    tail /var/log/httpd/error_log             # Apache error log
    tail /var/log/httpd/access_log            # Apache access log
    tail /var/tmp/lpar2rrd-realt-error.log    # STOR2RRD CGI-BIN log
    tail /var/tmp/systemd-private*/tmp/lpar2rrd-realt-error.log # STOR2RRD CGI-BIN log when Linux has enabled private temp
    
  • Test of CGI-BIN setup
    umask 022
    cd /home/lpar2rrd/lpar2rrd/
    cp bin/test-healthcheck-cgi.sh lpar2rrd-cgi/
    
    go to the web browser: http://<your web server>/lpar2rrd/test.html
    You should see your Apache, LPAR2RRD, and Operating System variables, if not, then check Apache logs for connected errors