Monitoring OpenStack HA Installation
Monitoring is a critical component of openstack administration. A firm hold of reporting will allow administrators to assess the state of openstack environment and maintain it without crisis.
Infrastructure and Kernel Metrics
Monitoring openstack infrastructure starts with infrastructure and kernel metrics. The following metrics should be monitored on all controller, compute, storage and network nodes:
- CPU Utilization
- Memory Utilization
- Free Disk Space
- System Load
- Network Utilization
Database Monitoring
Monitoring databases is necessary to increase openstack environment’s availability and performance. Off the shelves plug-in and inbuilt utilities for both MongoDB and MySQL database are available to provide real-time reporting of database activities and current database state with greater reliability.
RabbitMQ and Memcached Monitoring
Monitoring your RabbitMQ installation is an effective means to intercept openstack issues before they affect the rest of your environment and, eventually, your users. “rabbitmqctl cluster_status “ and RabbitMQ management plug-in provides a starting point for monitoring RabbitMQ metrics.
Memcached is one of the most popular key-value databases and is completely volatile. To begin with, include Memcached service monitoring on each controller node.
Openstack Service Monitoring
To ensure that an openstack services are up and running, verification of service status on every controller, compute, storage and network node is important.
The openstack-utils package provides important utilities used when deploying and managing an OpenStack environment. These utilities include:
- openstack-config : Sets configuration parameters on various OpenStack config files.
- openstack-status : Displays service status information.
Openstack Metrics
Some OpenStack services require additional verification that includes and not limited to:
- neutron agent-list the output table should list all the Neutron agents with the value in the alive column and the True value in the admin_state_up column.
- cinder service-list all cinder backends and their status.
- httpd status for keystone, horizon and other wsgi configuration
- heat service-list heat engines and status
Guest VM Monitoring
Guest VM monitoring capacities are totally dependent on the choice of hypervisor and monitoring tool capabilities. Out-of-box ceilometer metrics can be used to provide resource tracking, and alarming capabilities across all OpenStack core components.
Proactive Scheduled Provisioning Check
Provisioning failures are real in service down cloud and hyper converged infrastructure. Proactive automated provisioning check to include each host_aggregate, cinder backend can help proactively identify provisioning failures and will give administrators an option to troubleshoot before they affect the end-user.