ViPR SRM 3.6 – Periodic Maintenance Tasks

Table of Contents

Overview

Learn how to perform periodic maintenance checks to ensure optimal performance of the software. Learn how to review log files, keep software up-to-date, and use best practices.

This article is for ViPR SRM administrators or anyone who manages access to the ViPR SRM application. Ensure you are logged in to the ViPR SRM portal using an account with administrator privileges.

Perform the following to maintain the health of ViPR SRM

Back to Top

ViPR SRM System health reports

Review reports to monitor system health.

From the ViPR SRM portal, click EMC M&R Health > Health > Components. Review the following reports.

Web servers (Tomcat)

If values are out of range, increase memory and/or quantity of vCPUs on the Web Server virtual machine and continue to monitor utilization.

Databases

If metrics are above the limit, installation of an additional Database/Backend may be required to support additional metrics. If desired, EMC Professional Services can assist with this effort.

Backend

Collector managers

Servers

Go to EMC M&R Health > Servers Summary.

Back to Top

Temporary files

Verify that temporary files are automatically deleted.

ViPR SRM uses temporary files to ensure performance. Temporary files should be cleaned up automatically by the system. If those files are not deleted automatically, this might indicate a problem with data insertion. Temporary files are located on each of the backend servers at Backends/APG-Backend/<instance name>/tmp/

Back to Top

Reviewing log files

Troubleshoot the root cause of any errors indicated in log files.

Procedure

  1. Check WARNING and SEVERE messages in each of the following log files:
    • Collecting/Collector-Manager/Default/logs/collecting-0-0.log (for collection)
    • Web-Servers/Tomcat/Default/logs/apg-tomcat-default.out (for current tomcat running instance)
    • Web-Servers/Tomcat/Default/logs/catalina.<date>.log (the system will indicate if there are any errors)
    • Backends/APG-Backend/Default/logs/cache-0-0.log
  2. For additional assistance, open a Service Request with EMC Support.
Back to Top

Keeping software up to date

Check for software updates.

Procedure

  1. Check the ViPR SRM product support page at http://support.emc.com to determine whether a newer version is available.
  2. If a newer version is available, download the update file and follow the instructions in the upgrade document located on the same page.
Back to Top

Systems management frameworks

If ViPR SRM is being monitored by a systems management framework (e.g., VMware monitoring tools), EMC recommends monitoring specific items.

Back to Top

Best practices

Optimize product performance by using best practices in system maintenance.

Alerts

Use the main screen of the Health report to discover which components received alerts. Brown indicates a major alert, yellow a minor alert, and green indicates a no alert status.
report screen

Components receiving alerts

Memory and swapping

LINUX can cache a lot of memory. As a result you have a lot more memory to use than what the free command displays.

Be concerned if the free command shows you are using close to 100% of memory.

Unlike a LINUX server, a Windows server does not cache memory. Windows informs you exactly how much memory is in use.

If swap usage is over 2%, check memory usage. In the following example, swap usage is minimal, shown by the flat yellow line at the bottom.
report screen

Memory & Swap Usage

CPU usage

On a front end or database, be sure CPU usage is in a normal range. Otherwise, performance is degraded for end users.
report screen

CPU Utilization

CPU usage is less critical for collectors because they are not as visible to end users. However, if CPU usage on a collector is 100%, you can have collection problems, or problems with the collection period you set.

Server load

The value for load average indicates whether processes are waiting for a CPU or whether a CPU is idling some of the time. If the value for load average is greater than the number of CPUs, then processes are waiting for a CPU.

Use the top command to check the server load:
report title

Server load

Disk space

All ViPR SRM servers must have disk space for storing data. Use the df command to check available space on a server.
report screen

Disk space

Running out of disk space can cause problems, especially on backend servers, and can cause data loss. Be sure there is plenty of disk space on backend servers and on collectors as well.

Database Metrics Count and Database Properties Count

For Database Metrics Count, a value of 1,000,000 is a safe limit. Depending on your hardware, a higher value may be allowed. Slightly exceeding the limit probably will not create problems. A value of 2,000,000 always causes degradation of performance.

The number of properties in a database should be about twenty times the number of metrics in the database.

To display the database metric count and database properties count, select EMC M&R Health > Health > Components > Databases.

Current Queued Files Count

The Current Queued Files Count is the number of files waiting in queues for processing by a backend, which will batch the job. The value should be less than 20 per backend. If files accumulate for one or more backends, this means there is a problem inserting data into the database.

To display the Current Queued Files Count, go to EMC M&R Health > Health > Components > Backends.

JVM Sizing Recommendation

The JVM Sizing Recommendation report shows if a process exceeds memory or lacks enough memory to run smoothly. If you have a mismatch between the value for Currently Allocated and the value for Recommended Allocation for a process, increase the memory for the process and lower the memory allowed to other processes.

To display this report, go to EMC M&R Health > Misc. Reports > JVMs Sizing Recommendation

Back to Top

Displaying out-of-the-box alerts

Out-of-the-box alerts notify you when a threshold is exceeded. Learn what they are so you can route them to an appropriate destination.

Procedure

  1. Go to Administration > Modules > Alerting.
  2. Select Local Manager > Alert Definitions > EMC M&R Health.
    A list of alerts with their descriptions is displayed.
Back to Top

Editing out-of-the-box alerts

Route each alert to appropriate destinations.

Procedure

  1. In the list of alerts, select an alert, right-click and select Edit.
    An alerting GUI appears.
  2. On the right, click Action to display destinations.
  3. Drag an action into the editing area.
    A form appears, where you configure the alerting action.
  4. Enter text to configure the action.
  5. Optionally, click Test Action to verify that the action works as expected.
  6. Click Save.
  7. Click Save and enable to activate the alert.
Back to Top