TN WW172 System Health Checks
Description
This article from InSource shows...
- Author: Peter Farrell
- Published: 01/19/2017
- Applies to: Wonderware and Microsoft products
Details
Many warning signs of system failure, such as degrading mechanical components, storage filling up, memory failing, corruption of infrequently used but important software components, etc., are "silent", which means that they have no obvious symptoms. So, your systems can appear to be performing well even though your risk is increasing and may soon result in a catastrophic failure of a mission critical process. By performing periodic health checks, such "silent" emerging problems can be uncovered, corrected, or even prevented.
System health is important, but few manufacturing organizations have a managed process to evaluate the health status of their systems. For the same reasons that it makes sense to get a regular health check-up from your doctor, you should schedule a deep system health check at least once a year to prevent unplanned costly, and sometimes catastrophic interruptions of your process.
Hard data about losses due to unplanned interruptions of process control systems as a result of software performance degradation or failure is difficult to corroborate, but it’s clear that the impact is significant, and can sometimes even result in catastrophic results for a manufacturing organization.
If you have not already done so, following are a few tips that can get you started in planning and performing system health checks for your process control software.
- Designate responsibility to a team of people with skills and domain specialties who can plan, coordinate, perform, and document system health checks of all related and dependent components.
- Formally schedule periodic deep health checks of software systems, and make them a part of your regular maintenance programs.
- Commission an annual software system health check by an independent outside organization which has specialized skills and experience in evaluating the specialized nature of the software and software environments that your processes depend upon. You can contact InSource Solutions for more information on how we can help you with this important process. Call InSource Solutions at 877 467 6872 for more information on how we can assist.
- Keep your training programs for personnel responsible for system health up to date. The InSource Solutions website includes up to date information on available training at www.insourcess.com
- Attend product update events, and webinars from vendors of your particular process control software to make sure that you have the latest information about emerging trends and issues that could affect the results of your health checks.
- Perform a system health check after any major software or hardware upgrades.
- Where possible, build in alarms and triggered actionable response to deviations in system performance.
- Always consult the team responsible for system health during planned system changes, and before changes are made.
- Include your software and hardware vendors on your health check team to stay current on emerging trends, upcoming changes, and possible obsolescence of components that could degrade system health.
InSource Solutions can play a valuable role in helping you meet these important requirements. You can contact us for more information. The following example checks can be performed on nodes as indicated.
- GR Node
Wonderware Logger – check and resolve all warnings and errors.
Operating System Event Viewer System Log and Application Log – check and resolve all warnings and errors.
SQL Server logs - – check and resolve all warnings and errors.
SQL Server – Have a qualified DBA or someone familiar with SQL Server review all important metrics and logs to ensure performance is sufficient, that logs are healthy, and that all required backups are being performed. Confirm that you have a backup process in place, and that your backups are validated and available when needed. See the following in-depth article for more information
https://insource.mindtouch.us/Wonderware_(General)/Tech_Notes/TN_WW173_Software_Backups_-_When_Is_a_Backup_Not_a_Backup%3F
Application Server engine performance – Use Wonderware’s Object Viewer to review key engine performance metrics. Following is an example of the metrics that can be reviewed.
- AOS Nodes
Wonderware Logger – check and resolve all warnings and errors.
Operating System Event Viewer System Log and Application Log – check and resolve all warnings and errors
Application Server engine performance – Use Wonderware’s Object Viewer to review key engine performance metrics (see screen shot of example engine review above).
- Managed and classic InTouch platforms/nodes, Nodes acting as IO Server sources, etc
Same as AOS nodes above.