Simplistic overview
Let’s assume you are part of the IT service delivery chain for your organization. While all the IT solutions are working fine and customers experience no latency on the transactions performed then life is good and you get to do interesting things like reading the latest IEEE journals. But when the phone scream with the urgency of a 6 month old baby things change. Politely you answer but very impolite the voice on the other side is profusely winging about a product he/she was about to purchase but half way through the ordering process the screen freezes and threw him/her out of the system and now he/she is going to lose millions. You have no idea where to look for the problem or which of the 50 support engineers or vendors to phone. This impede on precious time they never have and they get very frustrated when you phone them with a problem they are not responsible for. Alternatively vendor’s bill you exuberant amounts for unnecessary call outs depleting your operational budget in a jiffy. Vendors are never at fault. While cold sweat is pearling on your forehead you phone Joe but no answer. Now you have no choice but to phone Grumpy but before you can the phone ring and the CIO, noticeably irritated, ask if you know what you are doing because the head of sales just called him - an idiot. Before he throw down the phone he make sure you understand that you are responsible for the bad reputation of the company. If this sound familiar then read on, if not rather pick up your IEEE journal.
The primary aim of application performance management (APM) is to avoid the above scenario. While reading your IEEE journal a soothing sound gently alerted you to the fact that a new mail has arrived. Opening the mail it states that an anomaly was detected in the order provisioning of sanitary products. It is not a crisis yet as end user experience is influenced with an additional latency of about 20%. Jack Java has been alerted that the problem manifest in the product selection list method (select_product_list(area,filter). Jack Java has already accepted the incident and estimated that it will be resolved in 20 minutes. Ramifications of this anomaly will affect sanitary product sales, plumbing sales and bathroom enquiries. Looking up at the video wall you see the icon representing the bathroom division is yellow. You know this information is also visible by the CIO and the chief of sales. While looking at the video wall the icon return to green and a new mail arrive stating that Jack has restored the service quality. In the corner of your eye you see that your service level agreements were unaffected by this incident. Because of the immense pressure you decide to have a coffee and on the way to the kitchen you run into the CIO who politely greet you with a wink.
‘If you can monitor it you can manage it’
The core principal of APM
- Know when anything in the IT landscape is about to break or deteriorate (anomaly recognition – preferably predictive)
- Know exactly what is causing the problem (root cause)
- Automatically cure the problem or notified a support resource with the ability to fix the problem
- Manage and track the activities to fix the problem
- Notify management of the problem indicating the affected systems and services and provide estimate time to fix
- Notify business indicating affected business service and target time for restoration
It can be clearly seen that the core principals cover the used case as described in the simplistic overview above. Methods exist to tangibly measure the effectiveness of an APM implementation based on the core principals. Due to the measurability it become easy to manage the APM implementation with tangible service level agreements (SLA’s).
How can APM be achieved
APM evolve around the following functionality:
- Collection of behaviour information (monitoring) of hardware, networks, system, databases, solutions, transactions, end user experience and business activities
- Collate and process the collected information
- Evaluate the collected information to establish service risks
- For identified service risks either execute remediation or generate notifications requesting remediation
- De-duplicate and correlate of notifications
- Distribute notifications to appropriate resources
- Manage the remediation process
- Present the collected information and identified risks on various angles to diverse audiences (retaining the single pane of glass)
History has proven that it is a daunting task to get the above functionality properly implemented off the bat and hence it become a journey in the organization. Due to this APM implementations has various levels of maturity which can be mapped to a maturity model which is a yardstick for the APM implementation in the organization. The strategy should remain to have as much as possible of the above functionality implemented as soon as possible in order to maximise the benefits from the investment.
Architectural overview of APM solution
Available tools in the APM space
Many different tools exist which all claim to be APM tools. APM tools are extremely technology sensitive and any tools should be carefully evaluated to see if it will cover the APM requirements in your organization. Terminology in this landscape is immature and miss alignment between demand and delivery is prominent. Due to the perceived complexity it become difficult to validate capability against requirements. Luckily if you reverting back to the core principals and the service level agreements measured from the core principals then APM success is guaranteed.
Several independent sources can be utilised in order to gain a better understanding of the tool capabilities against your organizational APM requirements. Gartner is a good example of such a reference.
How can Application Performance Management WorX assist you
Due to many years of experience in application performance management and diverse IT technologies we can assess your IT landscape and determine the APM requirements for your organization. As a vendor agnostic organization we can advise you in terms of multiple suitable products based on your unique APM requirements. We also do architectural assessments and roadmaps to guarantee the longevity of your APM investment. If required the engagement can be extended to implementation and APM operational services.