Java Performance Monitoring : First Metrics
During your professional life (as developer), you get a lot of features to implement, lots of bugs to fix, but sometimes, when you are not expecting, one of those hard blockers hit your application in it’s heart – performance issues. You start to think about the application design, digging around, trying to understand the source the fastest you can.
Stop. Take a deep breath and let’s go.
Good metrics you should look for:
- CPU Utilisation (kernel vs user), when your facing a CPU intense computation you’ll see a great user CPU utilisation. When you have a lot of thread control or shared resource access control, you’ll see a great kernel percentage using your CPU.
- Scheduler Run Queue, this metric is good to understand how saturated the system is. This queue holds threads ready to execute. If this value is 4 times greater than the number of virtual processors that something you should be really be worried about (too much threads, bad system responsiveness), this is the kind of values you don’t want to reproduce. If the size is greater than the number of virtual processors continuously you should be looking to reduce it.
- Memory Utilisation, be aware of memory swapping. If you observe anything greater than zero that something to worry about.
- Lock contention. In Java applications this is an hard metric to get, there is an operation used to get an approximated value: (csw – icsw) * 80000 / CPU frequency. If this gives you something greater than 3-5% you should then be worried about the overload of synchronisation between threads in your application and try to reduce it.
- Network utilisation, here you can easily figure it out whether the network is your bottleneck or not.
- Disk I/O utilisation, another good metric to understand whether the performance issue is related with the infrastructure or not.
So, basically, this is an initial list to look for, focused on multithreaded applications. But still valid for single threaded applications. Most of all, the most important this is to make decisions based on good interpretation of monitoring data.