Bring it on…
Archive for January, 2010
Whitepaper: Performance test environment on a virtualized platform.
Jan 19th
Author: Chaitanya M Bhatt
After working in a virtualized performance test bed for a certain project, I observed that few professionals have conspicuously used incorrect testing process in this context and I would like to jot my words of wisdom out here on this topic.
When I discuss about virtualization of performance test bed with performance engineers, the common reaction is that they think “It sounds cool.” I wouldn’t disagree with that. Never. Running a virtual guest operating system on a host operating system obviously sounds cool. You would be capitalizing on those ‘idle’ resources by pooling resources and making them shared. But, what engineers forget is that virtualization works best only in certain type of workloads. To be specific — it works best with diverse workloads. However, in a load testing bed, the load injector machine or load generator machines during execution always tend to work with a fixed workload with resource utilization almost reaching the threshold level while emulating higher load levels.
So, when you plan to create a test environment with load generators running on a virtualized platform, the first thing which you ought to know is that VMware or any virtualization software for instance has too many overheads like excessive context switching, too many interrupts, shared network resources, high IO activity etc. All these factors sum up to become an “Irrepressible effect” on your virtualized LoadGenerator machines. The whole virtualization paradigm causes processes to generally run slower than when on a physical machine. Hence, it causes a Vusers to run slower than usual. In fact because of the aforementioned points the integrity of your test results will be questionable.
Having said that, I feel, one should have load generators on virtual machines only if it is inevitable for them (Like, due to a stubborn client maybe!).
If you are moving ahead accepting the digressions from best practices, then, it is recommended to be wary of the limitation, document them and then act upon it carefully in order minimize the shortcomings of the approach.
Best practices:
Create multiple virtual LoadGenerator machine instances.
Having multiple virtualized LoadGenerators on a host machine is better than having a single virtual LoadGenerator machine because the hypervisor of a virtual platform scheduler works better when there is a diverse workload than when there is a homogeneous workload. Remember that virtualization software like VMware – be it whether it is a ‘Type-1’ or ‘Type-2’ hypervisor — it is designed to support as many virtual instances as possible on a single physical machine. A simple illustration: Load generator working on single virtual machine emulating 100 Vusers is likely to use more CPU resource than when the same 100 Vusers are split across into 2 virtual machines – 2 load generators.
Have an eye on the CPU utilization.
Ensure that all your virtualized load injector boxes are running at an optimal CPU utilization watermark (say, within 80%), this can avoid annoying issues in recorded response time like those negative response time for transactions which is an infamous problem in this context! This issue triggers because the VMware guest OS clocks are synchronized with host operating system time (which intern depends on your physical clock) and if the CPU is 100% busy while VMware guest operating system is attempting to sync the clock, then the process gets placed in the run queue of the processor causing a clock drift in guest operating system which intern messes up your transaction response time recorded by LoadRunner.
Tap those Load Generator machine resources metrics while executing tests and make sure none of them are starving for resources.
Never forget to add think time and pacing.
Also make sure that you strictly follow your business requirement than simply firing requests which otherwise, would be more like doing a stress test. Incorporate realistic think time in your scripts with appropriate pacing values. Since, CPU is a shared resource in a virtualized environment the above suggestion ensures that no virtual OS instance is deprived of CPU resource at any point in time unnecessarily.
A note on Network performance testing
Jan 7th
Most often it is not necessary to go about doing something specifically such as a network test per se. I would recommend engineers to perform their routine test cases on the AUT such as load test, stress test, endurance test etc. with client specified workload, but during the course of these tests pay attention to the network monitoring part of the test endeavor. There are certain servers monitoring tools specifically meant for monitoring network resources(netstat and network protocol analyzers(wireshark)) which I would recommend you to use other than any load testing tools.
On the other hand if the client has a scenario where there is a massive batch run or backup software causing huge amount of data moving across the network for long hours then I would recommend you to go about performing network tests specifically to emulate such scenarios.
For a network test you can have a Goal Oriented scenario setup with a certain Throughput as the target. Monitor the throughput and time to first buffer graph to find network related issues along with netstat and network protocol analyzers.
For a LoadRunner user the important counters which can be used are Total Bytes counter/sec, server bytes/sec, connections established counter; but note that it is even more important to understand the usage of these counters in identifying bottlenecks. Go through the documentation and get a clear understanding of these counters up front. On a high level be it LoadRunner or IxChariot or say any tool that you use to monitor and collect server metrics, the most important metric you have to make sure you have are:
1. Data Volume: Amount of data sent across then network.
2. Throughput: The speed at which data is sent through the network.
3. Data error rate: Large number of network errors that require retransmission of data will slow down throughput and degrade application performance.
When you’re trying to break down a performance bottleneck the first step you will have to make is to write down a matrix with traffic type against layer/tier. This will help you isolate the network which is causing the problem and henceforth to tune it without having to exercise dart-throwing as you struggle to understand your web site bottlenecks.
Example:
Client-Server communication:
Traffic:
-User HTTP requests
-Server HTML responses
- HTML page elements, such as gifs, jpegs, flash objects
etc..
Server to server communications (Middle tier)
Traffic:
-HTTP session data sharing within a cluster
-Application database transfers
-Traffic to services node (web services)
-Traffic to mail or messaging services
DNS traffic. etc…
Backend communications
Traffic:
-Databases transfers
-Database to application traffic
Etc.
During the analysis process, isolate the network portion which has the problem from the above layers and then find a way to troubleshoot the issue by correlating with the type of traffic which is observed in that layer.
Commonly observed network performance bottlenecks:
1) Faulty network component causing packet storms in the network.
2) Improperly configured NIC cards: Especially a node involving Multiple NICs can have a problem such as improper binding of NIC cards causing few NICs to be over utilized and others underutilized.
3) Improperly configured Load Balancer; Example: Affinity routing aka IP routing the requests to servers when the requests are coming from behind a proxy.
4) Insufficient bandwidth (This problem can be easily detected from the throughput graph when it becomes flat.)
5) Firewall component can be a major performance bottleneck: I would recommend engineers to first test the application system without firewall so that at least one variable is avoided.
6) Duplex mismatch: This problem occurs when one of the two communicating element is operating in full duplex whereas the other is operating in a half duplex mode. Unlike full duplex communicating element the half duplex element can either send or receive data packets but cannot do both together hence causing slowness in overall transfer speed due to heavy packet loss.
7) Using excessively chatty protocols: Too many handshake signals are definitely an overhead not just in the application front but also in the network resource front. A protocol analyzer can be very handy to detect such issues.