One of my favourite performance testing challenges came from a customer who was implementing a new Warehouse management system. His goal was that every transaction the warehouse operators performed would need to be less than one second.
Our job was to make sure this would happen: from hunting down any issues to proving that any fixes performed were making a difference.
Why is it so special?
What made this so challenging? There were a few reasons which I’ll try to summarize here.
Firstly, the nature of the warehouse system means there aren’t a lot of end-users compared to say an eCommerce site. We’re not dealing with “classic” problems related to transactional load.
It’s all about analyzing and optimizing the response time for a single user.
Secondly, the systems tend to be quite intelligent and they do a lot of computation with each user click. The principle processes for an operator are put-away (literally putting stuff onto the shelves) and picking (the reverse). Consider the logic that can go into put-away: where are the empty shelves? Where to put what type of goods (frozen, dry, etc)? Where to stack so that you don’t hide old stock? What’s the most efficient route through the warehouse? A smart system needs to calculate and recalculate with each click.
Adding to our problems: depending on the current inventory levels in the warehouse the computation time will change (i.e. it’s easy to put-away stuff in an empty warehouse).
Generating test data for all this is a feat in itself – in an integrated environment there tends to be a long chain of business transactions that must be performed (from customer order through to shipments) to be ready to even start testing the warehouse processes. Not to mention that representative shipments could have about 50 items on – which is all very painful to set up manually.
Efficiency in testing is crucial
Finally, as we are measuring response times at sub-second level, we need to be very good at measuring the response time, ensuring the test conditions are identical and repeating this process enough times to be sure we’re not dealing with statistical variations.
How did we tackle these problems?
On this project, we had the luxury of a client who really wanted to achieve his goal and who gave us the time and resources to implement what we needed to make it happen. Together, I believe we did seven smart things that ultimately made our project successful.
Stub/virtualize the ERP systems
To get rid of the long test data set up time, we implemented a kind of stubbing/virtualization to the warehouse management system. As we were dealing with a commercial warehouse management system with proprietary interfaces this wasn’t immediately obvious but we had access to a great programmer who implemented our stubbing program so we could effortlessly recreate the same customer shipments again and again.
Automate the test process
Perhaps a no-brainer but we used performance testing software to write our test scripts and record results.
Control the environment
We were given our own testing environment so that we would have total control over the test conditions as well as any software changes. In the custom development and cloud world, this would not seem like a big win, but in the world of integrated ERP systems, this made us feel very privileged.
Maintain realistic stock levels in the test warehouse
The warehouse management architects told us they were very concerned about the stock levels in the warehouse having an impact on the transactional computation time. We researched a real-life warehouse to see how many shelves and stock it had so that we understood what our test warehouse should look like.
Trace root cause
While it’s relatively easy to find poor performing transactions it’s another thing to understand why. Our system had a tracing function that allowed us to record what was happening at the programming level, then deep dive afterwards to diagnose response time issues.
As we were dealing with relatively low response times (sub-second) we were concerned about the tracing activity interfering with the system response time. We ran each script twice – once with tracing off (where we would capture response times) and secondly with tracing switched on. When we saw any poor results in the morning we could analyse exactly what happened together with the system engineers, armed with function-module level information.
As part of our nightly run, we exported data to a dashboard. This allowed us to track any improvements (or not) implemented by the engineers and be able to demonstrate the progress we were making.
Document it all
Finally, as part of our commitment to our customer to leave him with a repeatable testing process, we documented all our tools and steps in a beautiful step-by-step methodology package. Something I was glad to have done when I was invited back several months later to run some new tests following a database upgrade!
Reducing performance time by 30 to 60%
We focused our improvement efforts on the most frequently used transactions and the ones that fell the furthest from the first goal. In particular, we had success with the scanning and put-away transactions where we were able to reduce performance time by 30-60%.
What did we learn from this project? Any performance test engineer will tell you that understanding root cause of performance issues requires deep systems expertise.
We were fortunate to have access to some serious talent in this area. Even after we’ve found performance problems and understood why, implementing the fixes is a technical and managerial challenge.
Finally, the biggest takeaway was that it took some heavy-weight top-down commitment from our client to support us with everything we needed to ensure he got his sub-second response time.
Below is the podcast I recorded about this experience.
Please share your own experience with us by leaving a comment below.
Do you want more information about Itecor, find out more here
Working at Itecor is a great opportunity. For job openings, click here