Database Performance with High TPS System

High TPS System and Database Performance

Table of Content

Overview
The Challenge
Solution
Hardware Description Used For The Analysis:

OVERVIEW

In today’s world of ever increasing internet data usage from hand held devices, the telecom operators around the world want to optimize their bandwidth usage and also find out patterns which can help them in decision making.

Analyzing the user behavior for their surfing habits and keeping the history for finding the patterns can be a daunting task. Especially when you consider that more and more users are switching from their personal computers to their mobile handsets for their day to day activities such as information searching, visiting social sites, watching videos etc.

To support telecom providers for their needs requires a robust system which can handle high TPS, ranging from 5K to 100K peak TPS.

Applications are known to have achieved such high TPS, but to support this every application requires a robust and high performing database. At the end of the day you will find that most likely the TPS of your application is limited not so much by your application but your database. Your application will perform evenly across varying traffic load but when it comes to the database, they are known to behave differently. A database having 1 million records in a table behaves differently than the one having 10 million records.

For a system where you expect high TPS, you want your database to perform consistently irrespective of the number of records in the database, you need to consider the bottlenecks which can arise if the application and database are not designed properly. A badly designed application/database can bring down average TPS of you system by as much as 1/30 times!

THE CHALLENGE

It is always suggested not to design your application and the database in your development environment for a high TPS system. The simple reason for this is that databases behaves differently in different situations. You need your lab to be able to have records as much as 50 to 500 million in order to know the database behavior and design accordingly.

Now the question comes, why would the database behave differently at different load? The answer is in your table indexes. A typical application involves insert and update mechanism to save data into the database. As we know SQL updates require indexes and in some cases, very large indexes for efficient updates. On the other hand, SQL inserts doesn’t require indexing. For a low TPS system which results in few million records, indexing doesn’t create any bottleneck and the overall performance is consistent.

As your TPS grows, the size of your database will also grow and you will have hundreds of millions of records in your database per day. Now you will see that your application performance will nosedive overnight and you will actually end up getting TPS as low as 1K or even less.

Below is a graph of TPS Vs Table Partition Size Vs Index Size. You will notice how TPS keeps degrading gradually while the Index is growing.

So what caused your application performance to deteriorate to such level?

Table indexes are good to begin with and help you get good performance on relatively smaller database size. Every time you do SQL insert/update the indexing is reorganized to reflect the new records. When you started your application for the first time there was no data in the database and index sizes were in few KBs. As the TPS grew, the size of indexes went into MBs and then few hundred MBs. Even this will be manageable for a G7 machine with 48GB RAM. But then you will observe as the TPS kept growing with time and the no. of records in the database grew, the indexes will run into multiple GBs. Combined with many tables in your database, the total size of your indexes will run into tens of GBs.

There will be a time when database will no longer be able to manage indexes in the RAM itself and the indexes will spill onto disk. Even with fast disk with RAID 10 configuration, you will find that database will end up doing indexing most of the time and little time doing actual data insert/update. Your database disk utilization will be 100% all the time and now your application can no longer sustain the TPS it was giving before and the system throughput will nosedive.

Now you have found the culprit. But it is the necessary evil. It helps you for efficient SQL Select and Updates. If you drop indexes or reduce indexes, the SQL updates won’t be able to work efficiently. If you increase indexes to make updates faster, you will kill the performance of SQL inserts. You will find yourself stuck between a rock and a hard place, you can’t drop/reduce (remove columns) indexes because of SQL updates and you can’t increase indexes (combining more columns) because of SQL inserts.

SOLUTION

Let’s revisit our challenge of “Table indexes creating the bottleneck“ once again to arrive at the solution

Solution for Table Indexes:

Table indexes are required for fast data retrieval and SQL Updates. But while doing SQL inserts indexes don’t offer any benefit. The solutions is that indexes should be created only when all the data has been uploaded into the database and is ready for retrieval. Design your application to do only SQL inserts and drop all the table indexes (Or don’t create them in the first place) while data upload is going on! This solution works wonders for a system which does hundreds of millions of inserts per day. If you badly require indexes then create them once in a day after you have processed previous day. Try not to create indexes which are unnecessarily large because now you don’t need to do SQL updates on this data.

Below is the graph with a system having only SQL inserts on daily partitioned table and no indexing during data upload: