[Solved] MySQL improve performance large tables

EverSQL Database Performance Knowledge Base

MySQL improve performance large tables

Database type:

I am new to MySQl. Please excuse some wrong terminologies. I have a question about indexing and splitting tables in MySQL. I am working on a web server. There are 3 very large tables that are frequently queried. The tables and their fields are as follows:

Alignment: ali_id, chain1_id, chain2_id .....; Seed: seed_id, ali_id, .....; Fragment: seed_id .......

Seed contains one or more records for a given Alignment. Fragment contains one or more records for Seed.

The size of the tables is Alignment - 8.3GB, Seed - 26GB and Fragment - 127GB

These tables have very large indexes: Alignment has index for chain1_id, chain2_id, ali_id, (chain1_id and chain2_id) and (chain2_id and chain1_id). As a result size of index file itself is 27G which is roughly 3 time size of the Alignment table.

Is this appropriate?

Also since the table size is becoming very large, the server keeps on crashing. Is it a good idea to split up the tables into smaller ones. I am confused because I am not sure if multiple select statements will also slow down the server.

Thank you.

Regards, Amit.

Tables were already created by someone else.I guess they were as follows:

create Table Alignment (
 ali_id int(11) PRIMARY KEY, 
 chain_id1 int(11), 
 chain_id2 int(11), 
 param_id smallint(6), 
 date datetime); 

create Table Seed(
 seed_id int(11), 
 Ne smallint(5), 
 rmsd float, 
 ali_id int(11), 
 identics smallint(6)); 

* Seed has 5 more fields, all are smallint(6)

create table Fragment (
 start1 smallint(6), 
 start2 smallint(6), 
 len smallint(6), 
 seed_id bigint(20));

There query that takes long time is:

select a.chain_id2, s.Ne, s.rmsd, s.zN, s.ali_id, s.identics, s.positives, s.nFrg, s.cMatch, s.cont1, s.cont2, s.bMatch, s.back1, s.back2, s.seed_id 

from Alignment AS a, Chain AS c, Seed AS s 

WHERE (a.chain_id1 = c.chain_id) and a.ali_id = s.ali_id and c.pdb_chain = "$pdb_here" and s.zN > $ZLIM;

The number of hits may vary from 100-2000

It is run on an apache server on a Linux machine witn Intel Quad Core @2.5 GHz with 4 GB RAM.

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
ALTER TABLE `Alignment` ADD INDEX `alignment_idx_chain_id1_ali_id` (`chain_id1`,`ali_id`);
ALTER TABLE `Chain` ADD INDEX `chain_idx_pdb_chain_chain_id` (`pdb_chain`,`chain_id`);
ALTER TABLE `Seed` ADD INDEX `seed_idx_ali_id_zn` (`ali_id`,`zN`);
The optimized query:
SELECT
        a.chain_id2,
        s.Ne,
        s.rmsd,
        s.zN,
        s.ali_id,
        s.identics,
        s.positives,
        s.nFrg,
        s.cMatch,
        s.cont1,
        s.cont2,
        s.bMatch,
        s.back1,
        s.back2,
        s.seed_id 
    FROM
        Alignment AS a,
        Chain AS c,
        Seed AS s 
    WHERE
        (
            a.chain_id1 = c.chain_id
        ) 
        AND a.ali_id = s.ali_id 
        AND c.pdb_chain = "$pdb_here" 
        AND s.zN > $ZLIM

Related Articles



* original question posted on StackOverflow here.