[Solved] CakePHP 3.0 Group Two Queries

EverSQL Database Performance Knowledge Base

CakePHP 3.0 Group Two Queries

Still new to cakePHP, and still a bit slow with SQL...

I've managed to use cakephp's Query Builder to refine my query, almost where I want it. I've tried using various combinations of ->group and ->select, but I can't seem to get these two queries combined properly. Can anyone offer a clue or a new (more appropriate) direction?

My Desired Query:

SELECT
    accounts.id,
    accounts.account_name,
    account_credit_id,
    SUM(credit - debit)
FROM 
    (SELECT 
    account_credit_id,
    credit,
    debit    
    FROM transactions
    UNION
    SELECT 
    account_credit_id,
    credit,
    debit    
    FROM splits)
AS temp
INNER JOIN accounts
ON account_credit_id=accounts.id
WHERE accounts.account_term_id = 3
GROUP BY account_credit_id 
;

Controller (This is a mess and not working) :

    $expense_balances = $this->Transactions->find('all')
        ->contain(['AccountCredits'])
        ->where(['AccountCredits.account_term_id ' => 3]) //Only Expense Accounts
        ->group(['account_credit_id'])
        ->select(['total' => 'sum(credit - debit)' , 'account_credit_id' , 'account_name' => 'AccountCredits.account_name' ])   ;
    $split_expense_balances = $this->Transactions->Splits->find('all')
        ->contain(['Accounts'])
        ->where(['Accounts.account_term_id ' => 3]) //Only Expense Accounts
        ->group(['account_credit_id'])
        ->select(['total' => 'sum(credit - debit)' , 'account_credit_id' ,  'account_name' => 'Accounts.account_name' ])    ;

    $all_balances = $expense_balances->union($split_expense_balances)
            ->group(['account_credit_id']);

    debug($all_balances->toArray());

Debug (You can see that the records are grouped on 'account_credit_id' but there are two sets of groups — one set from each table):

[
(int) 0 => object(App\Model\Entity\Transaction) {

    'total' => '-72.5',
    'account_credit_id' => (int) 4,
    'account_name' => 'Pets',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},
(int) 1 => object(App\Model\Entity\Transaction) {

    'total' => '-80',
    'account_credit_id' => (int) 5,
    'account_name' => 'Groceries',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},
(int) 2 => object(App\Model\Entity\Transaction) {

    'total' => '-389.44998931884766',
    'account_credit_id' => (int) 2,
    'account_name' => 'Dining',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},
(int) 3 => object(App\Model\Entity\Transaction) {

    'total' => '-118.77000045776367',
    'account_credit_id' => (int) 4,
    'account_name' => 'Pets',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},
(int) 4 => object(App\Model\Entity\Transaction) {

    'total' => '-98.91999816894531',
    'account_credit_id' => (int) 5,
    'account_name' => 'Groceries',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

}

]

What I am trying to achieve (grouping by 'account_credit_id' and summing the totals):

[
(int) 0 => object(App\Model\Entity\Transaction) {

    'total' => '-191.27',
    'account_credit_id' => (int) 4,
    'account_name' => 'Pets',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},
(int) 1 => object(App\Model\Entity\Transaction) {

    'total' => '-178.92',
    'account_credit_id' => (int) 5,
    'account_name' => 'Groceries',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},
(int) 2 => object(App\Model\Entity\Transaction) {

    'total' => '-389.44998931884766',
    'account_credit_id' => (int) 2,
    'account_name' => 'Dining',
    '[new]' => false,
    '[accessible]' => [
        '*' => true
    ],
    '[dirty]' => [],
    '[original]' => [],
    '[virtual]' => [],
    '[errors]' => [],
    '[repository]' => 'Transactions'

},

]

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
  2. Explicitly ORDER BY After GROUP BY (modified query below): By default, the database sorts all 'GROUP BY col1, col2, ...' queries as if you specified 'ORDER BY col1, col2, ...' in the query as well. If a query includes a GROUP BY clause but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying 'ORDER BY NULL'.
  3. Prefer Sorting/Grouping By The First Table In Join Order (modified query below): The database can use indexes more efficiently when sorting and grouping using columns from the first table in the join order. The first table is determined based on the prediction of the the optimal first table, and is not necessarily the first table shown in the FROM clause.
  4. Use UNION ALL instead of UNION (query line: 14): Always use UNION ALL unless you need to eliminate duplicate records. By using UNION ALL, you'll avoid the expensive distinct operation the database applies when using a UNION clause.
Optimal indexes for this query:
ALTER TABLE `accounts` ADD INDEX `accounts_idx_account_id_id` (`account_term_id`,`id`);
The optimized query:
SELECT
        accounts.id,
        accounts.account_name,
        temp.account_credit_id,
        SUM(temp.credit - temp.debit) 
    FROM
        (SELECT
            transactions.account_credit_id,
            transactions.credit,
            transactions.debit 
        FROM
            transactions 
        UNION
        SELECT
            splits.account_credit_id,
            splits.credit,
            splits.debit 
        FROM
            splits
    ) AS temp 
INNER JOIN
    accounts 
        ON temp.account_credit_id = accounts.id 
WHERE
    accounts.account_term_id = 3 
GROUP BY
    accounts.id 
ORDER BY
    NULL

Related Articles



* original question posted on StackOverflow here.