I have invoices
, invoices_items
, order
, order_items
. Invoices
and Orders
tables contain around 1 million records. Invoices_items
and Orders_items
tables contains more than 2 millions records. Items table contains 200,000 records. Now I want to generate a report based on my filter like customers, item categories and more.
Running on PHP 5.6. MySql 5.7 and Apache2.
SELECT
`si_items`.`item_id`
, SUM(qty) AS `qty`
, IFNULL(SUM(selling_price * (qty)), 0) AS `salestotal`
, GROUP_CONCAT(si.id) AS `siso_id`
, MAX(si.date_transaction) AS `date_transaction`
FROM
`invoice_items` AS `si_items`
LEFT JOIN `invoice` AS `si`
ON si.id = si_items.parent_id
LEFT JOIN `items`
ON si_items.item_id = items.id
WHERE (
DATE_FORMAT(si.date_transaction, '%Y-%m-%d') BETWEEN '2019-01-01'
AND '2019-02-15'
)
AND (si.approved = 1)
AND (si.deleted = 0)
AND (items.deleted = 0)
GROUP BY `item_id`
UNION
SELECT
`so_items`.`item_id`
, SUM(qty) AS `qty`
, IFNULL(SUM(selling_price * (qty)), 0) AS `salestotal`
, GROUP_CONCAT(so.id) AS `soso_id`
, MAX(so.date_transaction) AS `date_transaction`
FROM
`order_items` AS `so_items`
LEFT JOIN `order` AS `so`
ON so.id = so_items.parent_id
LEFT JOIN `items`
ON so_items.item_id = items.id
WHERE (
DATE_FORMAT(so.date_transaction, '%Y-%m-%d') BETWEEN '2019-01-01'
AND '2019-02-15'
)
AND (so.approved = 1)
AND (so.deleted = 0)
AND (items.deleted = 0)
GROUP BY `item_id`
When I executed this query for 50 days, it took 1 minute 20 seconds.
INDEXES are added in tables.
Invoice and Order Tables:
PRIMARY KEY (`id`),
KEY `account_id` (`account_id`),
KEY `approved` (`approved`),
KEY `deleted` (`deleted`),
KEY `finalised` (`finalised`),
KEY `rp_status` (`rp_status`),
KEY `sales_types_id` (`sales_types_id`),
KEY `account_type_id` (`account_type_id`),
KEY `company_id` (`company_id`),
KEY `date_transaction` (`date_transaction`)
Invoices_items & Order_items
PRIMARY KEY (`id`),
KEY `deleted` (`deleted`),
KEY `item_id` (`item_id`),
KEY `parent_id` (`parent_id`),
KEY `vat_id` (`vat_id`),
KEY `qty` (`qty`),
Query image:
I need to increase performance of this query. How should I proceed?
Show Create Tables
CREATE TABLE `invoice` (
`id` char(36) NOT NULL,
`reference` varchar(25) DEFAULT NULL,
`company_id` char(36) DEFAULT NULL,
`branch_id` char(36) DEFAULT NULL,
`account_id` char(36) DEFAULT NULL,
`contact_id` char(36) DEFAULT NULL,
`transaction_type` varchar(10) DEFAULT NULL,
`sales_types_id` int(11) DEFAULT '0',
`quote_validity` int(11) DEFAULT '0',
`delivery_method_id` int(11) DEFAULT '0',
`sales_representative_id` int(11) DEFAULT '0',
`account_type_id` char(36) DEFAULT NULL,
`vat_exempted` tinyint(1) DEFAULT '0',
`description` text,
`finalised` tinyint(1) DEFAULT '0' COMMENT 'Not Yet finalised - status=1; Need Approval - status = 2; Approved - status = 3',
`approved` tinyint(1) DEFAULT '0',
`approved_user_id` int(11) DEFAULT '0',
`default_sales_location_id` char(36) DEFAULT NULL COMMENT '0-Yes; 1-No',
`generate_do` tinyint(1) DEFAULT '1',
`generate_dn` tinyint(4) DEFAULT '1',
`do_status` tinyint(1) DEFAULT '0',
`cn_status` tinyint(1) DEFAULT '0',
`rp_status` tinyint(1) DEFAULT '0',
`dm_status` tinyint(1) DEFAULT '0',
`currency_id` char(36),
`exchange_rate_id` tinyint(1) DEFAULT '0',
`exchange_rate` double DEFAULT '1',
`date_transaction` datetime DEFAULT NULL,
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`created_user_id` int(11) DEFAULT '0',
`modified_user_id` int(11) DEFAULT '0',
`deleted` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `account_id` (`account_id`),
KEY `approved` (`approved`),
KEY `branch_id` (`branch_id`),
KEY `cn_status` (`cn_status`),
KEY `created_user_id` (`created_user_id`),
KEY `date_created` (`date_created`),
KEY `deleted` (`deleted`),
KEY `do_status` (`do_status`),
KEY `finalised` (`finalised`),
KEY `reference` (`reference`),
KEY `rp_status` (`rp_status`),
KEY `sales_types_id` (`sales_types_id`),
KEY `account_type_id` (`account_type_id`),
KEY `company_id` (`company_id`),
KEY `date_transaction` (`date_transaction`),
KEY `default_sales_location_id` (`default_sales_location_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `invoice_items` (
`id` char(36) NOT NULL,
`parent_id` char(36) DEFAULT NULL,
`item_id` char(36) DEFAULT NULL,
`qty` double DEFAULT '0',
`cost_price` double DEFAULT '0',
`list_price` double DEFAULT '0',
`selling_price` double DEFAULT '0',
`unit_price` double DEFAULT '0',
`vat` double DEFAULT '0',
`amount` double DEFAULT '0',
`special_discount` double DEFAULT '0',
`price_change_status` tinyint(1) DEFAULT '0',
`remarks` text,
`vat_id` int(11) DEFAULT '1',
`stock_category_id` tinyint(2) DEFAULT '0' COMMENT '1: Stockable 2: Service',
`is_giftitem` tinyint(1) DEFAULT '0' COMMENT '1: Gift Item 0: NO Gift',
`item_type_status` tinyint(1) DEFAULT '0',
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`created_user_id` int(11) DEFAULT '0',
`modified_user_id` int(11) DEFAULT '0',
`deleted` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `deleted` (`deleted`),
KEY `item_id` (`item_id`),
KEY `parent_id` (`parent_id`),
KEY `stock_category_id` (`stock_category_id`),
KEY `item_type_status` (`item_type_status`),
KEY `vat_id` (`vat_id`),
KEY `amount` (`amount`),
KEY `qty` (`qty`),
KEY `unit_price` (`unit_price`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
-
\$\begingroup\$ For a question asking to improve the performance of SQL query it would be wise to add the EXPLAIN output. Besides. it would be extremely wise to simplify the query taking out insignificant parts leaving only a code that is having the same performance problem. \$\endgroup\$Your Common Sense– Your Common Sense2019年02月15日 11:27:48 +00:00Commented Feb 15, 2019 at 11:27
-
\$\begingroup\$ @YourCommonSense. added explain query result \$\endgroup\$smdhkv– smdhkv2019年02月15日 11:37:36 +00:00Commented Feb 15, 2019 at 11:37
-
\$\begingroup\$ I don't get it. Explain says there are only 100000 rows in invoice_items, and you said there are 2 million. Are 95% of them deleted? \$\endgroup\$Your Common Sense– Your Common Sense2019年02月15日 12:00:48 +00:00Commented Feb 15, 2019 at 12:00
-
\$\begingroup\$ @YourCommonSense, No. Query has been executed for 6 months. \$\endgroup\$smdhkv– smdhkv2019年02月15日 12:07:07 +00:00Commented Feb 15, 2019 at 12:07
-
2\$\begingroup\$ The current question title, which states your concerns about the code, is too general to be useful here. Please edit to the site standard, which is for the title to simply state the task accomplished by the code. Please see How to get the best value out of Code Review: Asking Questions for guidance on writing good question titles. \$\endgroup\$Toby Speight– Toby Speight2019年02月15日 13:18:28 +00:00Commented Feb 15, 2019 at 13:18
2 Answers 2
You use a date format function in the WHERE
clause.
This function then means that the query cannot use an index on the date column.
Removing the date format function in the WHERE
clause will improve the performance of the query.
I do not see any compound KEY, like (date_transaction, approved, deleted)
.
Also less indices can improve overall speed - though counter-intuitive.
Much this index might not help. In that case reduce the data for the time being.
Experiment with not using parts, i.e. both GROUP_CONCATs: GROUP_CONCAT(si.id) AS siso_id
One could offer a zoom-in on the group IDs, for a single group, done later.
One can also consider paging: here it might do to offer pages per month, reducing the request per page.
Or create an archive table with query results per month.