I have two tables in MySql with the following schema,
CREATE TABLE `open_log` (
`delivery_id` varchar(30) DEFAULT NULL,
`email_id` varchar(50) DEFAULT NULL,
`email_activity` varchar(30) DEFAULT NULL,
`click_url` text,
`email_code` varchar(30) DEFAULT NULL,
`on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `sent_log` (
`email_id` varchar(50) DEFAULT NULL,
`delivery_id` varchar(50) DEFAULT NULL,
`email_code` varchar(50) DEFAULT NULL,
`delivery_status` varchar(50) DEFAULT NULL,
`tries` int(11) DEFAULT NULL,
`creation_ts` varchar(50) DEFAULT NULL,
`creation_dt` varchar(50) DEFAULT NULL,
`on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The email_id and delivery_id columns in both tables make up a unique key.
The open_log table have 2.5 million records where as sent_log table has 0.25 million records.
I want to filter out the records from open log table based on the unique key (email_id and delivery_id).
I'm writing the following query.
SELECT * FROM open_log WHERE CONCAT(email_id,'^',delivery_id) IN ( SELECT DISTINCT CONCAT(email_id,'^',delivery_id) FROM sent_log )
The problem is the query is taking too much time to execute. I've waited for an hour for the query completion but didn't succeed.
I've tried to make the email_id and delivery_id as composite key but that didn't help.
Kindly, suggest what I can do to make it fast since, I have the big data size in the tables.
Thanks, Faisal Nasir
1 Answer 1
- First, if
email_id
anddelivery_id
together are a unique key, please add a primary -composite- key on both tables for(email_id, delivery_id)
Second, the concat is not necessary and will prevent the previous key from being used. Try:
SELECT ol.* FROM open_log ol JOIN sent_log sl ON (ol.email_id, ol.delivery_id) = (sl.email_id, sl.delivery_id)
-
1I like this syntax but the
ON ol.email_id = sl.email_id AND ol.delivery_id = sl.delivery_id
results in more optimal code thanON (ol.email_id, ol.delivery_id) = (sl.email_id, sl.delivery_id)
in some versions.ypercubeᵀᴹ– ypercubeᵀᴹ2014年07月16日 19:21:21 +00:00Commented Jul 16, 2014 at 19:21 -
@ypercube, not true, MySQL has a bug for certain uses of that syntax (I think it was fixed in 5.7?), let's speak clearly. Not in OPs use.jynus– jynus2014年07月16日 19:25:49 +00:00Commented Jul 16, 2014 at 19:25
-
What if the OP uses 5.1 or 5.0? Anyway, I'm not really sure, have to check it out. It may not affect
=
but only>=
andIN
comparisons.ypercubeᵀᴹ– ypercubeᵀᴹ2014年07月16日 19:32:43 +00:00Commented Jul 16, 2014 at 19:32 -
@ypercube it will work pastebin.com/XwgiW7j4jynus– jynus2014年07月16日 19:39:01 +00:00Commented Jul 16, 2014 at 19:39
-
1@ypercube, exactly the bug I mention is the
(a, b) IN ((c, d), (e, f))
, where indexes are not used- but return correct results.jynus– jynus2014年07月16日 19:40:39 +00:00Commented Jul 16, 2014 at 19:40
Explore related questions
See similar questions with these tags.