0

I have two tables in MySql with the following schema,

CREATE TABLE `open_log` (
 `delivery_id` varchar(30) DEFAULT NULL,
 `email_id` varchar(50) DEFAULT NULL,
 `email_activity` varchar(30) DEFAULT NULL,
 `click_url` text,
 `email_code` varchar(30) DEFAULT NULL,
 `on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `sent_log` (
 `email_id` varchar(50) DEFAULT NULL,
 `delivery_id` varchar(50) DEFAULT NULL,
 `email_code` varchar(50) DEFAULT NULL,
 `delivery_status` varchar(50) DEFAULT NULL,
 `tries` int(11) DEFAULT NULL,
 `creation_ts` varchar(50) DEFAULT NULL,
 `creation_dt` varchar(50) DEFAULT NULL,
 `on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

The email_id and delivery_id columns in both tables make up a unique key.

The open_log table have 2.5 million records where as sent_log table has 0.25 million records.

I want to filter out the records from open log table based on the unique key (email_id and delivery_id).

I'm writing the following query.

SELECT * FROM open_log WHERE CONCAT(email_id,'^',delivery_id) IN ( SELECT DISTINCT CONCAT(email_id,'^',delivery_id) FROM sent_log )

The problem is the query is taking too much time to execute. I've waited for an hour for the query completion but didn't succeed.

I've tried to make the email_id and delivery_id as composite key but that didn't help.

Kindly, suggest what I can do to make it fast since, I have the big data size in the tables.

Thanks, Faisal Nasir

1 Answer 1

1
  • First, if email_id and delivery_id together are a unique key, please add a primary -composite- key on both tables for (email_id, delivery_id)
  • Second, the concat is not necessary and will prevent the previous key from being used. Try:

    SELECT ol.* 
    FROM open_log ol 
    JOIN sent_log sl 
    ON (ol.email_id, ol.delivery_id) = (sl.email_id, sl.delivery_id)
    
ypercubeTM
99.7k13 gold badges217 silver badges306 bronze badges
answered Jul 16, 2014 at 18:54
6
  • 1
    I like this syntax but the ON ol.email_id = sl.email_id AND ol.delivery_id = sl.delivery_id results in more optimal code than ON (ol.email_id, ol.delivery_id) = (sl.email_id, sl.delivery_id) in some versions. Commented Jul 16, 2014 at 19:21
  • @ypercube, not true, MySQL has a bug for certain uses of that syntax (I think it was fixed in 5.7?), let's speak clearly. Not in OPs use. Commented Jul 16, 2014 at 19:25
  • What if the OP uses 5.1 or 5.0? Anyway, I'm not really sure, have to check it out. It may not affect = but only >= and IN comparisons. Commented Jul 16, 2014 at 19:32
  • @ypercube it will work pastebin.com/XwgiW7j4 Commented Jul 16, 2014 at 19:39
  • 1
    @ypercube, exactly the bug I mention is the (a, b) IN ((c, d), (e, f)), where indexes are not used- but return correct results. Commented Jul 16, 2014 at 19:40

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.