0

I can't quite seem to figure this out, and hoping someone can help. I was trying something like this, but doesn't seem to work:

SELECT * 
FROM testtable 
GROUP BY SUBSTRING_INDEX(urlTest,'/',3) 
HAVING COUNT(*)>1

Here is a (simplified) table structure with what I am using:

CREATE TABLE `testtable` (
 `field1` int(11) NOT NULL,
 `field2` datetime NOT NULL,
 `urlTest` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `testtable` (`field1`, `field2`, `urlTest`) VALUES
(2, '2010-01-01', 'http://test1.com/somethingelse.php?id=1'),
(5, '2010-01-01', 'http://test1.com/'),
(6, '2012-02-02', 'http://test1.com/newscript/something'),
(7, '2013-02-02', 'http://test2.com/newscript/something'),
(8, '2014-02-02', 'http://test3.com/newscript/something'),
(9, '2015-02-02', 'http://test3.com/');
ALTER TABLE `testtable`
 ADD PRIMARY KEY (`field1`);

The output I want to get is to return any "base" urls that are identical with an 'identical' date, that I can manually inspect.

In other words, although "test1.com/somethingelse.php?id=1" and "test1.com" are "unique" url entries, the "base" url (i.e., "test1.com") is identical/a duplicate (because the dates -> 2010年01月01日 <- are also identical). However, test1.com/newscript/something would not count as a 'duplicate' - because although the 'base' url is the same - it has a 'unique' date to the rest of the 'test1.com' urls (i.e., "2012-02-02")

So the output I'd like to get is:

2, 2010年01月01日, 'http://test1.com/somethingelse.php?id=1'
5, 2010年01月01日, 'http://test1.com/'

(because only test1.com has the same "base" unique URL (test1.com) as well as has the same timestamp. "test3.com" - while the same "base" url - since the actual timestamps are different, would be considered 'unique' urls)

How would I accomplish this?

Thanks!

Akina
20.8k2 gold badges20 silver badges22 bronze badges
asked Sep 27, 2019 at 16:18

1 Answer 1

0

Selecting all fields makes no sense - server will output one RANDOM field value from all values in the group. Use

SELECT testtable.*
FROM testtable
JOIN ( SELECT SUBSTRING_INDEX(urlTest,'/',3) AS domain,
 field2
 FROM testtable 
 GROUP BY SUBSTRING_INDEX(urlTest,'/',3), field2 
 HAVING COUNT(*) > 1 ) domains
WHERE SUBSTRING_INDEX(testtable.urlTest,'/',3) = domains.domain
 AND testtable.field2 = domains.field2

fiddle

answered Sep 27, 2019 at 16:50
9
  • wow! i'm impressed! that is very cool, thanks very much for your help! I've been trying to figure it out for quite some time now! thanks very much! :D Commented Sep 27, 2019 at 17:54
  • follow-up question (now that this works great!) and I can see the 'duplicate' urls... something I didn't anticipate is "some" base urls require a 'longer' substring_index, - so what would be the best way of doing that? Commented Sep 27, 2019 at 18:01
  • i.e, say I have "test1.com", "test2.com" and "test3.com", and I want to 'individually' add specific domain bases, but specify how many slashes there should be for each? i.e., Commented Sep 27, 2019 at 18:02
  • shoot - sorry, I keep typing and it keeps ending the comment. anyways, kind of like: test1.com/random test1.com/otheroutput (so test1.com is "3" slashes) test2.com/base/newrandom test2.com/base/otheritem (so "4" slashes, because up until "/base/" everything is identical) then say: test3.com/base/newrandom/variablending001 test3.com/base/newrandom/variablending002 test3.com/base/newrandom/variablending003 (so '3' variable endings in this case)? so "5" slashes... or... it just occurred to me... Commented Sep 27, 2019 at 18:04
  • @user6262902 New task is relative slightly. I'd recommend to create new question, with new DDL+data scripts and desured output. Commented Sep 27, 2019 at 18:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.