My stored procedure performs SELECT ... INTO var
yielding NULL, but if I repeat the same SELECT
myself, I get a value.
Here's the relevant part of the schema
CREATE TABLE UrlAuthority
(
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY
,name VARCHAR(255) NOT NULL COMMENT 'includes TLD suffix'
,UNIQUE(name)
)
ENGINE = INNODB DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE UrlFqdn
(
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY
,authority BIGINT NOT NULL COMMENT 'references example.com'
,name VARCHAR(255) NOT NULL COMMENT 'host.example.com'
,FOREIGN KEY (authority) REFERENCES UrlAuthority (id)
,UNIQUE (authority, name)
)
ENGINE = INNODB DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE Url
(
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY
,fqdn BIGINT NOT NULL
,path VARCHAR(255) NOT NULL
,FOREIGN KEY (fqdn) REFERENCES UrlFqdn (id)
,UNIQUE (fqdn, path)
)
ENGINE = INNODB DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
And here's the procedure
CREATE PROCEDURE UrlToId (url TEXT)
BEGIN
DECLARE host TEXT;
DECLARE authority TEXT;
DECLARE tld TEXT;
DECLARE pathtext TEXT;
DECLARE authority_tld TEXT;
DECLARE fqdn TEXT;
DECLARE authority_id BIGINT;
DECLARE fqdn_id BIGINT;
DECLARE url_id BIGINT;
DECLARE err TEXT;
CALL UnpackURL (url, host, authority, tld, pathtext);
SELECT JoinHostNames(authority, tld) INTO authority_tld;
SELECT JoinHostNames(host, authority_tld) INTO fqdn;
INSERT IGNORE INTO UrlAuthority (name) VALUES (authority_tld);
SELECT id
INTO authority_id
FROM UrlAuthority
WHERE name=authority_tld;
-- LOG ('UrlAuthority', authority_tld, authority_id)
INSERT IGNORE INTO UrlFqdn (authority, name)
VALUES (authority_id, fqdn);
SELECT id
INTO fqdn_id
FROM UrlFqdn
WHERE authority=authority_id AND name=fqdn;
-- LOG ('UrlFqdn', fqdn, fqdn_id)
INSERT IGNORE INTO Url (fqdn, path) VALUES (fqdn_id, pathtext);
-- LOG ('Url', fqdn_id, pathtext)
SELECT id FROM Url WHERE fqdn=fqdn_id AND path=pathtext;
END
|
The LOG
lines are a bit of metaprogramming, they insert strings into a debug table.
If I run
call UrlToId('http://mail4.z.uk/foo/bar');
I get these debug messages
UrlAuthority authority_tld=z.uk authority_id=1
UrlFqdn fqdn=mail4.z.uk fqdn_id=NULL
Url fqdn_id=NULL pathtext=/foo/bar
Clearly the problem is that fqdn_id=NULL
however, if I manually repeat the query which does SELECT id INFO fqdn_id
and substitute the values of the variables indicted in the trace:
SELECT id
-- INTO fqdn_id
FROM UrlFqdn
WHERE authority=1 /* authority_id */ AND name='mail4.z.uk' /* fqdn */;
This selects 1
as expected.
The select
gets the data, so why is the variable NULL
afterwards?
1 Answer 1
For a while, this one eluded me. I don't like SELECT ... INTO
, so I really wanted to blame it for your trouble, but no.
Here'e the problem.
SELECT id
INTO fqdn_id
FROM UrlFqdn
WHERE authority=authority_id AND name=fqdn;
How does MySQL interpret this?
SELECT id
INTO fqdn_id
FROM UrlFqdn
WHERE 'mail4' = 1 AND name = 'mail4.z.uk';
/* Impossible WHERE */
/* see DECLARE authority TEXT; */
“A local variable should not have the same name as a table column. If an SQL statement, such as a
SELECT ... INTO
statement, contains a reference to a column and a declared local variable with the same name, MySQL currently interprets the reference as the name of a variable.”— http://dev.mysql.com/doc/refman/5.7/en/local-variable-scope.html
Using the same identifier for a variable and a column name makes a debugging mess, one way or another.
-
1Oh my god THANK YOU. New personal policy: local variable names will have goddamned prefixes.spraff– spraff2016年10月15日 11:55:13 +00:00Commented Oct 15, 2016 at 11:55
-
What's wrong with
SELECT...INTO
by the way? I used to use@foo
but kept getting caught out by silent typos.spraff– spraff2016年10月15日 11:55:35 +00:00Commented Oct 15, 2016 at 11:55 -
See the "I don't like..." link for my write-up about the issue with
SELECT ... INTO
in contrast with the behavior of assignment from a scalar subquery, e.g.SET var = (SELECT ...);
. Unlike the subquery,SELECT ... INTO
leaves the variable's previous value intact rather than setting itNULL
if theSELECT
does not actually match a row. With cursors and loops, where the variable is reused, this can bite you. Also, yes, good idea: I almost always prefix local (program) variables with_
whether they need it or not, to avoid ambiguity.Michael - sqlbot– Michael - sqlbot2016年10月15日 16:31:47 +00:00Commented Oct 15, 2016 at 16:31
SELECT JoinHostNames(host, authority_tld) INTO fqdn;
JoinHostNames('mail','google.com')='mail.google.com'
but it also handles an emptyhost
argument gracefully.INSERT .. ON DUPLICATE KEY UPDATE ..
andLAST_INSERT_ID
; they may simplify your code.