I have a working script that selects image fields in all tables and empty their values if the physical file doesnt exist.
$query1 = "SELECT table_name,column_name
FROM information_schema.columns
WHERE table_schema='schemaname' AND column_name like '%image%' or column_name='video'";
$result1 = mysql_query($query1) or die(mysql_error() . " -- " . $query1);
while($row1 = mysql_fetch_row($result1)){
if (!strpos($row1[0],'backup') > 0){
$sql = "Select COLUMN_NAME FROM information_schema.columns WHERE TABLE_NAME = '".$row1[0]."' AND EXTRA = 'auto_increment'";
$resultcol = mysql_query($sql);
$rowcol = mysql_fetch_row($resultcol);
$query2 = "SELECT " . $row1[1] . ", " .$rowcol[0] . "
FROM " . $row1[0] . "
WHERE " . $row1[1] . " != '' AND " . $row1[1] . " IS NOT NULL
";
echo $query2 . "<br>";
$result2 = mysql_query($query2) or die(mysql_error() . " -- " . $query2);
while ($rowdb = mysql_fetch_row($result2)){
if (!strpos($rowdb[0],'facebook') > 0 && !file_exists($img_root.'/'.$rowdb[0])){
$sql = "UPDATE ".$row1[0]." SET ". $row1[1] . " = '' WHERE " . $rowcol[0]. "= ".$rowdb[1];
echo $sql . "<br><br>";
$delete_count++;
//mysql_query("UPDATE ".$row1[0]." SET ". $row1[1] . " = '' WHERE id = ".$row1["id"]);
}
}
}
}
The script is working fine, but it takes time. I was wondering if there is a smarter (more optimized) way to get the same function.
-
2\$\begingroup\$ For the sake of ease (of writing, reading and thus maintaining) I would recommend using PDO instead of using the MySQL drivers like that. It's more secure too once you start working with data a user can send to your query. Here are the Official PDO Documentation and a nice blog post that can get you started. PDO will also make it very easy to change from MySQL to something else later on. \$\endgroup\$Jeroen– Jeroen2014年03月12日 15:38:12 +00:00Commented Mar 12, 2014 at 15:38
-
\$\begingroup\$ There is probably not much you can do to get it to run any faster. \$\endgroup\$bumperbox– bumperbox2014年03月12日 19:50:17 +00:00Commented Mar 12, 2014 at 19:50
1 Answer 1
The mysql_*
functions are deprecated. Use mysqli
or PDO instead.
Each query that you issue has a rather large overhead: you have to communicate the query to the server; the server has to parse, plan, and execute the query; the results have to be returned to your application. Therefore, you want to avoid running any query in a loop. You're much better off executing fewer complex queries than many simple queries.
First, note that your $query1
has a bug: AND
has higher precedence than OR
, so you need parentheses. Also, you eventually ignore any column named ...backup
, so you might as well filter those out immediately.
SELECT TABLE_NAME AS table_name, COLUMN_NAME AS image_col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'schemaname'
AND TABLE_NAME NOT LIKE '%backup%'
AND (COLUMN_NAME LIKE '%image%' OR COLUMN_NAME = 'video');
Let's clarify the naming.
- Your
$row1[0]
is what I calltable_name
- Your
$row1[1]
is what I callimage_col
Notionally, you want to execute this query for each result returned by that $schema
query.
UPDATE `$schema['table_name']`
SET `$schema['image_col']` = ''
WHERE
`$schema['image_col']` != '' AND `$schema['image_col']` IS NOT NULL
AND `$s['image_col']` NOT LIKE '%facebook%'
AND NOT file_exists('$img_root/$schema['image_col']')
Of course, there are two problems with that idea:
- There is no
file_exists()
function in MySQL (and furthermore, if such a function existed, it would be wrong to run it on the database server rather than on the application server, since they could be on different machines with different filesystems). You have to execute many
UPDATE
s, not just one. Furthermore, theUPDATE
queries are dynamic — the table and column names vary.This is a strong indication that your database schema is poorly designed. There should not be a proliferation of tables! You should also not be doing version control or backups by creating tables named
...backup
. I encourage you to follow up with these issues on http://dba.stackexchange.com.
A solution to the first problem is to create a temporary table that lists all images on your website's filesystem, then write the UPDATE
with an anti-join.
$db = new mysqli(...);
$db->query('CREATE TEMPORARY TABLE fs_images ( filename VARCHAR(128) PRIMARY KEY );');
$ins = $db->prepare('INSERT INTO fs_images VALUES (?)');
$file_it = new RecursiveDirectoryIterator($img_root);
foreach (new RecursiveIteratorIterator($file_it) as $f) {
$ins->bind_param('s', basename($f));
$ins->execute();
}
$ins->close();
A solution to the second problem is to use a stored procedure with a loop that executes dynamically composed SQL queries with a cursor.
DELIMITER $$
CREATE PROCEDURE clear_images()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE table_name VARCHAR(64);
DECLARE image_col VARCHAR(64);
DECLARE schema_cursor CURSOR FOR
SELECT TABLE_NAME AS table_name, COLUMN_NAME AS image_col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'schemaname'
AND TABLE_NAME NOT LIKE '%backup%'
AND (COLUMN_NAME LIKE '%image%' OR COLUMN_NAME = 'video');
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN schema_cursor;
schema_loop: LOOP
FETCH schema_cursor INTO table_name, image_col;
IF done THEN
LEAVE schema_loop;
END IF;
PREPARE stmt FROM CONCAT(
'UPDATE `', @table_name, '`',
' SET `', @image_col, '` = ''''',
' WHERE',
' `', @image_col, '` != '''' AND `', @image_col, '` IS NOT NULL',
' AND `', @image_col, '` NOT LIKE ''%facebook%''',
' AND `', @image_col, '` NOT IN (SELECT filename FROM fs_images);'
);
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END LOOP;
END$$
DELIMITER ;
Anyway, this solution is rather nasty (and untested!) — all as a result of your poor database schema design.