This is a second rev of this SE Question.
How can I add more validation to this .ico file downloader?
It works but I need a way to check the integrity of each step:
- Check Default location ( checked length is greater than 20 ) ( broken for test.com ... why ?
- Check Google API ( validation already added )
- Scrape page for ico location and download directly ( last resort )
And of course general feedback. Note I Use a GET request to test the script invoking debug.
<?php
if($_GET)
{
$obj = new FaviconFinder();
$obj->invokeDebug($_GET);
}
class FaviconFinder
{
// domain before and after redirects
private $domain;
private $real_domain;
// the domain and how it was obtained
private $domain_code = '0';
private $domain_file;
// the favicon and how it was obtained
private $favicon_code = 'z';
private $favicon_file;
private $ext;
// paths local to server and on the internet (URL)
private $path_local_server = "../../favicons/";
private $name_db;
private $path_internet;
/****************************************************************************************************
* invoke
****************************************************************************************************/
public function invoke( $pipe )
{
$domain = $pipe['domain'];
$this->domain = $domain;
if ( $this->googleAPIFound($domain) )
{
$pipe = $this->saveFavicon($pipe, true);
$pipe['favicon'] = 'NULL';
$pipe['favicon_local'] = $this->name_db;
} else if ( $this->defaultFound($domain) )
{
$pipe = $this->saveFavicon($pipe, true);
$pipe['favicon'] = 'NULL';
$pipe['favicon_local'] = $this->name_db;
} else if ( $this->pageFound($domain) && $this->linkFound() && $this->getFavicon($this->path_internet) )
{
$pipe = $this->saveFavicon($pipe);
$pipe['favicon'] = $this->path_internet;
$pipe['favicon_local'] = $this->name_db;
} else {
$pipe['favicon'] = 'NULL';
$pipe['favicon_local'] = 'image_generic.png';
}
$pipe['method'] = $this->domain_code . $this->favicon_code;
return $pipe;
}
/****************************************************************************************************
* defaultFound
****************************************************************************************************/
private function defaultFound ($domain) {
$default_location = 'http://www.' . $domain . '/favicon.ico';
if( $this->getFavicon( $default_location) )
{
$this->domain_code = 'default - 1';
$this->path_internet = $default_location;
return true;
}
return false;
}
/****************************************************************************************************
* googleAPIFound
****************************************************************************************************/
private function googleAPIFound ($domain) {
$favicon = @file_get_contents('https://plus.google.com/_/favicon?domain=' . $domain);
// remove this
$favicon64 = base64_encode($favicon);
if ( hash('md5', $favicon64) == '99fd8ddc4311471625e5756986002b6b' ) {
return false;
}
else {
$this->domain_code = 'google - 1';
$this->favicon_file = $favicon;
return true;
}
}
/****************************************************************************************************
* pageFound
****************************************************************************************************/
private function pageFound ($domain)
{
return $this->pageFoundGet($domain) || $this->pageFoundCurl($domain);
}
private function pageFoundCurl ($domain)
{
$types = array(
"curl - 1"=>$domain,
"curl - 2"=>'www.' . $domain,
"curl - 4"=>'https://www.' . $domain,
"curl - 3"=>'http://www.' . $domain,
"curl - 6"=>'https://' . $domain,
"curl - 5"=>'http://' . $domain
);
foreach ($types as $key => $value) {
$this->domain_file = $this->curlExec($value, true);
if ($this->domain_file)
{
$this->domain_code = $key;
return true;
}
}
return false;
}
private function pageFoundGet( $domain )
{
$types = array(
"file_get - 1"=>$domain,
"file_get - 2"=>'www.' . $domain,
"file_get - 3"=>'http://www.' . $domain,
"file_get - 4"=>'https://www.' . $domain,
"file_get - 5"=>'http://' . $domain,
"file_get - 6"=>'https://' . $domain
);
foreach ($types as $key => $value) {
if ($this->domain_file = $this->fileGetContents( $value ))
{
$this->domain_code = $key;
return true;
}
}
return false;
}
/****************************************************************************************************
* linkFound
****************************************************************************************************/
private function linkFound()
{
$domain = $this->real_domain;
$regex = '#<link\s+(?=[^>]*rel=(?:\'|")(?:shortcut\s)?icon(?:\'|")\s*)(?:[^>]*href=(?:\'|")(.+?)(?:\'|")).*>#i';
$link_found = preg_match( $regex , $this->domain_file, $matches );
if($link_found === 1)
{
$path = $matches[1];
if ( $path[0] === '/' && $path[1] === '/' )
{
$this->favicon_code = 'a';
$this->path_internet = 'http:' . $path;
}
else if( $path[0] === '/' )
{
$this->favicon_code = 'b';
$this->path_internet = 'http://www.' . $domain . $path;
}
else if ( substr($path, 0, 4) === 'http' )
{
$this->favicon_code = 'c';
$this->path_internet = $path;
}
else
{
$this->favicon_code = 'd';
$this->path_internet = 'http://www.' . $domain . '/' . $path;
}
}
else
{
return false;
}
return true;
}
/****************************************************************************************************
* getFavicon
****************************************************************************************************/
private function getFavicon($url)
{
return $this->getFaviconCurl($url) ;
}
private function getFaviconCurl($url)
{
$temp = $this->curlExec( $url, false );
if($temp === false)
{
return false;
}
if(strlen($temp) < 20)
{
return false;
}
$this->favicon_file = $temp;
return true;
}
/****************************************************************************************************
* saveFavicon
****************************************************************************************************/
public function saveFavicon( $pipe, $switch )
{
if ($switch) {
$this->ext = "ico";
} else {
$arr = parse_url($this->path_internet);
$this->ext = pathinfo($arr['path'], PATHINFO_EXTENSION);
}
$name = str_replace('.', '_', $this->domain);
if ($this->ext) {
$name = $name . "." . $this->ext;
}
file_put_contents($this->path_local_server . $name, $this->favicon_file);
$this->name_db = $name;
return $pipe;
}
/****************************************************************************************************
* wrapper functions
****************************************************************************************************/
private function curlExec ($url, $set)
{
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
));
$temp = curl_exec($curl);
if ($set) {
$url = curl_getinfo( $curl )['url'];
$url = parse_url($url);
$url = $url['host'];
$this->real_domain = preg_replace('#^www\.(.+\.)#i', '1ドル', $url);
}
curl_close($curl);
return $temp;
}
private function fileGetContents ($value)
{
$opts = array(
'http'=>array(
'follow_location' => true,
'max_redirects' => 20
)
);
$context = stream_context_create($opts);
return @file_get_contents( $value, false, $context );
}
/****************************************************************************************************
* invokeDebug
****************************************************************************************************/
public function invokeDebug($pipe)
{
if ($pipe['domain'])
{
$pipe = $this->invoke($pipe);
echo "<br> domain | " . $pipe['domain'] . "";
echo "<br> domain_code | " . $this->domain_code;
echo "<br> favicon_code | " . $this->favicon_code;
}
if ($this->favicon_file)
{
echo "<br> favicon_file type | " . gettype($this->favicon_file);
echo "<br> favicon_file length | " . strlen($this->favicon_file);
echo "<br> favicon_file | " . $this->favicon_file;
$file64 = base64_encode($this->favicon_file);
echo "<br> <img src= 'data:image/" . $this->ext . ";base64," . $file64 . "'></img>";
}
}
}
-
\$\begingroup\$ tim has rolled back the last edit, because it invalidates the given answers. Please see what you may and may not do after receiving answers . \$\endgroup\$Vogel612– Vogel6122016年05月08日 21:06:37 +00:00Commented May 8, 2016 at 21:06
1 Answer 1
Not a full review, just a couple of things I noticed.
Security: XSS
You definitely want to fix the XSS vulnerability. The most obvious one is where you echo $pipe['domain']
, where $pipe
is just $_GET
. It's difficult to grasp if the other values, such as favicon_file
might possibly contain payloads as well, so just to be sure, I would encode them as well.
Security: Code Execution
It's a bit unclear what exactly favicon_file
might be. But in saveFavicon
, you use the original file extension, instead of using .ico
or at least checking the extension. So if an attacker managed to set favicon_file
to something like favicon.php
, you may be vulnerable.
It seems that this is currently a problem with the linkFound
function, which seems to take any file, regardless of extension, from a remote server. You would also need to trust that plus.google.com only returns .ico files, and that all future icon finder functions only return icons.
Functions
I find your function signatures somewhat confusing. For example, saveFavicon
accepts $pipe
, doesn't do anything with it, and then returns it.
It also has a $switch
value which is very unclear. Generally, boolean values are not ideal as they make functions harder to use. And what is switched?
The arguments for invoke
are also not that clear to me. It accepts $pipe
and then sets a whole bunch of values in this array, and returns it. So it seems that it's purpose is to set values in $pipe
. But really, it's main functionality seems to be the side-effect of downloading the favicon. Most of the set values are also not used (at least in your example).
Structure and Classes
First of, your class does a bit too much for my taste. It searches favicons (via multiple very different methods), it downloads them, and it saves them. Each could be handled by a different class.
What you want to do is save a favicon via X methods. Your invoke code is a bit hard to read though. I would probably create a FaviconFinder
interface (with methods such as findFavicon
or getFaviconInfo
, let GoogleFaviconFinder
, DefaultFaviconFinder
, etc implement it, then have a addFaviconFinder
method, which adds the finder. Then, invoke can iterate over them until one works.
I would also add even more classes. Eg:
$pipe
: This seems to be a magic array holding some input and output values. Arrays such as this are difficult to use (especially without proper documentation). I would probably create some kind of class instead, and give it a proper name (such asFaviconInfo
).favicon_code
: I'm not sure what this is used for, but some kind of class, or at least an array mapping the characters to properly named indices would be nice.- same goes for
domain_code
.
Naming
Many of your variables and function names could be more concrete. For example $pipe
or invoke
, $types
, $key
, $value
, $temp
, $set
, or $matches
are very vague.
-
\$\begingroup\$ I agree is is too big, but keeping for now while troulbeshooting - check rev 3 - I updated some of the names. \$\endgroup\$arc.slate .0– arc.slate .02016年05月08日 20:36:19 +00:00Commented May 8, 2016 at 20:36
-
\$\begingroup\$ please let me know why you did not up-vote. \$\endgroup\$arc.slate .0– arc.slate .02016年05月08日 20:38:04 +00:00Commented May 8, 2016 at 20:38
-
\$\begingroup\$ @arc.slate.0 I don't always upvote questions I answer (I do on the other hand upvote pretty much all answer to my questions). If your asking what could be improved about the question, a couple of things come to mind: the title (describe what the code does, not what you want out of a review), the missing description (just a sentence or maybe two on what your code does), don't ask about why something is broken (it seems off-topic to me), make a bit more clear what exactly you want out of a review (what validation do you mean, exactly? how/why do you want to check the integrity?, etc). \$\endgroup\$tim– tim2016年05月08日 20:45:54 +00:00Commented May 8, 2016 at 20:45
-
\$\begingroup\$ @arc.slate.0 See also How to get the best value out of Code Review - Asking Questions (in addition to the points I mentioned, it also mentions tags). Please also note What you may and may not do after receiving answers, I rolled back your edit. \$\endgroup\$tim– tim2016年05月08日 20:50:13 +00:00Commented May 8, 2016 at 20:50