I've got an array like this, which is a result from a Google analytics request. I asked for the amount of visits for the last three months.
$statPerMonth
array (size=2)
'08' => // The month (August)
array (size=34)
0 =>
array (size=3)
0 => string '08' (length=2) // Month again
1 => string 'admin.testweb.fr' (length=19) // host
2 => string '1' (length=1) // amount of visits
1 =>
array (size=3)
0 => string '08' (length=2)
1 => string 'audigie-espace-auto.reseau-fivestar.fr' (length=38)
2 => string '6' (length=1)
2 =>
array (size=3)
0 => string '08' (length=2)
1 => string 'www.audigie-espace-auto.reseau-fivestar.fr' (length=31)
2 => string '9' (length=1)
3 =>
array (size=3)
0 => string '08' (length=2)
1 => string 'carrosserie-abberis.reseau-fivestar.fr' (length=38)
2 => string '7' (length=1)
'07' =>
array (size=47)
0 =>
array (size=3)
0 => string '07' (length=2)
1 => string 'www.anothersite.testweb.fr' (length=13)
2 => string '1' (length=1)
1 =>
array (size=3)
0 => string '07' (length=2)
1 => string 'admin.testweb.fr' (length=16)
2 => string '2' (length=1)
2 =>
array (size=3)
0 => string '07' (length=2)
1 => string 'admin.testweb.fr' (length=19)
2 => string '1' (length=1)
3 =>
array (size=3)
0 => string '07' (length=2)
1 => string 'audigie-espace-auto.reseau-fivestar.fr' (length=38)
2 => string '20' (length=2)
4 =>
array (size=3)
0 => string '07' (length=2)
1 => string 'www.admin.testweb.fr' (length=19)
2 => string '1' (length=1)
This array represents the amount of visits for my websites but you can see that the values [‘08’][‘1’] and [‘08’][‘2’] are identical (only 'www.' differs) I want to merge those cells and add their value (because it’s the same site !) in order to get the total amount of visits for a site with it’s two hostnames.
Consider $sites as an array of Site Object (websites). the getHost()
method will return the site host for example ‘my-host.fr’ without the 'www' consider $statsPerMonth
array explained above
Finally, consider this algorithm:
foreach ($statsPerMonth as $actualMonth => $stats) {
foreach($sites as $site) {
$siteHost = $site->getHost();
foreach ($stats as $row) {
if (strstr($row['1'], $siteHost)) {
if(isset($globalStats[$actualMonth][$siteHost])) {
$globalStats[$actualMonth][$siteHost] = $globalStats[$actualMonth][$siteHost] + $row['2'];
} else {
$globalStats[$actualMonth][$siteHost] = 0;
$globalStats[$actualMonth][$siteHost] = $globalStats[$actualMonth][$siteHost] + $row['2'];
}
}
if(!isset($globalStats[$actualMonth][$siteHost])) {
$globalStats[$actualMonth][$siteHost] = 0;
}
}
}
}
This algorithm returns the $globalStats
array in this form:
array (size=3)
'08' =>
array (size=43)
'carrosserie-la-cascade.reseau-fivestar.fr' => int 1
'audigie-espace-auto.reseau-fivestar.fr' => int 15
'carrosserie-abberis-fivestar.fr' => int 16
'carrosserie-arenales-jonathan.reseau-fivestar.fr' => int 0
'07' =>
array (size=43)
'carrosserie-la-cascade.reseau-fivestar.fr' => int 2
'audigie-espace-auto.reseau-fivestar.fr' => int 20
'carrosserie-abberis-fivestar.fr' => int 0
'carrosserie-arenales-jonathan.reseau-fivestar.fr' => int 4
'06' =>
array (size=43)
'carrosserie-la-cascade.reseau-fivestar.fr' => int 0
'audigie-espace-auto.reseau-fivestar.fr' => int 29
'carrosserie-abberis-fivestar.fr' => int 0
'carrosserie-arenales-jonathan.reseau-fivestar.fr' => int 4
This is exactly what I want but I think we can improve this algorithm to make it more efficient (because the arrays are big). Any ideas about how to make this algorithm better?
-
\$\begingroup\$ Are the sites returned in any specific order? \$\endgroup\$DFord– DFord2014年09月25日 14:49:13 +00:00Commented Sep 25, 2014 at 14:49
1 Answer 1
The innermost loop body can be simplified. Instead of this:
if (strstr($row['1'], $siteHost)) { if(isset($globalStats[$actualMonth][$siteHost])) { $globalStats[$actualMonth][$siteHost] = $globalStats[$actualMonth][$siteHost] + $row['2']; } else { $globalStats[$actualMonth][$siteHost] = 0; $globalStats[$actualMonth][$siteHost] = $globalStats[$actualMonth][$siteHost] + $row['2']; } } if(!isset($globalStats[$actualMonth][$siteHost])) { $globalStats[$actualMonth][$siteHost] = 0; }
I think this is equivalent:
if (strstr($row[1], $siteHost)) {
if (!isset($globalStats[$actualMonth][$siteHost])) {
$globalStats[$actualMonth][$siteHost] = 0;
}
$globalStats[$actualMonth][$siteHost] += $row[2];
}
(I also removed the quotes around the array indexes 1 and 2, I don't think you needed them.)
But the bigger improvement will be to cur out the foreach($sites as $site)
loop. Think about it, for each site on your list, you re-process all the arrays for the given month. It would be better to process the stats only once. For this you will need a helper function getCanonicalSiteName
that will give you, for example, audigie-espace-auto.reseau-fivestar.fr
for both audigie-espace-auto.reseau-fivestar.fr
and www.audigie-espace-auto.reseau-fivestar.fr
in the processed array. Something like this (untested):
foreach ($statsPerMonth as $actualMonth => $stats) {
foreach ($stats as $row) {
$siteHost = getCanonicalSiteName($row[1]);
if (!isset($globalStats[$actualMonth][$siteHost])) {
$globalStats[$actualMonth][$siteHost] = 0;
}
$globalStats[$actualMonth][$siteHost] += $row[2];
}
}