I made a routing system for PHP inspired by Symfony's router composed of a few classes.
First I am using Symfony's HTTP Foundations component. Then, I am emulating the classes in the routing component but with almost completely my own implementation (I did copy few lines).
The whole system is not simple so I won't be making it the focus of the question, though, this is the GitHub link, and I would be grateful for a full review (rip me apart).
I will provide the class that matches and parses the routes and I would like to know what I can improve.
There is a parser class:
<?php
namespace Routing;
/**
* Takes an instance of Routing\Route object
* Extracts the variables from wildcards
* Calculates a regex that can be used to match routes with urls
*/
class RouteParser
{
/**
* Asks for a route, extracts info that can be used later
*
* @param Route Routing/Route
*
* @return array Array with parsed values
*/
public function parseRoute(Route $route)
{
$variables = array();
$parsed = self::parse($route->getPath());
return ['variables' => $parsed['variables'], 'matcherReg' => $parsed['regex']];
}
/**
* Takes a string pattern, matches it with regexes
*
* @param string The pattern
*
* @return array Array with parsed values
*/
private static function parse($pattern)
{
$matches = '';
$variables = array();
$pos = 0;
$reg = '#'; //It seems that regex must start and end with a delimiter
$nextText = '';
if($pattern == '/')
{
$reg = '#^[\/]+$#';
return ['variables' => '', 'regex' => $reg];
}
//Check if generated regexes are stored, if so it skips the whole process
if(apc_exists($pattern))
{
$cacheI = apc_fetch($pattern);
return $cacheI;
}
//Extracts the variables enclosed in {}
preg_match_all('#\{\w+\}#', $pattern, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
//Puts each variable in array
//Uses the text before and after to create a regex for the rest of the pattern - $precedingText, $nextText
//If no wildcard is detected in the path it splits it into segments and compiles are regex
foreach ($matches as $match)
{
$varName = substr($match[0][0], 1, -1);
$precedingText = substr($pattern, $pos, $match[0][1] - $pos);
$pos = $match[0][1] + strlen($match[0][0]);
$nxt = $pos - strlen($pattern);
if($nxt == 0) $nxt = strlen($pattern);
$nextText = substr($pattern, $nxt);
$precSegments = explode('/', $precedingText);
$precSegments = array_splice($precSegments, 1);
//Pulls a regex from the preeceding segment, each variable segment is replaced with '\/[a-zA-Z0-9]+'
if(strlen($precedingText) > 1)
{
foreach($precSegments as $key => $value) {
$reg .= '\/';
$reg .= $value;
}
$reg .= '[a-zA-Z0-9]+';
}
else
{
$reg .= '\/[a-zA-Z0-9]+';
}
$nextText = str_replace('/', '\/', $nextText);
if(is_numeric($varName)) {
throw new \Exception('Argument cannot be a number');
}
if (in_array($varName, $variables)) {
throw new \Exception(sprintf('More then one occurrence of variable name "%s".', $varName));
}
$variables[] = $varName;
}
//If no variable names, wildcards are found in pattern : /hello/static/path it will replace it with \/hello\/static\/path
if(count($matches) < 1 && $pattern != '/')
{
$reg .= str_replace('/', '\/', $pattern);
}
$reg = $reg . $nextText;
$reg .= '#';
apc_store($pattern, ['variables' => $variables, 'regex' => $reg]);
return ['variables' => $variables, 'regex' => $reg];
}
}
and the matcher class:
<?php
namespace Routing;
use Symfony\Component\HttpFoundation\Request;
class Matcher
{
/**
* @var RouteCollection
*/
private $routes;
/**
* @var Symfony\Component\HttpFoundation\Request
*/
private $request;
/**
* Constructor.
*
* @param RouteCollection $routes A RouteCollection instance
* @param Symfony\Component\HttpFoundation\Request A Symfony Request
*/
public function __construct(Request $request, RouteCollection $collection)
{
$this->routes = $collection;
$this->request = $request;
}
public function matchRequest()
{
$return = array();
foreach ($this->routes->all() as $name => $route)
{
if(preg_match($route['parsed']['matcherReg'], $this->request->getPathInfo()))
{
if(!in_array($this->request->getMethod(), $route['route']->getHttpMethods()))
{
throw new \Exception(sprintf('Method "%s" not allowed', $this->request->getMethod()), 1);
}
//var_dump($this->request);
return [
'_vars' => $route['parsed']['variables'],
'_controller' => $route['route']->getController(),
'_varValues' => self::realVariables($route['route']->getPath(), $this->request->getPathInfo())
];
}
}
throw new \Exception(sprintf('Route for "%s" not found', $this->request->getPathInfo()), 1);
}
private static function realVariables($routePath, $pathInfo)
{
$i = 0;
$poss = [];
$vars = [];
$routeSegs = explode('/', $routePath);
$segs = explode('/', $pathInfo);
foreach ($routeSegs as $key => $value) {
if(preg_match('#\{\w+\}#', $value))
{
$poss[] = $i;
}
$i++;
}
$segs = array_splice($segs, 1);
foreach ($poss as $key => $index) {
$vars[] = $segs[$index];
}
return $vars;
}
}
They are used in an index.php
, as in this excerpt:
<?php
$request = Request::createFromGlobals();
$response = new Response();
$routes->add('name', new Route(['GET'], 'hi/{p1}/cicki/{p2}', function($p1, $p2) use ($response) {
$response->setContent($p1 . ' - ' . $p2);
}));
try
{
$urlMatcher = new Matcher($request, $routes);
$rez = $urlMatcher->matchRequest();
}
catch(Exception $e) {echo $e;}
call_user_func_array($rez['_controller'], $rez['_varValues']);
$response->send();
So it basically takes a path pattern, with static strings or parameters enclosed in {}, extracts the parameters and generates a regex to be compared to the URL. If it matches then it returns the parameters, their values, the controller (a PHP closure) and it doesn't yet support optional parameters.
Edit: You may notice that I am caching my regexes with apc, so I would like to underline that after a route removal I am not invalidating the cache(A feature I still have to work on).
1 Answer 1
I apologize that nobody has reviewed this code in six years since it was posted. Perhaps you have learned a few things since then but hopefully the review below will still be helpful.
Array syntax
I like that the code takes advantage of the PHP 5.4 feature of short array syntax, though not in every array declaration - e.g. in RouteParser::parseRoute()
the first declaration uses the legacy format while the return
statement includes the shorthand format. There is nothing wrong with this but if this was my code then I would make it consistent.
Docblock format
While it may not exactly have a standard the widely accepted common docblock syntax contains no indentation for inner lines. Most of the docblocks used in the code contain extra indentation levels - e.g.
/** * Takes an instance of Routing\Route object * Extracts the variables from wildcards * Calculates a regex that can be used to match routes with urls */
Typically those would not have extra indentation - so the asterisks on each line are aligned vertically - e.g.
/**
* Takes an instance of Routing\Route object
* Extracts the variables from wildcards
* Calculates a regex that can be used to match routes with urls
*/
New lines for all blocks
Adhering to a standard is not required but it does help anyone reading the code. Many PHP developers follow PSRs like PSR-12. While it wasn't approved until 2019, PSR-12 replaced PSR-2 which was approved in 2012.
I have grown accustom to starting class and method blocks on a new line (per PSR-12 section 4.4) but doing so for other control structures like loops and conditionals feels a bit excessive and is not in-line with PSR-12 section 5.
There MUST be one space between the closing parenthesis and the opening brace
Unused variable
In the method RouteParser::parseRoute()
the variable $variables
is declared but never used. It can be eliminated without effecting anything. This will help avoid confusion for anyone reading the code in the future - which may include yourself.
Method length and simplification
The method RouteParser::parse()
is a bit on the long side. Some of the variables declared declared at the beginning are not used in the early returns and could be declared after those conditional returns - e.g. $matches
, $variables
, $pos
, $nextText
. Removing new lines before opening braces will decrease the length of that method.
Instead of splicing together substrings, a simpler approach would be to use str_replace()
to simply replace what needs to be replaced. The sample below uses positive look-behind and positive look ahead to not capture the curly brackets - this allows those matches to be added directly to the list of variables.
$variables = array();
$reg = '#' . str_replace('/', '\\\\/', $pattern) . '#';
preg_match_all('#(?!\{)\w+(?=\})#', $pattern, $matches);
if (count($matches)) {
$counts = array_count_values($matches[0]);
$duplicates = array_filter($counts, function ($count) {
return $count > 1;
});
if (count($duplicates)) {
throw new \Exception(sprintf('More than one occurrence of variable(s): "%s".', implode(',',array_keys($duplicates))));
}
foreach ($matches[0] as $match) {
$reg = str_replace("{" . $match . "}", '[a-zA-Z0-9]+', $reg);
$variables[] = $match;
}
}
apc_store($pattern, ['variables' => $variables, 'regex' => $reg]);
return ['variables' => $variables, 'regex' => $reg];
And the foreach
loop could also be replaced with a single call to preg_replace_callback()
though it would need to add the matches to $variables
within the callback and some may find that impure for a functional approach.
$reg = preg_replace_callback('#\{\w+\}#', function($matches) use (&$variables) {
$varName = substr($matches[0], 1, -1);
if(is_numeric($varName)) {
throw new \Exception('Argument cannot be a number');
}
if (in_array($varName, $variables)) {
throw new \Exception(sprintf('More then one occurrence of variable name "%s".', $varName));
}
$variables[] = $varName;
return '[a-zA-Z0-9]+';
}, $reg);
It seems that when the path is 'hi/{p1}/cicki/{p2}'
then the regex pattern returned is '#\\/[a-zA-Z0-9]+\\/cicki\\/[a-zA-Z0-9]+#'
(see this demo)- it seems like that should contain the 'hi'
at the beginning but maybe you have a reason to remove that.