I'd like to have neater URLs, no more and no less — I'd like to be able to write /page/subpage
instead of something like ?p=page&s=subpage
.
I've looked at various PHP frameworks and routing classes to see how they did it. Since most of them come with features I don't need, I wanted to try making my own system using only what's absolutely necessary — indeed, I think this is about as basic as routing can get.
Above all, I'd like to know if I've missed something. The code works, but I'm sure there are edge cases I didn't account for (or security issues; this is PHP after all).
On a side note, is there a difference betwen "routing" and "URL rewriting"? The terminology isn't all that clear to me.
.htaccess
RewriteEngine on
RewriteBase /routing_test/
RewriteCond %{REQUEST_FILENAME} !-l
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [QSA,L]
inc/functions/routing.php
<?php
define('DEFAULT_PAGE_NAME', 'default');
function getPageName($level)
{
$requestParams = explode('/', $_SERVER['REQUEST_URI']);
$scriptPath = explode('/', $_SERVER['SCRIPT_NAME']);
// remove the base path
while ($requestParams[0] === $scriptPath[0])
{
array_shift($requestParams);
array_shift($scriptPath);
}
if ($level === 0)
{
return $requestParams[0] ?: DEFAULT_PAGE_NAME;
}
return isset($requestParams[$level]) ? $requestParams[$level] : '';
}
As you can see, it's really only one function that does the work: getPageName($level)
gives the name of the "directory" at the specified level.
What you do with those is up to you — in this case, I'm simply including a file with that name.
index.php
<?php
define('BASE_PATH', '/routing_test');
require_once 'inc/functions/routing.php';
?>
<html>
<head>
<title>Routing Test</title>
</head>
<body>
<div class="main">
<?php
$pageName = getPageName(0);
$subpageName = getPageName(1);
if (file_exists($file = 'inc/content/' . $pageName . '.php'))
{
include $file;
}
else
{
echo 'not found';
}
?>
</div>
</body>
</html>
inc/content/default.php
<h1>This is the default page</h1>
inc/content/users.php
<?php
if ($subpageName === '')
{
echo 'Displaying all users.' . '<br />';
}
else
{
$userId = $subpageName;
echo 'Displaying user #' . $userId . '.<br />';
}
?>
<a href="<?php echo BASE_PATH . '/' . DEFAULT_PAGE_NAME; ?>">go back</a>
In this example, opening /routing_test/users
would show Displaying all users.
, while /routing_test/users/123
would show Displaying user #123.
.
As I said, I'm mainly looking for advice on best practices, security issues and correctness (pretty much exactly the points on the tour page), though general feedback is always welcome.
2 Answers 2
On a side note, is there a difference betwen "routing" and "URL rewriting"? The terminology isn't all that clear to me.
Yes. Generally, "routing" describes the process of mapping URLs to code in some form. The standard example would be mapping a URL to a method of a controller.
On the other hand, "rewriting" doesn't map a URL to code, but maps a URL to a different URL.
[But both terms are often not 100% clearly defined, and are sometimes used interchangably]
Security: DOS
If I visit your index.php
directly, without giving any parameters, I get an infinite loop in getPageName
.
You shouldn't need a while loop here, replacing the script path without the index.php file name in the request parameter should do the same thing.
Security: Directory Traversal
Your index.php
file is likely open to directory traversal and LFI in Windows (I don't have a Windows machine to test right now, but using \
should work). With current PHP versions, it is restricted to PHP files, but it should probably still be fixed.
Correctness: URL encoding
Because you use REQUEST_URI
, the values you get will be URL encoded. So if I visit /users/foo"bar
, I would get Displaying user #foo%22bar.
instead of the expected result. For the example this may be fine, but it will likely cause bugs in the future.
Do note though that the URL encoding is currently all that protects you from XSS (and that the encoding happens client-side, so you should not necessarily trust it).
Approach
If it's just about mapping /page/subpage
to ?p=page&s=subpage
, I would probably use Apache URL rewriting exclusively (I'm no expert, but it should certainly be possible).
If you want the URL routing functionality, I would probably go with a more extended approach, which lets me define a whitelist of allowed URLs, and maps them to a controller method.
-
\$\begingroup\$ This is precisely the kind of feedback I was hoping for; thank you! A real eye-opener, too, when it comes to unanticipated cases. (Though, would you mind elaborating how one can manipulate the URL to include any file from the server, even external ones?) If no one else answers, I'll probably accept yours in a bit. \$\endgroup\$vvye– vvye2016年03月30日 18:12:12 +00:00Commented Mar 30, 2016 at 18:12
-
\$\begingroup\$ Yes, it is possible with .htaccess alone. \$\endgroup\$Ismael Miguel– Ismael Miguel2016年03月30日 18:17:43 +00:00Commented Mar 30, 2016 at 18:17
-
\$\begingroup\$ @vvye
/page\..\..\..\..\..\..\..\somefile/subpage
would (probably, again, I don't have windows to test right now) includesomefile.php
in the root directory, which is outside the directory you want to include. This may lead to an attacker controlling the control flow of the application and may lead to a number of problems, possible authentication bypass, DOS, etc. If you are using an old version of PHP, an attacker may also try to cut of the trailing.php
via%00
(null byte poisoning) or via a number of trailing slashes (path truncation) and thus include any file. \$\endgroup\$tim– tim2016年03月30日 18:28:03 +00:00Commented Mar 30, 2016 at 18:28
Your function currently search for files in specific directory.
This is good, but is nor routing, nor URL rewriting. I could say it is very rudimentary routing.
Consider following examples:
http://domain.com/show/ - list all users
http://domain.com/show/124 - list specific user (with ID = 124)
Second one can not be done with your router.
URL Rewriting
URL Rewriting is when you have something similar to Apache
mod_rewrite
.
Using it, you somehow instruct the webserver to invoke say /users.php or /users.php?id=124
Routing
Routing from the other hand is usually not part of the webserver.
It is written in PHP
in a way similar to what you did. There is no webserver configuration, except the .htaccess
you are using. Some servers such embedded web server in PHP even omit any configuration - they redirect everything to index.php
.
Some frameworks provide routers
that do just:
http://domain.com/controller/method/?query=string
Another provide much "rich" configuration.
Here is brief example of router I implemented for fun 3-4 years ago here:
https://github.com/nmmmnu/pfc
protected function factoryRouter(\injector\Injector $injector){
$ns = __NAMESPACE__ . "\\" . "controllers" . "\\";
$inj = $injector;
$r = new \pfc\Framework\Router();
$r->bind("/", new \pfc\Framework\Route(new \pfc\Framework\Path\Exact("/"), new \pfc\Framework\Controller\Template("home.html.php", array("utf8_test" => "Здравейте, München, Français")) ));
$r->bind("/complex", new \pfc\Framework\Route(new \pfc\Framework\Path\Exact("/complex"), new \pfc\Framework\Controller\Action($inj, $ns . "MyController::complex") ));
$r->bind("/json", new \pfc\Framework\Route(new \pfc\Framework\Path\Exact("/json"), new \pfc\Framework\Controller\Action($inj, $ns . "MyController::json") ));
$r->bind("/show", new \pfc\Framework\Route(new \pfc\Framework\Path\Exact("/show"), new \pfc\Framework\Controller\Action($inj, $ns . "MyController::show") ));
$r->bind("/show/x", new \pfc\Framework\Route(new \pfc\Framework\Path\Mask("/show/{id}"), new \pfc\Framework\Controller\Action($inj, $ns . "MyController::showDetails") ));
$r->bind("/404", new \pfc\Framework\Route(new \pfc\Framework\Path\CatchAll("/"), new \pfc\Framework\Controller\Redirect("/") ));
return $r;
}
Check especially /show/{id}
it match /show/<any text here>
extracts the <any text here>
and send it to the controller.