PHP realpath($path); is insufficient for some cases since it returns false on paths that dont exist on a file system.
I need a function that extends realpath($path); to return even path that does not exist yet. Such function would be used for keeping a user within his filesystem directory and/or restricting modifications to some directories and files.
function DesiredRealPath($path_string) {
$desired_path = explode("/", str_replace("\\", "/", $path_string)); // convert back slashes to front slashes and create array of desired directories
$real_path = explode("/", str_replace("\\", "/", realpath("."))); // convert back slashes to front slashes and create array of actual directories
if(mb_substr($path_string, 0, 1) == "/" || mb_substr($path_string, 0, 1) == "\\") { // if path string begins with a slash, slice all actual directories except for root
$real_path = array_slice($real_path, 0, 1); // "/" points to root directory
}
foreach ($desired_path as $desired_element) {
switch ($desired_element) {
case "":
break;
case ".":
break;
case "..": // remove last element of actual directories if array has at least 2 directories left
if(count($real_path) >= 2) {
array_pop($real_path);
}
break;
default: // push desired directory into actual directories
array_push($real_path, $desired_element);
break;
}
}
return implode("/", $real_path); // put array of actual directories into a string
}
So far I have tested on both Windows and Linux, this function has determined precise directory even non-existant ones.
My question is, whether this function has any flaws or security vulnerabilities?
1 Answer 1
Because you’re allowing path traversal, your function doesn’t restrict even system paths. To exemplify the danger of your function, check examples below, where realpath('.')
points to /web/users/alice
:
DesiredRealPath('files/photos'); # It returns `/web/users/alice/files/photos` and this is OK
DesiredRealPath('/etc/passwd'); # It returns `/etc/passwd` and it’s very BAD
DesiredRealPath('../bob/photos'); # It returns `/web/users/bob/photos` and this is also BAD
To protect against Path Traversal Attack, you should ignore "dot" directories. For example, consider this function:
function basepath($rel_path)
{
$base = str_replace('\\', '/', realpath('.'));
$parts = explode('/', str_replace('\\', '/', $rel_path));
foreach ($parts as $part) {
if ($part && $part != '.' && $part != '..') {
$base .= "/{$part}";
}
}
return $base;
}
Testing the same paths as in the example above, no one escapes the /web/users/alice
directory:
basepath('files/photos'); # /web/users/alice/files/photos
basepath('/etc/passwd'); # /web/users/alice/etc/passwd
basepath('../bob/photos'); # /web/users/alice/bob/photos
For better security, make sure to configure the open_basedir directive correctly.
By the way, if you have a script for which you want prevent path traversal, add the following at the top of your script:
ini_set('open_basedir', __DIR__);
-
\$\begingroup\$ open_basedir is an excellent point, I didnt know about it and it seems suitable if the only restriction is one path. DesiredRealPath(); itself is not designed to make any restrictions, its purpose is to inform the server what path is going to be tempered with before any real tempering takes place. \$\endgroup\$John Doe– John Doe2018年12月17日 10:56:08 +00:00Commented Dec 17, 2018 at 10:56
-
\$\begingroup\$ How about
return $base . preg_replace('~\.+/~', '', str_replace('\\', '/', $rel_path));
instead of looping? \$\endgroup\$mickmackusa– mickmackusa2020年09月11日 21:49:43 +00:00Commented Sep 11, 2020 at 21:49 -
1\$\begingroup\$ @mickmackusa The main problem is that it allows to escape to the parent directory if you pass
/..
as input. Also, it doesn't handle properly any input consisting only of dots, it doesn’t trim and normalize slashes, and it doesn’t join properly the base path and relative path. Of course, we can get rid of the loop, but such a simple regular expression is not enough. I think that as a result it won't make the code either more readable or faster. \$\endgroup\$Victor– Victor2020年09月12日 06:46:09 +00:00Commented Sep 12, 2020 at 6:46 -
\$\begingroup\$ So it also needs
ltrim()
with a./
mask? \$\endgroup\$mickmackusa– mickmackusa2020年09月12日 07:01:29 +00:00Commented Sep 12, 2020 at 7:01 -
\$\begingroup\$ @mickmackusa It is not enough because if you pass something like this
/X/..
the function should return/web/users/alice/X
, not/web/users/alice/X/..
. Also it should normalize inputs likeX///X///
(in your case it returns something like this/web/users/aliceX///X///
, but it must be/web/users/alice/X/X
). \$\endgroup\$Victor– Victor2020年09月12日 07:14:23 +00:00Commented Sep 12, 2020 at 7:14