In my case I am saving a URL into my database via Eloquent Model:
namespace App\Models;
use Illuminate\Database\Eloquent\Factories\HasFactory;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Support\Str;
class PageLink extends Model
{
use HasFactory;
protected $table='links';
public function setUrlAttribute(string $url):void
{
$saneUrl = trim($url);
$saneUrl = filter_var($url, FILTER_SANITIZE_URL);
$saneUrl = preg_replace("/javascript:.*/","",$saneUrl);
$saneUrl = htmlspecialchars($url);
$saneUrl = Str::of($url)->stripTags()->toString();
/**
* Regex was found into https://stackoverflow.com/a/36564776
* The idea is to extract the url from attack vectors such as:
*
* "http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>
*
* This allows me to make the entry safe
*/
preg_match_all('/https?\:\/\/[^\" ]+/i', $saneUrl, $match);
if(is_array($match[0]) && isset($match[0][0])){
$saneUrl=$match[0][0];
}elseif (is_string($match[0])){
$saneUrl=$match[0];
}else {
$saneUrl="";
}
$this->attributes['url']=$saneUrl;
}
}
And in my blade view it is rendered as:
<a href="{{$link->url}}">{{$link->url}}</a>
But also will be provided via AJAX API as well (that is under development). Therefore I want to sanitize my input upon a valid URL. I do not validate it here because my controller does this for me:
use Illuminate\Contracts\Validation\Validator;
use Illuminate\Http\JsonResponse;
use Illuminate\Http\Request;
Route::post("/url",function(Request $request){
$validator = Validator::create($request->all(),[
'url'=>"required|url"
],[
'url'=>"Δεν δώσατε έναν έγγυρο σύνδεσμο"
]);
if($validator->fails()){
return new JsonResponse(['msg'=>$validator->error()],400);
}
$link = new Link();
$link->url = $request->get('url');
try{
$link->save();
}catch(\Exception $e){
report($e);
return new JsonResponse(["msg"=>"Link fails to be saved"],500);
}
return new JsonResponse($link,201);
});
And here is a unit test for the model:
namespace Tests\Feature\Unit\Model;
use App\Models\Link;
use Illuminate\Foundation\Testing\RefreshDatabase;
use Illuminate\Foundation\Testing\WithFaker;
use Tests\TestCase;
class UserPageLinkTest extends TestCase
{
use RefreshDatabase;
public static function urlProvider(): array
{
return [
['http://example.com<script>alert("XSS");</script>'],
["http://example.com\" onfocus=\"alert(\"Hello\")\""],
["https://example.com\' onfocus='alert('Hello')'"],
["<script>alert(\"XSS\")</script>"],
["http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>"],
["javascript:alert(String.fromCharCode(88,83,83))"]
];
}
/**
* @dataProvider urlProvider
*/
public function testUrlSanitizedFromXss($input): void
{
$model = new Link();
$model->url = $input;
$saneUrl = $model->url;
$this->assertNotEquals($saneUrl,$input);
}
}
The idea behind the code is to make a minimal/lightweight CMS that manages some fixed pages.
In my case could I omit any other filter and just use a regular expression pattern and still be safe from XSS attacks in the input?
4 Answers 4
The biggest problem here is that you are going overboard with precautions. Security is important, but it must be logical.
- What
$saneUrl = preg_replace("/javascript:.*/","",$saneUrl);
is for? Given below you will only allowhttps://
schema and thereforejavascript://
won't make it in anyway? - What
$saneUrl = htmlspecialchars($url);
is for? Given you would apply this sanitization with{{$link->url}}
anyway - What
$saneUrl = Str::of($url)->stripTags()->toString();
is for? What HTML tags you expect to remain after applyinghtmlspecialchars
? ;-)
Another problem is that you are trying to use deliberately wrong URLs. Even trying to "extract the URL from attack vectors" WHY? I am genuinely puzzled.
A generally accepted approach is to validate the data, and abort processing if validation fails. When you get an URL like this http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>
, it shouldn't be processed at all. Return a 422 error or whatever and call it a day.
In your place I would do it like this:
public function setUrlAttribute(string $url):void
{
$url = trim($url);
$invalidUrl = !filter_var($url, FILTER_VALIDATE_URL);
$invalidScheme = !preg_match('~^http~', $url);
if ($invalidUrl || $invalidScheme) {
throw WhateverLaravelThrowsInSuchCase("Invalid url");
}
$this->attributes['url']=$url;
}
That's all.
-
\$\begingroup\$ In other words if any furtger sanitization needed I should do it upon output. (Eg. If I need to display it as Json). Also you say that my controller will catch the
http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>
atack vector right? \$\endgroup\$Dimitrios Desyllas– Dimitrios Desyllas2024年02月27日 10:32:51 +00:00Commented Feb 27, 2024 at 10:32 -
2\$\begingroup\$ Yes, exactly. Any further sanitization should be done upon output. in case of JSON output it must be json_encode(). In case of HTML output it must be htmlspecualchars() (which is implemented by {{}} in Blade). Yes, this controller will catch
http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>
because it's invalid url, andfilter_var($url, FILTER_VALIDATE_URL)
will return false. \$\endgroup\$Your Common Sense– Your Common Sense2024年02月27日 10:35:49 +00:00Commented Feb 27, 2024 at 10:35 -
\$\begingroup\$ Ok I note it dowm. \$\endgroup\$Dimitrios Desyllas– Dimitrios Desyllas2024年02月27日 10:38:25 +00:00Commented Feb 27, 2024 at 10:38
Beware of re-assigning a variable. In the setter method setUrlAttribute()
the variable $saneUrl
is over-written numerous times:
public function setUrlAttribute(string $url):void { $saneUrl = trim($url); $saneUrl = filter_var($url, FILTER_SANITIZE_URL); $saneUrl = preg_replace("/javascript:.*/","",$saneUrl); $saneUrl = htmlspecialchars($url); $saneUrl = Str::of($url)->stripTags()->toString();
Each line in that block except the third line uses $url
on the right hand side of the assignment. Perhaps the intention for each line after the first is to reuse $saneUrl
and if that is the case then each line before the last is useless because the last one uses the parameter variable instead of the the modified value from the previous lines.
As some general feedback, I would say that your unit tests are lacking. It looks like you are just checking that your filter function modifies the input in some way? One of the main jobs of tests are to encode in an objective and testable way what the function is supposed to do. Each test input should have a matching expected value to compare to.
In general, I'm pretty wary about "sanitizing" efforts. If your application does not like the input, it should reject it. Accepting user input but silently modifying it to something different just doesn't make sense. I think that goes doubly if you suspect that the intention is malicious. What is the scenario where you are willing to accept input from a user that you think is trying to subvert your application or attack your users?
Maybe you have reasons for doing it this way, but if it was me I would use FILTER_VALIDATE_URL
and have the function fail if it does not validate. You do have to make sure that you properly encode your urls when you write them out to avoid XSS, but that is always true.
-
\$\begingroup\$ So upon model I should assume that my controller does not do any sort of input validation and try also to validate my input. But should thid be done after sanitizing it? \$\endgroup\$Dimitrios Desyllas– Dimitrios Desyllas2024年02月27日 08:54:13 +00:00Commented Feb 27, 2024 at 8:54
For validation, I think you can create a custom validator in Laravel.
Depending on what you want to validate: host, path, query,... you can create and customize the validation you want, and you that with the validator and rules like other rules. Pretty neat!
In that way, it would do the validation you want along other validation like required or url.
JaVaScRiPt:
orjavajavascriptscript:
orjava<new line>script:
This is why @YourCommonSense's answer is so important. Otherwise you'll be playing an endless cat-and-mouse game. \$\endgroup\$