I've written a blog site using python
and Flask
. It works a little like Reddit, but instead of having some anonymous users, there are only anonymous users. This site allows users to add posts (after being filtered, of course). I'd like feedback on a couple things:
Is there any way I can better filter user input for
XSS
? I only check if the beginning or ending of each word is<
or>
, which can probably be bypassed fairly easily.I store information about each post in a
json
file. Is this a good way to store this type of information?
As always, feedback in any other areas is accepted and considered.
File Structure
Blog
| __pycache__
| posts
...json files generated by users creating posts...
| templates
| index.html
| config.py (only thing in this file is SECRET_KEY)
| post.py
| server.py
server.py
"""
Main Module for running and managing the blog
"""
import random
import json
import os
from flask import Flask, render_template, request, flash, redirect, url_for
from config import SECRET_KEY
from post import Post
APP = Flask(__name__)
APP.secret_key = SECRET_KEY
POSTS = "posts/"
@APP.route("/", methods=["GET", "POST"])
def index():
"""
Main Page, responsible for displaying posts and
adding new posts
"""
if request.method == "POST":
content = request.form['content']
new_post = Post(generate_user(), content)
if new_post.postable():
add_post(new_post)
else:
error = ""
if new_post.contains_profanity():
error += "Your post contains profanity! "
if new_post.over_max_length():
error += "Your post is over the max length! "
if new_post.contains_html():
error += "Your post contains html! "
flash(error)
return redirect(url_for("index"))
return render_template("index.html", post=get_posts())
def get_posts() -> list:
"""
Returns all the posts created
"""
files = os.listdir(POSTS)
posts = []
for file in files:
with open(f"{POSTS}{file}", "r") as user_post:
data = json.load(user_post)
posts.append(data)
return posts
def add_post(new_post) -> None:
"""
Adds a new post to the posts folder
"""
data = {
'USER': new_post.user,
'UNIQUE ID': new_post.unique_id,
'CONTENT': new_post.content,
'DATE POSTED': new_post.date_created
}
with open(f"{POSTS}{new_post.unique_id}.json", "w") as file:
json.dump(data, file, indent=4)
def generate_user() -> str:
"""
Returns a random number for assigning to an anonymous user
when they create a post
"""
return f"Anonymous#{random.randint(1_000_000, 9_999_999)}"
if __name__ == '__main__':
APP.run(debug=True, host="0.0.0.0", port=80)
post.py
"""
This module is for the sole purpose of containing the
Post class
"""
import random
import datetime
class Post():
"""
Class for peoples posts
"""
def __init__(self, user, content):
self.user = user
self.unique_id = self.generate_unique_id()
self.content = content
self.max_characters = 5000
now = datetime.datetime.now()
end = "PM" if now.hour >= 12 else "AM"
self.date_created = f"{now.month}/{now.day}/{now.year} {now.hour}:{now.minute}:{now.second} {end}"
def generate_unique_id(self) -> int:
"""
Returns a unique id for this post
"""
return random.randint(100_000_000, 999_999_999)
def contains_profanity(self) -> bool:
"""
Returns a boolean based on if there is profanity
in the post. This checks against a VERY basic list
of profanity
"""
for word in self.content.split():
if word.lower() in ["foo", "bar", "word"]:
return True
return False
def over_max_length(self) -> bool:
"""
Returns a boolean based on if the post exceeds the
max limit allowed
"""
return len(self.content) > self.max_characters
def contains_html(self) -> bool:
"""
Returns a boolean if there is any html in the
post. This checks against a very basic list of
html, ones that are most common for XSS
"""
for word in self.content.split():
if word[0] == "<" or word[-1] == ">":
return True
return False
def postable(self) -> bool:
"""
Uses all of these class methods to determine if
this post is allowed to be posted
"""
return not self.contains_html() and \
not self.over_max_length() and \
not self.contains_profanity()
index.html
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Blog</title>
<style type="text/css">
fieldset {
border-radius: 10px;
border-width: 5px;
}
input[type=text] {
border-color: yellow;
border-radius: 5px;
width: 150px;
height: 25px;
font-size: 17px;
}
body {
background: pink;
}
textarea {
min-height: 100px;
max-width: 500px;
min-width: 500px;
max-width: 1000px;
font-size: 16px;
border-radius: 3px;
}
button[type=submit] {
margin-top: 5px;
height: 40px;
width: 120px;
border-radius: 10px;
font-size: 20px;
background-color: yellow;
color: black;
letter-spacing: 1px;
border-color: black;
}
button[type=submit]:hover {
background-color: purple;
color: white;
cursor: pointer;
}
#user { font-size: 20px; }
#date { font-size: 15px; }
#content {
margin-top: 10px;
font-size: 14px;
}
</style>
</head>
<body>
<!-- Posts Start -->
{% for p in post %}
<fieldset>
<div id="user"><b>{{ p['USER'] }} - {{ p['DATE POSTED'] }}</b></div>
<div id="content">{{ p['CONTENT'] }}</div>
</fieldset>
{% endfor %}
<!-- Posts End -->
<!-- New Post Start -->
<form action="" method="post">
<fieldset>
<h3>Create New Post</h3>
<textarea placeholder="Enter Content Here" name="content"></textarea><br>
<button type="submit"><b>Post</b></button>
</fieldset>
</form>
<!-- New Post End -->
<!-- Errors Start -->
{% for message in get_flashed_messages() %}
{{ message }}
{% endfor %}
<!-- Errors End -->
</body>
</html>
How the JSON data is stored
{
"USER": "Anonymous#8147466",
"UNIQUE ID": 766866833,
"CONTENT": "This is my post!",
"DATE POSTED": "9/27/2019 12:50:7 PM"
}
Site after above post is made
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Blog</title>
<style type="text/css">
fieldset {
border-radius: 10px;
border-width: 5px;
}
input[type=text] {
border-color: yellow;
border-radius: 5px;
width: 150px;
height: 25px;
font-size: 17px;
}
body {
background: pink;
}
textarea {
min-height: 100px;
max-width: 500px;
min-width: 500px;
max-width: 1000px;
font-size: 16px;
border-radius: 3px;
}
button[type=submit] {
margin-top: 5px;
height: 40px;
width: 120px;
border-radius: 10px;
font-size: 20px;
background-color: yellow;
color: black;
letter-spacing: 1px;
border-color: black;
}
button[type=submit]:hover {
background-color: purple;
color: white;
cursor: pointer;
}
#user { font-size: 20px; }
#date { font-size: 15px; }
#content {
margin-top: 10px;
font-size: 14px;
}
</style>
</head>
<body>
<!-- Posts Start -->
<fieldset>
<div id="user"><b>Anonymous#8147466 - 9/27/2019 12:50:7 PM</b></div>
<div id="content">This is my post!</div>
</fieldset>
<!-- Posts End -->
<!-- New Post Start -->
<form action="" method="post">
<fieldset>
<h3>Create New Post</h3>
<textarea placeholder="Enter Content Here" name="content"></textarea><br>
<button type="submit"><b>Post</b></button>
</fieldset>
</form>
<!-- New Post End -->
<!-- Errors Start -->
<!-- Errors End -->
</body>
</html>
-
\$\begingroup\$ Do you have an example usage with this? It isn't immediately clear to me how this would produce the intended result. \$\endgroup\$Mast– Mast ♦2019年09月27日 12:19:42 +00:00Commented Sep 27, 2019 at 12:19
-
\$\begingroup\$ @Mast I added the source code of the website after a post is made, if that makes anything more clear. \$\endgroup\$Ben A– Ben A2019年09月27日 16:53:09 +00:00Commented Sep 27, 2019 at 16:53
-
\$\begingroup\$ That's the result, but how is the function used? \$\endgroup\$Mast– Mast ♦2019年09月27日 17:32:36 +00:00Commented Sep 27, 2019 at 17:32
-
\$\begingroup\$ @Mast I'm not quite sure what you mean. What function are you inquiring about? \$\endgroup\$Ben A– Ben A2019年09月27日 21:23:05 +00:00Commented Sep 27, 2019 at 21:23
-
1\$\begingroup\$ @dfhwze That's why I edited my question removing the profanity, with Mast providing some sample words instead. \$\endgroup\$Ben A– Ben A2019年09月28日 10:10:49 +00:00Commented Sep 28, 2019 at 10:10
3 Answers 3
Date formatting
self.date_created = f"{now.month}/{now.day}/{now.year} {now.hour}:{now.minute}:{now.second} {end}"
should be
from datetime import datetime
...
datetime.now().strftime('%m/%d/%Y %I:%M:%S %p')
Generators for logic
for word in self.content.split():
if word.lower() in ["phooey", "shucks", "rascal"]:
return True
return False
can be
return any(
word.lower() in {'shut', 'the', 'front', 'door'}
for word in self.content.split()
)
Note the use of a set
instead of a list
for membership tests.
Boolean factorization
return not self.contains_html() and \
not self.over_max_length() and \
not self.contains_profanity()
can be
return not (
self.contains_html() or
self.over_max_length() or
self.contains_profanity()
)
Inline styles
You should really consider removing your styles from the index
head and putting them into a separate .css
file. Among other things, it'll improve caching behaviour.
Your code looks okay to me. For your questions:
- Flask uses Jinja template engine, it already does XSS filter automatically for you. So you don't need to worry about that. Actually, implementing a XSS filter has a lot of things to concern, so just leave it to the template engine.
- For a toy app, storing data into a json file is fine. But in actual production environment, you should use a real database like mysql to handle this. Because once you've got a huge amount of data, reading all of them from a file would be very slow.
I'd really made GET and POST routes into separate functions. Makes your project clearer as it grows.
Instead of error +=
, I'd made errors a list, and appended to that. It may be not an obvious win in your case, but imagine you will want some other way to separate errors in the future, eg, with HTML.
Also it would be much simpler, if postable
could serve as a validation and returned reasons why it's not postable. Then there will be no need to check it twice.
JSON data with spaces on key side is not JavaScript-friendly, so it's better be something like:
"datePosted": "2019-09-27T12:50:07"
Note also using ISO format: With front-end side library like Moment.js, it will be easy to turn it into "N days ago", to other timezone or to other format later.
open(f"{POSTS}{file}", "r")
is not good practice from security point of view. Ensure the resulting path concatenation still lies within the directory it is intended to be. (See some answers here: https://stackoverflow.com/questions/6803505/does-my-code-prevent-directory-traversal )