I am currently in the process of writing a system application that will be operated by multiple users. This will require individual users to access the application while limiting access to those that shouldn't be using it. The application will have several areas of access, which all users will not have access to. At this time, it may change in the future, the application is more or less a proof of concept and will be storing user information in a DB.
What I have here is what I'm calling the Operator Library. This part of my code is strictly how to access the portion of the DB that interacts with the users table. There will be other libraries that will be written to access the DB related to specific tables in a very similar manner to the below.
UPDATE:
looking at the documentation, it seems they recommend using a with statement for handling opening and using individual sessions. I've also added some simple exception handling.
Anything else I should be adding?
DBOperatorLibary.py
from Models import Operator, engine
from sqlalchemy.orm import sessionmaker
local_session = sessionmaker(autocommit=False, autoflush=False, bind=engine)
def create_operator(username, password, privilage):
ret = None
with local_session() as session:
operator = Operator(username=username, privilage=privilage)
if operator.password == "":
operator.password = password
session.add(operator)
try:
session.commit()
session.refresh(operator)
ret = operator
except Exception as e:
ret = e.args
return ret
def get_operators():
with local_session() as session:
operators = session.query(Operator).all()
return operators
def update_operator(operator_id, username, password, privilage):
ret = None
with local_session() as session:
operator = session.query(Operator).get(operator_id)
operator.username = username
operator.password = password
operator.privilage = privilage
try:
session.commit()
session.refresh(operator)
ret = operator
except Exception as e:
ret = e.args
return ret
def delete_operator(operator_id):
ret = None
with local_session() as session:
operator = session.query(Operator).get(operator_id)
session.delete(operator)
try:
session.commit()
ret = operator
except Exception as e:
ret = e.args
return ret
def validate_operator_password(username, password):
with local_session() as session:
operator = (
session.query(Operator)
.filter(Operator.username == username)
.first()
)
if operator is None:
return False
return operator.check_password(password)
if __name__ == "__main__":
...
Models.py
from BaseModel import Base
from OperatorModel import Operator
from RecordModels import Record
from sqlalchemy import create_engine
engine = create_engine("sqlite:///db/application.db", echo=True)
Base.metadata.create_all(bind=engine)
BaseModel.py
from sqlalchemy.orm import DeclarativeBase, MappedAsDataclass
class Base(MappedAsDataclass, DeclarativeBase):
"""subclasses will be converted to dataclasses"""
OperatorModel.py
from datetime import datetime
from BaseModel import Base
from sqlalchemy import func
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.orm import Mapped, mapped_column
from werkzeug.security import check_password_hash, generate_password_hash
class Operator(Base):
__tablename__ = "operators"
oid: Mapped[int] = mapped_column(primary_key=True, init=False)
created_on: Mapped[datetime] = mapped_column(
init=False, server_default=func.now()
)
username: Mapped[str]
privilage: Mapped[str]
_password: Mapped[str] = ""
@hybrid_property
def password(self):
return self._password
@password.setter
def password(self, password):
self._password = generate_password_hash(password)
return self._password
def check_password(self, password):
return check_password_hash(self._password, password)
def __repr__(self):
return f"User Record {self.oid}\
\n\tUsername: {self.username} \
\n\tPrivilage: {self.privilage} \
\n\tCreated On: {self.created_on}"
-
1\$\begingroup\$ I have rolled back Rev 5 → 3. Please see What should I do when someone answers my question? \$\endgroup\$Sᴀᴍ Onᴇᴌᴀ– Sᴀᴍ Onᴇᴌᴀ ♦2023年12月05日 19:42:37 +00:00Commented Dec 5, 2023 at 19:42
1 Answer 1
The with
looks good.
operator = Operator(username=username, privilage=privilage)
nit: Please use conventional spelling of "privilege".
type stability
ret = operator
except Exception as e:
ret = e.args
I'm not crazy about the Public API offered by create_operator
, there.
You have created a "difficult to consume" API;
caller must distinguish between those two cases.
When there's many call sites, at least one of them likely is buggy,
failing to properly test it.
Better to let that exception bubble up the call stack.
This pattern repeats throughout the code base, bloating it up.
Better to raise
at those other locations, as well.
enforce a password policy
if operator.password == "":
I don't understand this. How did that row come to appear in the database? Delete such rows. Require that operators shall use credentials for access.
sensible credentials
Please delete that password
column.
If (when) an attacker compromises your server,
the release of plaintext passwords will cause
catastrophic harm to you, your users, and to other sites they use.
Use argon2id()
to turn each password into a (salted) hash, and store that
in a new credential
column.
blind query
Consider renaming get_operators
to get_all_operators
.
I was a little surprised by the absence of parameter(s) fed to a WHERE clause.
extra refresh
def update_operator...
...
session.refresh(operator)
What do we believe the refresh() did there? Looks like "nothing", to me.
rename column
class Operator(Base):
...
def check_password(self, password):
return check_password_hash(self._password, password)
nit: Please use conventional spelling of "library".
Back in DBOperatorLibary your identifiers were telling me that we're storing cleartext passwords. But here we are passing around a hash. Please use identifiers that tell the truth.
Using pbkdf2 is reasonably good, though argon2id would be better (slower for attackers to brute force it).
Your @password.setter
uses especially confusing
identifiers, given that same word is used to
represent two radically different concepts.
EDIT
truthful identifiers
Two classic issues in software engineering are
- "comments lie!", and
- "identifiers lie!"
No one wants to read a two-sentence # comment
which
used to speak the truth about an algorithm,
back before requirements changed we altered the code
to have brand new behavior.
No one wants to read code like this all-too-typical example:
def mean(nums: list[int]) -> float:
avg = 0 # accummulator, a running sum
for num in nums:
avg += num
avg /= len(nums)
return avg
That variable is clearly in no way an "average" until the very end. So we can't reason about loop variants in terms of a mathematical average, we must go through mental gymnastics to ignore what our eyes are showing us and read that as "accummulator" or "running sum" within the loop.
Much better to phrase it like this:
def mean(nums: list[int]) -> float:
acc = 0
for num in nums:
acc += num
return acc / len(nums)
(Obviously in python we'd prefer sum(nums) / len(nums)
,
this is just for illustration.)
Interlude: Here is an example password hash. It is 64-bit prefix of SHA-2.
$ echo -n secret123 | shasum -a 224 | cut -c1-16
8a73901c428f41be
In the OP code, we have several instances where ostensibly
we're passing a password
around, so it might have a
value like "secret123"
. Sometimes those identifiers
speak the truth. Often they are really holding a hash,
and we might expect pw_hash
to have a value
like "8a73901c428f41be"
.
Recommend you rename some of those identifiers. Future maintenance engineers will thank you.
Crypto algorithms often want to see at least 128 bits of true entropy in key material, in order to make solid security guarantees. If you look at passwords memorized by real users, it appears that often they have trouble remembering more than about 40 bits of entropy. (A password like "banana" consumes 48 bits of storage but has significantly less entropy than that.)
password stretchers
Why do we use pbkdf2, or the better more-modern argon2id? To thwart certain attack scenarios.
Imagine that Mallory obtains a credential being sent across the wire or being stored on-disk on the webserver. She wants to be able to present valid credentials to your server and to other servers frequented by your users, which possibly use the same memorized password.
If passwords are stored in a file as plaintext, a single webserver breach is catastrophic for you and for other servers.
If credentials are stored as e.g. SHA-2 or SHA-3 hashes, Mallory can play dictionary words ("apple") and generated words ("secret1234") against her own GPU that computes the hash function, hoping for a match. Those are designed to be quick hash functions, so in an hour she can explore quite a large key space. We call this an "off-line" attack, since Mallory can conduct the search without contacting your on-line server.
If credentials are stored as argon2id hashes, she can still launch the same attack, but it was designed to be an intentionally slow hash function so she'll be able to explore a much much smaller key space in the same amount of time. We have also forced her to purchase significantly more RAM for her GPU, again due to intentional aspects of the hash function's design. The design goals for the (earlier) pbkdf2 function were similar. Experiences with that nice bit of engineering informed development of the new and improved argon2 suite of hash functions.
tl;dr: Use argon2id when {storing, checking} a user's memorized password.
-
\$\begingroup\$ Finally got around to applying your suggestions. Most of what you suggested I understand. I need to look into what you are talking about regarding passwords and credentials. Including what you mean by "Please use identifies that tell the truth". I will however looking into argon2id. thank you again \$\endgroup\$Michael– Michael2023年12月07日 17:10:03 +00:00Commented Dec 7, 2023 at 17:10
-
1\$\begingroup\$ Cool. Ok, I expanded on that topic -- look for "EDIT". \$\endgroup\$J_H– J_H2023年12月07日 18:08:16 +00:00Commented Dec 7, 2023 at 18:08