Is my implementation of _read_contents
(3 lines) fine, or does it create a potential problem (e.g. passing part of an object that has to be destroyed, or a connection that cannot be closed, etc.)? I don't need the review of the general design, only these 3 lines.
import paramiko
#################################################################################
class StorageSFTP:
def __init__(self, ssh_client_params):
self.ssh_client_params = ssh_client_params
###############################################################################
def __enter__(self):
self.ssh_client = paramiko.SSHClient()
# AutoAddPolicy explained in --> https://www.linode.com/docs/guides/use-paramiko-python-to-ssh-into-a-server/
self.ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
self.ssh_client.connect(**self.ssh_client_params)
self.sftp_client = self.ssh_client.open_sftp()
return self
def __exit__(self, type, value, traceback):
self.sftp_client.close()
self.ssh_client.close()
###############################################################################
def _read_contents(self, my_filename):
with self.sftp_client.open(my_filename) as sftp_file:
contents = sftp_file.read()
return contents
Usage (dummy example):
with StorageSFTP({...}) as my_stft:
c = my_stft._read_contents("my_filename")
print(c[:100])
1 Answer 1
There's a pair of high-level isssues with this design.
- declaring object attributes
- object (session) lifetimes
You offered example usage; thank you. I'm going to pretend it was this slightly more complex example:
def retrieve_files(ssh_client_params,
filename1="my_filename",
filename2="another_filename"):
with StorageSFTP(ssh_client_params) as my_stftp:
c = my_stftp._read_contents(filename1)
d = my_stftp._read_contents(filename2)
return c, d
That is, I'm going to assume a design where we authenticate once then retrieve several files. Creating a connection and authenticating costs roughly half a second across a WAN. We wish to amortize that cost over the retrieval of several short files.
If that's not a requirement for your use case, then there's other ways to design the Public API if you get to re-authenticate once per file.
declare attributes in constructor
def __init__(self, ...):
self.ssh_client_params = ...
##### Uggh! Please elide this ASCII art. #######
def _open(self):
self.ssh_client = ...
self.sftp_client = ...
There are three object attributes that a
maintenance engineer has to keep in mind.
One of them is declared up-front in the __init__
ctor
where it should be, thank you.
The other two are created later in the object's lifetime.
Please don't do that.
You don't have to,
but it is polite to tell the Gentle Reader about them
by initializing each one to None
if you're not quite ready for them yet.
Some linters will flag this issue.
I recommend that you store an open self.sftp_client
attribute
in __init__
, and let the other (local!) variables go out of scope,
as you won't need them again.
object lifetimes
You have exactly the right idea that using a with context manager helps app developers prevent resource leaks. But the OP code does not appear to work correctly, since it doesn't implement methods required by the protocol. I tried this under cPython 3.11:
>>> class StorageSFTP:
... def _open(self):
... print('open')
... def _close(self):
... print('close')
... def get(self):
... print('get')
...
>>> with StorageSFTP() as s:
... s.get()
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'StorageSFTP' object does not support the context manager protocol
Unsurprisingly it fails.
A context manager must implement __enter__
and __exit__
.
Usually it is more convenient to use a @contextmanager decorator.
The leading _
underscore to denote a _private()
method
can be helpful. But here we have ignored the mandatory
method names, whose spelling cannot change from the protocol spec,
and we have a private _read_contents()
method
which should actually be part of the Public API.
Just rename it to read_contents()
.
You may find from contextlib import closing
convenient.
If you're looking for an easy-to-apply pattern,
just stick to using the @closing
decorator.
App developers will always use with
to obtain an instance,
and your def close
will always execute at the end,
even if a fatal exception happened somewhere in the app code.
When you write that close
method,
feel free to log "Closing..."
,
so you can convince yourself that it's happening
when you anticipated.
This code does not achieve its design goals.
I would not be willing to delegate or accept maintenance tasks on it.
EDIT:
It appears the OP has altered the code.
The def _open
method I reviewed has been renamed to __enter__
.
Similarly def _close
became __exit__
.
-
\$\begingroup\$ Thanks for such a detailed review. I have extracted bits from a much bigger class, so _read_contents() is actually private, I am using system variables for authentication, etc. \$\endgroup\$Yulia V– Yulia V2023年11月26日 16:54:48 +00:00Commented Nov 26, 2023 at 16:54
-
\$\begingroup\$ Your snippet with
c
andd
is correct, i.e. no leaks, resources left unopened etc., right? That's all I was asking about :) \$\endgroup\$Yulia V– Yulia V2023年11月26日 16:56:24 +00:00Commented Nov 26, 2023 at 16:56 -
\$\begingroup\$ The project is about backing up Medium and Wordpress websites, and copy-pasting the files between the cloud servers (GitHub, SFTP, PCloud and local storage) in general. The repository is here - github.com/yu51a5/wordpress_backup - it does what's needed, but not very polished. If something in this repository looks interesting enough to review, I will gladly ask the questions so that you can add your comments as reviews on this website. \$\endgroup\$Yulia V– Yulia V2023年11月26日 17:12:39 +00:00Commented Nov 26, 2023 at 17:12
-
2\$\begingroup\$ In StoragePCloud, these identifiers seem a bit suspect:
__post __post_fileid __post_folderid __get_id __get_content_inner
. That is, it's my opinion that name mangling is seldom what's wanted and that it isn't something you want in this particular code. Recommend you use single leading_
underscore to denote_private()
, not two, which has another meaning. Also use black on your source code, or at least use four spaces. Don't nest functions. \$\endgroup\$J_H– J_H2023年11月26日 17:46:57 +00:00Commented Nov 26, 2023 at 17:46 -
\$\begingroup\$ These are private methods that are local to this class. Naming is not my strongest skill, admittedly - can change them if you think the private names are a priority. But I am more worried about the overall design at the moment. \$\endgroup\$Yulia V– Yulia V2023年11月26日 17:54:09 +00:00Commented Nov 26, 2023 at 17:54