Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Uploading file takes more than expected #126

Unanswered
christiankf asked this question in Q&A
Discussion options

Hi there

First, thanks for this awesome plugin. Works like a charm and is (almost) exactly what I need.

So maybe I'm understanding something wrong of need to configure something but I have the following scenario:

  1. I have a simple upload form where the user can upload a big file (let's say up to a couple of hundred MBs)
  2. The upload in via django-s3file and the progress loader works like a charm
  3. Then the POST method is called at the end of the JS direct upload to S3
  4. My code receives the temporary file and on save transforms the key that then is stored in the DB

The part of my code looks like this to handle the upload:

form = UploadForm(request.POST, request.FILES)
 if form.is_valid():
 upload = request.FILES['upload']
 obj = Obj()
 obj.save()
 obj.upload = upload # The upload_to needs to get the object id, so it's done in a second save call
 obj.save()

So this works. But the thing is that the last step 4) apparently takes more time at obj.upload = upload, the bigger the file is (a couple of hundred MBs took already something like 20-30s). So the user when they click "upload" see the upload progress but then have to wait still quite some time until the view actually loaded. I assume that this is because the file is "moved" on S3. But it probably is not a move but a copy, which would explain that the bigger the file is it takes longer.

So, somehow I this can't the the expected behaviour as it does upload the file directly to S3 but the advantage is only partially as the user now has to wait for the upload almost twice (upload + the time that it takes to copy the file).

Is this intended like that? Are there any good workaround or can I configure something in django-s3file or django-storages that the saving doesn't take that much time?

Thanks!

You must be logged in to vote

Replies: 1 comment 5 replies

Comment options

Hi @christiankf,

Thanks for reaching out. This is an interesting case. I believe the delay you are experience is due to the app server loading the file from S3 and maybe even doing something to it. What causes the file to be pulled into your applications memory depends on your objects model. If you'd share that with me, I might be able to help. If this happens to be an ImageField and you use dimension fields, that could be one cause, but you could have other behavior that might cause all this.

I hope that helps you a bit.
Best,
Joe

You must be logged in to vote
5 replies
Comment options

BTW, I hope you app server is in the same AWS data center as your S3 bucket. If those are on different GEO locations, the IO can become painfully slow.

Comment options

Hi Joe

Thanks for the response! (btw: I'm the same person as the post above; just used the wrong login).

I played around further and found the main issue (I believe). The thing was that the save command of AWSS3Storage actually copies the file through the server making the call (I haven't checked the network but it took about the same amount of time like the upload itself). I fixed it for my case that it is actually using the boto copy command instead of the upload_fileobj (including a seek operation) command.

So in general the save method does what it's supposed to do and it wouldn't make a difference if the file would not be there already but since your library does directly upload it, I kind of expected a move command somewhere. But S3 doesn't even have such a command so I hoped copying a bucket does it in a way that doesn't require to stream the file content through my server and it seems to work like that.

So all in all I now use an own subclass for the storage like that and it works as expected.

from storages.backends.s3boto3 import S3Boto3Storage
class AWSS3Storage(S3Boto3Storage): # pylint: disable=abstract-method
 default_acl = 'private'
 file_overwrite = True
 custom_domain = False
 def _save(self, name, content):
 # Basically copy the implementation of _save of S3Boto3Storage
 # and replace the obj.upload_fileobj with a copy function
 cleaned_name = self._clean_name(name)
 name = self._normalize_name(cleaned_name)
 params = self._get_write_parameters(name, content)
 if (self.gzip and # pylint: disable=no-member
 params['ContentType'] in self.gzip_content_types and # pylint: disable=no-member
 'ContentEncoding' not in params):
 content = self._compress_content(content)
 params['ContentEncoding'] = 'gzip'
 obj = self.bucket.Object(name)
 #content.seek(0, os.SEEK_SET) # Disable unnecessary seek operation
 #obj.upload_fileobj(content, ExtraArgs=params) # Disable upload function
 # Copy the file instead uf uploading
 obj.copy({'Bucket': self.bucket.name, 'Key': content.obj.key}, ExtraArgs=params)
 return cleaned_name

Of course there's still the possibility that I missed something to configure right with your library but with the own storage implementation it works now for me and it makes sense that those commented out parts actually led to the issue I had. Even if I hadn't done anything else with the file that could cause a download of it (the code above really was all I have done to that point with it).

Regarding your PS: I can't control it too far but I think the same GEO should be doable. Regardless it should only be an issue when initially uploading the file and later basically only URL signing should be quick. There should be no "user facing IO" happening.

Thanks again for your answer, help and of course the neat library!

Best,
Christian

Comment options

Hi @drakon, interesting. That's an excellent find, maybe we should include your solution in this library, what do you think? Would you be interested in providing a patch? Best, Joe

Comment options

Hi Joe, yeah, sure, happy to try it. Honestly I'm not quite an experienced OS contributor but happy to try to do it. Just give me some time (I'm now on holidays) and patience when I put up the PR. :)

Comment options

Sure, take your time. Consider this a great start to get into OSS. And, I've been doing this for a while, patience is what you need the most ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
Converted from issue

This discussion was converted from issue #125 on March 06, 2021 11:41.

AltStyle によって変換されたページ (->オリジナル) /