Fix uploading large amounts of data #85

meinemitternacht · 2018-08-21T16:31:51Z

When uploading large amounts of data (> 5 GB), libs3 refuses the upload with an error. This pull request increases the single-object limit to 5 TB, which brings it in line with S3's object limit.

After increasing that limit, a new bug was discovered in growbuffer_append(). The XML produced by put_object() was being truncated if more than one growbuffer was allocated.

If there are any questions about this pull request, please let me know.

Following AWS S3's increased single-object size limit, allow the upload of data > 5 GB

This fixes the following error encountered when uploading large files: ERROR: ErrorMalformedXML Message: The XML you provided was not well-formed or did not validate against our published schema. This becomes a problem after a previous patch increases the size limit for a single object, causing more than one growbuffer to be allocated for holding the XML.

Increasing the chunk size avoids an issue where too many parts are uploaded, triggering an error when calling the S3 CompleteMultipartUpload API command. The current limit on the number of parts per upload is 10,000 according to the S3 documentation (https://docs.aws.amazon.com/AmazonS3/latest/dev/qfacts.html)

bji · 2019-04-07T16:49:29Z

Rejecting due to no response to my question.

meinemitternacht · 2019-04-07T17:24:51Z

@bji There is no question to respond to, unless it is visible only to your account.

bji · 2019-04-08T06:10:09Z

OK sorry, not sure why the question I asked is not visible. I will try to make it so.

bji · 2018-08-21T17:24:27Z

src/s3.c

 static int growbuffer_append(growbuffer **gb, const char *data, int dataLen)
 {
-    int toCopy = 0 ;
+    int origDataLen = dataLen;


I don't quite understand what this fixes. Unless the function comment is wrong, this function only needs to return nonzero if it completed the copy successfully, it doesn't need to return the number of bytes actually copied.

Before your change, it would always run each iteration of the while loop while dataLen was larger than 0, which meant that toCopy would always be assigned to a value greater than 0 on line 451, which means that when the loop terminated, it would always return the nonzero value that toCopy most recently held.

Your change doesn't materially change this fact, it just changes what the actual value is (instead of being the size of the most recent copy, it's the original size of the data to be copied). But it shouldn't matter which exact value was returned from this function according to the function comments.

So what is the point of this change?

The return value for this function is used by s3.c::put_object() when completing a multipart upload. Perhaps the function comment should be updated.

The value that was being returned before was too small, causing the XML string to be truncated when it was sent to AWS. This fixes that issue.

bji · 2019-04-08T06:11:03Z

I clicked the 'begin review' button which I guess I neglected to click before. Can you see my question now?

meinemitternacht · 2019-04-09T22:11:29Z

Yes, thank you. I have replied to your question above.

bjitivo · 2019-04-09T22:39:24Z

Yes, multipart upload was added by another contributor some time after that function was made and the comment is out of date. Thanks for your change.

bjitivo · 2019-04-09T22:39:35Z

Oops wrong account.

meinemitternacht added 3 commits August 21, 2018 12:22

Increase maximum object size

f9c5fb8

Following AWS S3's increased single-object size limit, allow the upload of data > 5 GB

bji closed this Apr 7, 2019

bji reopened this Apr 8, 2019

bji reviewed Apr 8, 2019

View reviewed changes

bji merged commit 287e4be into bji:master Apr 9, 2019

meinemitternacht deleted the meinemitternacht-patch-1 branch April 10, 2019 12:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix uploading large amounts of data #85

Fix uploading large amounts of data #85

Uh oh!

meinemitternacht commented Aug 21, 2018

Uh oh!

bji commented Apr 7, 2019

Uh oh!

meinemitternacht commented Apr 7, 2019 •

edited

Loading

Uh oh!

bji commented Apr 8, 2019

Uh oh!

bji Aug 21, 2018

Uh oh!

meinemitternacht Apr 9, 2019

Uh oh!

meinemitternacht Apr 9, 2019 •

edited

Loading

Uh oh!

bji commented Apr 8, 2019

Uh oh!

meinemitternacht commented Apr 9, 2019

Uh oh!

bjitivo commented Apr 9, 2019

Uh oh!

bjitivo commented Apr 9, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix uploading large amounts of data #85

Fix uploading large amounts of data #85

Uh oh!

Conversation

meinemitternacht commented Aug 21, 2018

Uh oh!

bji commented Apr 7, 2019

Uh oh!

meinemitternacht commented Apr 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bji commented Apr 8, 2019

Uh oh!

bji Aug 21, 2018

Choose a reason for hiding this comment

Uh oh!

meinemitternacht Apr 9, 2019

Choose a reason for hiding this comment

Uh oh!

meinemitternacht Apr 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bji commented Apr 8, 2019

Uh oh!

meinemitternacht commented Apr 9, 2019

Uh oh!

bjitivo commented Apr 9, 2019

Uh oh!

bjitivo commented Apr 9, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

meinemitternacht commented Apr 7, 2019 •

edited

Loading

meinemitternacht Apr 9, 2019 •

edited

Loading