Advanced File Upload System Development

anon95807532 · November 2, 2023, 3:15pm

Hi guys,

I am developing a rather complex file system in Laravel, and I have some high requirements to achieve. I was wondering which solution is the best for maintenance and scalability.

Requirements include support for partial uploads, no filesize limit and the uploaded file has to be trackable by database in the future.

Via a custom API, obtain an upload hash for a file and post file parts to an endpoint using the hash as a reference

Pros:

Supports partial uploads.
Ignore server post_max_size as the file can be consolidated into one big file if needed.
User can switch networks during the upload without losing progress.
Fast upload speeds.

Cons:

Difficult to track the byte ranges during the upload process.
2 Database queries per part upload, one for getting the file by reference, and 2nd for getting the uploaded byte ranges using another table.
Heavy resource usage on the server.

Directly upload parts belonging to the same file, and one file at a time.

Pros:

Supports partial uploads.
Ignore server post_max_size limit as the file can be consolidated into one big file if needed.
Current uploading file hash can be stored in session instead of the database
Less programming effort on tracking byte ranges

Cons:

Slower upload as multiple files cannot be uploaded at once.
User cannot switch network during the upload.

Just post the file along with form submissions, the traditional way.

Pros:

Super easy to code.
Much less resource usage on the server as the file upload succeeds with the form post.
No need to keep track of byte ranges.

Cons:

Subject to absolute post_max_size limits.
Does not support partial upload
Slow form submissions
Does not support ajax

The project is hosting on a premium hosting so free hosting limits do not apply. Which one should I go with? Thanks.

Method 1
Method 2
Method 3

0 voters

Feel free to suggest more methods if I didn’t cover them.

Cheers!

Admin · November 2, 2023, 5:00pm

This reminds me of how object storage like Amazon S3 and OpenStack Swift handle large files. They don’t tell you exactly how it works, but the API is specific enough that you can deduce how the backend probably works.

https://docs.openstack.org/swift/latest/overview_large_objects.html

The similarities in both is that the entire file is not assembled on the server. Instead, if you want to download the full file, you simply look up the list of parts, and stream them in order. For object storage, that’s the whole point. For your use case, you can decide whether it’s more efficient to always serve the file from it’s parts or assemble the file when the upload is completed. It depends on the size of the parts and the final files I suppose.

There is little to no bookkeeping on the individual parts, you just have the client upload parts with an ordering key and stream the parts in that order (numerical or lexicographical). Obtaining the list of parts could be as simple as listing the files in a directory.

In any cases, you don’t need to care about byte ranges, the parts can simply be streamed in order without caring about the size of individual parts.

I suppose that in this particular case, I would go for an API that’s mostly like the Amazon API, with a clear “begin upload” and “finalize upload”, so you can keep track of which file you’re uploading and when it’s available to download. I would store the uploads in progress in the database, so it’s easier to track and clean up incomplete uploads if the user stops the upload before the upload is finalized.

system · November 17, 2023, 5:01pm

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.