So I have a GCP bucket and I have to upload files to it. The issue is I have 10 million files I want to upload into the bucket (each file size is 50kb) and I have a time constraint of 8 hours or fewer. Currently, I am using a Java program (google ref code) and tested it on 1000 images and it uploads each file in about 300 milliseconds, but if I use multi-threading; I have been able to reduce the average time to 40 milliseconds (using 20 threads). I can go up to 60 threads and reduce the time further to 15-20 milliseconds but then also I face 3 problems:
-
20 milliseconds per file isn’t fast enough. I need it to be at least 3 milliseconds or fewer.
-
It throws “com.google.cloud.storage.StorageException: Connect timed out,” exception when I exceed 25 threads.
-
Going beyond 60 threads, the programs don’t seem to get any faster (I am guessing hardware constraint ).