Do you want to work on this issue?
You can request for a bounty in order to promote it!
What would be a recommended concurrency for hashing multiple files asynchronously? #35
papb posted onGitHub
It would be nice to have in readme a suggestion of what concurrency to use if I want to hash multiple files concurrently. I don't have knowledge on this to even guess.
Thank you!
That really depends on the machine (disk type/speed and number of cores), Node.js version, and file size though. But maybe there's like a safe number. Something that always saturates all CPU cores, but doesn't overload the system. This will need some manual experimentation. I don't have an answer for this. I would guess something like the number of CPU cores times 2-4.
I would guess something like the number of CPU cores times 2-4.
const concurrency = require("os").cpus().length * 3
This will depend on machine, file size, and algorithm as @sindresorhus has suggested. However, you're unlikely to see significant benefits beyond require('os').cpus().length
since these algorithms are CPU-intensive. Note that my test machine has 6 physical cores and that the os
module returns a logical processor count instead (hyperthreading). Below you will find a benchmark implementation and results in support of require('os').cpus().length
:
// node_modules zipped
BENCHMARK: 50 FILES @ 10 ITERATIONS FILE SIZE: 27877036
Concurrency: 1 Total: 56791 ms Average: 5679.1 ms Cores: 12
Concurrency: 6 Total: 24740 ms Average: 2474 ms Cores: 12
Concurrency: 12 Total: 24488 ms Average: 2448.8 ms Cores: 12
Concurrency: 24 Total: 21875 ms Average: 2187.5 ms Cores: 12
Concurrency: 36 Total: 21975 ms Average: 2197.5 ms Cores: 12
Concurrency: 48 Total: 21945 ms Average: 2194.5 ms Cores: 12
// random jpg
BENCHMARK: 50 FILES @ 10 ITERATIONS FILE SIZE: 549590
Concurrency: 1 Total: 1292 ms Average: 129.1 ms Cores: 12
Concurrency: 6 Total: 506 ms Average: 50.6 ms Cores: 12
Concurrency: 12 Total: 490 ms Average: 49 ms Cores: 12
Concurrency: 24 Total: 482 ms Average: 48.2 ms Cores: 12
Concurrency: 36 Total: 471 ms Average: 47.1 ms Cores: 12
Concurrency: 48 Total: 473 ms Average: 47.3 ms Cores: 12
const logicalProcessors = require('os').cpus().length;
const fs = require('fs');
const {default: PQueue} = require('p-queue');
const hasha = require('.');
// Update w/ your files
const files = [];
for (let i = 1; i <= 50; i++) {
files.push(`./fixtures/${i}.zip`);
}
const stats = fs.statSync(files[0]);
const count = 10;
const benchmark = async concurrency => {
const queue = new PQueue({concurrency});
let total = 0;
const benchmarkStart = Date.now();
for (let i = 0; i < count; i++) {
const iterationStart = Date.now();
await queue.addAll(files.map(f => () => hasha.fromFile(f)));
const iterationEnd = Date.now();
total += (iterationEnd - iterationStart);
}
const benchmarkEnd = Date.now();
console.log(`Concurrency: ${queue.concurrency}\t\tTotal: ${benchmarkEnd - benchmarkStart} ms\tAverage: ${total / count} ms\tCores: ${logicalProcessors}`);
};
const run = async () => {
console.log(`BENCHMARK: ${files.length} FILES @ ${count} ITERATIONS\tFILE SIZE: ${stats.size}`);
await benchmark(1);
await benchmark(logicalProcessors / 2);
await benchmark(logicalProcessors);
await benchmark(logicalProcessors * 2);
await benchmark(logicalProcessors * 3);
await benchmark(logicalProcessors * 4);
};
run();
@brandon93s Cool, thank you!