sindresorhus/hasha

Do you want to work on this issue?

You can request for a bounty in order to promote it!

What would be a recommended concurrency for hashing multiple files asynchronously? #35

papb posted onGitHub

It would be nice to have in readme a suggestion of what concurrency to use if I want to hash multiple files concurrently. I don't have knowledge on this to even guess.

Thank you!


That really depends on the machine (disk type/speed and number of cores), Node.js version, and file size though. But maybe there's like a safe number. Something that always saturates all CPU cores, but doesn't overload the system. This will need some manual experimentation. I don't have an answer for this. I would guess something like the number of CPU cores times 2-4.

posted by sindresorhus over 4 years ago

I would guess something like the number of CPU cores times 2-4.

const concurrency = require("os").cpus().length * 3
posted by Richienb over 4 years ago

This will depend on machine, file size, and algorithm as @sindresorhus has suggested. However, you're unlikely to see significant benefits beyond require('os').cpus().length since these algorithms are CPU-intensive. Note that my test machine has 6 physical cores and that the os module returns a logical processor count instead (hyperthreading). Below you will find a benchmark implementation and results in support of require('os').cpus().length:

// node_modules zipped
BENCHMARK:  50 FILES @ 10 ITERATIONS    FILE SIZE: 27877036
Concurrency: 1          Total: 56791 ms     Average: 5679.1 ms      Cores: 12
Concurrency: 6          Total: 24740 ms     Average: 2474 ms        Cores: 12
Concurrency: 12         Total: 24488 ms     Average: 2448.8 ms      Cores: 12
Concurrency: 24         Total: 21875 ms     Average: 2187.5 ms      Cores: 12
Concurrency: 36         Total: 21975 ms     Average: 2197.5 ms      Cores: 12
Concurrency: 48         Total: 21945 ms     Average: 2194.5 ms      Cores: 12
// random jpg
BENCHMARK:  50 FILES @ 10 ITERATIONS    FILE SIZE: 549590
Concurrency: 1          Total: 1292 ms      Average: 129.1 ms       Cores: 12
Concurrency: 6          Total: 506 ms       Average: 50.6 ms        Cores: 12
Concurrency: 12         Total: 490 ms       Average: 49 ms  Cores: 12
Concurrency: 24         Total: 482 ms       Average: 48.2 ms        Cores: 12
Concurrency: 36         Total: 471 ms       Average: 47.1 ms        Cores: 12
Concurrency: 48         Total: 473 ms       Average: 47.3 ms        Cores: 12
const logicalProcessors = require('os').cpus().length;
const fs = require('fs');
const {default: PQueue} = require('p-queue');
const hasha = require('.');

// Update w/ your files
const files = [];
for (let i = 1; i <= 50; i++) {
    files.push(`./fixtures/${i}.zip`);
}

const stats = fs.statSync(files[0]);

const count = 10;

const benchmark = async concurrency => {
    const queue = new PQueue({concurrency});
    let total = 0;
    const benchmarkStart = Date.now();
    for (let i = 0; i < count; i++) {
        const iterationStart = Date.now();
        await queue.addAll(files.map(f => () => hasha.fromFile(f)));
        const iterationEnd = Date.now();
        total += (iterationEnd - iterationStart);
    }

    const benchmarkEnd = Date.now();
    console.log(`Concurrency: ${queue.concurrency}\t\tTotal: ${benchmarkEnd - benchmarkStart} ms\tAverage: ${total / count} ms\tCores: ${logicalProcessors}`);
};

const run = async () => {
    console.log(`BENCHMARK:  ${files.length} FILES @ ${count} ITERATIONS\tFILE SIZE: ${stats.size}`);
    await benchmark(1);
    await benchmark(logicalProcessors / 2);
    await benchmark(logicalProcessors);
    await benchmark(logicalProcessors * 2);
    await benchmark(logicalProcessors * 3);
    await benchmark(logicalProcessors * 4);
};

run();
posted by brandon93s over 4 years ago

@brandon93s Cool, thank you!

posted by papb over 4 years ago

Fund this Issue

$0.00
Funded
Only logged in users can fund an issue

Pull requests