sindresorhus/trash

Improve performance on Linux #79

anishmittal2020 posted onGitHub

$ sed -e '/^+.*/p' a > b
$ time rm b

real    0m0.361s
user    0m0.161s
sys     0m0.027s
$ sed -e '/^+.*/p' a > b
$ time \rm b

real    0m0.002s
user    0m0.000s
sys     0m0.002s
$ type rm
rm is aliased to trash

$ ls -a -l
-rw-rw-r--  1 nikhil 179K Jan 25 11:27 b

What operating system (+ version) and what Node.js version?

posted by sindresorhus over 6 years ago

$ node --version v10.15.0

Ubuntu 18.4

$ trash --version 1.4.0

posted by anishmittal2020 over 6 years ago

Ok, so there are multiple reasons why it's slower:

  1. It's not doing the same thing as rm. rm just deletes files, trash has to actually move the files to a different location on your hard drive. That's much slower.
  2. rm is created in C and trash is created in Node.js.
  3. While we have tried to make the Linux implementation as fast as possible, I don't use Linux, so it's not a priority.

I'm gonna keep this issue open as "help wanted" to improve performance on Linux.

posted by sindresorhus over 6 years ago

May be we could send the process to the background.

posted by anishmittal2020 over 6 years ago

You can easily do that yourself: alias trash="trash &".

posted by sindresorhus over 6 years ago

@issuehunt has funded $80.00 to this issue.


posted by IssueHuntBot almost 6 years ago

From my tests:

  • baseline for deleting one file ~170ms
  • applying some simple optimizations brings it down ~20ms
  • removing meow brings it down ~35ms
  • removing update-notifier brings it down ~25ms

So, all in all, should be easier to achieve 2 times faster time. Also, none of optimizations are related to linux directly, they affect all platforms.

Another thing to consider is that most of the time is spent in long chains of requires, because there is a lot of really small modules involved instead of bigger ones.

posted by stroncium almost 6 years ago

removing meow brings it down ~35ms

I don't want to do this, but I'm already planning to optimize meow: https://github.com/sindresorhus/meow/issues/67 https://github.com/sindresorhus/meow/issues/104

removing update-notifier brings it down ~25ms

šŸ‘


However, this issue was meant to investigate whether there's a way to optimize the actual Linux deletion code.

posted by sindresorhus almost 6 years ago

@sindresorhus I don't remember the numbers(plus there is some trickery in profiling short-lived processes), but most of the time is spent initializing node and requiring for simple case (trash one-file), to the point where if you use chrome profiler you won't even see where actual work starts unless you zoom quite a lot.

As for more complex scenarios with multiple files and directories, I don't expect there is much to do, except some place at the beginning where I've seen sync code instead of async, but with much work to do further down the line that shouldn't have much impact.

posted by stroncium almost 6 years ago

Did you try profiling on a directory with many subdirectories and thousands of files? That's a better real-world benchmark and what we should optimize for.

posted by sindresorhus almost 6 years ago

My debug using console.time in 4.14.52-1-MANJARO, file with ~801Mb. image

UPDATE: Full log

posted by TiagoDanin almost 6 years ago

I actually have some results, and am preparing to make PR.

@sindresorhus Long story short, the fix includes using procfs. After writing 10th implementation of that stuff and working on optimizing it yet again for the 10th time, I looked around for libraries to access procfs but none of them seem to do such things, only parse some selected areas. So, naturally, I decided to stop writing same code again and again and at the moment I'm close to releasing well optimized library which is able to work with a huge part of procfs including the stuff needed for trash. The library is also optimized for require times, so shouldn't be too heavy to include. Would you mind me using it here(after I release it and you review it if you want)?

posted by stroncium almost 6 years ago

The library is also optimized for require times, so shouldn't be too heavy to include. Would you mind me using it here(after I release it and you review it if you want)?

Great! More than happy to use it here.

posted by sindresorhus almost 6 years ago

@sindresorhus has rewarded $72.00 to @stroncium. See it on IssueHunt

  • :moneybag: Total deposit: $80.00
  • :tada: Repository reward(0%): $0.00
  • :wrench: Service fee(10%): $8.00
posted by issuehunt-app[bot] over 5 years ago

Fund this Issue

$80.00
Rewarded

Rewarded pull request

Recent activities

stroncium was rewarded by sindresorhus for sindresorhus/trash# 79
over 5 years ago
stroncium submitted an output to  sindresorhus/ trash# 79
over 5 years ago