Sounds very 🇸🇪 😉
Arch Linux
The beloved lightweight distro
It very much is (as I even acknowledge at the end of the github README). 😀
Hälsningar från Västerbotten 👋😁
Is it faster than running pacman -Qkk packagename
?
I have only implemented for checking all packages at the current point in time (as that is what I need later on). It could be possible to add support for checking a single package.
Thank you for reminding me of pacman -Qkk
though, I had forgotten it existed.
I just did a test of pacman -Qk
and pacman -Qkk
(with no package, so checking all of them) and paketkoll
is much faster. Based on the man page:
pacman -Qk
only checks file exists. I don't have that option, I always check file properties at least, but have the option to skip checking the file hash if the mtime and size matches (paketkoll --trust-mtime
). Even though I check more in this scenario I'm still about 4x faster.pacman -Qkk
checks checksum as well (similar to plainpaketkoll
). It is unclear to me if pacman will check the checksum if the mtime and size matches.
I can report that paketkoll
handily beats pacman in both scenarios (pacman -Qk
is slower than paketkoll --trust-mtime
, and pacman -Qkk
is much slower than plain paketkoll
). Below are the output of using the hyperfine benchmarking tool:
$ hyperfine -i -N --warmup=1 "paketkoll --trust-mtime" "paketkoll" "pacman -Qk" "pacman -Qkk"
Benchmark 1: paketkoll --trust-mtime
Time (mean ± σ): 246.4 ms ± 7.5 ms [User: 1223.3 ms, System: 1247.7 ms]
Range (min … max): 238.2 ms … 261.7 ms 11 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: paketkoll
Time (mean ± σ): 5.312 s ± 0.387 s [User: 17.321 s, System: 13.461 s]
Range (min … max): 4.907 s … 6.058 s 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 3: pacman -Qk
Time (mean ± σ): 976.7 ms ± 5.0 ms [User: 101.9 ms, System: 873.5 ms]
Range (min … max): 970.3 ms … 984.6 ms 10 runs
Benchmark 4: pacman -Qkk
Time (mean ± σ): 86.467 s ± 0.160 s [User: 53.327 s, System: 16.404 s]
Range (min … max): 86.315 s … 86.819 s 10 runs
Warning: Ignoring non-zero exit code.
It appears that pacman -Qkk
is much slower than paccheck --file-properties --sha256sum
even. I don't know how that is possible!
The above benchmarks were executed on an AMD Ryzen 5600X with 32 GB RAM and an Gen3 NVME SSD. pacman -Syu
executed as of yesterday most recently. Disk cache was hot in between runs for all the tools, that would make the first run a bit slower for all the tools (but not to a large extent on a SSD, I can imagine it would dominate on a mechanical HDD though)
In conclusion:
- When checking just file properties
paketkoll
is 3.96 times faster than pacman checking just if the files exist - When checking checksums
paketkoll
is 16.3 times faster than pacman checking file properties. This is impressive on a 6 core/12 thread CPU. pacman must be doing something exceedingly stupid here (might be worth looking into, perhaps it is checking both sha256sum and md5sum, which is totally unneeded). Compared topaccheck
I see a 7x speedup in that scenario which is more in line with what I would expect.
Damn, that's impressive.
I went ahead and implemented support for filtering packages (just made a new release: v0.1.3).
I am of course still faster. Here are two examples that show a small package (where it doesn't really matter that much) and a huge package (where it makes a massive difference). Excuse the strange paths, this is straight from the development tree.
Lets check on pacman itself, and lets include config files too (not sure if pacman has that option even?). Config files or not doesn't make a measurable difference though:
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll --config-files=include pacman" "pacman -Qkk pacman"
Benchmark 1: ./target/release/paketkoll --config-files=include pacman
Time (mean ± σ): 14.0 ms ± 0.2 ms [User: 21.1 ms, System: 19.0 ms]
Range (min … max): 13.4 ms … 14.5 ms 216 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: pacman -Qkk pacman
Time (mean ± σ): 20.2 ms ± 0.2 ms [User: 11.2 ms, System: 8.8 ms]
Range (min … max): 19.9 ms … 21.1 ms 147 runs
Summary
./target/release/paketkoll --config-files=include pacman ran
1.44 ± 0.02 times faster than pacman -Qkk pacman
Lets check on davici-resolve as well. Which is massive (5.89 GB):
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll --config-files=include pacman davinci-resolve" "pacman -Qkk pacman davinci-resolve"
Benchmark 1: ./target/release/paketkoll --config-files=include pacman davinci-resolve
Time (mean ± σ): 770.8 ms ± 4.3 ms [User: 2891.2 ms, System: 641.5 ms]
Range (min … max): 765.8 ms … 778.7 ms 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: pacman -Qkk pacman davinci-resolve
Time (mean ± σ): 10.589 s ± 0.018 s [User: 9.371 s, System: 1.207 s]
Range (min … max): 10.550 s … 10.620 s 10 runs
Warning: Ignoring non-zero exit code.
Summary
./target/release/paketkoll --config-files=include pacman davinci-resolve ran
13.74 ± 0.08 times faster than pacman -Qkk pacman davinci-resolve
What about a some midsized packages (vtk 359 MB, linux 131 MB)?
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll vtk" "pacman -Qkk vtk"
Benchmark 1: ./target/release/paketkoll vtk
Time (mean ± σ): 46.4 ms ± 0.6 ms [User: 204.9 ms, System: 93.4 ms]
Range (min … max): 45.7 ms … 48.8 ms 65 runs
Benchmark 2: pacman -Qkk vtk
Time (mean ± σ): 702.7 ms ± 4.4 ms [User: 590.0 ms, System: 109.9 ms]
Range (min … max): 698.6 ms … 710.6 ms 10 runs
Summary
./target/release/paketkoll vtk ran
15.15 ± 0.23 times faster than pacman -Qkk vtk
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll linux" "pacman -Qkk linux"
Benchmark 1: ./target/release/paketkoll linux
Time (mean ± σ): 34.9 ms ± 0.3 ms [User: 95.0 ms, System: 78.2 ms]
Range (min … max): 34.2 ms … 36.4 ms 84 runs
Benchmark 2: pacman -Qkk linux
Time (mean ± σ): 313.9 ms ± 0.4 ms [User: 233.6 ms, System: 79.8 ms]
Range (min … max): 313.4 ms … 314.5 ms 10 runs
Summary
./target/release/paketkoll linux ran
9.00 ± 0.09 times faster than pacman -Qkk linux
For small sizes where neither tool performs much work, the majority is spent on fixed overheads that both tools have (loading the binary, setting up glibc internals, parsing the command line arguments, etc). For medium sizes paketkoll pulls ahead quite rapidly. And for large sizes pacman is painfully slow.
Just for laughs I decided to check an empty meta-package (base, 0 bytes). Here pacman actually beats paketkoll, slightly. Not a useful scenario, but for full transparency I should include it:
$ hyperfine -i -N --warmup 1 "./target/release/paketkoll base" "pacman -Qkk base"
Benchmark 1: ./target/release/paketkoll base
Time (mean ± σ): 13.3 ms ± 0.2 ms [User: 15.3 ms, System: 18.8 ms]
Range (min … max): 12.8 ms … 14.1 ms 218 runs
Benchmark 2: pacman -Qkk base
Time (mean ± σ): 8.8 ms ± 0.2 ms [User: 2.8 ms, System: 5.8 ms]
Range (min … max): 8.4 ms … 10.0 ms 327 runs
Summary
pacman -Qkk base ran
1.52 ± 0.05 times faster than ./target/release/paketkoll base
I always start a threadpool regardless of if I have work to do (and changing that would slow the case I actually care about). That is the most likely cause of this slightly larger fixed overhead.