<feed xmlns='http://www.w3.org/2005/Atom'>
<title>apt/apt-pkg/pkgcachegen.h, branch 2.7.12</title>
<subtitle>Debians commandline package manager</subtitle>
<id>https://git.kalnischkies.de/apt/atom?h=2.7.12</id>
<link rel='self' href='https://git.kalnischkies.de/apt/atom?h=2.7.12'/>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/'/>
<updated>2023-08-02T10:04:32Z</updated>
<entry>
<title>Compare SHA256 to check if versions are really the same</title>
<updated>2023-08-02T10:04:32Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2023-08-01T11:59:09Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=5576e7f76da73f3f5217f90d816cc19b6c0a5a77'/>
<id>urn:sha1:5576e7f76da73f3f5217f90d816cc19b6c0a5a77</id>
<content type='text'>
If we know both SHA256, and they're different, the packages are. This
approach stores the SHA256 only at runtime, avoiding the overhead of
storing it on-disk, because when we update repositories we update all
of them anyhow.

Note that pkgCacheGenerator is hidden, so we can just modify its
ABI, hooray.

Closes: #931175
LP: #2029268
</content>
</entry>
<entry>
<title>Avoid undefined pointer arithmetic while growing mmap</title>
<updated>2021-02-04T10:00:00Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2020-06-08T15:07:43Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=f7e6eaf84bebac565f462e2ce48f30808cc771eb'/>
<id>urn:sha1:f7e6eaf84bebac565f462e2ce48f30808cc771eb</id>
<content type='text'>
The undefined behaviour sanitizer complains with:
runtime error: addition of unsigned offset to 0x… overflowed to 0x…

Compilers and runtime do the right thing in any case and it is a
codepath that can (and ideally should) be avoided for speed reasons
alone, but fixing it can't hurt (too much).
</content>
</entry>
<entry>
<title>Do not require libxxhash-dev for including pkgcachegen.h</title>
<updated>2020-12-17T15:20:48Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-12-17T15:17:12Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=ece7f5bb0afee0994a4fb4380e756ce725fe67a9'/>
<id>urn:sha1:ece7f5bb0afee0994a4fb4380e756ce725fe67a9</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Use XXH3 for cache, hash table hashing</title>
<updated>2020-12-15T12:47:22Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-12-13T20:07:03Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=1460eebf2abe913df964e031eff081a57f043697'/>
<id>urn:sha1:1460eebf2abe913df964e031eff081a57f043697</id>
<content type='text'>
XXH3 is faster than both our CRC32c implementation as well
as DJB hash for hash table hashing, so meh, let's switch to
it.
</content>
</entry>
<entry>
<title>Make map_pointer&lt;T&gt; typesafe</title>
<updated>2020-02-24T17:29:07Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-02-24T16:46:10Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=4fad7262291a8af1415fb9a3693678bd9610f0d6'/>
<id>urn:sha1:4fad7262291a8af1415fb9a3693678bd9610f0d6</id>
<content type='text'>
Instead of just using uint32_t, which would allow you to
assign e.g. a map_pointer&lt;Version&gt; to a map_pointer&lt;Package&gt;,
use our own smarter struct that has strict type checking.

We allow creating a map_pointer from a nullptr, and we allow
comparing map_pointer to nullptr, which also deals with comparisons
against 0 which are often used, as 0 will be implictly converted
to nullptr.
</content>
</entry>
<entry>
<title>Wrap AllocateInMap with a templated version</title>
<updated>2020-02-24T17:09:49Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-02-24T17:09:49Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=1f4e2ab7462f5e05e452fb8505185895d91651c2'/>
<id>urn:sha1:1f4e2ab7462f5e05e452fb8505185895d91651c2</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Replace map_pointer_t with map_pointer&lt;T&gt;</title>
<updated>2020-02-24T16:08:34Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-02-24T16:08:34Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=c3587c0d9de852eca11d9bbc004095d54115eda4'/>
<id>urn:sha1:c3587c0d9de852eca11d9bbc004095d54115eda4</id>
<content type='text'>
This is a first step to a type safe cache, adding typing
information everywhere. Next, we'll replace map_pointer&lt;T&gt;
implementation with a type safe one.
</content>
</entry>
<entry>
<title>Use a 32-bit djb VersionHash instead of CRC-16</title>
<updated>2020-02-18T12:39:26Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-01-17T13:34:45Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=8c10048dce06ee0f160c86a6df07f0e6d2c34242'/>
<id>urn:sha1:8c10048dce06ee0f160c86a6df07f0e6d2c34242</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Remove includes of (md5|sha1|sha2).h headers</title>
<updated>2020-01-14T12:10:36Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-01-07T20:21:35Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=8c1a37e12790a23f3b132899485e011f9134b483'/>
<id>urn:sha1:8c1a37e12790a23f3b132899485e011f9134b483</id>
<content type='text'>
Remove it everywhere, except where it is still needed.
</content>
</entry>
<entry>
<title>Avoid extra out-of-cache hash table deduplication for package names</title>
<updated>2020-01-08T10:13:27Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>julian.klode@canonical.com</email>
</author>
<published>2020-01-08T10:03:28Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=6902792898a9fcc3bdff605e2097e6a5cd2d6bbc'/>
<id>urn:sha1:6902792898a9fcc3bdff605e2097e6a5cd2d6bbc</id>
<content type='text'>
We were de-duplicating package name strings in StoreString, but also
deduplicating most of them by them being in groups, so we had extra
hash table lookups that could be avoided in NewGroup().

To continue deduplicating names across binary packages and source
packages, insert groups for source packages as well. This is also
a good first step in allowing efficient lookup of packages by source
package - we can extend Group later by a list of SourceVersion objects,
or alternatively, simply add a by-source chain into pkgCache::Version.

This change improves performance by about 10% (913 to 814 ms), while
having no significant overhead on the cache size:

--- before
+++ after
@@ -1,7 +1,7 @@
-Total package names: 109536 (2.191 k)
-Total package structures: 118689 (4.748 k)
+Total package names: 119642 (2.393 k)
+Total package structures: 118687 (4.747 k)
   Normal packages: 83309
-  Pure virtual packages: 3365
+  Pure virtual packages: 3363
   Single virtual packages: 17811
   Mixed virtual packages: 1973
   Missing: 12231
@@ -10,21 +10,21 @@ Total distinct descriptions: 149291 (3.583 k)
 Total dependencies: 484135/156650 (12,2 M)
 Total ver/file relations: 57421 (1.378 k)
 Total Desc/File relations: 18219 (437 k)
-Total Provides mappings: 29963 (719 k)
+Total Provides mappings: 29959 (719 k)
 Total globbed strings: 226993 (5.332 k)
 Total slack space: 26,8 k
-Total space accounted for: 38,1 M
+Total space accounted for: 38,3 M
 Total buckets in PkgHashTable: 50503
-  Unused: 5727
-  Used: 44776
-  Utilization: 88.6601%
-  Average entries: 2.65073
+  Unused: 5728
+  Used: 44775
+  Utilization: 88.6581%
+  Average entries: 2.65074
   Longest: 60
   Shortest: 1
 Total buckets in GrpHashTable: 50503
-  Unused: 5727
-  Used: 44776
-  Utilization: 88.6601%
-  Average entries: 2.44631
-  Longest: 10
+  Unused: 4649
+  Used: 45854
+  Utilization: 90.7946%
+  Average entries: 2.60919
+  Longest: 11
   Shortest: 1
</content>
</entry>
</feed>
