summaryrefslogtreecommitdiff
path: root/apt-pkg/deb/deblistparser.cc
Commit message (Collapse)AuthorAgeFilesLines
* Allow no spaces for the last dependency in ParseDepends, tooDavid Kalnischkies2024-04-171-1/+1
| | | | | | | | | | | | | | | | | | | | All other entries in a dependency line get substantial leeway about the amount of spaces surrounding the entry itself and its individual parts, but the very last entry was required to have a version constraint be at least 4 chars long (excluding opening bracket and spaces following it), so if the version is short and a single-char relation used a space had to make up for it. This is a bit unfair in comparison to the other entries who do not have such unreasonable demands, so we reduce our demand to 3 chars or longer, which is satisfied by "=1)". If it is a good idea to hate spaces that much remains unanswered by this commit, but in practice most tools (re)writing the files we parse will include spaces, so its only in files (or on the satisfy command line) directly edited by users that we can encounter such a situation, which is a relatively new development given this line came unchanged from the introduction of this method in 1998. LP: #2061834
* Parse unsupported != relation in dependenciesDavid Kalnischkies2024-03-071-1/+11
| | | | | | | | | | | | | | | | | libapt has a NotEquals relation for version constraints in dependencies, which is used internally e.g. in the MultiArch implementation, but this relation is not supported by Debian policy and as such can not be used in packages. Our parser here is extremely accepting, even unknown relations are parsed as Equals relation – but the version that must match will be a rather strange one… For our own testcases and e.g. on the command line with 'satisfy' it can make sense to have != available… and what strange things apt does parsing unsupported relations is not really much of a concern. Real packages will not have such relations anyhow as we are (mostly) just a consumer, not a producer of packages and index files.
* Modernize standard library includesJulian Andres Klode2024-02-201-3/+3
| | | | | | This was automated with sed and git-clang-format, and then I had to fix up the top of policy.cc by hand as git-clang-format accidentally indented it by two spaces.
* Stop calculating Description-md5 if missingJulian Andres Klode2023-10-041-19/+1
| | | | | | | | This avoids the rabbit hole of md5 on FIPS systems, and repositories have moved to including the value as well. Also stop validating the field, this can be an arbitrary string as far as we are concerned.
* Use pkgTagSection::Key in more places in src:aptDavid Kalnischkies2022-04-011-11/+10
| | | | | | | | | | The speed critical paths were converted earlier, but the remaining could benefit a tiny bit from this as well especially as we have the facility now available and can therefore brush up the code in various places in the process as well. Also takes the time to add the hidden Exists method advertised in the headers, but previously not implemented.
* Drop support for long obsoleted Suggests alias: OptionalDavid Kalnischkies2022-04-011-4/+1
| | | | | dpkg-dev stopped recognizing it in 2007 (1.14.7) while building packages. The rename itself happened in 1995 (0.93.72).
* Barbarian M-A:allowed don't satisfy :any deps of other archsDavid Kalnischkies2021-09-041-4/+13
| | | | | | | | | | | | | | | | | | | | | What does a M-A:allowed package from non-native/non-foreign architecture provide? If we look at M-A:foreign, such a package satisfies dependencies within its own architecture, but not in other architectures, so the same should apply to :any dependencies on M-A:allowed packages, but we have a problem: While unqualified package names are architecture-specific, the virtual package name qualified with :any is not (see 3addaba1ff). We could of course make it architecture-specific now, but that would introduce many virtual packages for this relatively minor usecase and would reintroduce a need for special display handling. So, we pull a trick here: Barbarian M-A:allowed packages do not provide the architecture-independent :any package anymore, but only a specific one and every :any dependency from a barbarian package is rewritten to an or-group of the specific and the independent :any package. References: 3addaba1ff
* Store size from volatile sources for already installed versionsDavid Kalnischkies2021-06-101-1/+6
| | | | | | | | Volatile sources are parsed after the status file, so if we have a version already installed the size information is not stored, so that a reinstall of said version is refused claiming a broken repository. References: 1412cf51403286e9c040f9f86fd4d8306e62aff2
* Include all translations when building the cacheJulian Andres Klode2021-01-271-1/+1
| | | | | | | | | We do download all translations we ever downloaded, but we don't add all of those to the cache, meaning that if we run update with LANG=C, it might still download your de_DE translation, but it won't insert it into the cache, causing your de_DE user to not get translated messages. LP: #1907850
* Add support for Phased-Update-PercentageJulian Andres Klode2021-01-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for Phased-Update-Percentage by pinning upgrades that are not to be installed down to 1. The output of policy has been changed to add the level of phasing, and documentation has been improved to document how phased updates work. The patch detects if it is running in a chroot, and if so, always includes phased updates, restoring classic apt behavior to avoid behavioral changes on buildd chroots. Various options are added to control this all: * APT::Get::{Always,Never}-Include-Phased-Updates and their legacy update-manager equivalents to always or never include phased updates * APT::Machine-ID can be set to a UUID string to have all machines in a fleet phase the same * Dir::Etc::Machine-ID is weird in that it's default is sort of like ../machine-id, but not really, as ../machine-id would look up $PWD/../machine-id and not relative to Dir::Etc; but it allows you to override the path to machine-id (as opposed to the value) * Dir::Bin::ischroot is the path to the ischroot(1) binary which is used to detect whether we are running in a chroot.
* Determine autoremovable kernels at run-timeJulian Andres Klode2021-01-041-1/+7
| | | | | | | | | | | | | | | | | | | | | | | Our kernel autoremoval helper script protects the currently booted kernel, but it only runs whenever we install or remove a kernel, causing it to protect the kernel that was booted at that point in time, which is not necessarily the same kernel as the one that is running right now. Reimplement the logic in C++ such that we can calculate it at run-time: Provide a function to produce a regular expression that matches all kernels that need protecting, and by changing the default root set function in the DepCache to make use of that expression. Note that the code groups the kernels by versions as before, and then marks all kernel packages with the same version. This optimized version inserts a virtual package $kernel into the cache when building it to avoid having to iterate over all packages in the cache to find the installed ones, significantly improving performance at a minor cost when building the cache. LP: #1615381
* Add basic support for the Protected fieldJulian Andres Klode2020-06-291-0/+2
| | | | This will be mapped to Important for the time being.
* Do not sent our filename-provides trick to EDSP solversDavid Kalnischkies2020-06-141-5/+4
| | | | | | | | | | | | If package is installed via an explicitly given deb file we store the filename as a provides, so that the frontend can request the filename and get the usual "Selected foo instead of foo.deb" message. We do not need to trouble the EDSP solvers with that though as these provides are not valid in various ways and we have already solved the link between commandline and package (and version) for them. Closes: #962741
* Make map_pointer<T> typesafeJulian Andres Klode2020-02-241-2/+2
| | | | | | | | | | | Instead of just using uint32_t, which would allow you to assign e.g. a map_pointer<Version> to a map_pointer<Package>, use our own smarter struct that has strict type checking. We allow creating a map_pointer from a nullptr, and we allow comparing map_pointer to nullptr, which also deals with comparisons against 0 which are often used, as 0 will be implictly converted to nullptr.
* Remove CRC-16 implementationJulian Andres Klode2020-02-181-1/+0
|
* Use a 32-bit djb VersionHash instead of CRC-16Julian Andres Klode2020-02-181-4/+4
|
* Allow querying all binaries built by a source packageJulian Andres Klode2020-01-171-22/+12
| | | | | | This adds a simple way to lookup binaries by a source package, but this adds all binaries into one list, even with different source versions. Be careful.
* Convert users of {MD5,SHA1,SHA256,SHA512}Summation to use HashesJulian Andres Klode2020-01-141-3/+3
| | | | | | | This makes use of the a function GetHashString() that returns the specific hash string. We also need to implement another overload of Add() for signed chars with sizes, so the existing users do not require reinterpret_cast everywhere.
* Avoid extra out-of-cache hash table deduplication for package namesJulian Andres Klode2020-01-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were de-duplicating package name strings in StoreString, but also deduplicating most of them by them being in groups, so we had extra hash table lookups that could be avoided in NewGroup(). To continue deduplicating names across binary packages and source packages, insert groups for source packages as well. This is also a good first step in allowing efficient lookup of packages by source package - we can extend Group later by a list of SourceVersion objects, or alternatively, simply add a by-source chain into pkgCache::Version. This change improves performance by about 10% (913 to 814 ms), while having no significant overhead on the cache size: --- before +++ after @@ -1,7 +1,7 @@ -Total package names: 109536 (2.191 k) -Total package structures: 118689 (4.748 k) +Total package names: 119642 (2.393 k) +Total package structures: 118687 (4.747 k) Normal packages: 83309 - Pure virtual packages: 3365 + Pure virtual packages: 3363 Single virtual packages: 17811 Mixed virtual packages: 1973 Missing: 12231 @@ -10,21 +10,21 @@ Total distinct descriptions: 149291 (3.583 k) Total dependencies: 484135/156650 (12,2 M) Total ver/file relations: 57421 (1.378 k) Total Desc/File relations: 18219 (437 k) -Total Provides mappings: 29963 (719 k) +Total Provides mappings: 29959 (719 k) Total globbed strings: 226993 (5.332 k) Total slack space: 26,8 k -Total space accounted for: 38,1 M +Total space accounted for: 38,3 M Total buckets in PkgHashTable: 50503 - Unused: 5727 - Used: 44776 - Utilization: 88.6601% - Average entries: 2.65073 + Unused: 5728 + Used: 44775 + Utilization: 88.6581% + Average entries: 2.65074 Longest: 60 Shortest: 1 Total buckets in GrpHashTable: 50503 - Unused: 5727 - Used: 44776 - Utilization: 88.6601% - Average entries: 2.44631 - Longest: 10 + Unused: 4649 + Used: 45854 + Utilization: 90.7946% + Average entries: 2.60919 + Longest: 11 Shortest: 1
* Show details about the package with bad ProvidesDavid Kalnischkies2019-07-081-3/+4
| | | | | | | | The error messages say only which package it was trying to provide, but not which package & version tried it which can be misleading as to which package (version) is the offender. References: #930256
* Merge the ParseDepends functionsJulian Andres Klode2019-06-111-35/+5
|
* debListParser: Avoid native arch lookup in ParseDependsJulian Andres Klode2018-12-261-3/+3
| | | | | | | | | | | | | We called low-level ParseDepends without an architecture each time, which means each call looked up the native architecture. Store the native architecture in the class and use that when calling low-level ParseDepends from the high-level ParseDepends(). This improves performance for a cache build from 2.7 to 2.5 seconds for me. Also avoid a call when stripping multiarch, as the native architecture is passed in.
* Remove obsolete RCS keywordsGuillem Jover2018-05-071-1/+0
| | | | Prompted-by: Jakub Wilk <jwilk@debian.org>
* convert various c-style casts to C++-styleDavid Kalnischkies2017-12-131-3/+3
| | | | | | | | | | gcc was warning about ignored type qualifiers for all of them due to the last 'const', so dropping that and converting to static_cast in the process removes the here harmless warning to avoid hidden real issues in them later on. Reported-By: gcc Gbp-Dch: Ignore
* Reformat and sort all includes with clang-formatJulian Andres Klode2017-07-121-9/+9
| | | | | | | | | | | | | This makes it easier to see which headers includes what. The changes were done by running git grep -l '#\s*include' \ | grep -E '.(cc|h)$' \ | xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/' To modify all include lines by adding a space, and then running ./git-clang-format.sh.
* Drop cacheiterators.h includeJulian Andres Klode2017-07-121-1/+0
| | | | | Including cacheiterators.h before pkgcache.h fails because pkgcache.h depends on cacheiterators.h.
* Strip 0: epochs from the version hashJulian Andres Klode2017-06-281-0/+5
| | | | | | | This should fix some issues with dpkg normalizing such values. Suprisingly enough apt treats the Version: field the same, even with epoch vs without, but not when searching, and does not strip the 0: from the output.
* Do not package names representing .dsc/.deb/... filesJulian Andres Klode2017-02-101-2/+13
| | | | | | | | | | | | | | | | | | | | | In the case of build-dep and other commands where a file can be passed we must make sure not to normalize the path name as that can have odd side effects, or well, cause the operation to do nothing. Test for build-dep-file is adjusted to perform the vcard check once as "vcard" and once as "VCard", thus testing that this solves the reported bug. We inline the std::transform() and optimize it a bit to not write anything in the common case (package names are defined to be lowercase, the whole transformation is just for names that should not exist...) to counter the performance hit of the added find() call (it's about 0.15% more instructions than with the existing transform, but we save about 0.67% in writes...). Closes: #854794
* ParseDepends: Support passing the desired architectureNiels Thykier2017-01-021-3/+25
| | | | | | | | | | | This is useful for e.g. Britney, where the Build-Depends would have to be parsed for multiple architectures. With this change, the call can choose the architecture without having to mess with the config. Signed-off-by: Niels Thykier <niels@thykier.net> Closes: #845969 (jak@d.o: made the code compile)
* Do not use MD5SumValue for Description_md5()Julian Andres Klode2016-11-221-11/+7
| | | | | | | | | | | Our profile says we spend about 5% of the time transforming the hex digits into the binary format used by HashsumValue, all for comparing them against the other strings. That makes no sense at all. According to callgrind, this reduces the overall instruction count from 5,3 billion to 5 billion in my example, which roughly matches the 5%.
* debListParser: Micro-optimize AvailableDescriptionLanguages()Julian Andres Klode2016-11-221-8/+7
| | | | | | | | | | | | | Generating a string for each version we see is somewhat inefficient. The problem here is that the Description tag names are longer than 15 byte, and thus require an allocation on the heap, which we should avoid. It seems reasonable that 20 characters works for all languages codes used for archive descriptions, but if not, there's a warning, so we'll catch that. This should improve performance by about 2%.
* Optimize VersionHash() to not need temporary copy of inputJulian Andres Klode2016-11-221-14/+6
| | | | | | | Stop copying stuff, and just parse the bytes one by-one to the newly created AddCRC16Byte. This improves the instruction count for an update run from 720,850,121 to 455,801,749 according to callgrind.
* Introduce tolower_ascii_unsafe() and use it for hashingJulian Andres Klode2016-11-221-1/+1
| | | | | | | This one has some obvious collisions for non-alphabetical characters, like some control characters also hashing to numbers, but we don't really have those, and these are hash functions which are not collision free to begin with.
* debListParser: Convert to use pkgTagSection::Key-based lookupJulian Andres Klode2016-11-221-39/+41
| | | | | | This basically gets rid of 40-50% of the hash table lookups, making things a bit faster that way, and the profiles look far cleaner.
* add hidden config to set packages as Essential/ImportantDavid Kalnischkies2016-11-111-2/+14
| | | | | | | | | | | | | | | | You can pretty much achieve the same with a local dummy package if you want to, but libapt has an inbuilt setting for essential: "apt" which can be overridden with this option as well – it could be helpful in quick tests and what not so adding this alternative shouldn't really hurt much. We aren't going to document them much through as care must be taken in regards to the binary caches as they aren't invalidated by config options alone, so the effects of old settings could still be in them, similar to the other already existing pkgCacheGen option(s). Closes: 767891 Thanks: Anthony Towns for initial patch
* VersionHash: Do not skip too long dependency linesJulian Andres Klode2016-09-181-2/+2
| | | | | | | | | | | | | | | If the dependency line does not contain spaces in the repository but does in the dpkg status file (because dpkg normalized the dependency list), the dpkg line might be longer than the line in the repository. If it now happens to be longer than 1024 characters, it would be skipped, causing the hashes to be out of date. Note that we have to bump the minor cache version again as this changes the format slightly, and we might get mismatches with an older src cache otherwise. Fixes Debian/apt#23
* reinstalling local deb file is no downgradeDavid Kalnischkies2016-07-011-1/+1
| | | | | | | | | | | | If we have a (e.g. locally built) deb file installed and do try to install it again apt complained about this being a downgrade, but it wasn't as it is the very same version… it was just confused into not merging the versions together which looks like a downgrade then. The same size assumption is usually good, but given that volatile files are parsed last (even after the status file) the base assumption no longer holds, but is easy to adept without actually changing anything in practice.
* Fix buffer overflow in debListParser::VersionHash()Julian Andres Klode2016-06-281-2/+6
| | | | | | | | | | | If a package file is formatted in a way that that no space follows a deprecated "<", we would reformat it to "<=" and increase the length of the output by 1, which can break. Under normal circumstances with "<=" this should not be an issue. Closes: #828812
* get group again after potential remap in Source: parseDavid Kalnischkies2016-03-061-1/+3
| | | | | | | | | | | | | | | | Mysteriously segfaults only on i386 for me, but at least one reporter had the same behavior and it makes sense that this is the problem as the parsing of Source: was fixed in 1.2.2 – before the not remapped group was not used. We don't use our usual Dynamic<> trick here as we don't have it in the parser. Its a bit of a layer violation to do this parsing here, but its how it is always was… Until next time with this lovely kind of problem. Closes: 812251 Thanks: Francesco Poli and Marc Haber for testdata.
* convert Version() and Architecture() to APT::StringViewDavid Kalnischkies2016-01-261-14/+14
| | | | | | Part of hidden classes, so conversion is abi-free. Git-Dch: Ignore
* remove unused Description methods in listparsersDavid Kalnischkies2016-01-261-19/+1
| | | | | | | These virtual methods are implemented in hidden classes, so we can drop them without breaking the ABI. Git-Dch: Ignore
* parse version correctly from binary Source fieldDavid Kalnischkies2016-01-261-1/+1
| | | | | | | | | | | In commit a221efc331693f8905da870141756c892911c433 I promoted the source package name and version to the binary cache for faster access by e.g. EDSP, but due to changing the interpretation length to soon we always ignored the version part of the Source field, so that packages ended up having the binary version as source version – which while usually just fine it is wrong for binary rebuilds. Closes: 812492
* treat an empty dependency field just like it doesn't existDavid Kalnischkies2016-01-251-1/+1
| | | | Git-Dch: Ignore
* use APT::StringView for GrabWordDavid Kalnischkies2016-01-151-10/+10
| | | | Git-Dch: Ignore
* fix M-A:foreign provides creation for unknown archsDavid Kalnischkies2016-01-141-0/+4
| | | | | | | | Architectures for packages which do not belong to the native nor a foreign architecture (dubbed barbarian for now) which are marked M-A:foreign still provide in their own architecture even if not for others. Also, other M-A:foreign (and allowed) packages provide in these barbarian architectures.
* debListParser: Convert another ParseDepends to StringViewJulian Andres Klode2016-01-081-2/+2
| | | | | | I overlooked this Gbp-Dch: ignore
* AvailableDescriptionLanguages: Use one string for all iterationsJulian Andres Klode2016-01-081-2/+9
| | | | | | | | | | | Do not create strings within the loop, that creates one string per language and does more work than needed. Instead, reserve enough space at the beginning and assign the prefix, and then resize and append inside the loop. Also call exists with the string itself instead of the c_str(), this means that the lookup uses the size information in the string now and does not have to call strlen() on it.
* Replace compare() == 0 checks with this == other checksJulian Andres Klode2016-01-081-4/+4
| | | | | | | This improves performance, as we now can ignore unequal strings based on their length already. Gbp-Dch: ignore
* Switch performance critical code to use APT::StringViewJulian Andres Klode2016-01-071-44/+72
| | | | | | This improves performance of the cache generation on my ARM platform (4x Cortex A15) by about 10% to 20% from 2.35-2.50 to 2.1 seconds.
* ParseDepends: Mark branches for build-dep parsing as unlikelyJulian Andres Klode2015-12-271-2/+2
| | | | | | We do not see those branches at all during normal mode of operation (that is, during cache generation), so tell the compiler about it.