summaryrefslogtreecommitdiff
path: root/apt-pkg/tagfile.cc
Commit message (Collapse)AuthorAgeFilesLines
* Parse records including empty tag names correctlyDavid Kalnischkies2020-02-261-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No sensible file should include these, but even insensible files do not gain unfair advantages with it as this parser does not deal with security critical files before they haven't passed other checks like signatures or hashsums. The problem is that the parser accepts and parses empty tag names correctly, but does not store the data parsed which will effect later passes over the data resulting e.g. in the following tag containing the name and value of the previous (empty) tag, its own tagname and its own value or a crash due to an attempt to access invalid memory depending on who passes over the data and what is done with it. This commit fixes both, the incidient of the crash reported by Anatoly Trosinenko who reproduced it via apt-sortpkgs: | $ cat /tmp/Packages-null | 0: | PACKAGE:0 | | : | PACKAGE: | | PACKAGE:: | $ apt-sortpkgs /tmp/Packages-null and the deeper parsing issue shown by the included testcase. Reported-By: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> References: 8710a36a01c0cb1648926792c2ad05185535558e
* tagfile: Check out-of-bounds access to Tags vectorJulian Andres Klode2020-02-201-0/+8
| | | | | Check that the index we're going to use is within the size of the array.
* tagfile: Check if memchr() returned null before usingJulian Andres Klode2020-02-201-1/+6
| | | | | This fixes a segmentation fault trying to read from nullptr+1, aka address 1.
* Follow gcc-9 -Wnoexcept suggestion for FileChunk constructorDavid Kalnischkies2019-04-161-1/+1
| | | | | | | | warning: but ‘pkgTagFilePrivate::FileChunk::FileChunk(bool, size_t)’ does not throw; perhaps it should be declared ‘noexcept’ [-Wnoexcept] Reported-By: gcc-9 Gbp-Dch: Ignore
* tagfile: Remove deprecated pkgUserTagSection and TFRewriteJulian Andres Klode2019-02-261-146/+2
|
* Step over empty sections in TagFiles with commentsDavid Kalnischkies2019-02-011-2/+6
| | | | | | | | Implementing a parser with recursion isn't the best idea, but in practice we should get away with it for the time being to avoid needless codechurn. Closes: #920317 #921037
* Remove obsolete RCS keywordsGuillem Jover2018-05-071-1/+0
| | | | Prompted-by: Jakub Wilk <jwilk@debian.org>
* Reformat and sort all includes with clang-formatJulian Andres Klode2017-07-121-5/+5
| | | | | | | | | | | | | This makes it easier to see which headers includes what. The changes were done by running git grep -l '#\s*include' \ | grep -E '.(cc|h)$' \ | xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/' To modify all include lines by adding a space, and then running ./git-clang-format.sh.
* fix various typos reported by spellintianDavid Kalnischkies2017-01-191-1/+1
| | | | | | | | Most of them in (old) code comments. The two instances of user visible string changes the po files of the manpages are fixed up as well. Gbp-Dch: Ignore Reported-By: spellintian
* TagSection: Introduce functions for looking up by key idsJulian Andres Klode2016-11-221-10/+82
| | | | | Introduce a new enum class and add functions that can do a lookup with that enum class. This uses triehash.
* TagSection: Extract Find() methods taking Pos instead of KeyJulian Andres Klode2016-11-221-20/+54
| | | | | This allows us to add a perfect hash function to the tag file without having to reimplement the methods a second time.
* TagSection: Split AlphaIndexes into AlphaIndexes and BetaIndexesJulian Andres Klode2016-11-221-10/+12
| | | | | | | | | Move the use of the AlphaHash to a new second hash table in preparation for the arrival of the new perfect hash function. With the new perfect hash function hashing most of the keys for us, having 128 slots for a fallback hash function seems enough and prevents us from wasting space.
* TagFile: Fix off-by-one errors in comment strippingJulian Andres Klode2016-08-311-2/+2
| | | | | | | | | | | | | | | | | | | Adding 1 to the value of d->End - current makes restLength one byte too long: If we pass memchr(current, ..., restLength) has thus undefined behavior. Also, reading the value of current has undefined behavior if current >= d->End, not only for current > d->End: Consider a string of length 1, that is d->End = d->Current + 1. We can only read at d->Current + 0, but d->Current + 1 is beyond the end of the string. This probably caused several inexplicable build failures on hurd-i386 in the past, and just now caused a build failure on Ubuntu's amd64 builder. Reported-By: valgrind
* Switch performance critical code to use APT::StringViewJulian Andres Klode2016-01-071-21/+25
| | | | | | This improves performance of the cache generation on my ARM platform (4x Cortex A15) by about 10% to 20% from 2.35-2.50 to 2.1 seconds.
* add optional support for comments in pkgTagFileDavid Kalnischkies2016-01-021-41/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | APT usually deals with perfectly formatted files generated automatically be other programs – and as it has to parse multiple MBs of such files it tries to be fast rather than forgiving. This was always a problem if we reused this parser for files with a deb822 syntax which are mostly written by hand however, like apt_preferences or the deb822-style sources as these can include stray newlines and more importantly comments all over the place. As a stopgap we had pkgUserTagSection which deals at least with comments before and after a given stanza, but comments in between weren't really supported and now that we support parsing debian/control for e.g. build-dep we face the full comment problem e.g. with comments inbetween multi-line fields (like Build-Depends). We can't easily deal with this on the pkgTagSection level as the interface gives access to 'raw' char-pointers for performance reasons so we would need to optionally add a buffer here on which we could remove comments to hand out pointers into this buffer instead. The interface is quite large already and supports writing stanzas as well, which does not support comments at all either. So while in future it might make sense to have a parser setup which deals with and keeps comments in this commit we opt for the simpler solution for now: We officially declare that pkgTagSection does not support comments and instead expect the caller to deal with them, which in our case is pkgTagFile: pkgTagFile is extended with an additional mode which can deal with comments by dropping them from the buffer which will later form the input of pkgTagSection. The actual implementation is slightly more complex than this sentence suggests at first on one hand to have good performance and on the other to allow jumping directly to stanzas with offsets collected in a previous run (like our cache generation does it for example).
* pkgTagSection::Scan: Fix read of uninitialized valueJulian Andres Klode2015-12-291-1/+1
| | | | | We ignored the boundary of the buffer we were reading in while scanning for spaces.
* deal with empty values properly in deb822 parserDavid Kalnischkies2015-12-271-1/+3
| | | | | | | | Regression introduced in 8710a36a01c0cb1648926792c2ad05185535558e, but such fields are unlikely in practice as it is just as simple to not have a field at all with the same result of not having a value. Closes: 808102
* Convert most callers of isspace() to isspace_ascii()Julian Andres Klode2015-12-271-9/+9
| | | | | This converts all callers that read machine-generated data, callers that might work with user input are not converted.
* tagfile: Hardcode error message for out of range integer valuesJulian Andres Klode2015-12-141-4/+3
| | | | | | This makes the test suite work on 32 bit-long platforms. Gbp-Dch: ignore
* policy: Be more strict about parsing pin files, and document prio 0Julian Andres Klode2015-08-121-1/+8
| | | | | | Treat invalid pin priorities and overflows as an error. Closes: #429912
* use a smaller type for flags storage in the cacheDavid Kalnischkies2015-08-101-0/+28
| | | | | | | | We store very few flags in the cache, so keeping storage space for 8 is enough for all of them and still leaves a few unused bits remaining for future extensions without wasting bytes for nothing. Git-Dch: Ignore
* remove the compatibility markers for 4.13 abiDavid Kalnischkies2015-08-101-28/+0
| | | | | | | | We aren't and we will not be really compatible again with the previous stable abi, so lets drop these markers (which never made it into a released version) for good as they have outlived their intend already. Git-Dch: Ignore
* bring back deb822 sources.list entries as .sourcesDavid Kalnischkies2015-08-101-0/+8
| | | | | | | | | | | | | | | | | | Having two different formats in the same file is very dirty and causes external tools to fail hard trying to parse them. It is probably not a good idea for them to parse them in the first place, but they do and we shouldn't break them if there is a better way. So we solve this issue for now by giving our deb822 format a new filename extension ".sources" which unsupporting applications are likely to ignore an can begin gradually moving forward rather than waiting for the unknown applications to catch up. Currently and for the forseeable future apt is going to support both with the same feature set as documented in the manpage, with the longtime plan of adopting the 'new' format as default, but that is a long way to go and might get going more from having an easier time setting options than from us pushing it explicitely.
* fix memory leaks reported by -fsanitizeDavid Kalnischkies2015-08-101-3/+10
| | | | | | | | Various small leaks here and there. Nothing particularily big, but still good to fix. Found by the sanitizers while running our testcases. Reported-By: gcc -fsanitize Git-Dch: Ignore
* make all d-pointer * const pointersDavid Kalnischkies2015-08-101-22/+24
| | | | | | | | | | | | | | Doing this disables the implicit copy assignment operator (among others) which would cause hovac if used on the classes as it would just copy the pointer, not the data the d-pointer points to. For most of the classes we don't need a copy assignment operator anyway and in many classes it was broken before as many contain a pointer of some sort. Only for our Cacheset Container interfaces we define an explicit copy assignment operator which could later be implemented to copy the data from one d-pointer to the other if we need it. Git-Dch: Ignore
* apply various style suggestions by cppcheckDavid Kalnischkies2015-08-101-1/+1
| | | | | | | Some of them modify the ABI, but given that we prepare a big one already, these few hardly count for much. Git-Dch: Ignore
* implement a more c++-style TFRewrite alternativeDavid Kalnischkies2015-05-111-2/+146
| | | | | | | | | | TFRewrite is okay, but it has obscure limitations (256 Tags), even more obscure bugs (order for renames is defined by the old name) and the interface is very c-style encouraging bad usage like we do it in apt-ftparchive passing massive amounts of c_str() from std::string in. The old-style is marked as deprecated accordingly. The next commit will fix all places in the apt code to not use the old-style anymore.
* sync TFRewrite*Order arrays with dpkg and dakDavid Kalnischkies2015-05-111-60/+3
| | | | | | | | | | dpkg and dak know various field names and order them in their output, while we have yet another order and have to play catch up with them as we are sitting between chairs here and neither order is ideal for us, too. A little testcase is from now on supposed to help ensureing that we do not derivate to far away from which fields dpkg knows and orders.
* properly implement pkgRecord::Parser for *.deb filesDavid Kalnischkies2015-03-161-3/+3
| | | | | | | | | Implementing FileName() works for most cases for us, but other frontends might need more and even for us its not very stable as the normal Jump() implementation is pretty bad on a deb file and produce errors on its own at times. So, replacing this makeshift with a complete implementation by mostly just shuffling code around.
* restore ABI of pkgTagSectionDavid Kalnischkies2014-11-081-30/+75
| | | | | | | | We have a d-pointer available here, so go ahead and use it which also helps in hidding some dirty details here. The "hard" part is keeping the abi for the inlined methods so that they don't break – at least not more than before as much of the point beside a speedup is support for more than 256 fields in a single section.
* explicit overload methods instead of adding parametersDavid Kalnischkies2014-11-081-0/+6
| | | | | | | | Adding a new parameter (with a default) is an ABI break, but you can overload a method, which is "just" an API break for everyone doing references to this method (aka: nobody). Git-Dch: Ignore
* guard const-ification API changesDavid Kalnischkies2014-11-081-0/+4
| | | | Git-Dch: Ignore
* do not inline virtual destructors with d-pointersDavid Kalnischkies2014-10-131-0/+2
| | | | | | | | | | | | | | Reimplementing an inline method is opening a can of worms we don't want to open if we ever want to us a d-pointer in those classes, so we do the only thing which can save us from hell: move the destructors into the cc sources and we are good. Technically not an ABI break as the methods inline or not do the same (nothing), so a program compiled against the old version still works with the new version (beside that this version is still in experimental, so nothing really has been build against this library anyway). Git-Dch: Ignore
* Merge branch 'debian/sid' into debian/experimentalMichael Vogt2014-09-231-1/+1
|\ | | | | | | | | | | | | | | | | Conflicts: apt-pkg/acquire-item.cc apt-pkg/acquire-item.h apt-pkg/cachefilter.h configure.ac debian/changelog
| * Ensure that iTFRewritePackageOrder is "MD5sum" to match apt-ftparchiveMichael Vogt2014-09-211-1/+1
| | | | | | | | | | | | | | The iTFRewritePackageOrder is used in indexcopy to copy and normalize cdrom Packages files. This change will ensure that there is no "normalization" that changes MD5sum -> MD5Sum which alters the hash of the Packages file on disk (oh the irony).
* | Add APT::Acquire::$(host)::By-Hash=1 knob, add Acquire-By-Hash to Release fileMichael Vogt2014-05-221-0/+11
| | | | | | | | | | | | | | The by-hash can be configured on a per-hostname basis and a Release file can indicate that it has by-hash support via a new flag. The location of the hash now matches the AptByHash spec
* | improve pkgTagSection scanning and parsingDavid Kalnischkies2014-05-101-78/+131
| | | | | | | | | | | | Removes the 256 fields limit, deals consistently with spaces littered all over the place and is even a tiny bit faster than before. Even comes with a bunch of new tests to validate these claims.
* | add support for apt-get build-dep foo.dscMichael Vogt2014-04-221-0/+11
|/
* follow method attribute suggestions by gccDavid Kalnischkies2014-03-131-1/+1
| | | | | Git-Dch: Ignore Reported-By: gcc -Wsuggest-attribute={pure,const,noreturn}
* cleanup headers and especially #includes everywhereDavid Kalnischkies2014-03-131-0/+2
| | | | | | | | Beside being a bit cleaner it hopefully also resolves oddball problems I have with high levels of parallel jobs. Git-Dch: Ignore Reported-By: iwyu (include-what-you-use)
* warning: type qualifiers ignored on function return type [-Wignored-qualifiers]David Kalnischkies2014-03-131-1/+1
| | | | | Reported-By: gcc -Wignored-qualifiers Git-Dch: Ignore
* pkgTagFile: if we have seen the end, do not try to see moreDavid Kalnischkies2014-01-301-1/+5
| | | | | | | | | | Asking for more via Step() will notice that we are done with the file already and will result in a fail, which means we can't find the last sections anymore (which is especially painful if we haven't moved at all as in the testcase we haven't even looked at one of the sources leading to a strange behaviour) Reported-By: Niall Walsh <niallwalsh@users.berlios.de>
* "apt show" show user friendly size infoMichael Vogt2014-01-221-41/+43
| | | | | | The size/installed-size is displayed via SizeToStr() and Size is rewriten to "Download-Size" to make clear what size is refered to here.
* make /etc/apt/preferences parser deal with comment only sectionsMichael Vogt2013-12-211-2/+9
|
* do not trust FileFd::Eof() in pkgTagFile::Fill()David Kalnischkies2013-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | The Eof check was added (by me of course) in 0aae6d14390193e25ab6d0fd49295bd7b131954f as part of a fix up ~a month ago (at DebConf). The idea was not that bad, but doesn't make that much sense either as this bit is set by the FileFd based on Actual as well, so this is basically doing the same check again – with the difference that the HitEof bit can still linger from a previous Read we did at the end of the file, but have seek'd away from it now. Combined with the length of entries, entry order and other not that easily controllable conditions you can be 'lucky' enough to hit this problem in a way which even visible (truncating of other fields might not be visible easily, like 'Tags' and others). Closes: 723705 Thanks: Cyril Brulebois
* do chdir("/") after chroot()Michael Vogt2013-08-221-1/+1
|
* Merge remote-tracking branch 'mvo/bugfix/coverity' into debian/sidMichael Vogt2013-08-221-0/+10
|\ | | | | | | | | Conflicts: apt-pkg/tagfile.h
| * memset() pkgTagSections data to make coverity happyMichael Vogt2013-08-061-0/+10
| |
* | use malloc instead of new[] in pkgTagFileDavid Kalnischkies2013-08-151-15/+15
| | | | | | | | | | | | | | | | We don't need initialized memory for pkgTagFile, but more to the point we can use realloc this way which hides the bloody details of increasing the size of the buffer used. Git-Dch: Ignore
* | ensure that pkgTagFile isn't writing past Buffer lengthDavid Kalnischkies2013-08-151-9/+24
|/ | | | | | | | | | | | | | | | | | | | | In 91c4cc14d3654636edf997d23852f05ad3de4853 I removed the +256 from the pkgTagFile call parsing Release files as I couldn't find a mentioning of a reason for why and it was marked as XXX which suggested that at least someone else was suspicious. It turns out that it is indeed "documented", it just didn't found it at first but the changelog of apt 0.6.6 (29. Dec 2003) mentions: * Restore the ugly hack I removed from indexRecords::Load which set the pkgTagFile buffer size to (file size)+256. This is concealing a bug, but I can't fix it right now. This should fix the segfaults that folks are seeing with 0.6.[45]. The bug it is "hiding" is that if pkgTagFile works with a file which doesn't end in a double newline it will be adding it without checking if the Buffer is big enough to store them. Its also not a good idea to let the End pointer be past the end of our space, even if we don't access the data. Closes: 719629