summaryrefslogtreecommitdiff
path: root/methods/rred.cc
Commit message (Collapse)AuthorAgeFilesLines
* Allow merging with empty pdiff patchesDavid Kalnischkies2021-03-061-1/+5
| | | | | | | | There isn't a lot of sense in working on empty patches as they change nothing (quite literally), but they can be the result of merging multiple patches and so to not require our users to specifically detect and remove them, we can be nice and just ignore them instead of erroring out.
* Use error reporting instead of assert in rred patchingDavid Kalnischkies2021-02-041-68/+88
| | | | | | | | | | | The rest of our code uses return-code errors and it isn't that nice to crash the rred method on bad patches anyway, so we move to properly reporting errors in our usual way which incidently also helps writing a fuzzer for it. This is not really security relevant though as the patches passed hash verification while they were downloaded, so an attacker has to overtake a trusted repository first which gives plenty better options of attack.
* Implement encoded URI handling in all methodsDavid Kalnischkies2020-12-181-2/+2
| | | | | | | | Every method opts in to getting the encoded URI passed along while keeping compat in case we are operated by an older acquire system. Effectively this is just a change for the http-based methods as the others just decode the URI as they work with files directly.
* Support compressed output from rred similar to apt-helper cat-filefeature/rredDavid Kalnischkies2020-11-071-2/+34
|
* Support reading compressed patches in rred direct call modesDavid Kalnischkies2020-11-071-1/+1
| | | | | | The acquire system mode does this for a long time already and as it is easy to implement and handy for manual testing as well we can support it in the other modes, too.
* Prepare rred binary for external usageDavid Kalnischkies2020-11-071-45/+86
| | | | | | | | | | | Merging patches is a bit of non-trivial code we have for client-side work, but as we support also server-side merging we can export this functionality so that server software can reuse it. Note that this just cleans up and makes rred behave a bit more like all our other binaries by supporting setting configuration at runtime and supporting --help and --version. If you can make due without this, the now advertised functionality is provided already in earlier versions.
* apt-pkg: URI: Add 'explicit' to single argument constructorJulian Andres Klode2019-04-301-1/+1
| | | | | This needs a fair amount of changes elsewhere in the code, hence this is separate from the previous commits.
* Sandbox methods with seccomp-BPF; except cdrom, gpgv, rshJulian Andres Klode2017-10-221-1/+4
| | | | | | | | | | | | This reduces the number of syscalls to about 140 from about 350 or so, significantly reducing security risks. Also change prepare-release to ignore the architecture lists in the build dependencies when generating the build-depends package for travis. We might want to clean up things a bit more and/or move it somewhere else.
* Reformat and sort all includes with clang-formatJulian Andres Klode2017-07-121-7/+7
| | | | | | | | | | | | | This makes it easier to see which headers includes what. The changes were done by running git grep -l '#\s*include' \ | grep -E '.(cc|h)$' \ | xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/' To modify all include lines by adding a space, and then running ./git-clang-format.sh.
* stop rred from leaking debug messages on recovered errorsDavid Kalnischkies2017-01-191-3/+6
| | | | | | | | rred can fail for a plentory of reasons, but its failure is usually recoverable (Ign lines) so it shouldn't leak unrequested debug messages to an observing user. Closes: #850759
* try not to call memcpy with length 0 in hash calculationsDavid Kalnischkies2016-09-011-4/+5
| | | | | | | | | | memcpy is marked as nonnull for its input, but ignores the input anyhow if the declared length is zero. Our SHA2 implementations do this as well, it was "just" MD5 and SHA1 missing, so we add the length check here as well as along the callstack as it is really pointless to do all these method calls for "nothing". Reported-By: gcc -fsanitize=undefined
* implement generic config fallback for methodsDavid Kalnischkies2016-08-101-1/+1
| | | | | | | | | | The https method implemented for a long while now a hardcoded fallback to the same options in http, which, while it works, is rather inflexible if we want to allow the methods to use another name to change their behavior slightly, like apt-transport-tor does to https – most of the diff being s#https#tor#g which then fails to do the full circle fallthrough tor -> https -> http for https sources. With this config infrastructure this could be implemented now.
* rred: truncate result file before writing to itDavid Kalnischkies2016-07-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | If another file in the transaction fails and hence dooms the transaction we can end in a situation in which a -patched file (= rred writes the result of the patching to it) remains in the partial/ directory. The next apt call will perform the rred patching again and write its result again to the -patched file, but instead of starting with an empty file as intended it will override the content previously in the file which has the same result if the new content happens to be longer than the old content, but if it isn't parts of the old content remain in the file which will pass verification as the new content written to it matches the hashes and if the entire transaction passes the file will be moved the lists/ directory where it might or might not trigger errors depending on if the old content which remained forms a valid file together with the new content. This has no real security implications as no untrusted data is involved: The old content consists of a base file which passed verification and a bunch of patches which all passed multiple verifications as well, so the old content isn't controllable by an attacker and the new one isn't either (as the new content alone passes verification). So the best an attacker can do is letting the user run into the same issue as in the report. Closes: #831762
* verify hash of input file in rredDavid Kalnischkies2016-07-261-16/+41
| | | | | | | | | | | | We read the entire input file we want to patch anyhow, so we can also calculate the hash for that file and compare it with what he had expected it to be. Note that this isn't really a security improvement as a) the file we patch is trusted & b) if the input is incorrect, the result will hardly be matching, so this is just for failing slightly earlier with a more relevant error message (althrough, in terms of rred its ignored and complete download attempt instead).
* use std::locale::global instead of setlocaleDavid Kalnischkies2016-05-281-3/+1
| | | | | | We use a wild mixture of C and C++ ways of generating output, so having a consistent world-view in both styles sounds like a good idea and should help in preventing regressions.
* rred: If there were I/O errors, failJulian Andres Klode2016-02-041-0/+5
| | | | | We basically ignored errors from writing and flushing, let's not do that.
* act on various suggestions from cppcheckDavid Kalnischkies2016-01-261-0/+4
| | | | | Reported-By: cppcheck Git-Dch: Ignore
* allow pdiff bootstrap from all supported compressorsDavid Kalnischkies2016-01-081-2/+2
| | | | | | | There is no reason to enforce that the file we start the bootstrap with is compressed with a compressor which is available online. This allows us to change the on-disk format as well as deals with repositories adding/removing support for a specific compressor.
* rred: Run in parallelJulian Andres Klode2016-01-071-1/+1
| | | | | Remove the SingleInstance flag so we can use the new randomized queue feature to run parallel.
* rred: Use buffered writesJulian Andres Klode2015-12-271-3/+7
| | | | | Buffered writes improve performance a lot, given that we spent about 78% of the time in _write.
* rred: Only call pkgInitConfig() in test modeJulian Andres Klode2015-12-271-2/+2
| | | | | | | This accidentally slipped in in a previous commit, but it should be used only for testing mode. Reported-By: David Kalnischkies <david@kalnischkies.de>
* Convert most callers of isspace() to isspace_ascii()Julian Andres Klode2015-12-271-0/+3
| | | | | This converts all callers that read machine-generated data, callers that might work with user input are not converted.
* rred: Allow passing files as arguments for compressor testingJulian Andres Klode2015-12-261-2/+14
| | | | | | | | This introduces a -t mode in which the first argument is input, the second is output and the remaining are diffs. This allows us to test patching compressed files, which are detected using their file extension.
* apply various suggestions made by cppcheckDavid Kalnischkies2015-11-051-2/+2
| | | | | Reported-By: cppcheck Git-Dch: Ignore
* allow acquire method specific options via Binary scopeDavid Kalnischkies2015-11-051-12/+3
| | | | | | | | Allows users who know what they are getting themselves into with this trick to e.g. disable privilege dropping for e.g. file:// until they can fix up the permissions on those repositories. It helps also the test framework and people with a similar setup (= me) to run in less modified environments.
* avoid using global PendingError to avoid failing too often too soonDavid Kalnischkies2015-09-141-1/+1
| | | | | | | | | | | | | | | | | | | Our error reporting is historically grown into some kind of mess. A while ago I implemented stacking for the global error which is used in this commit now to wrap calls to functions which do not report (all) errors via return, so that only failures in those calls cause a failure to propergate down the chain rather than failing if anything (potentially totally unrelated) has failed at some point in the past. This way we can avoid stopping the entire acquire process just because a single source produced an error for example. It also means that after the acquire process the cache is generated – even if the acquire process had failures – as we still have the old good data around we can and should generate a cache for (again). There are probably more instances of this hiding, but all these looked like the easiest to work with and fix with reasonable (aka net-positive) effects.
* implement PDiff patching for compressed filesDavid Kalnischkies2015-08-281-37/+47
| | | | | | | | | | | | | | | | | | Some additional files like 'Contents' are very big and should therefore kept compressed on the disk, which apt-file did in the past. It also implemented pdiff patching of these files by un- and recompressing these files on-the-fly, with this commit we can do the same – but we can do this in both pdiff patching styles (client and server merging) and secured by hashes. Hashes are in so far slightly complicated as we can't compare the hashes of the compressed files as we might compress them differently than the server would (different compressor versions, options, …), so we must compare the hashes of the uncompressed content. While this commit has changes in public headers, the classes it changes are marked as hidden, so nobody can use them directly, which means the ABI break is internal only.
* add c++11 override marker to overridden methodsDavid Kalnischkies2015-08-101-2/+2
| | | | | | | | | C++11 adds the 'override' specifier to mark that a method is overriding a base class method and error out if not. We hide it in the APT_OVERRIDE macro to ensure that we keep compiling in pre-c++11 standards. Reported-By: clang-modernize -add-override -override-macros Git-Dch: Ignore
* replace ULONG_MAX with c++ style std::numeric_limitsDavid Kalnischkies2015-06-091-2/+2
| | | | | | | For some reason travis seems to be unhappy about it claiming it is not defined. Well, lets not think to deeply about it… Git-Dch: Ignore
* support hashes for compressed pdiff filesDavid Kalnischkies2015-06-091-1/+1
| | | | | | | | At the moment we only have hashes for the uncompressed pdiff files, but via the new '$HASH-Download' field in the .diff/Index hashes can be provided for the .gz compressed pdiff file, which apt will pick up now and use to verify the download. Now, we "just" need a buy in from the creators of repositories…
* add more parsing error checking for rredDavid Kalnischkies2015-06-091-21/+49
| | | | | | | The rred parser is very accepting regarding 'invalid' files. Given that we can't trust the input it might be a bit too relaxed. In any case, checking for more errors can't hurt given that we support only a very specific subset of ed commands.
* check patch hashes in rred worker instead of in the handlerDavid Kalnischkies2015-06-091-10/+52
| | | | | | | | | | | | | | | | rred is responsible for unpacking and reading the patch files in one go, but we currently only have hashes for the uncompressed patch files, so the handler read the entire patch file before dispatching it to the worker which would read it again – both with an implicit uncompress. Worse, while the workers operate in parallel the handler is the central orchestration unit, so having it busy with work means the workers do (potentially) nothing. This means rred is working with 'untrusted' data, which is bad. Yet, having the unpack in the handler meant that the untrusted uncompress was done as root which isn't better either. Now, we have it at least contained in a binary which we can harden a bit better. In the long run, we want hashes for the compressed patch files through to be safe.
* calculate only expected hashes in methodsDavid Kalnischkies2015-04-191-1/+1
| | | | | | | | | | | | | | Methods get told which hashes are expected by the acquire system, which means we can use this list to restrict what we calculate in the methods as any extra we are calculating is wasted effort as we can't compare it with anything anyway. Adding support for a new hash algorithm is therefore 'free' now and if a algorithm is no longer provided in a repository for a file, we automatically stop calculating it. In practice this results in a speed-up in Debian as we don't have SHA512 here (so far), so we practically stop calculating it.
* Assert statement calls a function which may have desired side effects: ↵David Kalnischkies2014-11-081-2/+2
| | | | | | | | | | 'pos_is_okay' It does not have any desired sideeffect, so we just mark it as const to properly advertise this fact to developer, compiler and linter alike. Reported-By: cppcheck Git-Dch: Ignore
* cleanup headers and especially #includes everywhereDavid Kalnischkies2014-03-131-3/+2
| | | | | | | | Beside being a bit cleaner it hopefully also resolves oddball problems I have with high levels of parallel jobs. Git-Dch: Ignore Reported-By: iwyu (include-what-you-use)
* fix -Wformat= warnings about size_t != %lu on e.g. armelDavid Kalnischkies2014-03-131-6/+6
| | | | | Git-Dch: Ignore Reported-By: gcc
* use utimes instead of utimensat/futimensDavid Kalnischkies2014-02-111-4/+5
| | | | | | | | | | | cppcheck complains about the obsolete utime as it was removed in POSIX1.2008 and recommends usage of utimensat/futimens instead as those are in POSIX and so commit 9ce3cfc9 switched to them. It is just that they aren't as portable as the standard suggests: At least our kFreeBSD and Hurd ports stumble over it at runtime. So to make both, the ports and cppcheck happy, we use utimes instead. Closes: 738567
* fix various style/performance warnings in rredDavid Kalnischkies2014-01-301-43/+24
| | | | | Reported-By: cppcheck Git-Dch: Ignore
* methods/rred: minor robustness improvementsAnthony Towns2014-01-211-19/+20
| | | | | Use retry_fwrite to better handle partial fwrite successes, and to keep the Hashes in sync with what's actually written.
* integrate Anthonys rred with POC for client-side mergeDavid Kalnischkies2014-01-151-68/+26
| | | | | | | | | | | | | Providing the benefits of both without the downsides :) (ABI breaks or external dependencies) For this Anthonys rred is equipped with: - magic-filename-pickup of patches rather than explicit messages - use of FileFd instead of FILE* to get on-the-fly uncompress of the gzip compressed pdiff patches The acquire code in turn stops checking for apt-file's helper as our own rred is now clever enough for our needs.
* reimplement rred to allow applying all the diffs in a single passAnthony Towns2014-01-151-600/+684
| | | | | | | | | | | | | | | Based on the idea presented in: https://lists.debian.org/deity/2009/08/msg00169.html and https://lists.debian.org/debian-devel/2014/01/msg00081.html It reads all patches one by one and merges them in-memory before applying the merged changes to the index. Beware: This commit by David Kalnischkies rips out the rred binary rewrite unchanged (expect minor format issue corrections) from the proposed changes, so this commit alone BREAKS pdiff completely. The integration into the acquire system as it was prepared in the previous POC will be done in the next commit to have proper 'blame'.
* implement POC client-side merging of pdiffs via apt-fileDavid Kalnischkies2013-12-131-32/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea of pdiffs is to avoid downloading the hole file by patching the existing index. This works very well, but becomes slow if a lot of patches needs to be applied to reconstruct an up-to-date index and in recent years more and more dinstall (or similar) runs are executed creating more and more pdiffs in the same amount of time, so pdiffs became less useful. The solution is simple: Reduce the amount of patches (which are very small) which need to be applied on top of the index we have available (which is usually pretty big). This can be done in two ways: Either merge the patches on the server-side so that the client has to download only one patch or the patches are all downloaded and merged on the client-side. The first needs a client who is doing one step at a time who can also skip patches if it needs (APT supports this for a long time now). The later is implemented by this commit, but depends on the server NOT merging the patches and the patches being in a strict order in which no patch is skipped. This is traditionally the case for dak, but other repository creators support merging – e.g. reprepro (which helpfully adds a flag indicating that the patches are merged). To support both or even mixes a client needs more information which isn't available for now. This POC uses the external diffindex-rred included in apt-file to do the heavy lifting of merging & applying all patches in one pass, hence to test this feature apt-file needs to be installed.
* we don't need zlib (anymore) in rred so don't include itDavid Kalnischkies2012-05-101-1/+0
|
* make these retry_write methods static so that they don't end up as symbolsDavid Kalnischkies2012-03-221-1/+1
|
* * methods/rred.cc:David Kalnischkies2012-03-201-5/+20
| | | | | | | | | | | - check return of writev() as gcc recommends * methods/mirror.cc: - check return of chdir() as gcc recommends * apt-pkg/deb/dpkgpm.cc: - check return of write() a gcc recommends * apt-inst/deb/debfile.cc: - check return of chdir() as gcc recommends * apt-inst/deb/dpkgdb.cc: - check return of chdir() as gcc recommends
* fix a few esoteric cppcheck errors/warnings/infosDavid Kalnischkies2012-01-201-1/+6
|
* as Size() can be quiet expensive for compressed files lets store the resultDavid Kalnischkies2012-01-101-3/+5
|
* implement the fallback method of rred by using the FileFd and the includedDavid Kalnischkies2011-12-181-37/+12
| | | | ReadLine instead of accessing the files directly with fgets()
* - add a ReadLine methodDavid Kalnischkies2011-12-111-28/+14
| | | - drop the explicit export of gz-compression handling
* enable FileFd to guess the compressor based on the filename if requested orDavid Kalnischkies2011-12-101-1/+1
| | | | | to search for compressed silbings of the given filename and use this guessing instead of hardcoding Gzip compression