apt/test/integration/test-pdiff-usage, branch 1.5

apt/test/integration/test-pdiff-usage, branch 1.5_rc4 Debians commandline package manager https://git.kalnischkies.de/apt/atom?h=1.5_rc4 2017-07-26T17:07:56Z fail early in http if server answer is too small as well 2017-07-26T17:07:56Z David Kalnischkies david@kalnischkies.de 2017-07-26T16:35:42Z urn:sha1:f2f8e89f08cdf01c83a0b8ab053c65329d85ca90 Failing on too much data is good, but we can do better by checking for exact filesizes as we know with hashsums how large a file should be, so if we get a file which has a size we do not expect we can drop it directly, regardless of if the file is larger or smaller than what we expect which should catch most cases which would end up as hashsum errors later now a lot sooner. don't move failed pdiff indexes out of partial 2017-07-26T17:07:55Z David Kalnischkies david@kalnischkies.de 2017-07-23T23:15:55Z urn:sha1:8df85a4fb91bed6c79a3cb9c2000881cc5b42ea7 The comment says this is intended, but looking at the history reveals that the comment comes from a different era. Nowadays we don't really need it anymore (and even back then it was disputeable) as we haven't used that file for our update in the end and nothing really needs this file after the update. Triggered is this by 188f297a2af4c15cb1d502360d1e478644b5b810 which moves various error conditions forward including this code expecting the file to exist – but it doesn't need to as download could have failed. We could fix that by simple checking if the file exists and only stage it if it does, but instead we don't stage it and instead even rename it out of the way with our conventional FAILED name (if it exists). That restores support for partial mirrors (= in this case mirrors which don't ship pdiff files). Note that apt heals itself even if only such a mirror is used as the update is successful even if that error is shown. Closes: 869425 show .diff/Index properly as ignored if we fallback 2017-06-26T21:31:15Z David Kalnischkies david@kalnischkies.de 2017-05-28T20:26:17Z urn:sha1:188f297a2af4c15cb1d502360d1e478644b5b810 Moving the code responsible for parsing the Index file from ::Done into the slightly earlier ::VerifyDone allows us to still "fail" the download if we can't make use of the Index for whatever reason, so that the progress log correctly displays "Ign" instead of "Get" for the file. This also makes quiet a few debug messages proper error messages (but those are still hidden by default for Ign lines). fix various typos reported by spellintian 2017-01-19T14:59:38Z David Kalnischkies david@kalnischkies.de 2017-01-19T14:14:19Z urn:sha1:93cff633a830e222693fc0f3d78e6e534d1126ee Most of them in (old) code comments. The two instances of user visible string changes the po files of the manpages are fixed up as well. Gbp-Dch: Ignore Reported-By: spellintian support compression and by-hash for .diff/Index files 2016-08-17T05:55:46Z David Kalnischkies david@kalnischkies.de 2016-08-16T05:47:44Z urn:sha1:77e274f5ad23d79294f28ecc9868fc6f534214a4 In af81ab9030229b4ce6cbe28f0f0831d4896fda01 by-hash got implemented as a special compression type for our usual index files like Packages. Missing in this scheme was the special .diff/Index index file containing the info about individual patches for this index file. Deriving from the index file class directly we inherent the compression handling infrastructure and in this way also by-hash nearly for free. Closes: #824926 implement generic config fallback for methods 2016-08-10T21:19:44Z David Kalnischkies david@kalnischkies.de 2016-07-31T16:05:56Z urn:sha1:30060442025824c491f58887ca7369f3c572fa57 The https method implemented for a long while now a hardcoded fallback to the same options in http, which, while it works, is rather inflexible if we want to allow the methods to use another name to change their behavior slightly, like apt-transport-tor does to https – most of the diff being s#https#tor#g which then fails to do the full circle fallthrough tor -> https -> http for https sources. With this config infrastructure this could be implemented now. rred: truncate result file before writing to it 2016-07-27T13:52:22Z David Kalnischkies david@kalnischkies.de 2016-07-27T13:52:22Z urn:sha1:0e071dfe205ad21d8b929b4bb8164b008dc7c474 If another file in the transaction fails and hence dooms the transaction we can end in a situation in which a -patched file (= rred writes the result of the patching to it) remains in the partial/ directory. The next apt call will perform the rred patching again and write its result again to the -patched file, but instead of starting with an empty file as intended it will override the content previously in the file which has the same result if the new content happens to be longer than the old content, but if it isn't parts of the old content remain in the file which will pass verification as the new content written to it matches the hashes and if the entire transaction passes the file will be moved the lists/ directory where it might or might not trigger errors depending on if the old content which remained forms a valid file together with the new content. This has no real security implications as no untrusted data is involved: The old content consists of a base file which passed verification and a bunch of patches which all passed multiple verifications as well, so the old content isn't controllable by an attacker and the new one isn't either (as the new content alone passes verification). So the best an attacker can do is letting the user run into the same issue as in the report. Closes: #831762 don't ask server if we have entire file in partial/ 2016-04-25T13:35:52Z David Kalnischkies david@kalnischkies.de 2016-04-07T15:48:17Z urn:sha1:742f67eaede80d2f9b3631d8697ebd63b8f95427 We have this situation in cases were parts of the transaction are refused (e.g. in a hashsum mismatch) and rerun the update (e.g. in the hope that we get a mirror which is synced this time). Previously we would ask the server with an if-range and in the best case recieve a 416 in response (less featureful server might end up giving us the entire file again or we get the wrong file this time giving us a hashsum mismatch…), which is a waste of time if we know already by checking the hashsums that we got the complete and correct file. make random acquire queues work less random 2016-04-25T13:35:52Z David Kalnischkies david@kalnischkies.de 2016-04-06T10:50:26Z urn:sha1:4aa6ebf6d78131416ef173b1ce472f014da25136 Queues feeding workers like rred are created in a random pattern to get a few of them to run in parallel – but if we already have an idling queue we don't need to assign it to a (potentially new) random queue as that saves us the (agruably small) overhead of starting up a new queue, avoids adding jobs to an already busy queue while others idle and as a bonus reduces the size of debug logs a bit. We also keep starting new queues now until we reach our limit before we assign work at random to them, which should give us a more effective utilisation overall compared to potentially adding work to busy queues while we haven't reached our queue limit yet. stop handling items in doomed transactions 2016-04-07T11:48:31Z David Kalnischkies david@kalnischkies.de 2016-04-05T23:08:57Z urn:sha1:38f8704e419ed93f433129e20df5611df6652620 With the previous commit we track the state of transactions, so we can now use our knowledge to avoid processing data for a transaction which was already closed (via an abort in this case). This is needed as multiple independent processes are interacting in the process, so there isn't a simple immediate full-engine stop and it would also be bad to teach each and every item how to check if its manager has failed subordinate and what to do in that case. In the pdiff case, which deals (potentially) with many items during its lifetime e.g. a hashsum mismatch in another file can abort the transaction the file we try to patch via pdiff belongs to. This causes some of the items (which are already done) to be aborted with it, but items still in the process of acquisition continue in the processing and will later try to use all the items together failing in strange ways as cleanup already happened. The chosen solution is to dry up the communication channels instead by ignoring new requests for data acquisition, canceling requests which are not assigned to a queue and not calling Done/Failed on items anymore. This means that e.g. already started or pending (e.g. pipelined) downloads aren't stopped and continue as normal for now, but they remain in partial/ and aren't processed further so the next update command will pick them up and put them to good use while the current process fails updating (for this transaction group) in an orderly fashion. Closes: 817240 Thanks: Barr Detwix & Vincent Lefevre for log files