apt/test/integration/test-pdiff-usage, branch 1.4.3

apt/test/integration/test-pdiff-usage, branch 1.4.3 Debians commandline package manager https://git.kalnischkies.de/apt/atom?h=1.4.3 2017-01-19T14:59:38Z fix various typos reported by spellintian 2017-01-19T14:59:38Z David Kalnischkies david@kalnischkies.de 2017-01-19T14:14:19Z urn:sha1:93cff633a830e222693fc0f3d78e6e534d1126ee Most of them in (old) code comments. The two instances of user visible string changes the po files of the manpages are fixed up as well. Gbp-Dch: Ignore Reported-By: spellintian support compression and by-hash for .diff/Index files 2016-08-17T05:55:46Z David Kalnischkies david@kalnischkies.de 2016-08-16T05:47:44Z urn:sha1:77e274f5ad23d79294f28ecc9868fc6f534214a4 In af81ab9030229b4ce6cbe28f0f0831d4896fda01 by-hash got implemented as a special compression type for our usual index files like Packages. Missing in this scheme was the special .diff/Index index file containing the info about individual patches for this index file. Deriving from the index file class directly we inherent the compression handling infrastructure and in this way also by-hash nearly for free. Closes: #824926 implement generic config fallback for methods 2016-08-10T21:19:44Z David Kalnischkies david@kalnischkies.de 2016-07-31T16:05:56Z urn:sha1:30060442025824c491f58887ca7369f3c572fa57 The https method implemented for a long while now a hardcoded fallback to the same options in http, which, while it works, is rather inflexible if we want to allow the methods to use another name to change their behavior slightly, like apt-transport-tor does to https – most of the diff being s#https#tor#g which then fails to do the full circle fallthrough tor -> https -> http for https sources. With this config infrastructure this could be implemented now. rred: truncate result file before writing to it 2016-07-27T13:52:22Z David Kalnischkies david@kalnischkies.de 2016-07-27T13:52:22Z urn:sha1:0e071dfe205ad21d8b929b4bb8164b008dc7c474 If another file in the transaction fails and hence dooms the transaction we can end in a situation in which a -patched file (= rred writes the result of the patching to it) remains in the partial/ directory. The next apt call will perform the rred patching again and write its result again to the -patched file, but instead of starting with an empty file as intended it will override the content previously in the file which has the same result if the new content happens to be longer than the old content, but if it isn't parts of the old content remain in the file which will pass verification as the new content written to it matches the hashes and if the entire transaction passes the file will be moved the lists/ directory where it might or might not trigger errors depending on if the old content which remained forms a valid file together with the new content. This has no real security implications as no untrusted data is involved: The old content consists of a base file which passed verification and a bunch of patches which all passed multiple verifications as well, so the old content isn't controllable by an attacker and the new one isn't either (as the new content alone passes verification). So the best an attacker can do is letting the user run into the same issue as in the report. Closes: #831762 don't ask server if we have entire file in partial/ 2016-04-25T13:35:52Z David Kalnischkies david@kalnischkies.de 2016-04-07T15:48:17Z urn:sha1:742f67eaede80d2f9b3631d8697ebd63b8f95427 We have this situation in cases were parts of the transaction are refused (e.g. in a hashsum mismatch) and rerun the update (e.g. in the hope that we get a mirror which is synced this time). Previously we would ask the server with an if-range and in the best case recieve a 416 in response (less featureful server might end up giving us the entire file again or we get the wrong file this time giving us a hashsum mismatch…), which is a waste of time if we know already by checking the hashsums that we got the complete and correct file. make random acquire queues work less random 2016-04-25T13:35:52Z David Kalnischkies david@kalnischkies.de 2016-04-06T10:50:26Z urn:sha1:4aa6ebf6d78131416ef173b1ce472f014da25136 Queues feeding workers like rred are created in a random pattern to get a few of them to run in parallel – but if we already have an idling queue we don't need to assign it to a (potentially new) random queue as that saves us the (agruably small) overhead of starting up a new queue, avoids adding jobs to an already busy queue while others idle and as a bonus reduces the size of debug logs a bit. We also keep starting new queues now until we reach our limit before we assign work at random to them, which should give us a more effective utilisation overall compared to potentially adding work to busy queues while we haven't reached our queue limit yet. stop handling items in doomed transactions 2016-04-07T11:48:31Z David Kalnischkies david@kalnischkies.de 2016-04-05T23:08:57Z urn:sha1:38f8704e419ed93f433129e20df5611df6652620 With the previous commit we track the state of transactions, so we can now use our knowledge to avoid processing data for a transaction which was already closed (via an abort in this case). This is needed as multiple independent processes are interacting in the process, so there isn't a simple immediate full-engine stop and it would also be bad to teach each and every item how to check if its manager has failed subordinate and what to do in that case. In the pdiff case, which deals (potentially) with many items during its lifetime e.g. a hashsum mismatch in another file can abort the transaction the file we try to patch via pdiff belongs to. This causes some of the items (which are already done) to be aborted with it, but items still in the process of acquisition continue in the processing and will later try to use all the items together failing in strange ways as cleanup already happened. The chosen solution is to dry up the communication channels instead by ignoring new requests for data acquisition, canceling requests which are not assigned to a queue and not calling Done/Failed on items anymore. This means that e.g. already started or pending (e.g. pipelined) downloads aren't stopped and continue as normal for now, but they remain in partial/ and aren't processed further so the next update command will pick them up and put them to good use while the current process fails updating (for this transaction group) in an orderly fashion. Closes: 817240 Thanks: Barr Detwix & Vincent Lefevre for log files don't use Desc.URI to calculate .diff/Index filenames 2016-03-14T10:47:19Z David Kalnischkies david@kalnischkies.de 2016-03-13T00:02:30Z urn:sha1:b7a1076f18022cbeb7baf4d82ab8bae0f725a573 The URI descibing an item can change via mirrors/redirectors which causes the .diff/Index files to get the wrong names in storage. Git-Dch: Ignore require $(HASH)-Download field in .diff/Index files 2016-03-14T10:47:19Z David Kalnischkies david@kalnischkies.de 2016-03-14T00:09:32Z urn:sha1:4a808deaac462e7714a345dac676c6da294a2ee0 Now that we ignore SHA1-only files it makes sense to require also the provision of hashes for the compressed patches as this was introduced in the same patchset as support for non-SHA1 hashes in the file itself in dak and adding support in other archive creators (if they support pdiffs at all) will likely be in the same batch. The reason for the change itself is simple: If you are 'scared' enough about the security of SHA1, you shouldn't uncompress a file you haven't verified at all – after all, it could be exploiting a bug or a zip bomb. test: remove SHA1 support testing as unsupported 2016-03-14T10:47:18Z David Kalnischkies david@kalnischkies.de 2016-03-13T20:49:37Z urn:sha1:8d0d92558c00d1825e413ce67be51a46a5c18aea Given that we refuse to use SHA1-only .diff/Indexes no point in shipping and running code which pretends to check support for it which given that all these tests are run 3 times eats a noticeable amount of time. Git-Dch: Ignore