<feed xmlns='http://www.w3.org/2005/Atom'>
<title>apt/test/integration/test-pdiff-usage, branch 1.5_rc4</title>
<subtitle>Debians commandline package manager</subtitle>
<id>https://git.kalnischkies.de/apt/atom?h=1.5_rc4</id>
<link rel='self' href='https://git.kalnischkies.de/apt/atom?h=1.5_rc4'/>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/'/>
<updated>2017-07-26T17:07:56Z</updated>
<entry>
<title>fail early in http if server answer is too small as well</title>
<updated>2017-07-26T17:07:56Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2017-07-26T16:35:42Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=f2f8e89f08cdf01c83a0b8ab053c65329d85ca90'/>
<id>urn:sha1:f2f8e89f08cdf01c83a0b8ab053c65329d85ca90</id>
<content type='text'>
Failing on too much data is good, but we can do better by checking for
exact filesizes as we know with hashsums how large a file should be, so
if we get a file which has a size we do not expect we can drop it
directly, regardless of if the file is larger or smaller than what we
expect which should catch most cases which would end up as hashsum
errors later now a lot sooner.
</content>
</entry>
<entry>
<title>don't move failed pdiff indexes out of partial</title>
<updated>2017-07-26T17:07:55Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2017-07-23T23:15:55Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=8df85a4fb91bed6c79a3cb9c2000881cc5b42ea7'/>
<id>urn:sha1:8df85a4fb91bed6c79a3cb9c2000881cc5b42ea7</id>
<content type='text'>
The comment says this is intended, but looking at the history reveals
that the comment comes from a different era. Nowadays we don't really
need it anymore (and even back then it was disputeable) as we haven't
used that file for our update in the end and nothing really needs this
file after the update.

Triggered is this by 188f297a2af4c15cb1d502360d1e478644b5b810 which
moves various error conditions forward including this code expecting the
file to exist – but it doesn't need to as download could have failed.
We could fix that by simple checking if the file exists and only stage
it if it does, but instead we don't stage it and instead even rename it
out of the way with our conventional FAILED name (if it exists).

That restores support for partial mirrors (= in this case mirrors which
don't ship pdiff files). Note that apt heals itself even if only such a
mirror is used as the update is successful even if that error is shown.

Closes: 869425
</content>
</entry>
<entry>
<title>show .diff/Index properly as ignored if we fallback</title>
<updated>2017-06-26T21:31:15Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2017-05-28T20:26:17Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=188f297a2af4c15cb1d502360d1e478644b5b810'/>
<id>urn:sha1:188f297a2af4c15cb1d502360d1e478644b5b810</id>
<content type='text'>
Moving the code responsible for parsing the Index file from ::Done into
the slightly earlier ::VerifyDone allows us to still "fail" the download
if we can't make use of the Index for whatever reason, so that the
progress log correctly displays "Ign" instead of "Get" for the file.

This also makes quiet a few debug messages proper error messages (but
those are still hidden by default for Ign lines).
</content>
</entry>
<entry>
<title>fix various typos reported by spellintian</title>
<updated>2017-01-19T14:59:38Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2017-01-19T14:14:19Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=93cff633a830e222693fc0f3d78e6e534d1126ee'/>
<id>urn:sha1:93cff633a830e222693fc0f3d78e6e534d1126ee</id>
<content type='text'>
Most of them in (old) code comments. The two instances of user visible
string changes the po files of the manpages are fixed up as well.

Gbp-Dch: Ignore
Reported-By: spellintian
</content>
</entry>
<entry>
<title>support compression and by-hash for .diff/Index files</title>
<updated>2016-08-17T05:55:46Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-08-16T05:47:44Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=77e274f5ad23d79294f28ecc9868fc6f534214a4'/>
<id>urn:sha1:77e274f5ad23d79294f28ecc9868fc6f534214a4</id>
<content type='text'>
In af81ab9030229b4ce6cbe28f0f0831d4896fda01 by-hash got implemented as a
special compression type for our usual index files like Packages.
Missing in this scheme was the special .diff/Index index file containing
the info about individual patches for this index file. Deriving from the
index file class directly we inherent the compression handling
infrastructure and in this way also by-hash nearly for free.

Closes: #824926
</content>
</entry>
<entry>
<title>implement generic config fallback for methods</title>
<updated>2016-08-10T21:19:44Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-07-31T16:05:56Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=30060442025824c491f58887ca7369f3c572fa57'/>
<id>urn:sha1:30060442025824c491f58887ca7369f3c572fa57</id>
<content type='text'>
The https method implemented for a long while now a hardcoded fallback
to the same options in http, which, while it works, is rather inflexible
if we want to allow the methods to use another name to change their
behavior slightly, like apt-transport-tor does to https – most of the
diff being s#https#tor#g which then fails to do the full circle
fallthrough tor -&gt; https -&gt; http for https sources. With this config
infrastructure this could be implemented now.
</content>
</entry>
<entry>
<title>rred: truncate result file before writing to it</title>
<updated>2016-07-27T13:52:22Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-07-27T13:52:22Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=0e071dfe205ad21d8b929b4bb8164b008dc7c474'/>
<id>urn:sha1:0e071dfe205ad21d8b929b4bb8164b008dc7c474</id>
<content type='text'>
If another file in the transaction fails and hence dooms the transaction
we can end in a situation in which a -patched file (= rred writes the
result of the patching to it) remains in the partial/ directory.

The next apt call will perform the rred patching again and write its
result again to the -patched file, but instead of starting with an empty
file as intended it will override the content previously in the file
which has the same result if the new content happens to be longer than
the old content, but if it isn't parts of the old content remain in the
file which will pass verification as the new content written to it
matches the hashes and if the entire transaction passes the file will be
moved the lists/ directory where it might or might not trigger errors
depending on if the old content which remained forms a valid file
together with the new content.

This has no real security implications as no untrusted data is involved:
The old content consists of a base file which passed verification and a
bunch of patches which all passed multiple verifications as well, so the
old content isn't controllable by an attacker and the new one isn't
either (as the new content alone passes verification). So the best an
attacker can do is letting the user run into the same issue as in the
report.

Closes: #831762
</content>
</entry>
<entry>
<title>don't ask server if we have entire file in partial/</title>
<updated>2016-04-25T13:35:52Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-04-07T15:48:17Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=742f67eaede80d2f9b3631d8697ebd63b8f95427'/>
<id>urn:sha1:742f67eaede80d2f9b3631d8697ebd63b8f95427</id>
<content type='text'>
We have this situation in cases were parts of the transaction are
refused (e.g. in a hashsum mismatch) and rerun the update (e.g. in the
hope that we get a mirror which is synced this time).

Previously we would ask the server with an if-range and in the best case
recieve a 416 in response (less featureful server might end up giving us
the entire file again or we get the wrong file this time giving us a
hashsum mismatch…), which is a waste of time if we know already by
checking the hashsums that we got the complete and correct file.
</content>
</entry>
<entry>
<title>make random acquire queues work less random</title>
<updated>2016-04-25T13:35:52Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-04-06T10:50:26Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=4aa6ebf6d78131416ef173b1ce472f014da25136'/>
<id>urn:sha1:4aa6ebf6d78131416ef173b1ce472f014da25136</id>
<content type='text'>
Queues feeding workers like rred are created in a random pattern to get
a few of them to run in parallel – but if we already have an idling queue
we don't need to assign it to a (potentially new) random queue as that
saves us the (agruably small) overhead of starting up a new queue,
avoids adding jobs to an already busy queue while others idle and as
a bonus reduces the size of debug logs a bit.

We also keep starting new queues now until we reach our limit before
we assign work at random to them, which should give us a more effective
utilisation overall compared to potentially adding work to busy queues
while we haven't reached our queue limit yet.
</content>
</entry>
<entry>
<title>stop handling items in doomed transactions</title>
<updated>2016-04-07T11:48:31Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-04-05T23:08:57Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=38f8704e419ed93f433129e20df5611df6652620'/>
<id>urn:sha1:38f8704e419ed93f433129e20df5611df6652620</id>
<content type='text'>
With the previous commit we track the state of transactions, so we can
now use our knowledge to avoid processing data for a transaction which
was already closed (via an abort in this case).

This is needed as multiple independent processes are interacting in the
process, so there isn't a simple immediate full-engine stop and it would
also be bad to teach each and every item how to check if its manager
has failed subordinate and what to do in that case.

In the pdiff case, which deals (potentially) with many items during its
lifetime e.g. a hashsum mismatch in another file can abort the
transaction the file we try to patch via pdiff belongs to. This causes
some of the items (which are already done) to be aborted with it, but
items still in the process of acquisition continue in the processing and
will later try to use all the items together failing in strange ways as
cleanup already happened.

The chosen solution is to dry up the communication channels instead by
ignoring new requests for data acquisition, canceling requests which are
not assigned to a queue and not calling Done/Failed on items anymore.
This means that e.g. already started or pending (e.g. pipelined)
downloads aren't stopped and continue as normal for now, but they remain
in partial/ and aren't processed further so the next update command will
pick them up and put them to good use while the current process fails
updating (for this transaction group) in an orderly fashion.

Closes: 817240
Thanks: Barr Detwix &amp; Vincent Lefevre for log files
</content>
</entry>
</feed>
