<feed xmlns='http://www.w3.org/2005/Atom'>
<title>apt/test/integration/test-pdiff-usage, branch 1.3_exp3</title>
<subtitle>Debians commandline package manager</subtitle>
<id>https://git.kalnischkies.de/apt/atom?h=1.3_exp3</id>
<link rel='self' href='https://git.kalnischkies.de/apt/atom?h=1.3_exp3'/>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/'/>
<updated>2016-04-25T13:35:52Z</updated>
<entry>
<title>don't ask server if we have entire file in partial/</title>
<updated>2016-04-25T13:35:52Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-04-07T15:48:17Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=742f67eaede80d2f9b3631d8697ebd63b8f95427'/>
<id>urn:sha1:742f67eaede80d2f9b3631d8697ebd63b8f95427</id>
<content type='text'>
We have this situation in cases were parts of the transaction are
refused (e.g. in a hashsum mismatch) and rerun the update (e.g. in the
hope that we get a mirror which is synced this time).

Previously we would ask the server with an if-range and in the best case
recieve a 416 in response (less featureful server might end up giving us
the entire file again or we get the wrong file this time giving us a
hashsum mismatch…), which is a waste of time if we know already by
checking the hashsums that we got the complete and correct file.
</content>
</entry>
<entry>
<title>make random acquire queues work less random</title>
<updated>2016-04-25T13:35:52Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-04-06T10:50:26Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=4aa6ebf6d78131416ef173b1ce472f014da25136'/>
<id>urn:sha1:4aa6ebf6d78131416ef173b1ce472f014da25136</id>
<content type='text'>
Queues feeding workers like rred are created in a random pattern to get
a few of them to run in parallel – but if we already have an idling queue
we don't need to assign it to a (potentially new) random queue as that
saves us the (agruably small) overhead of starting up a new queue,
avoids adding jobs to an already busy queue while others idle and as
a bonus reduces the size of debug logs a bit.

We also keep starting new queues now until we reach our limit before
we assign work at random to them, which should give us a more effective
utilisation overall compared to potentially adding work to busy queues
while we haven't reached our queue limit yet.
</content>
</entry>
<entry>
<title>stop handling items in doomed transactions</title>
<updated>2016-04-07T11:48:31Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-04-05T23:08:57Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=38f8704e419ed93f433129e20df5611df6652620'/>
<id>urn:sha1:38f8704e419ed93f433129e20df5611df6652620</id>
<content type='text'>
With the previous commit we track the state of transactions, so we can
now use our knowledge to avoid processing data for a transaction which
was already closed (via an abort in this case).

This is needed as multiple independent processes are interacting in the
process, so there isn't a simple immediate full-engine stop and it would
also be bad to teach each and every item how to check if its manager
has failed subordinate and what to do in that case.

In the pdiff case, which deals (potentially) with many items during its
lifetime e.g. a hashsum mismatch in another file can abort the
transaction the file we try to patch via pdiff belongs to. This causes
some of the items (which are already done) to be aborted with it, but
items still in the process of acquisition continue in the processing and
will later try to use all the items together failing in strange ways as
cleanup already happened.

The chosen solution is to dry up the communication channels instead by
ignoring new requests for data acquisition, canceling requests which are
not assigned to a queue and not calling Done/Failed on items anymore.
This means that e.g. already started or pending (e.g. pipelined)
downloads aren't stopped and continue as normal for now, but they remain
in partial/ and aren't processed further so the next update command will
pick them up and put them to good use while the current process fails
updating (for this transaction group) in an orderly fashion.

Closes: 817240
Thanks: Barr Detwix &amp; Vincent Lefevre for log files
</content>
</entry>
<entry>
<title>don't use Desc.URI to calculate .diff/Index filenames</title>
<updated>2016-03-14T10:47:19Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-03-13T00:02:30Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=b7a1076f18022cbeb7baf4d82ab8bae0f725a573'/>
<id>urn:sha1:b7a1076f18022cbeb7baf4d82ab8bae0f725a573</id>
<content type='text'>
The URI descibing an item can change via mirrors/redirectors which
causes the .diff/Index files to get the wrong names in storage.

Git-Dch: Ignore
</content>
</entry>
<entry>
<title>require $(HASH)-Download field in .diff/Index files</title>
<updated>2016-03-14T10:47:19Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-03-14T00:09:32Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=4a808deaac462e7714a345dac676c6da294a2ee0'/>
<id>urn:sha1:4a808deaac462e7714a345dac676c6da294a2ee0</id>
<content type='text'>
Now that we ignore SHA1-only files it makes sense to require also the
provision of hashes for the compressed patches as this was introduced in
the same patchset as support for non-SHA1 hashes in the file itself in
dak and adding support in other archive creators (if they support pdiffs
at all) will likely be in the same batch.

The reason for the change itself is simple: If you are 'scared' enough
about the security of SHA1, you shouldn't uncompress a file you haven't
verified at all – after all, it could be exploiting a bug or a zip bomb.
</content>
</entry>
<entry>
<title>test: remove SHA1 support testing as unsupported</title>
<updated>2016-03-14T10:47:18Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-03-13T20:49:37Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=8d0d92558c00d1825e413ce67be51a46a5c18aea'/>
<id>urn:sha1:8d0d92558c00d1825e413ce67be51a46a5c18aea</id>
<content type='text'>
Given that we refuse to use SHA1-only .diff/Indexes no point in shipping
and running code which pretends to check support for it which given that
all these tests are run 3 times eats a noticeable amount of time.

Git-Dch: Ignore
</content>
</entry>
<entry>
<title>Test that SHA1-only .diff/Index files are not used</title>
<updated>2016-03-13T12:05:30Z</updated>
<author>
<name>Julian Andres Klode</name>
<email>jak@debian.org</email>
</author>
<published>2016-03-13T12:05:30Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=f345d0571d055c2cd5da3a9e423753f1ac21a9aa'/>
<id>urn:sha1:f345d0571d055c2cd5da3a9e423753f1ac21a9aa</id>
<content type='text'>
Ensure that .diff/Index files that only contain SHA1 values and no
SHA2 values are not used.
</content>
</entry>
<entry>
<title>do not move not-failed pdiff-patches into CWD on failure</title>
<updated>2016-03-06T11:57:38Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-03-06T11:03:34Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=dfcf7f356b790338f0a3e9df3c5d6f159814fe53'/>
<id>urn:sha1:dfcf7f356b790338f0a3e9df3c5d6f159814fe53</id>
<content type='text'>
If a single pdiff fails, we have to fail the entire patching endeavour
and fall back to getting the complete file instead. That is easy in
serverside merged pdiffs as we get them one by one. For clientside we
get them all at once through, which means that a failure in one has to
stop the entire pipeline, which works as expected (as proven by the
bugreporters as they don't even notice it happening). The problem is
just that the first failing pdiff will do the cleanup, so another pdiff
which happens to be successfully acquired after we processed the failure
doesn't find the file it is supposed to use as a basename anymore, so
the patch is renamed to what should be the unique extension and moved
into the current working directory. Processing is then stopped as the
patch realizes that it isn't the last one which completed downloading.

On the plus side this means this is neither us using a bad temporary
location nor a security problem. It "just" overrides unconditionally
files in your current working directory (if you happen to have them
named like a pdiff patch – a bit unlikely perhaps) and so drops files
there which are never used again.

I guess this was introduced in 4e3c5633b1e74b4f58b95f339cfbbf4cbf21ab3e
for real as I made the need for the existence of the base file rather
explicit, but the potential lingers in the code for far longer.

Closes: #816837
</content>
</entry>
<entry>
<title>remove uncompressed leftover partial file before pdiff bootstrap</title>
<updated>2016-01-08T16:51:23Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-01-08T16:51:23Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=ef3c549e00b2a0487ddee0aeb70e3a29f76c2fbb'/>
<id>urn:sha1:ef3c549e00b2a0487ddee0aeb70e3a29f76c2fbb</id>
<content type='text'>
The code already deals with compressed leftovers, but forgot the
uncompressed files. The opertunity is picked to reorder this code and
add debug messages about the actions taken as well as produce such a
leftover file in the associated testcase.
</content>
</entry>
<entry>
<title>use filesize of compressed pdiffs for the limit if possible</title>
<updated>2016-01-08T14:40:01Z</updated>
<author>
<name>David Kalnischkies</name>
<email>david@kalnischkies.de</email>
</author>
<published>2016-01-08T14:30:05Z</published>
<link rel='alternate' type='text/html' href='https://git.kalnischkies.de/apt/commit/?id=4e6219da0dd1e68fad7db972f7ddd76598645228'/>
<id>urn:sha1:4e6219da0dd1e68fad7db972f7ddd76598645228</id>
<content type='text'>
With the addition of the $HASH-Download field in the .diff/Index we got
the size of the compressed patches for 'free', so if that information is
available we can use it for a more fitting calculation of the size
requirements of the patches vs. the complete file.

Note that this predicts a too small size in the transition case in which
the information isn't available for all patches, but figuring this out
would be a lot of code for practically nothing as only one update can
ever be in such a transition phase.
</content>
</entry>
</feed>
