summaryrefslogtreecommitdiff
path: root/methods/http.cc
Commit message (Collapse)AuthorAgeFilesLines
* rename ServerMethod to BaseHttpMethodDavid Kalnischkies2016-12-311-4/+4
| | | | | | | This 'method' is the abstract base for http and https and should as such be called out like this rather using an easily confused name. Gbp-Dch: Ignore
* separating state variables regarding server/requestDavid Kalnischkies2016-12-311-53/+46
| | | | | | | | | | | | | | Having a Reset(bool) method to partially reset certain variables like the download size always were strange, so this commit splits the ServerState into an additional RequestState living on the stack for as long as we deal with this request causing an automatic "reset". There is much to do still to make this code look better, but this is a good first step which compiles cleanly and passes all tests, so keeping it as history might be beneficial and due to avoiding explicit memory allocations it ends up fixing a small memory leak in https, too. Closes: #440057
* http: skip connection cleanup if we close it anyhowDavid Kalnischkies2016-11-111-0/+3
| | | | Suggested in #529794
* improve SOCKS error messages for http slightlyDavid Kalnischkies2016-11-101-14/+46
| | | | | | | | | The 0.0.0.0:0 tor reports is pretty useless by itself, but even if an IP would be reported it is better to show the user the hostname we wanted the proxy to connect to in the same error message. We improve upon it further by looking for this bind address in particular and remap error messages slightly to give users a better chance of figuring out what went wrong. Upstream Tor can't do that as it is technically wrong.
* don't sent Range requests if we know its not acceptedDavid Kalnischkies2016-08-161-1/+1
| | | | | | If the server told us in a previous request that it isn't supporting Ranges with bytes via an Accept-Ranges header missing bytes, we don't try to formulate requests using Ranges.
* reorganize server-states resetting in http/httpsDavid Kalnischkies2016-08-161-0/+7
| | | | | | | | | | | | | We keep various information bits about the server around, some only effecting the currently handled file (like sizes) while others should be persistent (like pipeline detections). http used to reset all file-related manually, which is a bit silly if we already have a Reset() method – which does reset all through –, so extending it with a parameter for reuse and calling it from https too (as this was previously resetting by just creating a new state struct – it uses no value of the persistent state-keeping yet as it supports no pipelining). Gbp-Dch: Ignore
* http: auto-configure for local Tor proxy if called as 'tor'David Kalnischkies2016-08-111-0/+3
| | | | | | | | | | | | | With apts http transport supporting socks5h proxies and all the work in terms of configuration of methods based on the name it is called with it becomes surprisingly easy to implement Tor support equally (and perhaps even a bit exceeding) what is available currently in apt-transport-tor. How this will turn out to be handled packaging wise we will see in https://lists.debian.org/deity/2016/08/msg00012.html , but until this is resolved we can add the needed support without actively enabling it for now, so that this can be tested better.
* implement socks5h proxy support for http methodDavid Kalnischkies2016-08-101-22/+169
| | | | | | | | | | | | | | | | | Socks support is a requested feature in sofar that the internet is actually believing Acquire::socks::Proxy would exist. It doesn't and this commit isn't adding it as that isn't how our configuration works, but it allows Acquire::http::Proxy="socks5h://…". The HTTPS method was changed already to support socks proxies (all versions) via curl. This commit implements only SOCKS5 (RFC1928) with no auth or pass&user auth (RFC1929), but not GSSAPI which is required by the RFC. The 'h' in the protocol name further indicates that DNS resolution is delegated to the socks proxy rather than performed locally. The implementation works and was tested with Tor as socks proxy for which implementing socks5h only can actually be considered a feature. Closes: 744934
* implement generic config fallback for methodsDavid Kalnischkies2016-08-101-29/+30
| | | | | | | | | | The https method implemented for a long while now a hardcoded fallback to the same options in http, which, while it works, is rather inflexible if we want to allow the methods to use another name to change their behavior slightly, like apt-transport-tor does to https – most of the diff being s#https#tor#g which then fails to do the full circle fallthrough tor -> https -> http for https sources. With this config infrastructure this could be implemented now.
* use the same redirection handling for http and httpsDavid Kalnischkies2016-08-101-0/+29
| | | | | | | | cURL which backs our https implementation can handle redirects on its own, but by dealing with them on our own we gain finer control over which redirections will be performed (we don't like https → http) and by whom so that redirections to other hosts correctly spawn a new https method dealing with these instead of letting the current one deal with it.
* detect redirection loops in acquire instead of workersDavid Kalnischkies2016-08-101-0/+6
| | | | | | | Having the detection handled in specific (http) workers means that a redirection loop over different hostnames isn't detected. Its also not a good idea have this implement in each method independently even if it would work
* fail on unsupported http/https proxy settingsDavid Kalnischkies2016-08-101-0/+2
| | | | Closes: #623443
* prevent C++ locale number formatting in text APIs (try 2)David Kalnischkies2016-07-301-1/+1
| | | | | | | Followup of b58e2c7c56b1416a343e81f9f80cb1f02c128e25. Still a regression of sorts of 8b79c94af7f7cf2e5e5342294bc6e5a908cacabf. Closes: 832044
* avoid 416 response teardown binding to null pointerDavid Kalnischkies2016-07-051-7/+9
| | | | | | | | | | methods/http.cc:640:13: runtime error: reference binding to null pointer of type 'struct FileFd' This reference is never used in the cases it has a nullptr, so the practical difference is non-existent, but its a bug still. Reported-By: gcc -fsanitize=undefined
* use +0000 instead of UTC by default as timezone in outputDavid Kalnischkies2016-07-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | All apt versions support numeric as well as 3-character timezones just fine and its actually hard to write code which doesn't "accidently" accepts it. So why change? Documenting the Date/Valid-Until fields in the Release file is easy to do in terms of referencing the datetime format used e.g. in the Debian changelogs (policy §4.4). This format specifies only the numeric timezones through, not the nowadays obsolete 3-character ones, so in the interest of least surprise we should use the same format even through it carries a small risk of regression in other clients (which encounter repositories created with apt-ftparchive). In case it is really regressing in practice, the hidden option -o APT::FTPArchive::Release::NumericTimezone=0 can be used to go back to good old UTC as timezone. The EDSP and EIPP protocols use this 'new' format, the text interface used to communicate with the acquire methods does not for compatibility reasons even if none of our methods would be effected and I doubt any other would (in these instances the timezone is 'GMT' as that is what HTTP/1.1 requires). Note that this is only true for apt talking to methods, (libapt-based) methods talking to apt will respond with the 'new' format. It is therefore strongly adviced to support both also in method input.
* http: don't hang on redirect with length + connection closeDavid Kalnischkies2016-06-151-4/+4
| | | | | | | | Most servers who close the connection do not send a content-length as this is redundant information usually, but some might and while testing with our server and with 'aptwebserver::response-header::Connection' set to 'close' I noticed that http hangs after a redirect in such cases, so if we have the information, just use it instead of discarding it.
* use std::locale::global instead of setlocaleDavid Kalnischkies2016-05-281-1/+0
| | | | | | We use a wild mixture of C and C++ ways of generating output, so having a consistent world-view in both styles sounds like a good idea and should help in preventing regressions.
* prevent C++ locale number formatting in text APIsDavid Kalnischkies2016-05-271-2/+2
| | | | | | | | | | | Setting the C++ locale via std::locale::global(std::locale("")); which would otherwise default to the default C locale (aka: unaffected by setlocale) effects the formatting of numeric types in IO streams, which for output for humans is perfectly sensible, but breaks our many text interfaces used and parsed by us and others without expecting the numbers to be formatted. Closes: #825396
* fix two memory leaks reported by gccDavid Kalnischkies2015-09-141-2/+2
| | | | | Reported-By: gcc -fsanitize=address -fno-sanitize=vptr Git-Dch: Ignore
* Merge branch 'debian/sid' into debian/experimentalMichael Vogt2015-05-221-1/+1
|\ | | | | | | | | | | | | | | | | Conflicts: apt-pkg/pkgcache.h debian/changelog methods/https.cc methods/server.cc test/integration/test-apt-download-progress
| * Fix endless loop in apt-get update that can cause disk fillupMichael Vogt2015-05-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The apt http code parses Content-Length and Content-Range. For both requests the variable "Size" is used and the semantic for this Size is the total file size. However Content-Length is not the entire file size for partital file requests. For servers that send the Content-Range header first and then the Content-Length header this can lead to globbing of Size so that its less than the real file size. This may lead to a subsequent passing of a negative number into the CircleBuf which leads to a endless loop that writes data. Thanks to Anton Blanchard for the analysis and initial patch. LP: #1445239
| * dispose http(s) 416 error page as non-contentDavid Kalnischkies2014-12-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Real webservers (like apache) actually send an error page with a 416 response, but our client didn't expect it leaving the page on the socket to be parsed as response for the next request (http) or as file content (https), which isn't what we want at all… Symptom is a "Bad header line" as html usually doesn't parse that well to an http-header. This manifests itself e.g. if we have a complete file (or larger) in partial/ which isn't discarded by If-Range as the server doesn't support it (or it is just newer, think: mirror rotation). It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a, which removed the filesize - 1 trick, but this had its own problems… To properly test this our webserver gains the ability to reply with transfer-encoding: chunked as most real webservers will use it to send the dynamically generated error pages. (The tests and their binary helpers had to be slightly modified to apply, but the patch to fix the issue itself is unchanged.) Closes: 768797
* | calculate hashes while downloading in httpsDavid Kalnischkies2015-04-191-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | We do this in HTTP already to give the CPU some exercise while the disk is heavily spinning (or flashing?) to store the data avoiding the need to reread the entire file again later on to calculate the hashes – which happens outside of the eyes of progress reporting, so you might ended up with a bunch of https workers 'stuck' at 100% while they were busy calculating hashes. This is a bummer for everyone using apt as a connection speedtest as the https method works slower now (not really, it just isn't reporting done too early anymore).
* | calculate only expected hashes in methodsDavid Kalnischkies2015-04-191-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Methods get told which hashes are expected by the acquire system, which means we can use this list to restrict what we calculate in the methods as any extra we are calculating is wasted effort as we can't compare it with anything anyway. Adding support for a new hash algorithm is therefore 'free' now and if a algorithm is no longer provided in a repository for a file, we automatically stop calculating it. In practice this results in a speed-up in Debian as we don't have SHA512 here (so far), so we practically stop calculating it.
* | handle servers closing encoded connections correctlyDavid Kalnischkies2015-04-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Servers who advertise that they close the connection get the 'Closes' encoding flag, but this conflicts with servers who response with a transfer-encoding (e.g. encoding) as it is saved in the same flag. We have a better flag for the keep-alive (or not) of the connection anyway, so we check this instead of the encoding. This is in practice not much of a problem as real servers we talk to are HTTP1.1 servers (with keep-alive) and there isn't much point in doing chunked encoding if you are going to close anyway, but our simple testserver stumbles over this if pressed and its a bit cleaner, too. Git-Dch: Ignore
* | derive more of https from http methodDavid Kalnischkies2015-03-161-2/+0
| | | | | | | | | | | | | | | | | | Bug #778375 uncovered that https wasn't properly integrated in the class family tree of http as it was supposed to be leading to a NULL pointer dereference. Fixing this 'properly' was deemed to much diff for practically no gain that late in the release, so commit 0c2dc43d4fe1d026650b5e2920a021557f9534a6 just fixed the synptom, while this commit here is fixing the cause plus adding a test.
* | dispose http(s) 416 error page as non-contentDavid Kalnischkies2014-12-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Real webservers (like apache) actually send an error page with a 416 response, but our client didn't expect it leaving the page on the socket to be parsed as response for the next request (http) or as file content (https), which isn't what we want at all… Symptom is a "Bad header line" as html usually doesn't parse that well to an http-header. This manifests itself e.g. if we have a complete file (or larger) in partial/ which isn't discarded by If-Range as the server doesn't support it (or it is just newer, think: mirror rotation). It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a, which removed the filesize - 1 trick, but this had its own problems… To properly test this our webserver gains the ability to reply with transfer-encoding: chunked as most real webservers will use it to send the dynamically generated error pages. Closes: 768797
* | Fix backward compatiblity of the new pkgAcquireMethod::DropPrivsOrDie()Michael Vogt2014-10-131-0/+2
| | | | | | | | | | | | | | | | Do not drop privileges in the methods when using a older version of libapt that does not support the chown magic in partial/ yet. To do this DropPrivileges() now will ignore a empty Apt::Sandbox::User. Cleanup all hardcoded _apt along the way.
* | Send "Fail-Reason: MaximumSizeExceeded" from the methodMichael Vogt2014-10-071-0/+1
| | | | | | | | | | Communicate the fail reason from the methods to the parent and Rename() failed files.
* | make expected-size a maximum-size check as this is what we want at this pointMichael Vogt2014-10-071-2/+2
| |
* | make http size check workMichael Vogt2014-10-061-63/+7
|\|
| * Make Proxy-Auto-Detect check for each hostMichael Vogt2014-09-021-60/+2
| | | | | | | | | | | | | | | | | | When doing Acquire::http{,s}::Proxy-Auto-Detect, run the auto-detect command for each host instead of only once. This should make using "proxy" from libproxy-tools feasible which can then be used for PAC style or other proxy configurations. Closes: #759264
| * methods/http.cc: use Req.str() in debug outputMichael Vogt2014-06-241-1/+1
| |
* | Pass ExpectedSize to tthe backend methodMichael Vogt2014-08-261-1/+9
|/ | | | | This ensures that we can stop downloading if the server send too much data by accident (or by a malicious attempt)
* build http request in a stringstreamDavid Kalnischkies2014-04-261-57/+31
| | | | | beside reducing code a bit, it avoids oddball problems while building the string and doesn't trigger static analyse warnings.
* follow method attribute suggestions by gccDavid Kalnischkies2014-03-131-2/+2
| | | | | Git-Dch: Ignore Reported-By: gcc -Wsuggest-attribute={pure,const,noreturn}
* cleanup headers and especially #includes everywhereDavid Kalnischkies2014-03-131-8/+5
| | | | | | | | Beside being a bit cleaner it hopefully also resolves oddball problems I have with high levels of parallel jobs. Git-Dch: Ignore Reported-By: iwyu (include-what-you-use)
* StartPos is always positive for http/httpsDavid Kalnischkies2014-03-131-7/+2
| | | | | | | | | server.cc: In member function ‘bool ServerState::HeaderLine(std::string)’: server.cc:198:72: warning: format ‘%llu’ expects argument of type ‘long long unsigned int*’, but argument 3 has type ‘long long int*’ [-Wformat=] else if (sscanf(Val.c_str(),"bytes %llu-%*u/%llu",&StartPos,&Size) != 2) Git-Dch: Ignore Reported-By: gcc -Wpedantic
* warning: extra ‘;’ [-Wpedantic]David Kalnischkies2014-03-131-3/+3
| | | | | Git-Dch: Ignore Reported-By: gcc -Wpedantic
* Fix typos in documentation (codespell)Michael Vogt2014-02-221-2/+2
|
* fix various style/performance warnings in rredDavid Kalnischkies2014-01-301-1/+0
| | | | | Reported-By: cppcheck Git-Dch: Ignore
* correct some style/performance/warnings from cppcheckDavid Kalnischkies2014-01-161-3/+1
| | | | | | | | The most "visible" change is from utime to utimensat/futimens as the first one isn't part of POSIX anymore. Reported-By: cppcheck Git-Dch: Ignore
* add Acquire::http::Proxy-Auto-Detect to the apt.conf.5 manpage (closes: 726597)Michael Vogt2013-10-221-1/+5
|
* handle complete responses to https range requestsDavid Kalnischkies2013-10-011-0/+5
| | | | | | | | | | | | | Servers might respond with a complete file either because they don't support Ranges at all or the If-Range condition isn't statisfied, so we have to parse the headers curl gets ourself to seek or truncate the file we have so far. This also finially adds the testcase testing a bunch of partial situations for both, http and https - which is now all green. Closes: 617643, 667699 LP: 1157943
* refactor http client implementationDavid Kalnischkies2013-10-011-790/+201
| | | | | | | | | | | | | No effective behavior change, just shuffling big junks of code between methods and classes to split them into those strongly related to our client implementation and those implementing HTTP. The idea is to get HTTPS to a point in which most of the implementation can be shared even though the client implementations itself is completely different. This isn't anywhere near yet though, but it should beenough to reuse at least a few lines from http in https now. Git-Dch: Ignore
* replace "filesize - 1" trick in http with proper 416 handlingDavid Kalnischkies2013-10-011-6/+21
| | | | | | | | Our http client requests the "filesize - 1" for the small edgecase of handling a file which was completely downloaded, but not yet moved to the correct place as we get 416 errors in that case, but as we can handle 416 returns now we just special-case the situation of requesting the exact filesize and handle it as a 200 without content instead.
* retry without partial data after a 416 responseDavid Kalnischkies2013-10-011-2/+17
| | | | | | | | | | | | | | | If we get a 416 from the server it means the Range we asked for is above the real filesize of the file on the server. Mostly this happens if the server isn't supporting If-Range, but regardless of how we end up with the partial data, the data is invalid so we discard it and retry with a fresh plate and hope for the best. Old behavior was to consider 416 an error and retry with a different compression until we ran out of compression and requested the uncompressed file (which doesn't exist on most mirrors) with an accept line which server answered with "406 Not Acceptable". Closes: 710924
* Merge remote-tracking branch 'mvo/bugfix/coverity' into debian/sidMichael Vogt2013-07-281-1/+1
|\
| * fix off-by-one error in HttpMethod::​AutoDetectProxy()Michael Vogt2013-07-251-1/+1
| |
* | request absolute URIs from proxies again (0.9.9.3 regession)David Kalnischkies2013-07-261-4/+16
| | | | | | | | | | | | | | | | Commit 2b9c9b7f28b18f6ae3e422020e8934872b06c9f3 not only removes keep-alive, but also changes the request URI send to proxies which are required to be absolute URIs rather than the usual absolute paths. Closes: 717891