diff options
Diffstat (limited to 'doc/dpkg-tech.sgml')
-rw-r--r-- | doc/dpkg-tech.sgml | 509 |
1 files changed, 509 insertions, 0 deletions
diff --git a/doc/dpkg-tech.sgml b/doc/dpkg-tech.sgml new file mode 100644 index 000000000..3c01ee8f0 --- /dev/null +++ b/doc/dpkg-tech.sgml @@ -0,0 +1,509 @@ +<!doctype debiandoc system> +<book> +<title>dpkg technical manual</title> + +<author>Tom Lees <email>tom@lpsg.demon.co.uk</email></author> +<version>$Id: dpkg-tech.sgml,v 1.1 1998/07/02 02:58:12 jgg Exp $</version> + +<abstract> +This document describes the minimum necessary workings for the APT dselect +replacement. It gives an overall specification of what its external interface +must look like for compatibility, and also gives details of some internal +quirks. +</abstract> + +<copyright> +Copyright © Tom Lees, 1997. +<p> +APT and this document are free software; you can redistribute them and/or +modify them under the terms of the GNU General Public License as published +by the Free Software Foundation; either version 2 of the License, or (at your +option) any later version. + +<p> +For more details, on Debian GNU/Linux systems, see the file +/usr/doc/copyright/GPL for the full license. +</copyright> + +<toc sect> + +<chapt>Quick summary of dpkg's external interface +<sect id="control">Control files + +<p> +The basic dpkg package control file supports the following major features:- + +<list> +<item>5 types of dependencies:- + <list> + <item>Pre-Depends, which must be satisfied before a package may be + unpacked + <item>Depends, which must be satisfied before a package may be + configured + <item>Recommends, to specify a package which if not installed may + severely limit the usefulness of the package + <item>Suggests, to specify a package which may increase the + productivity of the package + <item>Conflicts, to specify a package which must NOT be installed + in order for the package to be configured + </list> +Each of these dependencies can specify a version and a depedency on that +version, for example "<= 0.5-1", "== 2.7.2-1", etc. The comparators available +are:- + <list> + <item>"<<" - less than + <item>"<=" - less than or equal to + <item>">>" - greater than + <item>">=" - greater than or equal to + <item>"==" - equal to + </list> +<item>The concept of "virtual packages", which many other packages may provide, +using the Provides mechanism. An example of this is the "httpd" virtual package, +which all web servers should provide. Virtual package names may be used in +dependency headers. However, current policy is that virtual packages do not +support version numbers, so dependencies on virtual packages with versions +will always fail. +<item>Several other control fields, such as Package, Version, Description, +Section, Priority, etc., which are mainly for classification purposes. The +package name must consist entirely of lowercase characters, plus the characters +'+', '-', and '.'. Fields can extend across multiple lines - on the second +and subsequent lines, there is a space at the beginning instead of a field +name and a ':'. Empty lines must consist of the text " .", which will be +ignored, as will the initial space for other continuation lines. This feature +is usually only used in the Description field. +</list> + +<sect>The dpkg status area + +<p> +The "dpkg status area" is the term used to refer to the directory where dpkg +keeps its various status files (GNU would have you call it the dpkg shared +state directory). This is always, on Debian systems, /var/lib/dpkg. However, +the default directory name should not be hard-coded, but #define'd, so that +alteration is possible (it is available via configure in dpkg 1.4.0.9 and +above). Of course, in a library, code should be allowed to override the +default directory, but the default should be part of the library (so that +the user may change the dpkg admin dir simply by replacing the library). + +<p> +Dpkg keeps a variety of files in its status area. These are discussed later +on in this document, but a quick summary of the files is here:- + +<list> +<item>available - this file contains a concatenation of control information +from all the packages which dpkg knows about. This is updated using the dpkg +commands "--update-avail <file>", "--merge-avail <file>", and +"--clear-avail". +<item>status - this file contains information on the following things for +every package:- + <list> + <item>Whether it is installed, not installed, unpacked, removed, + failed configuration, or half-installed (deconfigured in + favour of another package). + <item>Whether it is selected as install, hold, remove, or purge. + <item>If it is "ok" (no installation problems), or "not-ok". + <item>It usually also contains the section and priority (so that + dselect may classify packages not in available) + <item>For packages which did not initially appear in the "available" + file when they were installed, the other control information + for them. + </list> + <p> + The exact format for the "Status:" field is: + <example> + Status: Want Flag Status + </example> + Where <var>Want</> may be one of <em>unknown</>, <em>install</>, + <em>hold</>, <em>deinstall</>, <em>purge</>. <var>Flag</> + may be one of <em>ok</>, <em>reinstreq</>, <em>hold</>, + <em>hold-reinstreq</>. + <var>Status</> may be one of <em>not-installed</>, <em>unpacked</>, + <em>half-configured</>, <em>installed</>, <em>half-installed</> + <em>config-files</>, <em>post-inst-failed</>, <em>removal-failed</>. + The states are as follows:- + <taglist> + <tag>not-installed + <item>No files are installed from the package, it has no config files + left, it uninstalled cleanly if it ever was installed. + <tag>unpacked + <item>The basic files have been unpacked (and are listed in + /var/lib/dpkg/info/[package].list. There are config files present, + but the postinst script has _NOT_ been run. + <tag>half-configured + <item>The package was installed and unpacked, but the postinst script + failed in some way. + <tag>installed + <item>All files for the package are installed, and the configuration + was also successful. + <tag>half-installed + <item>An attempt was made to remove the packagem but there was a failure + in the prerm script. + <tag>config-files + <item>The package was "removed", not "purged". The config files are left, + but nothing else. + <tag>post-inst-failed + <item>Old name for half-configured. Do not use. + <tag>removal-failed + <item>Old name for half-installed. Do not use. + </taglist> + The two last items are only left in dpkg for compatibility - they are + understood by it, but never written out in this form. + + <p> + Please see the dpkg source code, <tt>lib/parshelp.c</tt>, + <em>statusinfos</>, <em>eflaginfos</> and <em>wantinfos</> for more + details. + +<item>info - this directory contains files from the control archive of every +package currently installed. They are installed with a prefix of "<packagename>.". +In addition to this, it also contains a file called <package>.list for every +package, which contains a list of files. Note also that the control file is +not copied into here; it is instead found as part of status or available. +<item>methods - this directory is reserved for "method"-specific files - each +"method" has a subdirectory underneath this directory (or at least, it can +have). In addition, there is another subdirectory "mnt", where misc. +filesystems (floppies, CDROMs, etc.) are mounted. +<item>alternatives - directory used by the "update-alternatives" program. It +contains one file for each "alternatives" interface, which contains information +about all the needed symlinked files for each alternative. +<item>diversions - file used by the "dpkg-divert" program. Each diversion takes +three lines. The first is the package name (or ":" for user diversion), the +second the original filename, and the third the diverted filename. +<item>updates - directory used internally by dpkg. This is discussed later, +in the section <ref id="updates">. +<item>parts - temporary directory used by dpkg-split +</list> + +<sect>The dpkg library files + +<p> +These files are installed under /usr/lib/dpkg (usually), but +/usr/local/lib/dpkg is also a possibility (as Debian policy dictates). Under +this directory, there is a "methods" subdirectory. The methods subdirectory +in turn contains any number of subdirectories for each general method +processor (note that one set of method scripts can, and is, used for more than +one of the methods listed under dselect). + +<p> +The following files may be found in each of these subdirectories:- + +<list> +<item>names - One line per method, two-digit priority to appear on menu +at beginning, followed by a space, the name, and then another space and the +short description. +<item>desc.<name> - Contains the long description displayed by dselect +when the cursor is put over the <name> method. +<item>setup - Script or program which sets up the initial values to be used +by this method. Called with first argument as the status area directory +(/var/lib/dpkg), second argument as the name of the method (as in the directory +name), and the third argument as the option (as in the names file). +<item>install - Script/program called when the "install" option of dselect is +run with this method. Same arguments as for setup. +<item>update - Script/program called when the "update" option of dselect is +run. Same arguments as for setup/install. +</list> + +<sect>The "dpkg" command-line utility + +<sect1>"Documented" command-line interfaces + +<p> +As yet unwritten. You can refer to the other manuals for now. See +<manref name="dpkg" section="8">. + +<sect1>Environment variables which dpkg responds to + +<p> +<list> +<item>DPKG_NO_TSTP - if set to a non-null value, this variable causes dpkg to +run a child shell process instead of sending itself a SIGTSTP, when the user +selects to background the dpkg process when it asks about conffiles. +<item>SHELL - used to determine which shell to run in the case when +DPKG_NO_TSTP is set. +<item>CC - used as the C compiler to call to determine the target architecture. +The default is "gcc". +<item>PATH - dpkg checks that it can find at least the following files in the +path when it wants to run package installation scripts, and gives an error if +it cannot find all of them:- + <list> + <item>ldconfig + <item>start-stop-daemon + <item>install-info + <item>update-rc.d + </list> +</list> + +<sect1>Assertions + +<p> +The dpkg utility itself is required for quite a number of packages, even if +they have been installed with a tool totally separate from dpkg. The reason for +this is that some packages, in their pre-installation scripts, check that your +version of dpkg supports certain features. This was broken from the start, and +it should have actually been a control file header "Dpkg-requires", or similar. +What happens is that the configuration scripts will abort or continue according +to the exit code of a call to dpkg, which will stop them from being wrongly +configured. + +<p> +These special command-line options, which simply return as true or false are +all prefixed with "--assert-". Here is a list of them (without the prefix):- + +<list> +<item>support-predepends - Returns success or failure according to whether +a version of dpkg which supports predepends properly (1.1.0 or above) is +installed, according to the database. +<item>working-epoch - Return success or failure according to whether a version +of dpkg which supports epochs in version properly (1.4.0.7 or above) is +installed, according to the database. +</list> + +<p> +Both these options check the status database to see what version of the "dpkg" +package is installed, and check it against a known working version. + +<sect1>--predep-package + +<p> +This strange option is described as follows in the source code: + +<example> +/* Print a single package which: + * (a) is the target of one or more relevant predependencies. + * (b) has itself no unsatisfied pre-dependencies. + * If such a package is present output is the Packages file entry, + * which can be massaged as appropriate. + * Exit status: + * 0 = a package printed, OK + * 1 = no suitable package available + * 2 = error + */ +</example> + +<p> +On further inspection of the source code, it appears that what is does is +this:- + +<list> +<item>Looks at the packages in the database which are selected as "install", +and are installed. +<item>It then looks at the Pre-Depends information for each of these packages +from the available file. When it find a package for which any of the +pre-dependencies are not satisfied, it breaks from the loop through the packages. +<item>It then looks through the unsatisfied pre-dependencies, and looks for +packages which would satisfy this pre-dependency, stopping on the first it +finds. If it finds none, it bombs out with an error. +<item>It then continues this for every dependency of the initial package. +</list> + +Eventually, it writes out the record of all the packages to satisfy the +pre-dependencies. This is used by the disk method to make sure that its +dependency ordering is correct. What happens is that all pre-depending +packages are first installed, then it runs dpkg -iGROEB on the directory, +which installs in the order package files are found. Since pre-dependencies +mean that a package may not even be unpacked unless they are satisfied, it is +necessary to do this (usually, since all the package files are unpacked in one +phase, the configured in another, this is not needed). + +<chapt>dpkg-deb and .deb file internals + +<p> +This chapter describes the internals to the "dpkg-deb" tool, which is used +by "dpkg" as a back-end. dpkg-deb has its own tar extraction functions, which +is the source of many problems, as it does not support long filenames, using +extension blocks. + +<sect>The .deb archive format + +<p> +The main principal of the new-format Debian archive (I won't describe the old +format - for that have a look at deb-old.5), is that the archive really is +an archive - as used by "ar" and friends. However, dpkg-deb uses this format +internally, rather than calling "ar". Inside this archive, there are usually +the folowing members:- + +<list> +<item>debian-binary +<item>control.tar.gz +<item>data.tar.gz +</list> + +<p> +The debian-binary member consists simply of the string "2.0", indicating the +format version. control.tar.gz contains the control files (and scripts), and +the data.tar.gz contains the actual files to populate the filesystem with. +Both tarfiles extract straight into the current directory. Information on the +tar formats can be found in the GNU tar info page. Since dpkg-deb calls +"tar -cf" to build packages, the Debian packages use the GNU extensions. + +<sect>The dpkg-deb command-line + +<p> +dpkg-deb documents itself thoroughly with its '--help' command-line option. +However, I am including a reference to these for completeness. dpkg-deb +supports the following options:- + +<list> +<item>--build (-b) <dir> - builds a .deb archive, takes a directory which +contains all the files as an argument. Note that the directory +<dir>/DEBIAN will be packed separately into the control archive. +<item>--contents (-c) <debfile> - Lists the contents of ther "data.tar.gz" +member. +<item>--control (-e) <debfile> - Extracts the control archive into a +directory called DEBIAN. Alternatively, with another argument, it will extract +it into a different directory. +<item>--info (-I) <debfile> - Prints the contents of the "control" file +in the control archive to stdout. Alternatively, giving it other arguments will +cause it to print the contents of those files instead. +<item>--field (-f) <debfile> <field> ... - Prints any number of +fields from the "control" file. Giving it extra arguments limits the fields it +prints to only those specified. With no command-line arguments other than a +filename, it is equivalent to -I and just the .deb filename. +<item>--extract (-x) <debfile> <dir> - Extracts the data archive +of a debian package under the directory <dir>. +<item>--vextract (-X) <debfile> <dir> - Same as --extract, except +it is equivalent of giving tar the '-v' option - it prints the filenames as +it extracts them. +<item>--fsys-tarfile <debfile> - This option outputs a gunzip'd version +of data.tar.gz to stdout. +<item>--new - sets the archive format to be used to the new Debian format +<item>--old - sets the archive format to be used to the old Debian format +<item>--debug - Tells dpkg-deb to produce debugging output +<item>--nocheck - Tells dpkg-deb not to check the sanity of the control file +<item>--help (-h) - Gives a help message +<item>--version - Shows the version number +<item>--licence/--license (UK/US spellings) - Shows a brief outline of the GPL +</list> + +<sect1>Internal checks used by dpkg-deb when building packages + +<p> +Here is a list of the internal checks used by dpkg-deb when building packages. +It is in the order they are done. + +<list> +<item>First, the output Debian archive argument, if it is given, is checked +using stat. If it is a directory, an internal flag is set. This check is only +made if the archive name is specified explicitly on the command-line. If the +argument was not given, the default is the directory name, with ".deb" +appended. +<item>Next, the control file is checked, unless the --nocheck flag was +specified on the command-line. dpkg-deb will bomb out if the second argument +to --build was a directory, and --nocheck was specified. Note that dpkg-deb +will not be able to determine the name of the package in this case. In the +control file, the following things are checked:- + <list> + <item>The package name is checked to see if it contains any invalid + characters (see <ref id="control"> for this). + <item>The priority field is checked to see if it uses standard values, + and user-defined values are warned against. However, note that this + check is now redundant, since the control file no longer contains + the priority - the changes file now does this. + <item>The control file fields are then checked against the standard + list of fields which appear in control files, and any "user-defined" + fields are reported as warnings. + <item>dpkg-deb then checks that the control file contains a valid + version number. + </list> +<item>After this, in the case where a directory was specified to build the +.deb file in, the filename is created as "directory/pkg_ver.deb" or +"directory/pkg_ver_arch.deb", depending on whether the control file contains +an architecture field. +<item>Next, dpkg-deb checks for the <dir>/DEBIAN directory. It complains +if it doesn't exist, or if it has permissions < 0755, or > 0775. +<item>It then checks that all the files in this subdir are either symlinks +or plain files, and have permissions between 0555 and 0775. +<item>The conffiles file is then checked to see if the filenames are too +long. Warnings are produced for each that is. After this, it checks that +the package provides initial copies of each of these conffiles, and that +they are all plain files. +</list> + +<chapt>dpkg internals + +<p> +This chapter describes the internals of dpkg itself. Although the low-level +formats are quite simple, what dpkg does in certain cases often does not +make sense. + +<sect id="updates">Updates + +<p> +This describes the /var/lib/dpkg/updates directory. The function of this +directory is somewhat strange, and seems only to be used internally. A function +called cleanupdates is called whenever the database is scanned. This function +in turn uses <manref name="scandir" section="3">, to sort the files in this +directory. Files who names do not consist entirely of digits are discarded. +dpkg also causes a fatal error if any of the filenames are different lengths. + +<p> +After having scanned the directory, dpkg in turn parses each file the same way +it parses the status file (they are sorted by the scandir to be in numerical +order). After having done this, it then writes the status information back +to the "status" file, and removes all the "updates" files. + +<p> +These files are created internally by dpkg's "checkpoint" function, and are +cleaned up when dpkg exits cleanly. + +<p> +Juding by the use of the updates directory I would call it a Journal. Inorder +to effeciently ensure the complete integrity of the status file dpkg will +"checkpoint" or journal all of it's activities in the updates directory. By +merging the contents of the updates directory (in order!!) against the +original status file it can get the precise current state of the system, +even in the event of a system failure while dpkg is running. + +<p> +The other option would be to sync-rewrite the status file after each +operation, which would kill performance. + +<p> +It is very important that any program that uses the status file abort if +the updates directory is not empty! The user should be informed to run dpkg +manually (what options though??) to correct the situation. + +<sect>What happens when dpkg reads the database + +<p> +First, the status file is read. This gives dpkg an initial idea of the packages +that are there. Next, the updates files are read in, overriding the status +file, and if necessary, the status file is re-written, and updates files are +removed. Finally, the available file is read. The available file is read +with flags which preclude dpkg from updating any status information from it, +though - installed version, etc., and is also told to record that the packages +it reads this time are available, not installed. + +<p> +More information on updates is given above. + +<sect>How dpkg compares version numbers + +<p> +Version numbers consist of three parts: the epoch, the upstream version, and +the Debian revision. Dpkg compares these parts in that order. If the epochs +are different, it returns immediately, and so on. + +<p> +However, the important part is how it compares the versions which are +essentially stored as just strings. These are compared in two distinct parts: +those consisting of numerical characters (which are evaluated, and then +compared), and those consisting of other characters. When comparing +non-numerical parts, they are compared as the character values (ASCII), but +non-alphabetical characters are considered "greater than" alphabetical ones. +Also note that longer strings (after excluding differences where numerical +values are equal) are considered "greater than" shorter ones. + +<p> +Here are a few examples of how these rules apply:- + +<example> +15 > 10 +0010 == 10 + +d.r > dsr +32.d.r == 0032.d.r +d.rnr < d.rnrn +</example> + +</book> |