diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/cache.sgml | 824 |
1 files changed, 0 insertions, 824 deletions
diff --git a/doc/cache.sgml b/doc/cache.sgml deleted file mode 100644 index aea5a45c3..000000000 --- a/doc/cache.sgml +++ /dev/null @@ -1,824 +0,0 @@ -<!-- -*- mode: sgml; mode: fold -*- --> -<!doctype debiandoc PUBLIC "-//DebianDoc//DTD DebianDoc//EN"> -<book> -<title>APT Cache File Format</title> - -<author>Jason Gunthorpe <email>jgg@debian.org</email></author> -<version>$Id: cache.sgml,v 1.11 2003/02/12 15:05:44 doogie Exp $</version> - -<abstract> -This document describes the complete implementation and format of the APT -Cache file. The APT Cache file is a way for APT to parse and store a -large number of package files for display in the UI. It's primary design -goal is to make display of a single package in the tree very fast by -pre-linking important things like dependencies and provides. - -The specification doubles as documentation for one of the in-memory -structures used by the package library and the APT GUI. - -</abstract> - -<copyright> -Copyright © Jason Gunthorpe, 1997-1998. -<p> -APT and this document are free software; you can redistribute them and/or -modify them under the terms of the GNU General Public License as published -by the Free Software Foundation; either version 2 of the License, or (at your -option) any later version. - -<p> -For more details, on Debian GNU/Linux systems, see the file -/usr/share/common-licenses/GPL for the full license. -</copyright> - -<toc sect> - -<chapt>Introduction -<!-- Purpose {{{ --> -<!-- ===================================================================== --> -<sect>Purpose - -<p> -This document describes the implementation of an architecture -dependent binary cache file. The goal of this cache file is two fold, -firstly to speed loading and processing of the package file array and -secondly to reduce memory consumption of the package file array. - -<p> -The implementation is aimed at an environment with many primary package -files, for instance someone that has a Package file for their CD-ROM, a -Package file for the latest version of the distribution on the CD-ROM and a -package file for the development version. Always present is the information -contained in the status file which might be considered a separate package -file. - -<p> -Please understand, this is designed as a -CACHE FILE- it is not meant to be -used on any system other than the one it was created for. It is not meant to -be authoritative either, i.e. if a system crash or software failure occurs it -must be perfectly acceptable for the cache file to be in an inconsistent -state. Furthermore at any time the cache file may be erased without losing -any information. - -<p> -Also the structures and storage layout is optimized for use by the APT -GUI and may not be suitable for all purposes. However it should be possible -to extend it with associate cache files that contain other information. - -<p> -To keep memory use down the cache file only contains often used fields and -fields that are inexpensive to store, the Package file has a full list of -fields. Also the client may assume that all items are perfectly valid and -need not perform checks against their correctness. Removal of information -from the cache is possible, but blanks will be left in the file, and -unused strings will also be present. The recommended implementation is to -simply rebuild the cache each time any of the data files change. It is -possible to add a new package file to the cache without any negative side -effects. - -<sect1>Note on Pointer access -<p> -Every item in every structure is stored as the index to that structure. -What this means is that once the files is mmaped every data access has to -go through a fixup stage to get a real memory pointer. This is done -by taking the index, multiplying it by the type size and then adding -it to the start address of the memory block. This sounds complex, but -in C it is a single array dereference. Because all items are aligned to -their size and indexes are stored as multiples of the size of the structure -the format is immediately portable to all possible architectures - BUT the -generated files are -NOT-. - -<p> -This scheme allows code like this to be written: -<example> - void *Map = mmap(...); - Package *PkgList = (Package *)Map; - Header *Head = (Header *)Map; - char *Strings = (char *)Map; - cout << (Strings + PkgList[Head->HashTable[0]]->Name) << endl; -</example> -<p> -Notice the lack of casting or multiplication. The net result is to return -the name of the first package in the first hash bucket, without error -checks. - -<p> -The generator uses allocation pools to group similarly sized structures in -large blocks to eliminate any alignment overhead. The generator also -assures that no structures overlap and all indexes are unique. Although -at first glance it may seem like there is the potential for two structures -to exist at the same point the generator never allows this to happen. -(See the discussion of free space pools) - <!-- }}} --> - -<chapt>Structures -<!-- Header {{{ --> -<!-- ===================================================================== --> -<sect>Header -<p> -This is the first item in the file. -<example> - struct Header - { - // Signature information - unsigned long Signature; - short MajorVersion; - short MinorVersion; - bool Dirty; - - // Size of structure values - unsigned short HeaderSz; - unsigned short PackageSz; - unsigned short PackageFileSz; - unsigned short VersionSz; - unsigned short DependencySz; - unsigned short ProvidesSz; - unsigned short VerFileSz; - - // Structure counts - unsigned long PackageCount; - unsigned long VersionCount; - unsigned long DependsCount; - unsigned long PackageFileCount; - - // Offsets - unsigned long FileList; // PackageFile - unsigned long StringList; // StringItem - unsigned long VerSysName; // StringTable - unsigned long Architecture; // StringTable - unsigned long MaxVerFileSize; - - // Allocation pools - struct - { - unsigned long ItemSize; - unsigned long Start; - unsigned long Count; - } Pools[7]; - - // Package name lookup - unsigned long HashTable[2*1024]; // Package - }; -</example> -<taglist> -<tag>Signature<item> -This must contain the hex value 0x98FE76DC which is designed to verify -that the system loading the image has the same byte order and byte size as -the system saving the image - -<tag>MajorVersion -<tag>MinorVersion<item> -These contain the version of the cache file, currently 0.2. - -<tag>Dirty<item> -Dirty is true if the cache file was opened for reading, the client expects -to have written things to it and have not fully synced it. The file should -be erased and rebuilt if it is true. - -<tag>HeaderSz -<tag>PackageSz -<tag>PackageFileSz -<tag>VersionSz -<tag>DependencySz -<tag>VerFileSz -<tag>ProvidesSz<item> -*Sz contains the sizeof() that particular structure. It is used as an -extra consistency check on the structure of the file. - -If any of the size values do not exactly match what the client expects then -the client should refuse the load the file. - -<tag>PackageCount -<tag>VersionCount -<tag>DependsCount -<tag>PackageFileCount<item> -These indicate the number of each structure contained in the cache. -PackageCount is especially useful for generating user state structures. -See Package::Id for more info. - -<tag>VerSysName<item> -String representing the version system used for this cache - -<tag>Architecture<item> -Architecture the cache was built against. - -<tag>MaxVerFileSize<item> -The maximum size of a raw entry from the original Package file -(i.e. VerFile::Size) is stored here. - -<tag>FileList<item> -This contains the index of the first PackageFile structure. The PackageFile -structures are singly linked lists that represent all package files that -have been merged into the cache. - -<tag>StringList<item> -This contains a list of all the unique strings (string item type strings) in -the cache. The parser reads this list into memory so it can match strings -against it. - -<tag>Pools<item> -The Pool structures manage the allocation pools that the generator uses. -Start indicates the first byte of the pool, Count is the number of objects -remaining in the pool and ItemSize is the structure size (alignment factor) -of the pool. An ItemSize of 0 indicates the pool is empty. There should be -the same number of pools as there are structure types. The generator -stores this information so future additions can make use of any unused pool -blocks. - -<tag>HashTable<item> -HashTable is a hash table that provides indexing for all of the packages. -Each package name is inserted into the hash table using the following has -function: -<example> - unsigned long Hash(string Str) - { - unsigned long Hash = 0; - for (const char *I = Str.begin(); I != Str.end(); I++) - Hash += *I * ((Str.end() - I + 1)); - return Hash % _count(Head.HashTable); - } -</example> -<p> -By iterating over each entry in the hash table it is possible to iterate over -the entire list of packages. Hash Collisions are handled with a singly linked -list of packages based at the hash item. The linked list contains only -packages that match the hashing function. - -</taglist> - <!-- }}} --> -<!-- Package {{{ --> -<!-- ===================================================================== --> -<sect>Package -<p> -This contains information for a single unique package. There can be any -number of versions of a given package. Package exists in a singly -linked list of package records starting at the hash index of the name in -the Header->HashTable. -<example> - struct Pacakge - { - // Pointers - unsigned long Name; // Stringtable - unsigned long VersionList; // Version - unsigned long CurrentVer; // Version - unsigned long Section; // StringTable (StringItem) - - // Linked lists - unsigned long NextPackage; // Package - unsigned long RevDepends; // Dependency - unsigned long ProvidesList; // Provides - - // Install/Remove/Purge etc - unsigned char SelectedState; // What - unsigned char InstState; // Flags - unsigned char CurrentState; // State - - // Unique ID for this pkg - unsigned short ID; - unsigned long Flags; - }; -</example> - -<taglist> -<tag>Name<item> -Name of the package. - -<tag>VersionList<item> -Base of a singly linked list of version structures. Each structure -represents a unique version of the package. The version structures -contain links into PackageFile and the original text file as well as -detailed information about the size and dependencies of the specific -package. In this way multiple versions of a package can be cleanly handled -by the system. Furthermore, this linked list is guaranteed to be sorted -from Highest version to lowest version with no duplicate entries. - -<tag>CurrentVer<item> -CurrentVer is an index to the installed version, either can be -0. - -<tag>Section<item> -This indicates the deduced section. It should be "Unknown" or the section -of the last parsed item. - -<tag>NextPackage<item> -Next link in this hash item. This linked list is based at Header.HashTable -and contains only packages with the same hash value. - -<tag>RevDepends<item> -Reverse Depends is a linked list of all dependencies linked to this package. - -<tag>ProvidesList<item> -This is a linked list of all provides for this package name. - -<tag>SelectedState -<tag>InstState -<tag>CurrentState<item> -These correspond to the 3 items in the Status field found in the status -file. See the section on defines for the possible values. -<p> -SelectedState is the state that the user wishes the package to be -in. -<p> -InstState is the installation state of the package. This normally -should be OK, but if the installation had an accident it may be otherwise. -<p> -CurrentState indicates if the package is installed, partially installed or -not installed. - -<tag>ID<item> -ID is a value from 0 to Header->PackageCount. It is a unique value assigned -by the generator. This allows clients to create an array of size PackageCount -and use it to store state information for the package map. For instance the -status file emitter uses this to track which packages have been emitted -already. - -<tag>Flags<item> -Flags are some useful indicators of the package's state. - -</taglist> - - <!-- }}} --> -<!-- PackageFile {{{ --> -<!-- ===================================================================== --> -<sect>PackageFile -<p> -This contains information for a single package file. Package files are -referenced by Version structures. This is a singly linked list based from -Header.FileList -<example> - struct PackageFile - { - // Names - unsigned long FileName; // Stringtable - unsigned long Archive; // Stringtable - unsigned long Component; // Stringtable - unsigned long Version; // Stringtable - unsigned long Origin; // Stringtable - unsigned long Label; // Stringtable - unsigned long Architecture; // Stringtable - unsigned long Site; // Stringtable - unsigned long IndexType; // Stringtable - unsigned long Size; - - // Linked list - unsigned long NextFile; // PackageFile - unsigned short ID; - unsigned long Flags; - time_t mtime; // Modification time - }; -</example> -<taglist> - -<tag>FileName<item> -Refers the the physical disk file that this PacakgeFile represents. - -<tag>Archive -<tag>Component -<tag>Version -<tag>Origin -<tag>Label -<tag>Architecture -<tag>NotAutomatic<item> -This is the release information. Please see the files document for a -description of what the release information means. - -<tag>Site<item> -The site the index file was fetched from. - -<tag>IndexType<item> -A string indicating what sort of index file this is. - -<tag>Size<item> -Size is provided as a simple check to ensure that the package file has not -been altered. - -<tag>ID<item> -See Package::ID. - -<tag>Flags<item> -Provides some flags for the PackageFile, see the section on defines. - -<tag>mtime<item> -Modification time for the file at time of cache generation. - -</taglist> - - <!-- }}} --> -<!-- Version {{{ --> -<!-- ===================================================================== --> -<sect>Version -<p> -This contains the information for a single version of a package. This is a -single linked list based from Package.Versionlist. - -<p> -The version list is always sorted from highest version to lowest version by -the generator. Also there may not be any duplicate entries in the list (same -VerStr). - -<example> - struct Version - { - unsigned long VerStr; // Stringtable - unsigned long Section; // StringTable (StringItem) - unsigned long Arch; // StringTable - - // Lists - unsigned long FileList; // VerFile - unsigned long NextVer; // Version - unsigned long DependsList; // Dependency - unsigned long ParentPkg; // Package - unsigned long ProvidesList; // Provides - - unsigned long Size; - unsigned long InstalledSize; - unsigned long Hash; - unsigned short ID; - unsigned char Priority; - }; -</example> -<taglist> - -<tag>VerStr<item> -This is the complete version string. - -<tag>FileList<item> -References the all the PackageFile's that this version came out of. FileList -can be used to determine what distribution(s) the Version applies to. If -FileList is 0 then this is a blank version. The structure should also have -a 0 in all other fields excluding VerStr and Possibly NextVer. - -<tag>Section<item> -This string indicates which section it is part of. The string should be -contained in the StringItem list. - -<tag>Arch<item> -Architecture the package was compiled for. - -<tag>NextVer<item> -Next step in the linked list. - -<tag>DependsList<item> -This is the base of the dependency list. - -<tag>ParentPkg<item> -This links the version to the owning package, allowing reverse dependencies -to determine the package. - -<tag>ProvidesList<item> -Head of the linked list of Provides::NextPkgProv, forward provides. - -<tag>Size -<tag>InstalledSize<item> -The archive size for this version. For Debian this is the size of the .deb -file. Installed size is the uncompressed size for this version - -<tag>Hash<item> -This is a characteristic value representing this package. No two packages -in existence should have the same VerStr and Hash with different contents. - -<tag>ID<item> -See Package::ID. - -<tag>Priority<item> -This is the parsed priority value of the package. -</taglist> - - <!-- }}} --> -<!-- Dependency {{{ --> -<!-- ===================================================================== --> -<sect>Dependency -<p> -Dependency contains the information for a single dependency record. The records -are split up like this to ease processing by the client. The base of list -linked list is Version.DependsList. All forms of dependencies are recorded -here including Conflicts, Breaks, Suggests and Recommends. - -<p> -Multiple depends on the same package must be grouped together in -the Dependency lists. Clients should assume this is always true. - -<example> - struct Dependency - { - unsigned long Version; // Stringtable - unsigned long Package; // Package - unsigned long NextDepends; // Dependency - unsigned long NextRevDepends; // Reverse dependency linking - unsigned long ParentVer; // Upwards parent version link - - // Specific types of depends - unsigned char Type; - unsigned char CompareOp; - unsigned short ID; - }; -</example> -<taglist> -<tag>Version<item> -The string form of the version that the dependency is applied against. - -<tag>Package<item> -The index of the package file this depends applies to. If the package file -does not already exist when the dependency is inserted a blank one (no -version records) should be created. - -<tag>NextDepends<item> -Linked list based off a Version structure of all the dependencies in that -version. - -<tag>NextRevDepends<item> -Reverse dependency linking, based off a Package structure. This linked list -is a list of all packages that have a depends line for a given package. - -<tag>ParentVer<item> -Parent version linking, allows the reverse dependency list to link -back to the version and package that the dependency are for. - -<tag>Type<item> -Describes weather it is depends, predepends, recommends, suggests, etc. - -<tag>CompareOp<item> -Describes the comparison operator specified on the depends line. If the high -bit is set then it is a logical or with the previous record. - -<tag>ID<item> -See Package::ID. - -</taglist> - - <!-- }}} --> -<!-- Provides {{{ --> -<!-- ===================================================================== --> -<sect>Provides -<p> -Provides handles virtual packages. When a Provides: line is encountered -a new provides record is added associating the package with a virtual -package name. The provides structures are linked off the package structures. -This simplifies the analysis of dependencies and other aspects A provides -refers to a specific version of a specific package, not all versions need to -provide that provides. - -<p> -There is a linked list of provided package names started from each -version that provides packages. This is the forwards provides mechanism. -<example> - struct Provides - { - unsigned long ParentPkg; // Package - unsigned long Version; // Version - unsigned long ProvideVersion; // Stringtable - unsigned long NextProvides; // Provides - unsigned long NextPkgProv; // Provides - }; -</example> -<taglist> -<tag>ParentPkg<item> -The index of the package that head of this linked list is in. ParentPkg->Name -is the name of the provides. - -<tag>Version<item> -The index of the version this provide line applies to. - -<tag>ProvideVersion<item> -Each provides can specify a version in the provides line. This version allows -dependencies to depend on specific versions of a Provides, as well as allowing -Provides to override existing packages. This is experimental. - -<tag>NextProvides<item> -Next link in the singly linked list of provides (based off package) - -<tag>NextPkgProv<item> -Next link in the singly linked list of provides for 'Version'. - -</taglist> - - <!-- }}} --> -<!-- VerFile {{{ --> -<!-- ===================================================================== --> -<sect>VerFile -<p> -VerFile associates a version with a PackageFile, this allows a full -description of all Versions in all files (and hence all sources) under -consideration. - -<example> - struct pkgCache::VerFile - { - unsigned long File; // PackageFile - unsigned long NextFile; // PkgVerFile - unsigned long Offset; - unsigned short Size; - } -</example> -<taglist> -<tag>File<item> -The index of the package file that this version was found in. - -<tag>NextFile<item> -The next step in the linked list. - -<tag>Offset -<tag>Size<item> -These describe the exact position in the package file for the section from -this version. -</taglist> - - <!-- }}} --> -<!-- StringItem {{{ --> -<!-- ===================================================================== --> -<sect>StringItem -<p> -StringItem is used for generating single instances of strings. Some things -like Section Name are are useful to have as unique tags. It is part of -a linked list based at Header::StringList. -<example> - struct StringItem - { - unsigned long String; // Stringtable - unsigned long NextItem; // StringItem - }; -</example> -<taglist> -<tag>String<item> -The string this refers to. - -<tag>NextItem<item> -Next link in the chain. -</taglist> - <!-- }}} --> -<!-- StringTable {{{ --> -<!-- ===================================================================== --> -<sect>StringTable -<p> -All strings are simply inlined any place in the file that is natural for the -writer. The client should make no assumptions about the positioning of -strings. All stringtable values point to a byte offset from the start of the -file that a null terminated string will begin. - <!-- }}} --> -<!-- Defines {{{ --> -<!-- ===================================================================== --> -<sect>Defines -<p> -Several structures use variables to indicate things. Here is a list of all -of them. - -<sect1>Definitions for Dependency::Type -<p> -<example> -#define pkgDEP_Depends 1 -#define pkgDEP_PreDepends 2 -#define pkgDEP_Suggests 3 -#define pkgDEP_Recommends 4 -#define pkgDEP_Conflicts 5 -#define pkgDEP_Replaces 6 -#define pkgDEP_Breaks 8 -</example> -</sect1> - -<sect1>Definitions for Dependency::CompareOp -<p> -<example> -#define pkgOP_OR 0x10 -#define pkgOP_LESSEQ 0x1 -#define pkgOP_GREATEREQ 0x2 -#define pkgOP_LESS 0x3 -#define pkgOP_GREATER 0x4 -#define pkgOP_EQUALS 0x5 -</example> -The lower 4 bits are used to indicate what operator is being specified and -the upper 4 bits are flags. pkgOP_OR indicates that the next package is -or'd with the current package. -</sect1> - -<sect1>Definitions for Package::SelectedState -<p> -<example> -#define pkgSTATE_Unkown 0 -#define pkgSTATE_Install 1 -#define pkgSTATE_Hold 2 -#define pkgSTATE_DeInstall 3 -#define pkgSTATE_Purge 4 -</example> -</sect1> - -<sect1>Definitions for Package::InstState -<p> -<example> -#define pkgSTATE_Ok 0 -#define pkgSTATE_ReInstReq 1 -#define pkgSTATE_Hold 2 -#define pkgSTATE_HoldReInstReq 3 -</example> -</sect1> - -<sect1>Definitions for Package::CurrentState -<p> -<example> -#define pkgSTATE_NotInstalled 0 -#define pkgSTATE_UnPacked 1 -#define pkgSTATE_HalfConfigured 2 -#define pkgSTATE_UnInstalled 3 -#define pkgSTATE_HalfInstalled 4 -#define pkgSTATE_ConfigFiles 5 -#define pkgSTATE_Installed 6 -#define pkgSTATE_TriggersAwaited 7 -#define pkgSTATE_TriggersPending 8 -</example> -</sect1> - -<sect1>Definitions for Package::Flags -<p> -<example> -#define pkgFLAG_Auto (1 << 0) -#define pkgFLAG_New (1 << 1) -#define pkgFLAG_Obsolete (1 << 2) -#define pkgFLAG_Essential (1 << 3) -#define pkgFLAG_ImmediateConf (1 << 4) -</example> -</sect1> - -<sect1>Definitions for Version::Priority -<p> -Zero is used for unparsable or absent Priority fields. -<example> -#define pkgPRIO_Important 1 -#define pkgPRIO_Required 2 -#define pkgPRIO_Standard 3 -#define pkgPRIO_Optional 4 -#define pkgPRIO_Extra 5 -</example> -</sect1> - -<sect1>Definitions for PackageFile::Flags -<p> -<example> -#define pkgFLAG_NotSource (1 << 0) -#define pkgFLAG_NotAutomatic (1 << 1) -</example> -</sect1> - - <!-- }}} --> - -<chapt>Notes on the Generator -<!-- Notes on the Generator {{{ --> -<!-- ===================================================================== --> -<p> -The pkgCache::MergePackageFile function is currently the only generator of -the cache file. It implements a conversion from the normal textual package -file into the cache file. - -<p> -The generator assumes any package declaration with a -Status: line is a 'Status of the package' type of package declaration. -A Package with a Target-Version field should also really have a status field. -The processing of a Target-Version field can create a place-holder Version -structure that is empty to refer to the specified version (See Version -for info on what a empty Version looks like). The Target-Version syntax -allows the specification of a specific version and a target distribution. - -<p> -Different section names on different versions is supported, but I -do not expect to use it. To simplify the GUI it will merely use the section -in the Package structure. This should be okay as I hope sections do not change -much. - -<p> -The generator goes through a number of post processing steps after producing -a disk file. It sorts all of the version lists to be in descending order -and then generates the reverse dependency lists for all of the packages. -ID numbers and count values are also generated in the post processing step. - -<p> -It is possible to extend many of the structures in the cache with extra data. -This is done by using the ID member. ID will be a unique number from 0 to -Header->??Count. For example -<example> -struct MyPkgData; -MyPkgData *Data = new MyPkgData[Header->PackageCount]; -Data[Package->ID]->Item = 0; -</example> -This provides a one way reference between package structures and user data. To -get a two way reference would require a member inside the MyPkgData structure. - -<p> -The generators use of free space pools tend to make the package file quite -large, and quite full of blank space. This could be fixed with sparse files. - - <!-- }}} --> - -<chapt>Future Directions -<!-- Future Directions {{{ --> -<!-- ===================================================================== --> -<p> -Some good directions to take the cache file is into a cache directory that -contains many associated caches that cache other important bits of -information. (/var/cache/apt, FHS2) - -<p> -Caching of the info/*.list is an excellent place to start, by generating all -the list files into a tree structure and reverse linking them to the package -structures in the main cache file major speed gains in dpkg might be achieved. - - <!-- }}} --> - -</book> |