$OpenBSD: OpenBSD::Ustar.pod,v 1.13 2009/12/17 11:41:30 espie Exp $ =head1 NAME OpenBSD::Ustar - simple access to Ustar C archives =head1 SYNOPSIS use OpenBSD::Ustar; # for reading open(my $in, "<", $arcnameforreading) or die; $rdarc = OpenBSD::Ustar->new($in, $destdir); while (my $o = $rdarc->next) { # decide whether we want to extract it, change object attributes $o->create; } $rdarc->close; # for writing open(my $out, ">", $arcnameforwriting) or die; $wrarc = OpenBSD::Ustar->new($fh, $destdir); # loop my $o = $wrarc->prepare($filename); # tweak some entry parameters $o->write; $wrarc->close; # for copying open(my $in, "<", $arcnameforreading) or die; $rdarc = OpenBSD::Ustar->new($in, $destdir); open(my $out, ">", $arcnameforwriting) or die; $wrarc = OpenBSD::Ustar->new($fh, $destdir); while (my $o = $rdarc->next) { $o->copy($wrarc); } $rdarc->close; $wrarc->close; =head1 DESCRIPTION C provides an API to read, write and copy archives compatible with C. For the time being, it can only handle the USTAR archive format. A filehandle C<$fh> is associated with an C object through C. For archive reading, the filehandle should support C. C does not rely on C or C in order to be usable on pipe outputs. For archive writing, the filehandle should support C. Note that read and write support are mutually exclusive, though there is no need to specify the mode used at creation time; it is implicitly provided by the underlying filehandle. Read access to an archive object C<$rdarc> occurs through a loop that repeatedly calls C<$o = $rdarc-Enext> to obtain the next archive entry. It returns an archive entry object C<$o> that can be queried to decide whether to extract this entry or not. Write access to an archive object C<$wrarc> occurs through a user-directed loop: obtain an archive entry through C<$o = $wrarc-Eprepare($filename)>, which can be tweaked manually and then written to the archive. Most client software will specialize C to their own needs. Note however that C is not designed for inheritance. Composition (putting a C object inside your class) and forwarding methods (writing C or C methods that call the corresponding C method) are the correct way to use this API. Note that C does not do any caching. The client code is responsible for retrieving and storing archives if it needs to scan through them multiple times in a row. Actual extraction is performed through C<$o-Ecreate> and is not mandatory. Thus, client code can control whether it wants to extract archive elements or not. The C method can take an optional C<$callback> argument, which will be called regularly while extracting large objects, as C<&$callback($donesize)>, with C<$donesize> the number of bytes already extracted. Small files can also be directly extracted to a scalar using C<$v = $o-Econtents>. Actual writing is performed through C<$o-Ewrite> and is not mandatory either. Archives should be closed using C<$wrarc-Eclose>, which will pad the archive as needed and close the underlying file handle. In particular, this is mandatory for write access, since valid archives require blank-filled blocks. This is equivalent to calling C<$wrarc-Epad>, which will complete the archive with blank-filled blocks, then closing the associated file handle manually. Client code may decide to abort archive extraction early, or to run it through until C<$arc-Enext> returns false. The C object doesn't hold any hidden resources and doesn't need any specific clean-up. Client code is only responsible for closing the underlying filehandle and terminating any associated pipe process. An object C<$o> returned through C or through C holds all the characteristics of the archive header: =over 20 =item C<$o-EIsDir> true if archive entry is a directory =item C<$o-EIsFile> true if archive entry is a file =item C<$o-EIsLink> true if archive entry is any kind of link =item C<$o-EIsSymLink> true if archive entry is a symbolic link =item C<$o-EIsHardLink> true if archive entry is a hard link =item C<$o-E{name}> filename =item C<$o-E{mode}> C mode =item C<$o-E{mtime}> C modification time =item C<$o-E{uid}> owner user ID =item C<$o-E{gid}> owner group ID =item C<$o-E{uname}> owner user name =item C<$o-E{gname}> owner group name =item C<$o-E{linkname}> name of the source link, if applicable =back The fields C, C, C, C, C and C can be altered before calling C<$o-Ecreate> or C<$o-Ewrite>, and will properly influence the resulting file. The relationship between C and C, and C and C conforms to the USTAR format usual behavior. In addition, client code may define C<$o-E{cwd}> in a way similar to C's C<-C> option to affect the creation of hard links. All creation commands happen relative to the current destdir of the C<$arc> C object. This is set at creation, and can later be changed through C<$arc-Edestdir($value)>. During writing, hard link status is determined according to already written archive entries: a name that references a file which has already been written will be granted hard link status. Hard links can not be copied from one archive to another unless the original file has also been copied. Calling C<$o-Ealias($arc, $name)> will trick the destination archive C<$arc> into believing C<$o> has been copied under the given C<$name>, so that further hard links will be copied over. Archives can be copied by creating separate archives for reading and writing. Calling C<$o = $rdarc-Enext> and C<$o-Ecopy($wrarc)> will copy an entry obtained from C<$rdarc> to C<$wrarc>.