prune empties in checkout
#! /usr/bin/perl -w
$vsn = '$Revision: 1.22 $'; #'
$vsn =~ s/.+:\s*([\d.]+).+/$1/;
use constant TPUB => 'PUBLIC';
use constant TPRV => 'PRIVATE';
use constant TUNK => 'UNKNOWN';
# $Id: cvspublish.pl,v 1.22 2003/02/03 02:24:07 mca1001 Exp $
## Introduction ############################################################
=head1 NAME
cvspublish.pl - render a (piece of a) CVS repository for a web server
=head1 SYNOPSIS
cvspublish.pl [-d <target directory>] [--help|-h]
[-m websitemodule] [--tagonly|-T]
[-P htmlmodule] [--quiet|-q]
[-R REL_TAG] [--dryrun|-n]
[-p [<dir>]] [-c [<dir>]] [-s <cvssubdir>]
=head1 README
The program exports a CVS module (possibly the whole repository) and
adds enough HTML to allow innocent bystanders to pluck out individual
files. It is designed for publishing a personal script collection.
blibbly test chunk
should be removed
later
=head1 DESCRIPTION
If you are like me, you probably have a directory in CVS which
contains your website, plus some others that contain your code
projects ... and you want to publish some of your code projects in a
subdirectory of your website.
This script will do selective publication, including a few sanity
checks, along with neat descriptions and links to full documentation
and revision history.
It can call F<cvs2html> and F<pod2html> to get the hard work done.
=cut
## Set options #############################################################
{
my $opts = <<"=cut"; # aiee! it's horrible!
=head1 OPTIONS
As defined in the code, they are
help|h --help or -h request the syntax help message (which you
may find clearer than the manpage)
module|m=s Specify name of top level module to export
subdir|s=s Subdirectory to index, relative to module
c2hdir|c:s Place for cvs2html to scribble, relative to module
With no parameter, cvs2html is not used.
p2hdir|p:s Place to put HTML converted from POD files.
With no parameter, pod2html is not used.
tagonly|T Just tag the repository and exit
release|R=s Existing tag to export with
Prevents any 'cvs rtag' commands being run
destdir|d=s Directory to export into
autopub|P:s Tag the specified module to ensure new items are public.
dryrun|n Print the commands but do not change anything
quiet|q Most of the noise is in dryrun mode...
C<foo|f=s> means you can specify C<--foo I<string>> or C<-f
I<string>>. No C<=s> means the thing is just a switch. See
L<Getopt::Long> for more details.
I suspect the fact that this piece of documentation is passed to my
perl script is a bug, but it is a very handy bug!
The default values are
module => website
autopub => website_plain
subdir => cvspublish
c2hdir => cvshistory
p2hdir => cvspods
The default release tag is of the form I<REL_yyyy_dd_mm>.
Output goes to F<index.html> in the C<subdir> directory, overwriting
an existing file if necessary. The destination should probably be
configurable...
=cut
# ^^^^^^^^^^^^^^^^^^ change them right there in that there heredoc/pod
my $last = '';
foreach my $opt (split /[\r\n]+/, $opts) {
if ($opt =~ m@^ (\w+)\s+=>\s+(.+?)\s*$@) { $opts{$1}=$2 }
elsif ($opt =~ m@^ (\S+)\s+(.+?)\s*$@) { $optdescr{$last=$1}=$2 }
elsif ($opt =~ m@^\s{10,}(.+?)\s*$@ && $last) {
$optdescr{$last} .= "\n$1";
} # else ignore the other drivel
}
}
## Business ################################################################
use strict;
use vars qw($vsn %opts %optdescr %FILES %DIRS @CVSDIRS %LICENCES);
use Getopt::Long;
use HTML::Entities;
use POSIX qw(tmpnam);
# $DIRS{$dir}->{$leafname} == $FILES{$dir.$leafname}
# Value is a hash with "source:type" => info-value
# This lot gets rendered in the end...
# CVSDIRS = list of directories to delete (under subdir)
$| = 1;
# perl -ne 'print "$1;\n" if /^\s*(sub\s+\w+\s*\(\W+\))[^;]*(\}.*)?$/' ~/bin/cvspublish.pl
sub check_env ();
sub set_rtags ();
sub run_cmds (@);
sub validate_tag ($$);
sub set_tempdir ();
sub safe_chdir ($);
sub checkout_n_release ();
sub cvs2html ();
sub safe_mkdir ($$);
sub trundle ($);
sub debug_hash ($;$);
sub get_file_data ($$);
sub read_cvslog ($);
sub read_file ($);
sub bytesize ($);
sub check_c2h ($);
sub make_pod ($$);
sub rubber_stamp ();
sub slurp_file ($);
sub render ();
die "argh, the options are gone"
unless scalar keys %opts && (3 < scalar keys %optdescr);
Getopt::Long::Configure('no_ignore_case');
if (Getopt::Long::GetOptions(\%opts, keys %optdescr)) {
$opts{'help'} = 'You requested help' if $opts{'help'};
} else {
$opts{'help'} = 'Invalid syntax';
}
$opts{'help'} = "You haven't specified a module to export"
unless $opts{'help'} || $opts{'module'};
die "syntax message here ($opts{'help'})" if $opts{'help'};
check_env();
foreach (@opts{qw(destdir module subdir p2hdir c2hdir)})
{ s@[\\/]+$@@ if defined }
set_rtags() unless $opts{'release'};
if ($opts{'tagonly'}) {
print "Tags done, exiting as requested\n"
unless $opts{'quiet'};
exit;
}
set_tempdir(); # chdir into destdir
checkout_n_release();
cvs2html() if $opts{'c2hdir'};
safe_mkdir("$opts{'destdir'}/$opts{'module'}/$opts{'p2hdir'}", 0777)
if $opts{'p2hdir'};
safe_chdir("$opts{'destdir'}/$opts{'module'}/$opts{'subdir'}");
trundle("./");
# debug_hash(%DIRS);
# print "=" x 78, "\n";
# debug_hash(%FILES);
rubber_stamp();
render(); # %DIRS / %FILES --> html, somewhere
safe_chdir("$opts{'destdir'}/$opts{'module'}/$opts{'subdir'}");
# TODO: this doesn't cover two-level directories in website_plain
run_cmds([ "rm", "-rf", @CVSDIRS, "../CVS", "../*/CVS" ]);
exit;
#-----------------------------------------------------------------------------
sub check_env ()
#| Check we have necessary binaries on the PATH
#| plus a CVSROOT
{
die('Obviously I forgot to tell you to set the CVSROOT '.
'environment variable') unless $ENV{'CVSROOT'};
die("I won't get far without a PATH variable") unless $ENV{'PATH'};
my @paths = split /:/, $ENV{'PATH'};
my @bins = qw(rm echo cvs);
# Win32 peeps should try the Cygwin stuff. It even has bash!
# Folks without these should do eg. --p2hdir=""
push @bins, 'pod2html' if $opts{'p2hdir'};
push @bins, 'cvs2html' if $opts{'c2hdir'};
BIN:
foreach my $bin (@bins) {
foreach my $path (@paths) {
$path .= '/' unless $path =~ m@[\\/]$@;
next BIN if -f $path.$bin && -x _;
}
die "Can't find '$bin' on the path ($ENV{'PATH'})";
}
}
sub set_rtags ()
#| Put file tags into a consistent state & make the release tag.
# Anything PRIVATE is not PUBLIC, anything neither PUBLIC nor PRIVATE
# is UNKNOWN. PUBLIC files are left with the release tag on .. the
# HEAD revision?
{
# whip up a release tag
unless ($opts{'release'}) {
my @tm = localtime();
# I suppose a 2-digit date format would mean at most 36,000
# tags, if you run every day for a century...
$opts{'release'} = sprintf('REL_%04d_%02d_%02d',
$tm[5]+1900, $tm[4]+1, $tm[3]);
}
validate_tag($opts{'release'}, "release");
# rtag games to tag our stuff
my ($pb, $pv, $un, $rel, $mod) = (TPUB, TPRV, TUNK, $opts{'release'}, '.');
my @cmds = (
"rtag -r $pv -d $pb $mod", # not public if private
"rtag $un $mod", # (generally unknown)
"rtag -r $pb -d $un $mod", # not unknown if public
"rtag -r $pv -d $un $mod", # or private
"rtag -F $rel $mod", # (generally, released)
"rtag -r $pv -d $rel $mod", # not release if private
"rtag -r $un -d $rel $mod", # or unknown
);
# possibly prefix those with public for the HTML
if ($opts{'autopub'}) {
validate_tag($opts{'autopub'}, "autopub");
# ("rtag -d $pv $opts{'autopub'}")
unshift @cmds, ("rtag $pb $opts{'autopub'}");
}
# change CVS options here
run_cmds(map { ["cvs", "-q", (split /\s+/, $_) ] } @cmds); # enercute!
}
sub run_cmds (@)
#| Run each argument in system()
#| Commands may be string or ref of list of args
{
foreach my $cmd (@_) {
my @cmd = (ref $cmd ? @{$cmd} : $cmd);
if ($opts{'dryrun'}) {
print "Would run '@cmd'\n"
unless $opts{'quiet'};
} else {
print "Running '@cmd'\n"
unless $opts{'quiet'};
system(@cmd);
die "Command '@cmd' failed: $! (error code $?)" if $?;
}
}
}
sub validate_tag ($$)
#| die unless the tag is OK for CVS
{
my ($tag, $opt) = (shift, shift);
die "CVS tag '$tag' (for the $opt option) is not valid"
unless $tag =~ /^[a-z][-a-z0-9_]*$/i;
}
sub set_tempdir ()
#| Create (if necessary) and 'cd' to the destdir
{
unless ($opts{'destdir'}) {
do { $opts{'destdir'} = tmpnam() }
until not -e $opts{'destdir'};
print "Random --destdir=\"$opts{'destdir'}\"\n";
}
safe_mkdir($opts{'destdir'}, 0777);
safe_chdir($opts{'destdir'});
}
sub safe_chdir ($)
#| Go there or die in a freak accident
{
my $dir = shift;
chdir($dir) or
die "Failed to change working directory to '$dir': $!";
print "Now in '$dir'\n" if $opts{'dryrun'} && !$opts{'quiet'};
}
sub checkout_n_release ()
#| cvs co -P -r RELTAG MODULE
#| cvs rel MODULE
{
my $dir = "$opts{'destdir'}/$opts{'module'}";
warn("Slightly surprised to see the $dir ".
"directory already, but will continue anyway\n")
if -d $dir;
run_cmds([ qq(cvs checkout -P -r $opts{'release'} $opts{'module'}) ]);
warn("I was expecting to see a $dir directory, but it's not ".
"there. Will continue anyway (I assume $opts{'destdir'} ".
"is absolute?!)\n")
unless -d $dir;
run_cmds("echo y | cvs release $opts{'module'} ; echo",
# "must be standing immediately above it" ... but this cancels the
# checkout when I check with 'cvs history'
"echo y | cvs release $opts{'module'}/$opts{'subdir'} ; echo");
}
sub cvs2html ()
#| Make a c2hdir and run cvs2html at it
{
my $site = "$opts{'destdir'}/$opts{'module'}/";
my @c2hopts = ();
safe_mkdir($site.$opts{'c2hdir'}, 0777);
safe_chdir($site.$opts{'subdir'});
# # last 30 changes, chronologically
push @c2hopts, qw(-N 30 -C changelogs.html);
# diffs with nobbled keywords
push @c2hopts, qw(-a -k);
run_cmds([ "cvs2html",
"-o", "../$opts{'c2hdir'}/c2h",
"-l", "../$opts{'subdir'}",
"-w", "200",
@c2hopts,
]);
}
sub safe_mkdir ($$)
#| Make it after doing the sensible checks
{
my ($d, $m) = (shift, shift);
if (-d $d) {
# do nothing
} elsif (-e $d) {
die "Destination $d already exists but is not a directory";
} elsif ($opts{'dryrun'}) {
print "Would create directory $d\n" unless -d $d || $opts{'quiet'};
} else {
# So check the cwd if it fails on a relative name?
mkdir($d, $m) or die "Failed to make directory '$d': $!";
}
}
sub trundle ($)
#| Scan specified directory (relative to CWD) and do jobs on files
{
my $dir = shift;
unless (opendir DIR, $dir) {
# maybe this should be fatal... it's hardly a good thing
warn "Can't scan directory '$dir': $!, skipping";
return;
}
my @cont = sort readdir DIR;
closedir DIR;
my @dirs = ();
my $f;
while (defined ($f = shift @cont)) {
next if $f eq '.' or $f eq '..';
my $path = $dir.$f;
if (-d $path) {
if ($f eq 'CVS') {
push @CVSDIRS, $path; # delete later
} else {
push @dirs, $f; # recurse later
}
} elsif (-f $path) {
next if $path eq './index.html'; # Hi, Mum! <wave>
my $lastANAME = $path;
$lastANAME =~ s:/[^/]+$::; # remove filename, get parent dir
$lastANAME =~ s:^\./::;
$lastANAME =~ s:[^\w]:.:g; # massage, copied from $pdirname 8-(
$lastANAME =~ s/^\.|\.$//g;
my $updirs = $path;
$updirs =~ s:(^|/)/*[^/]+(?=/):$1..:g; # eww
$updirs =~ s:/[^/]+$::;
$DIRS{$dir} = {} unless defined $DIRS{$dir};
$FILES{$path} = $DIRS{$dir}->{$f} =
get_file_data($path, "$updirs/cvspublish/#$lastANAME");
# TODO: Argh! I can't figure out the final directory name from here!
} else {
warn("$path is neither file nor directory. ".
"The webserver will love that!\n");
}
}
foreach $f (@dirs) {
trundle($dir.$f."/");
}
}
sub debug_hash ($;$)
#| Dump hash contents
{
my %h = %{ shift() };
my $spc = (@_ ? shift : '') . ' ';
foreach my $k (sort keys %h) {
my $v = $h{$k};
print $spc, $k, " => ", $v, "\n";
debug_hash($v, $spc) if ref($v) eq 'HASH';
}
}
sub get_file_data ($$)
#| Pull CVS logs and tags from file
#| Run pod2html maybe
#| Return the data for stashing
{
my ($file, $backlink) = @_;
my %data;
my $add = (sub ($$) { my ($k, $v) = @_;
warn "$file: Overwriting '$k' info '$data{$k}' with '$v'"
if defined $data{$k};
$data{$k} = $v;
});
my $h = read_cvslog($file);
foreach (keys %{$h}) { $add->("cvslog:$_", $h->{$_}) }
#TODO: $data{'type'} = guess_type($file); # there's a perl interface to file(1) already isn't there?
$h = read_file($file);
foreach (keys %{$h}) { $add->($_, $h->{$_}) }
if ($opts{'c2hdir'}) {
$h = check_c2h($file);
foreach (keys %{$h}) { $add->("c2h:$_", $h->{$_}) }
}
$add->('p2h:href', make_pod($file, $backlink))
if $opts{'p2hdir'} && $data{'pod:haspod'};
return \%data;
}
sub read_cvslog ($)
#| Do `cvs log file` and parse the output, return a hashref of titbits
{
my $file = shift;
my ($cmd, %d, $cl) = "cvs log 2>&1 \Q$file";
# Nasty little helper functions. They should go somewhere.
my $grump = (sub { print "debug dump of extracted data:\n";
debug_hash(%d);
die("\nParse of '$cmd' failed on '$_[0]' step\n".
"Text left = >>$cl<<\n") });
my $grab = (sub { $grump->("$_[0] chunk")
unless $cl =~ s/\A(.*?)\n?-{28}\n//sm;
return $1; });
if ($opts{'dryrun'}) {
print "Running '$cmd' (it's read only)\n"
unless $opts{'quiet'};
}
$cl = `$cmd`;
return( { 'broken' => "$cl ($!)" } )
if $?;
# sorry, these just fit what I've got here...
$grump->('rcsfile') unless $cl =~ s/^\s*RCS file:\s+[^\n]+\n//s;
$grump->('workfile') unless $cl =~ s/^Working file:\s+.+\n//;
$grump->('head') unless $cl =~ s/^head:\s*([\d.]*)\n//;
$d{'head'} = $1;
$grump->('branch') unless $cl =~ s/^branch:\s*([\d.]*)\n//;
$d{'branch'} = $1;
$grump->('locks') unless $cl =~ s/^locks:.*\n//;
$grump->('acl') unless $cl =~ s/^access list:.*\n//;
$grump->('symname head') unless $cl =~ s/^symbolic names:\s*\n//;
$d{'symnames'} = "";
while ($cl =~ s/^(?:\t\s*|\s{4,})([-_a-z0-9A-Z]+):\s*([\d.]*)\n//) {
$d{'symnames'} .= ($d{'symnames'} ? " " : "") . "$1:$2";
}
$grump->('keysubst') unless $cl =~ s/^keyword substitution:\s*(\S*)\n//;
$d{'keysubst'} = $1;
$grump->('totrev') unless $cl =~ s/^total revisions:\s+\d+;\s+//;
$grump->('selrev') unless $cl =~ s/^selected revisions:\s+(\d+)\n//;
my $revcounter = $1;
$cl =~ s/={77}\Z/'-' x 28/e;
$grump->('descr') unless $cl =~ s/^description:\s*\n//;
$d{'description'} = $grab->('description');
while ($revcounter--) {
$grump->('rev') unless $cl =~ s/^revision\s+([\d.]+)\n//;
my $rev = $1;
$grump->('rhdr') unless $cl =~ s/^(date:\s+.+)\n//;
$_ = $1; s/;\s+/\n/g; s/;\s+$//;
$d{"revhdr $rev"} = $_;
$d{"revlog $rev"} = $grab->("rev $rev");
}
return \%d;
}
sub read_file ($) # args = filename
#| Read the file and return some data
{
my $file = shift;
my %d;
my $add = ( sub ($$) { my ($k, $v) = (lc(shift()), shift);
$d{$k} = ($d{$k} ? $d{$k}."\n" : '') . $v; } );
die "Can't see file $file" unless -f $file && -r _;
$add->('file:size', bytesize(-s _));
if (not open FILE, "< $file\0") {
warn "Failed to read '$file': $!";
return undef;
}
# TODO: this reads the whole file, which is probably a waste of time
# OK, now it'll bin out on null bytes.
# Anyway, the loop needs rewriting, and the pod reader needs work (or
# delegation) too
my @unpeeked = (); # lines to not have read yet
while(defined( $_ = (@unpeeked ? shift @unpeeked : <FILE>) )) {
last if /\x00/;
# {{{ Foldable
if (/\s\{{3}\s+Licence\b/) {
$d{'folding:haslicence'} = 1;
}
# }}}
# primitive pod reading (formatted for indent elsewhere)
elsif (my ($cmd, $arg) =
/^=(head[12]|item|over|back|cut|pod|for|begin|end)(\s+.+)?$/) {
next unless $arg; # not interested in =back or whatever
$arg =~ s/^\s+|\s+$//g;
if ($cmd =~ /^head/ && $arg eq 'README') {
my $txt = '';
while (<FILE>) {
last if /=cut\b/;
if ($txt =~ /\n\n$/ && /^=\w+/) {
unshift @unpeeked, $_;
last;
}
$txt .= $_;
}
$add->('pod:description', $txt);
} elsif ($cmd =~ /^head/) {
$d{'pod:haspod'} = 1; # don't p2h just for a README section
} elsif ($cmd eq 'for' && $arg eq 'cvspublish') {
while (<FILE>) {
if (m@(\w+)\s*:\s*(.+?)\s*$@) {
$add->("tagged:$1", $2)
}
}
}
}
# quick stab at single-line <meta> tags in HTML
# they have to be formatted the obvious way!
elsif ( m@<meta\s+name\s*=\s*"(description|author|copyright)"\s+content\s*=\s*"([^\"]+)"@ ) {
$add->("html:$1", $2);
}
# special (grim) data tags should work in any commentable text format
elsif (/^\s*(.*?)\s*CVSPUBLISH:{3}\s*(.*?)\s*$/) {
# This is a block of info for us - see pod for an example
# You can have more than one of them...
my @comment = ($1, $2);
my $closed = 0;
my $lastkey = '';
while (<FILE>) {
s/^\s+|\s+$//g;
my ($orig, $l, $r) = ($_, map {$_ eq ''} @comment);
if (!$l) {
substr($_, 0, length($comment[0])) = ''
if $l = substr($_, 0, length($comment[0]))
eq $comment[0];
}
if (!$r) {
substr($_, -length($comment[1]) ) = ''
if $r = substr($_, -length($comment[1]) )
eq $comment[1];
}
s/^\s+|\s+$//g;
if ($l && $r && /:{3}CVSPUBLISH/) {
$closed = 1;
last; # graceful end
} elsif ($l && $r && /^:\s*(.+)/ && $lastkey) {
$add->("tagged:$lastkey", $1);
} elsif ($l && $r && /^(\w+)\s*:\s*(.+)/) {
my $val = $2;
$lastkey = $1;
$add->("tagged:$lastkey", $val);
if (lc($lastkey) eq 'rubberstamp') {
$d{'tagged:rubberstamp'} = $val;
die "Can't stamp '$file' with '$val', file not found or not readable"
unless -f $val && -r $val;
@d{qw(tagged:rubberstamp_left tagged:rubberstamp_right)} = @comment;
$lastkey = '';
}
} else {
# TODO: better error message at the very least!
warn("Invalid line '$orig' --> '$_' in cvspublish ".
"data block - skipping remainder");
$add->('tagged:broken', "$orig (line $.)");
$closed = 1;
last;
}
}
$add->('tagged:broken', "cvspublish data block not closed")
unless $closed;
}
}
close FILE;
return \%d;
}
sub bytesize ($)
#| Return number of bytes in a convenient form
{
my ($num, $mod, $val) = (shift, " bytes");
$val = $num;
if ($num > 4095 ) { $val = int($num / 1024); $mod = "k" }
if ($num > 4095*1024 ) { $val = int($num / 1024 / 1024); $mod = "M" }
if ($num > 4095*1024*1024) { $val = int($num / 1024 / 1024 / 1024); $mod = "G" }
return $val.$mod;
}
sub check_c2h ($)
#| Make links to cvs2html generated files
#| Keys 'file' and maybe 'frameset' are hrefs from 'subdir'
{
my $file = shift;
my %d;
# for the given file, guess where cvs2html would put the pieces
$file =~ m@^\.(/.+)?/([^/]+)$@
or die "Can't match directory/file for '$file'";
my ($dir, $leaf) = ($1, $2);
$dir = '' unless defined $dir;
$dir =~ s@/@_@g;
# TODO: relate this to the command!
my $prefix = "../$opts{'c2hdir'}/c2h" . $dir;
my $t;
if (-f ($t = $prefix."~".$leaf.".html")) { # cvs2html -O -f
$d{'file'} = $t;
} elsif (-f ($t = $prefix."~".$leaf."__rf.html")) { # cvs2html -o -f
$d{'file'} = $t;
$d{'frameset'} = $prefix.".html";
} elsif (-f ($t = $prefix."__rf.html")) { # cvs2html -o
$d{'file'} = $t."#".$leaf;
$d{'frameset'} = $prefix.".html";
} elsif (-f ($t = $prefix.".html")) { # cvs2html -O
$d{'file'} = $t."#".$leaf;
} elsif ($opts{'c2hdir'}) {
warn "Couldn't find any cvs2html files using '$prefix'"
unless $opts{'quiet'};
} else {
# well we're not called if it wasn't requested...
warn "check_c2h: No info for $file\n";
}
return \%d;
}
sub make_pod ($$)
#| Make pod2html file
#| Return href relative to cvspublish directory, or undef
{
my ($infile, $backlink) = @_;
my $outfile = $infile . '_.html'; # .pl.html is Polish!
my $htmlroot = "../$opts{'p2hdir'}";
$outfile =~ s@^./@@;
$outfile = "$htmlroot/$outfile";
my $title = "pod2html: $infile";
$title =~ s/: \.\//: /;
my $outdir = $outfile; $outdir =~ s@[\\/][^\\/]+$@@;
safe_mkdir($outdir, 0777);
run_cmds(['pod2html',
'--htmlroot', $htmlroot,
'--noindex',
'--nonetscape',
'--podroot', '.',
'--title', $title,
# why recurse? it's just a file?
# '--recurse',
'--infile', $infile,
'--outfile', $outfile,
]);
#TODO: return undef if error, instead of just dying
# Post-process the p2h output to add a back-link at the top.
# Read:
open DOCFILE, "< $outfile\0"
or die "Failed to read back pod2html output $outfile: $!";
my $html = join "", <DOCFILE>;
close DOCFILE;
# Insert link:
$html =~ s{(<body\b[^>]*>)}{$1\n<p><a href="$backlink">Back to cvspublish index</a></p>}i
or die "Failed to insert backlink in $outfile";
# Replace:
open DOCFILE, "> $outfile\0"
or die "Failed to clobber pod2html output $outfile: $!\n";
print DOCFILE $html;
close DOCFILE;
# TODO: this is icky, should be generalised to do something more useful
return $outfile;
}
#-----------------------------------------------------------------------------
sub rubber_stamp ()
#| Manufacture a world in which licences aren't needed
#| Failing that, rubber stamp specified files with licence conditions
{
# die "Undefined subroutine World::Politics::adjust called";
foreach my $file (sort keys %FILES) {
my $d = $FILES{$file};
next unless defined $d->{'tagged:rubberstamp'};
if ($d->{'folding:haslicence'}) {
print "$file already has a licence stamped\n"
unless $opts{'quiet'};
next;
}
die "Oh, I need the author tag to rubberstamp. Oops"
unless $d->{'tagged:author'} || $d->{'html:author'};
my @comment = @{$d} {('tagged:rubberstamp_left',
'tagged:rubberstamp_right')};
die("File '$file' has rubberstamp defined but the ".
"comment fields are broken?")
unless defined $comment[0] && defined $comment[1];
my $l = $d->{'tagged:rubberstamp'};
if (!defined $LICENCES{$l}) {
$LICENCES{$l} = slurp_file($l);
chomp @{ $LICENCES{$l} };
}
my @f = @{ slurp_file($file) };
splice(@f, 3, 0,
map { "$comment[0] $_ $comment[1]\n" }
(('{' x 3) . " Licence",
"(rubberstamped from $l",
"by $0 v$vsn ".(scalar localtime).")",
"",
"Copyright (C) 2001 ".($d->{'tagged:author'} ||
$d->{'html:author'}),
"",
@{ $LICENCES{$l} },
('}' x 3),
));
if ($opts{'dryrun'}) {
print "Would clobber '$file' with new stamped copy\n"
unless $opts{'quiet'};
next;
}
open PLS, "> $file\0" || die "Failed to clobber '$file': $!";
print PLS @f;
close PLS;
}
}
sub slurp_file ($) {
my $fn = shift;
open SLP, "< $fn\0" || die "Failed to read '$fn': $!";
my @lines = <SLP>;
close SLP;
return \@lines;
}
sub render ()
#| Take %DIRS and/or %FILES and chug out some HTML
{
my $out = "$opts{'destdir'}/$opts{'module'}/$opts{'subdir'}/index.html";
my $oldfh = undef;
if ($opts{'dryrun'}) {
print "Would output to '$out', and clobber it if it exists\n"
unless $opts{'quiet'};
} else {
open OUT, ">$out"
or die "Failed to create '$out': $!";
$oldfh = select OUT;
}
my $changelog = ($opts{'c2hdir'} ?
"Some 'recent' changes are ".
"<a href=\"../$opts{'c2hdir'}/changelogs.html\">".
"listed in order</a><p>"
: '' );
print <<"ENDHEAD";
<html><head><title>Published works in '$opts{'module'}' module</title>
<meta name="generator" content="cvspublish $vsn">
</head>
<body bgcolor="#ffffff">
<h1>CVS module '$opts{'module'}'</h1>
$changelog
<table border="2" cellpadding="2" cellspacing="1" bgcolor="#f0f0ff">
<tr bgcolor="#d0d0ff"><th>Directory</th>
<th>File</th>
<th>Size</th>
<th>Version</th>
<th>Info</th></tr>
ENDHEAD
foreach my $dir (sort keys %DIRS) {
my %files = %{$DIRS{$dir}};
my $pdir = $dir;
$pdir =~ s@^\./@@;
$pdir = '[top]' unless $pdir;
# Directory .. rowspan across relevant files
print "\n<tr><td rowspan=\"".(scalar keys %files)."\" valign=\"top\">";
my $pdirname = $pdir;
$pdirname =~ s/[^\w]/./g;
$pdirname =~ s/^\.|\.$//g;
print "<a name=\"$pdirname\">";
print $pdir;
# TODO: scan c2h:frameset for all files and get link(s?) to top level
# directories. Maybe also detect index or README files?
print "</a></td>\n";
my $inrow = 1;
foreach my $leaf (sort keys %files) {
my $d = $files{$leaf};
print "<tr>" unless $inrow;
# File
print "<td><strong>$leaf</strong><br> ";
print "<small><a href=\"$dir$leaf\">(download)</a></small> ";
print('<small><a href="', $d->{'c2h:file'},
'">(revision history)</a></small> ')
if $d->{'c2h:file'};
print('<small><a href="', $d->{'p2h:href'},
'">(documentation)</a></small> ')
if $d->{'p2h:href'};
print "</td>\n";
# Size
print("<td>", (defined $d->{'file:size'} ?
$d->{'file:size'} : '[unknown]'),
"</td>\n");
# TODO: about here we need something to pull all the bits of data
# together. This is just a grim hack.
# Version
my $dispvsn;
if ($d->{'cvslog:symnames'}) {
$dispvsn = $d->{'cvslog:symnames'};
if (" $dispvsn " =~ m@\s$opts{'release'}:([\d.]+)\s@) {
$dispvsn = $1;
} else {
$dispvsn = "(raw symnames) $dispvsn";
}
} else {
$dispvsn = "[unknown]";
}
print "<td>$dispvsn</td>\n"; #ick (why ick?)
# Description
print('<td cellpadding="1" bgcolor="#8080ff">',
'<table border="1" '.
'cellpadding="0" cellspacing="0" width="100%">',
"\n");
foreach my $k (sort keys %{$d}) {
next if $k eq 'p2h:href'; # TODO: format local URL nicely
next if $k eq 'c2h:file' || $k eq 'c2h:frameset'; # TODO: (already) use them!
next if $k =~ /^(file:size
|cvslog:keysubst
|pod:haspod
|cvslog:rev(log|hdr).*
|tagged:rubberstamp.*)$/x;
my $val = defined $d->{$k} ? $d->{$k} : '[unknown]';
next if ($k eq 'cvslog:branch' ||
$k eq 'cvslog:description') && $val eq '';
next if $k eq 'cvslog:head' && $val eq $dispvsn;
my $key = $k;
$key = "\u$key" if $key =~ s/^tagged://;
HTML::Entities::encode($key);
HTML::Entities::encode($val);
if ($k =~ /^pod:/) {
my @lines = split /\n/, $val;
my $prevblank = 1;
my $inpre = 0;
foreach (@lines) {
if (/^\s*$/) {
$prevblank = 1;
undef;
} elsif (/^\s+/) {
$_ = "<pre>$_" unless $inpre;
$inpre = 1;
$prevblank = 0;
} else {
if ($inpre) {
$_ = "</pre>$_";
} else {
$_ = "<p>$_" if $prevblank;
}
$inpre = 0;
$prevblank = 0;
}
}
push @lines, "</pre>" if $inpre;
$val = join "\n", @lines;
} else {
$val =~ s/\n/<br>/g;
}
# insert obvious hrefs
$val =~ s@(http://\S+)@<a href="$1">$1</a>@g;
print("<tr><td bgcolor=\"#ffd0d0\">$key</td>",
"<td bgcolor=\"#fff0f0\">$val</td>\n");
}
print "</table></td></tr>\n";
$inrow = 0;
}
}
my $now = scalar localtime();
print <<"ENDFOOT";
</table>
Generated on $now by <a href="http://www.fruitcake.demon.co.uk/cvspublish">
cvspublish</a> v$vsn, possibly with assistance from pod2html and
<a href="http://www.sslug.dk/cvs2html">cvs2html</a>.
</body>
</html>
ENDFOOT
if (defined $oldfh) {
close OUT;
select $oldfh;
}
}
__END__
## Plain old docs from here ################################################
=head2 Quickstart [Dry run mode is friendly]
Naturally you are a wondering what this program is going to your
files. The easiest way to get started would be to paste in the lines
in the L<CVSROOT/modules|"The CVSROOT/modules file"> section and run
cvspublish -n -d /tmp/webtest
It will print the commands it wanted to run so you can run them
yourself. When you run the script it will cope with re-doing most
tasks. The only things to watch out for are
=over 4
=item *
It doesn't delete the HTML files in the cvs2html directory, so it may
link the wrong files if you've changed the cvs2html command line.
=item *
Multiple checkout/release cycles will add spam to your
F<CVSROOT/history> file, if you still have one.
=item *
Once the final stage of deleting all the CVS directories has run,
C<cvs log> will not work. You will need to do the checkout step again.
=back
=head1 PREREQUISITES
You will need C<cvs>, and life will be easier in a C<unix-style
environment>.
=head1 COREQUISITES
The program can use C<cvs2html> (to be found at
http://www.sslug.dk/cvs2html) and the standard C<pod2html>.
=head1 DETAILED OPERATION
=head2 Jobs the program performs
=over 4
=item *
(Optionally) tag a module to ensure new unmarked items are published.
Generally this is handy for making sure none of your website goes
missing. You can still tag files as private (eg. if they need more
work before going live).
=item *
Remove I<PUBLIC> tags from I<PRIVATE> tagged files
=item *
Tag as I<UNKNOWN> everything which is neither I<PUBLIC> nor I<PRIVATE>
(so you can find out what is not marked for publishing with C<cvs co
-r UNKNOWN .>)
=item *
Re-tag I<PUBLIC> files with a new release tag. The I<PUBLIC> tag is
not moved.
=item *
In a temporary directory, C<cvs checkout -r new_release_tag
chosen_module>, then do a C<cvs release> to balance the module
history.
=item *
Make a directory and run F<cvs2html> from the CVS working
subdirectory, targetting HTML to the new directory -- only if
C<c2hdir> is a non-empty value.
=item *
Trundle the subdirectories to be indexed, and for each file
=over 4
=item *
pull CVS logs to get description, revision history etc.
=item *
read (the top of?) the file looking for tagged information and POD
formatted documentation (unless the cvs log says the file is binary)
=item *
build tables (files?) with summary info in them. This bit is not
decided, should presumably be configurable
=item *
link to the relevant cvs2html files, if present
=item *
generate and link pod2html docs, if the C<p2hdir> is a non-empty
value. Also adds a link back to the cvspublish index file, from the
top of the doc.
=back
=item *
Remove all F<CVS> admin directories, so what we have left is
equivalent to C<cvs export>
=back
=head2 Inline data tags
Since the description of a CVS item cannot be changed very easily, it
makes sense to pull extra tagged data from the file itself, unless it
is binary.
In the absence of an obvious standard to pick on, I made one up.
Changing it is no big deal, though, provided the data is tagged as
key/value. Values can be multiline:
# CVSPUBLISH:::
# Licence: GPL (GNU General Public Licence v2 or above)
# Rubberstamp: /usr/share/common-licenses/GPL
# Description: Render a (piece of a) CVS repository for a web server
# Author: Matthew Astley <matthew@fruitcake.demon.co.uk>
# :::CVSPUBLISH
<!-- CVSPUBLISH::: -->
<!-- Missing: Run 'cvs log' less times -->
<!-- : Links to directory-view for cvs2html -->
<!-- : Rationalise the descriptions that get pulled in 8-) -->
<!-- :::CVSPUBLISH -->
Which tags are allowed or used is defined at the HTML rendering stage.
Currently it just dumps all the tags out (except the cvslog ones,
which it mostly ignores), although file tags override tags from other
places and some tags are expected to be present. Run it and see...
The tags are not case sensitive because that would be too confusing.
=head2 Standard data tags
It should get descriptions from HTML...
=head1 CONFIGURING CVS
You will need to add a couple of things to your repository, but there
is nothing tricky.
=head2 CVS tags used
The program assumes you have carved your repository up by tagging
files as I<PUBLIC> or I<PRIVATE> at some point. When it runs, it will
leave an I<UNKNOWN> tag on any file that has neither I<PUBLIC> nor
I<PRIVATE> tags.
To mark the whole repository as public, try C<cvs rtag PUBLIC .> (nb.
trailing dot to indicate everything. This works on CVS v1.10.8, but I
have not seen it documented.)
Release files are tagged in the form I<REL_yyyy_mm_dd> unless you
specify a tag. TODO: Issues with multiple releases in one day,
collection of old release labels to save disk space/visual bandwidth.
Only I<PUBLIC> files in your website directory will be published. (See
also L<the C<-P> option|/OPTIONS>).
The actual text of the tags used is set in a constant at the top of
the file, in case you wish to translate or abbreviate them.
=head2 The CVSROOT/modules file
You can either paste some trickery like this into your F<modules>
admin file to make the default options work, or figure something out
for yourself.
# The front door, eg. http://www.fruitcake.demon.co.uk/index.html is
# held in fruitcake.dcu/index.html
# The HTML points at a subdirectory called cvspublish and relies on
# us to join the ends. This module must not contain files or
# directories which clash with p2hdir, c2hdir or subdir.
website_plain -d website fruitcake.dcu
# This module includes the whole repository, but in the
# website/cvspublish subdirectory at export time
website_projects -d website/cvspublish .
# website consists of website_plain plus website_projects, minus the
# website_plain part the second time round.
website -a !website/cvspublish/fruitcake.dcu website_plain website_projects
(trim the leading spaces or the comments will cause confusion)
=head2 Finding files whose PUBLIC/PRIVATE state is unspecified
If you've set things up the way I have, run C<cvspublish -T> to get
the tags set, then C<cvs export -r UNKNOWN -d pubpriv website_projects>
to export all the UNKOWN files into a directory called F<pubpriv>.
If you use C<checkout> instead of C<export> you can run C<cvs tag> on
the files in the pubpriv directory, otherwise you'll have to go back
to a working directory, but if you have history switched on then you
might want to do a C<cvs release> afterwards.
You can run C<cvspublish> again on the same day to reuse the current
release tag - files that have been updated will have their tags moved
to the new (?) HEAD.
=head1 BUGS & WISHLIST
Yes, there are sure to be some bugs. These items are in a vague
priority order.
=over 4
=item *
Something odd happens if your --destdir is not absolute, maybe when it
contains [.~] characters. Did I say it was supposed to be an absolute
path?
=item *
I think there might be a problem with deleted files (in the Attic)
being published .. haven't investigated yet.
=item *
Add hooks to pre-pod2html and post-run scripts, for rubber-stamping
the licences and for updating files from external sources (GPG keys
and outline2html of the TODO list).
=item *
Remove CVS directories from the website_plain module's subdirectories
too. Perhaps this a symptom of the cvs release not working properly.
=item *
Something to rubber-stamp a full copy of the *insert favourite licence
here* onto files which are tagged to request it. (What is the legal
state of this? I assume that what I publish offers the contract...)
Implemented crudely ... this should probably be a separate program. It
probably needs to do different things for different types of file
anyway, and as for how it will add "give the -V option to see the
licence"...
=item *
Maybe something clever with the 'state' of the file (as set with C<cvs
admin -s>) would be more appropriate?
=item *
Using the L<file(1)> command during file-trundling is not yet
supported.
=item *
Maybe a thingie to delete author tags that match the global default
and say "author: foo smith unless stated otherwise".
=item *
Half done ... it pulls C<E<lt>metaE<gt>> tags from any old file. Also
makes a stab at reading POD.
=item *
Further C<rtag> games are possible to get stable/unstable versions.
=item *
Cleverness with vendor branches etc. is not necessarily supported. I
expect C<cvs rtag -r PUBLIC RELEASETAG> will go for the head branch --
I shall trust it to be sane until I find out otherwise.
=item *
If I bothered to unit test my stuff, I could put the test results up
automatically too. See L<Test::Unit> or
http://sourceforge.net/projects/perlunit/ .
=item *
Roll all the C<cvs log> calls into one so the ssh overhead isn't
crippling. (C<fsh> seems to work quite well now, so this goes to the
bottom of the pile)
=back
=cut
##############################################################################
=for Matthew
Notes for my future reference:
STABLE = like it says. if there's no tag, assume the HEAD, I suppose.
This will need something like
cvs rtag -r STABLE -f -F prePUBLIC .
cvs rtag -r PRIVATE -d prePUBLIC .
cvs rtag -r UNKNOWN -d prePUBLIC .
tag the STABLE or HEAD revision of PUBLIC files
cvs rtag unSTABLE .
cvs rtag -r STABLE -d unSTABLE .
cvs rtag postPUBLIC .
cvs rtag -r unSTABLE -d postPUBLIC .
cvs rtag -d unSTABLE .
=cut
|
Repository owner Powered by ViewCVS 1.0-dev |
ViewCVS and CVS Help |