Webtogether

DataSource

The datasource is simply a script which will read the Filesystem structure. The the listing will be converted to XML. Additionally the DataSource evaluates the metadata. If a metadata file is read only the content of it will be supplied in the outputted XML. If the metadata file is executable then the output of the executed metadata file/script will be supplied.

The DataSource script could as source of RAW data use also a database or whatever you like. Just for my endeauvor the filesystem is a convenient way to provide data.

I have implemented the DataSource in PHP but there is no limit to do it in shell, perl or even maybe use a specific database extension which will provide directly XML from an SQL database.

In reality nothing can stop you from implementing it as a XMLRPC or as a WSDL maybe ?

Input

The DataSource has several parameters so we can better control it's behavior. Some of the parameters are actually only handed back by the DataSource and the are then evaluated by the xslt transformations.

Modifiers

These input parameters influence the size of the output.

  • d specifies the directory which we would like to obtain data about or you can specify a file. If a file is specified it is the sent from the server as a download. If a directory is specified the listing and possibly some metadata are sent back. The default value is the root directory.
  • e offset, specifies the position from which the file listing is handed out. This way you can implement pagination (for example the number of thumbnails per page, number of posts per page etc.) The default value is 0.
  • a ammount, the number of items items handed down. This is used in conjunction with the e switch. The default value is all items in the directory.

Parameters

Next is the list of parameters which do not directly influence the xml output. They are used by the xslt transformation which needs them as persistent data. In fact if there would be a way to store persistent data on the client side the wouldn't be needed at all (I use only xslt 1.0)

  • l specifies the language to be used. This is only used by the xslt. So if your xslt doesn't implement internationalisation then it will be ignored. The default is "en-US"
  • s is used to keep the name of the column based on which the sorting is done in the xslt sheet. The default is n (name)
  • o this keeps the sort order between sessions. Can be ascending or descending. The default is a (ascending)
  • x specifes the stylesheet which has to be used to transform the xml data

Output

The output is written in the xml format. There is a xml template which is filled in with data. It would be possible to have a different xml template based on the purpose but the common format with the metadata tag has simply happened. The template is in the listing below.

Template

<!-- Template HEAD -->
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?xml-stylesheet type="text/xsl" href="{XSLSHEET}.xsl" ?>
<list d="{DIRECTORY}"
      s="{SORT}"
      o="{ORDER}"
      t="{TYPE}"
      l="{LANGUAGE}"
      e="{OFFSET}"
      a="{AMMOUNT}">
<!-- Template ENTRYHEAD -->
  <e>
    <y>{DIR}</y>
    <p>{PRIVILEGES}</p>
    <i>{SUBDIRS}</i>
    <u>{OWNER}</u>
    <g>{GROUP}</g>
    <s>{SIZE}</s>
    <t>{YEAR}{MONTH}{DAY}{TIME}</t>
    <n>{NAME}</n>
    <l>{LINK}</l>
    <m>
<!-- Template ENTRYTAIL -->
    </m>
  </e>
<!-- Template TAIL -->
</list>

I'm sure there is a lot of templating engines out there for PHP but writing this very simple one was easy and fun. There are two functions which manipulate this template. The load template function which loads the template and splits it into blocks in an associative array:

<?php
function loadTemplate ( $file )
{
    $name = "";
    $output = array ();
    $handle = fopen ( $file, 'r' );
    while (!feof($handle))
    {
        $line = fgets($handle, 128);
        if ( preg_match ( '<!--\s+Template\s+([a-zA-Z0-9_]+)\s+-->', $line, $matches ) ) {
            $name = $matches[1];
            $output[$name] = "";
        } elseif ( strlen($name) > 0 ) {
            $output[$name] .= $line;
        }
    }
    fclose($handle);
    return $output;
}
?>

And the second function is used to fill in the given template block with the values. The actual example of how to use the functions is given further below. Here you can see the implementation of the filltemplate:

<?php
function fillTemplate ( $tpl, $arr )
{
    $output = $tpl;
    foreach ($arr as $key => $value) {
        $output = preg_replace ( $key, $value, $output );
    }
    return $output;
}
?>

Actually these two functions are the only ones used by the DataSource. Now below is a small demonstration how to fill in the template with values just for the completness sake. The matches array is filled in with a preg_match function (regular expression).

<?php
...
preg_match ( '/^(.)(.{9})\s+(\d+)\s+(\w+)\s+(\w+)\s+(\d+)\s+(\w+)\s+(\d+)\s+(?:(\d{2}:\d{2})|(\d{4}))\s+(.*)$/', $line, $matches );
...
echo fillTemplate ( $template["ENTRYHEAD"], array(
    "/{DIR}/"        => $matches[1],
    "/{PRIVILEGES}/" => $matches[2],
    "/{SUBDIRS}/"    => $matches[3],
    "/{OWNER}/"      => $matches[4],
    "/{GROUP}/"      => $matches[5],
    "/{SIZE}/"       => $matches[6],
    "/{MONTH}/"      => strlen($matches[7]) == 1 ? '0'.$matches[7] : $matches[7],
    "/{DAY}/"        => strlen($matches[8]) == 1 ? '0'.$matches[8] : $matches[8],
    "/{TIME}/"       => $matches[9],
    "/{YEAR}/"       => $matches[10],
    "/{NAME}/"       => $matches[11],
    "/{LINK}/"       => $matches[12]
) );
...
?>

Data

The data itself are provided as a listing of the given directory. This means we do a "ls -l" and transform it into a xml output. This is enough for this concept software. In real life you might read the data from a database for example as mentioned earlier. But you could supply the data from database also with the help of the metadata. As you can see in the source below it is not really a big deal to get the filesystem data. Additionally you can see that if not a directory is specfied but a file then its content is supplied as download.

<?php
...
$directory=$_GET['d'];
// check if it is a directory or a file.
$directory='/'.$directory.'/';
$directory=preg_replace ( '@/+@','/', $directory );
// remove relative paths
$directory=preg_replace ( '@/\.\./@','/', $directory );
$directory=preg_replace ( '@/\./@','/', $directory );
$filename=preg_replace ( '@/+$@','', $directory );
if ( !is_dir ($filename) ) {
    if ( !file_exists($filename) ) {
        Header("HTTP/1.0 404 Not Found");
    } else {
        Header("Content-Length: ".filesize($filename));
        Header("Content-Type: application/x-download");
        Header("Content-Disposition: attachment; filename=\"".rawurlencode(basename($filename))."\"");
        readfile($filename);
    }
    exit;
}
// prepare the safe directory for the ls command
$safedir=escapeshellarg( $directory );
$output = array();
$yearnow = date('Y');
$monthnow = date('n');
exec("ls -l $safedir", $output, $retval);
...
?>

Metadata

Here you can see how the metadata are added. In principle every directory has a .metadata subdirectory. For every file in the directory there can be a counterpart with the same name in the metadata directory. Now if this counterpart in metadata directory doesn't exist no special action is done. If the file exists in the metada directory and it can be read then the content of this file is dumped into the xml output. If the file is executable, then the file is executed and the output of this file is dumped into the xml output. Below you can see the code snipet which does this.

<?php
...
$metadata = 0;
if ( is_dir ( $directory.".metadata" ) ) {
  $metadata = 1;
}
...
if ( $metadata == 1 ) {
    if ( is_executable ( $directory.".metadata/".$matches[11] ) ) {
        passthru( "'".$directory.".metadata/".$matches[11]."'" );
    } elseif ( is_readable( $directory.".metadata/".$matches[11] ) ) {
        readfile ( $directory.".metadata/".$matches[11] );
    }
}
...
?>

Download

Below I have included the full DataSource source file in php. It's really just a small php script with very little logic in it. There is the required xml template. Additionally there is a basic xml transformation so you can run it and see some HTML results.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-Share Alike 2.5 License.