I was speaking with a friend the other day and he was telling about this crazy thing that was being done in his department. There was a task that needed to be done usually once a day and it was important that it was not only performed but that it was also documented
Considering the types of follow up questions they were constantly doing the entire solution would have been quite appropriate as a small database application.
My friend is neither an IT person nor is there anyone in his group with those types skills. The task is too small for IT to allocate a project but too much work to actually do it in spare time. Thus the users were using a spreadsheet to track this information. Each day they added a new tab, and on each tab was information about the task or tasks that were done.
The whole thing seemed rather foreign to me. It seems all of my tasks usually end up in the shape of a program at some point.
Just this morning one of my users wanted some test data to be uploaded into the application. Almost like my friends situation, it was too unimportant to do right and received almost no attention – until today.
I decided to write up a small shell script to automatically create a few pdf files and configuration files to go with them. All I need to do is to add the internal entity id when running the script. The script creates a random number and picks some random static data. The overall affect is quite similar to what production looks like.
%% Subject=Test for entity id PRO7:1019999:IR-LOAN-SHORT:PRO7
%% FaxNumber=john.johnson@bigcompany.com;max.mustermann@bigcompany.com
%% Message=special message goes here
%% Attachments=special.pdf
%% Entity=PRO7
The configuration contains enough information for the processing system to send out files.
The script just loops through the internal id’s provided on the command line. Special kudos to the ability to have arrays and generate random numbers in a shell script.
#!/usr/bin/ksh
clear
if [ $# -eq 0 ]
then
echo add the entity ids to command line
exit
fi
EMAILTO="john.johnson@bigcompany.com;max.mustermann@bigcompany.com"
MSG="special message goes here"
ATTACH="special.pdf"
INSTRUMENTS[0]="FX-SPOT"
INSTRUMENTS[1]="FIXING"
INSTRUMENTS[2]="FX-FWD"
INSTRUMENTS[3]="IR-LOAN-LONG"
INSTRUMENTS[4]="IR-LOAN-SHORT"
INSTRUMENTS[5]="COMMODITY"
INSTRUMENTS[6]="IR-DEPOSIT-SHORT"
INSTRUMENTS[7]="IR-BOND"
while [ $# -gt 0 ]
do
TXN=$RANDOM
IDX=$(($RANDOM % 8))
INSTRUMENT=${INSTRUMENTS[$IDX]}
BASENAME=$1-$TXN
SEND=true
TEX=${BASENAME}.tex
PDF=${BASENAME}.pdf
XSL=form.xsl
XML=${BASENAME}.xml
ENTITY=$1
echo " Test for entity id '$ENTITY' " > $XML
echo %% Subject=Test for entity id $ENTITY:10$TXN:$INSTRUMENT:$ENTITY > $TEX
echo"%% FaxNumber=$EMAILTO" >> $TEX
echo %% Message=$MSG >> $TEX
echo %% Attachments=$ATTACH >> $TEX
echo %% Entity=$ENTITY >> $TEX
cat $TEX
fop -xml ${XML} -xsl form.xsl -pdf ${PDF}
shift
echo " "
echo " "
echo " "
done
The magic is not in the shell script, that is pretty much just the glue, the magic is in the ability to create PDF files.
I must admit that in the past I did not do an extensive study on what different solutions are available for creating PDF files.
Apache™ FOP (Formatting Objects Processor) is a print formatter driven by XSL formatting objects (XSL-FO) and an output independent formatter. It is a Java application that reads a formatting object (FO) tree and renders the resulting pages to a specified output. Output formats currently supported include PDF, PS, PCL, AFP, XML (area tree representation), Print, AWT and PNG, and to a lesser extent, RTF and TXT. The primary output target is PDF.
Somehow I gravitated towards using Apache Fop to create the files. I find this technology so useful because it is easy enough to create a well formatted form that can be generated into many different types of output of which PDF is only one.
You decide what the format of the output should be and create a form to parse that output. Once all of this has been decided it is easy enough to create a program (if necessary) to then create that format of xml file.
Form is simply some xslt code. The most important part of the form is the description of what the output dimensions will look like.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format" exclude-result-prefixes="fo">
<xsl:template match="test">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="simpleA4" page-height="29.7cm" page-width="21cm" margin-top="2cm" margin-bottom="2cm" margin-left="2cm" margin-right="2cm">
<fo:region-body/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="simpleA4">
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="16pt" font-weight="bold" space-after="5mm">Test: <xsl:value-of select="desc"/> </fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
</xsl:stylesheet>
The form is then merged with the xml data file to create the output.
This particular example is actually very poor as there is almost no real connection between the xslt form and the data file. The only common element is that the beginning block is called test.
<test>
<desc>Test for entity id 'ZDF' </desc>
</test>
Unless other templates are added, the text for any xml tags that are encountered are simply output without any real additional formatting into the output file.
I don’t have time for a full set of examples so I simply refer you to the Apache examples. I will however be revisiting this topic in the future will more examples and a fuller explanation.