|Using XML to generate research tools for Wittgenstein scholars by collaborative groupwork|
This document evaluates the project as a whole, but also makes specific reference to elements pertaining to the requirements of the UK AHRB as a funding agency.
Both Biggs (Hertfordshire) and Seekircher (Innsbruck) attended training in XML at WAB in Bergen early in the project (December 2002 and January 2003 respectively). WAB is part of the University of Bergen Research Group for Text Technology at Aksis. Biggs visited Innsbruck (February 2003) to collate materials from the Brenner Archive including letters and postcards by Wittgenstein for the target period of September 1914. Pichler (Bergen) undertook the conversion and development of the XML file, and facilitated the project by coordinating the support of the technology partner. In order to meet the objective of integrating a wide variety of media and formats, materials were compiled or linked as HTML websites, JPG and MOV images, MP3 audio, PDF, DOC and TXT documents, etc. Because the work was conducted on a website, the final project and its history emerged as folders within the whole site, each representing an iteration. This "history" addresses the issue of data preservation in digital archives (cf. AHRB application "research question" and "research methods").
The basic data was converted into XML from MECS; MECS is a markup system like SGML, developed at WAB for transcription of the Wittgenstein Nachlass. This was then further modified according to TEI recommendations. This enhanced its portability. XSL stylesheets were developed to generate both XHTML and PDF outputs, reflecting the group's desire to output via the web or in print. XML and XHTML files were developed to meet international standards. The XML file contained international character sets, logical notation, and external file references, and therefore provided an example with transnational relevance for text encoding.
During the project it became apparent that the richness and density of the main data set (XML file) would present important challenges for presentation. Since the research question was how to provide tools for scholars at a variety of levels of expertise, a key problem became the exploitation of the stylesheet (XSL file) to select user-determined content from the main data. Although the team's editorial and scholarly experience allowed a level of anticipation and structuring to be applied to the data in anticipation of user needs (cf. outcomes "project 17" and "project 18"), a key outcome has been the experimentation with the delivery of user-determined content (cf. "project 16"). This was achieved by allowing the user to select the value of certain variables in the stylesheet, thus enabling the dynamic generation of the output with, for example, annotation to logical symbolism in tool tips. The group felt that dynamic generation should be achieved without the user having to engage with any XSL encoding, and so the user interface is a simple HTML form. At the same time, user's engagement with XML technology is not excluded.
The aim, to work collaboratively on the problem of constructing a data set that would give maximum flexibility for the production of a variety of user-centred multi-media formats in XHTML, has been achieved (cf. AHRB application "aims and objectives"). There were two objectives: (a) to design and implement a DTD and associated documents that can accommodate a wide variety of material pertinent to the project, and (b) to demonstrate the utility of the documents by generating a sample that can be used by scholars on the internet. Objective (a) has been achieved and exceeded, because the project investigated print-based PDF output in addition to web-based XHTML output. This involved using two different stylesheet languages: XSL and XSLFO. Objective (b) has also been achieved and exceeded, because a comprehensive range of samples has been produced, including the facility to generate user-determined content.
Dissemination of the project outcomes is via a website hosted by the Wittgenstein Archives at the University of Bergen (WAB). The site will be available from 31 October 2003 and technical snagging will be undertaken during November and December 2003. The WAB site contains the major reference site and portal for international Wittgenstein studies and is a European Research Infrastructure. The Research Group for Text Technology co-hosts the Text Encoding Initiative (http://www.tei-c.org). This is therefore an effective location for the outcomes. Dissemination has also been undertaken via a cross-disciplinary seminar by Biggs in Bergen on 21/05/2003, a conference paper by Biggs (http://www.glos.ac.uk/humanitiesimages/documents/P/85_1.doc), at "Digital Resources for the Humanities", University of Gloucester, in September 2003, and presentations by Alois Pichler at the Institute of History and Philosophy of Sciences and Technology (IHPST, http://www-ihpst.univ-paris1.fr/home.htm) in Paris (8.3.2003), at the annual conference of the Nordic Network for Editorial Philology (http://www.nnedit.org) in Sandbjerg (12.10.2003) and at the Philosophy Department at the University of Bologna (23.10.2003). Finally, Biggs is now working on a monograph on the characteristics of graphics and text. The relevance of these issues to the text-encoding community is shown by the proposal of TEI to form a Special Interest Group for "graphics-intensive texts". Biggs and his research group (CREAC, Hertfordshire) joined TEI in July. These outcomes meet or exceed the plans set out in §17 of the AHRB application.
A number of problems are as yet unresolved, including the control of fonts for PDF output. This problem appears to be a limitation of the software that the group has used (Oxygen). Minor problems, including this one, will be addressed during the snagging period. This task will be complete when technical functionality has been achieved in projects 16-18, and character representation is accurate in all outputs. Projects 0-15 will not be corrected but are retained as a record of the developmental process. Once complete, the commercial partner (Intelex) will be asked to evaluate the outcomes (cf. AHRB application "research question").