<< Chapter < Page
  Online humanities scholarship:     Page 13 / 32
Chapter >> Page >

The bulk of the primary material was so substantial that harvesting the secondary materials manually would be too onerous a task—clearly, automated methods were desirable and would allow for continual and ongoing harvesting of new materials as they became available. Ideally, these methods should be general enough in nature so that they can be applied to other types of literature, requiring minimal modification for reuse in other fields. This emphasis on transportability and scalability would ensure that the form and structure of the knowledgebase could be used in other fields of scholarly research.

Initially, the strategy was to assemble a sample database of secondary materials in partnership with the University of Victoria Libraries, gathering materials harvested automatically from electronic academic publication amalgamator services (such as EBSCOhost ). An automated process was developed to retrieve relevant documents and store them in a purpose-built database. This process would query remote databases with numerous search strings, weed out erroneous and duplicate entries, separate metadata from text, and store both in a relational database. The utility of our harvesting methods would then be demonstrated to the amalgamators and other publishers with the intent of fostering partnerships with them.

3.4. building a professional reading environment

At this stage REKn contained roughly 80 gigabytes of text data, consisting of some 12,830 primary text documents and an ongoing collection of secondary texts in excess of 80,000 documents. Text data in the knowledgebase was roughly 80 gigabytes; text and image data combined was estimated to be in the 2 to 3 terabyte range. Given its immense scale, development of a document viewer with analytical and communicative functionality to interact with REKn was a pressing issue. The inability of existing tools to accurately search, navigate, and read large collections of data in many formats, later coupled with the findings of our research into professional reading, led to the development of a Professional Reading Environment (PReE) to meet these needs.

Initially designed as a desktop GUI to the PostgreSQL database containing REKn, the PReE proof of concept was developed as a .NET Windows Form application. Very little consideration was given to further use of the code at this stage—the focus was solely on testing whether it all could work. Using .NET Framework was justified on the grounds that it is the standard development platform for Microsoft Windows machines, presumably used by a large portion of our potential users. Developing the proof of concept in .NET Framework meant that the application could use the resources of the client’s machine to a greater extent than if the application were housed in a browser. Local processing would be necessary if, for example, users were to use image-processing tools on scanned manuscript pages.

As demonstrated in the movie below ( Movie 1 ), the proof of concept built in .NET sported a number of useful features. Individual users were able to log in, opening as many separate document-centered instances of the GUI as they desired simultaneously, and perform search, reading, analytical, and composition and communication functions. These functions, in turn, were drawn on our modeling of professional reading and other activities associated with conducting and disseminating humanities research. Searches could be conducted on document metadata and citations (by author, title, and keyword) for both primary and secondary materials ( Figure 1 ). A selected word or phrase could also spawn a search of documents within the knowledgebase, as well as a search of other Internet resources (such as the Oxford English Dictionary Online and Lexicons of Early Modern English ) from within PReE. Similarly, the user could use TAPoR Tools to perform analyses on the current text or selected words and phrases in PReE ( Figure 2 ).

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Online humanities scholarship: the shape of things to come. OpenStax CNX. May 08, 2010 Download for free at http://cnx.org/content/col11199/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Online humanities scholarship: the shape of things to come' conversation and receive update notifications?

Ask