OCHRE System Design
OCHRE is designed to support all stages of the research data life cycle and to make it easy to move from one stage to the next—from initial data acquisition to integration, analysis, publication, and final archiving. The OCHRE system has four tiers, as shown in the diagram below.
The Client Tier consists of two kinds of end-user software applications. The first is a stand-alone Java application (not a browser applet) that runs on desktop and laptop computers under Windows, Mac OS X, or Linux. It has a feature-rich graphical user interface (GUI) that enables the members of a project team to build and manage their project’s data via password-protected user accounts, and to display and analyze the data in many different ways. The Java GUI communicates via the Internet with the OCHRE data warehouse running on a Tamino XML Server in the University of Chicago Library. The Java GUI is normally used online but also has an offline mode for use when a fast Internet connection is not available (e.g., on an archaeological dig).
The Middle Tier is the layer of software that exposes published data from the data warehouse to browser and mobile apps via a RESTful Web API (or Web service). This API is currently under development; details and documentation will be coming soon. Apps use the API to fetch published data (exposed as “flattened” XML-based data-exchange documents; see above) by means of persistent URLs; or an app may trigger Java routines and R functions, and pass arguments to them, to execute pre-written queries and analytical workflows that a project has named and saved in the data warehouse for others to use. The Middle Tier uses the HTTP Web service capability of Tamino XML Server, which gives browser and mobile apps access to published data from the core data warehouse.
The Core Data and Analysis Tier consists of both the OCHRE data warehouse and a separate server for executing R functions to do statistical analysis and visualization. The data warehouse is highly scalable and extensively indexed, permitting fast queries. It is implemented using a high-performance database management system called Tamino XML Server from Software AG, which is maintained and backed-up by professional system administrators in the Digital Library Development Center of the University of Chicago Library. The data warehouse is structured in accordance with an innovative non-relational graph data model that is optimized for semistructured data represented as recursive hierarchies of spatial, temporal, linguistic, and taxonomic items. The data warehouse and its underlying data model are described further in the Database page of this website.