E-ARK uses cases and processes

In the scope of the E-ARK project (a predecessor of the eArchiving Building Block running in 2014-17) the E-ARK team has

  • identified the E-ARK use cases and detailed the related OAIS processes, 

  • developed a set of format specifications (including a detailed structure for all three types of OAIS information packages), 

  • and developed or modified a set of tools to process the information packages. 

 

Use cases identified by the E-ARK project

  • Pre-Ingest and Ingest use cases 

  • Export and ingest relational database(s) based on SIARD 

  • Export and ingest electronic records based on MoReq2010 

  • Package and ingest simple files from a file system 

  • Package and ingest geodata related to other digital content in the package  

  • Access use cases 

  • Access relational database(s) based on SIARD 

  • Access relational database(s) via SOLR (not SQL) 

  • Access single electronic records/files (ingested from an ERMS or from a file system) 

  • Access data via OLAP (data cube) technology 

  • Access geodata re related to other digital content in the package 

 

Understanding eArchiving specifications and tools

The following tables show the digital archiving components resulting from the E-ARK project layered according to the OAIS processes. The columns of the (source and intermediate) formats are left white while the columns containing the tools – performing the transition from one format to the other – are drawn in amber.

Pre-Ingest and Ingest

Data Source

Export tool

Content type format

SIP creation tool

Submission Information Package

Ingest tool

Archival Information Package

Archival Repository

Database

DBVTK

SIARD 2.0

RODA-In

ESS ETP

SIP Creator

(E-ARK Web)

E-ARK SIP

RODA

ESS ETA

SIP2AIP Converter

(E-ARK Web)

E-ARK AIP

RODA Repository

ESS Preservation Platform

HDFS Storage

SOLR Index

(E-ARK Web)

ERMS

ERMS export module

ERMS content type

Files

 

 

Geodata

QGIS*

Geodata content type


 *QGIS is not an E-ARK product. Some freely available and (almost) industry standard tools were integrated into and tested together with the E-ARK toolset during some pilot scenarios in the E-ARK project. 

Access

Archival Repository

Archival Information Package

Search and Order tools

DIP creation tool

Dissemination Information Package

Viewer

Output Format

RODA Repository

ESS Preservation Platform

HDFS Storage

SOLR Index

(E-ARK Web)

E-ARK AIP

Search & Display

Order Management Tool

Lily Ingest*

E-ARK Web Search

RODA

ESS EPP

AIP2DIP Converter

(E-ARK Web)

E-ARK DIP

DBVTK

Relational Database

SOLR

SOLR Database

CMIS Portal Viewer

ERMS record

IP Viewer

Simple files

OLAP*Viewer

OLAP Data

QGIS*, Peripleo*

Geodata

*QGIS, Peripleo, Lily Ingest, Oracle OLAP are not E-ARK products. Some freely available and (almost) industry standard tools were integrated into and tested together with the E-ARK toolset during some pilot scenarios in the E-ARK project.

As the above tables show, the format specifications indicate the connection points between the processing steps as the process progresses. If the format specifications can be standardized, they automatically bring compatibility between the consecutive process steps. That is exactly the reason why detailed format specifications were desperately needed. The OAIS model doesn’t specify the internal structure of the information packages. One of the main goals of the E-ARK project was to provide the archival community with detailed format specifications.

The E-ARK project has defined the following format specifications:

For OAIS information packages

  • Common Specification for Information Packages 

  • SIP Specification 

  • AIP Specification 

  • DIP Specification 

For content types (to store data of specific types within the information package)

  • SIARD 2.0 format for databases 

  • ERMS format for electronic records from records management systems 

  • Geodata format to store geographic information along with other data or content types 

Every tool developed or modified in the scope of the E-ARK project is compatible with all the above format specifications.

The E-ARK Web solution was developed as a reference implementation. Although it’s not a mature tool set, all components were well tested and tried in cooperation with the specifications and other tools in some of the more than twenty real-world E-ARK pilot scenarios.

You can find some basic description as well as links to more detailed information of every component at the Library page of the General Model (http://kc.dlmforum.eu/gm3).

The General Model provides information about all E-ARK components from different aspects. The cross-reference view shows the connected elements of a selected component. The components are divided into four groups: format specifications, use cases and processes, tools and pilot scenarios.

    The above products portfolio of the E-ARK project is considered as an initial release of the eArchiving Building Block services. (Please note that the General Model is being redesigned according to the service oriented approach of the eArchiving Building Block.)