Skip to content

netarchivesuite/netarchivesuite-docker-compose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Running NetarchiveSuite Quickstart

Very Quick Start

docker-compose -p nas build

docker container rm nas_ftp_1 & docker-compose -p nas up

will create a complete dockerised NetarchiveSuite with GUI on http://localhost:8078.

In addition, a java debugger can be attached to the heritrix processes on port 8500 (Focused) or 8501 (Snapshot) and port 8600 for the Focused HarvestController application. Debuggers can similarly be added to other applications in the build using the same mechanism in the docker-compose.yml file.

NetarchiveSuite database will be exposed on port 6543 and the Heritrix GUI for Focused harvests is available at localhost port 8090.

For more information on using NetarchiveSuite, see the Quickstart Manual.

More About This Docker-Compose Assembly

The assembly starts a complete NetarchiveSuite installation consisting of 14 containers. Three of these containers are services used by NetarchiveSuite (jms-broker, ftp-server, postgres database) and the other 11 are a network of NetarchiveSuite applications. In a production environment, these 11 applications would run on multiple machines, possibly widely geographically distributed.

Each NetarchiveSuite application instance is based on the Dockerfile defined in the "nasapp" directory. Each application uses the same distribution of NetarchiveSuite software whose location is defined inside the Dockerfile. These can either be fetched from our nexus installation, or provided by the user - for example in order to test self-developed code.

The individual applications are defined by customising three jinja2 template files - start.sh.j2, settings.xml.j2, and logback.xml.j2. It is no coincidence that we use the same templating engine as ansible, as there is a long-standing ambition to develop an ansible-playbook deployment of NetarchiveSuite.

The docker-compose.yml defines environment variables for the templating of each of the 12 containers. The actual call to the jinja2 command line (j2) comes from the docker-entrypoint.sh script.

The installation emulates a NetarchiveSuite setup distributed over two geographic locations ("S" and "K"). In particular there is an instance of BitarchiveApplication at each of the two locations. Such a setup enables a geographically distributed bitpreservation environment.

The environment also defines two harvesters, each with its own harvesting channel - ("FOCUSED" and "SNAPSHOT"). The database configuration loaded in the nasdb container maps the "SNAPSHOT" channel to broad crawls - that is crawls of all known domains. In a realistic setup there could be many harvesters on many machines in multiple geographic locations.

About

Quickstart for Netarchivesuite using docker-compose

Resources

Stars

Watchers

Forks

Packages

No packages published