4. Configuring the LOCKSS System

After installing the LOCKSS system, configure the system with the configure-lockss script by running this command in the lockss user's lockss-installer directory as lockss 1:

scripts/configure-lockss

If you have experience with classic LOCKSS daemon version 1.x, this is the equivalent of hostconfig.

Some notes about using configure-lockss:

  • When run the first time, some of the questions asked by the script will have a suggested or default value, displayed in square brackets; hit Enter to accept the suggested value, or type the correct value and hit Enter.

  • Any subsequent runs will use the previous values as the default value; review and hit Enter to leave unchanged.

  • Password prompts will not display the previous value but can still be left unchanged with Enter.

4.1. Network Settings

4.1.1. Hostname

Prompt: Fully qualified hostname (FQDN) of this machine

Enter the machine's fully-qualified hostname (meaning with its domain name), for example locksstest.myuniversity.edu.

4.1.2. IP Address

Prompt: IP address of this machine

If the machine is publicly routable, meaning it has an IP address that can be used to identify it over the Internet, enter the publicly routable IP address. Otherwise, if the machine is accessible via network address translation (NAT), meaning it has an IP address that is valid only on your local network but it can be reached from the Internet via a NAT router, enter the internal IP address.

4.1.3. Network Address Translation

  1. Prompt: Is this machine behind NAT?

    If the machine is publicly routable, enter N; otherwise, if the machine is not publicly routable but will be accessible via network address translation (NAT), enter Y.

  2. If you answered Y, you will be asked an additional configuration question:

    External IP address for NAT

    Enter the publicly routable IP address of the NAT router.

4.1.4. Initial UI Subnet

Prompt: Initial subnet(s) for admin UI access

Enter a semicolon-separated list of subnets in CIDR or mask notation that should initially have access to the Web user interfaces (UI) of the system. The access list can be modified later via the UI.

4.1.5. Container Subnet

  1. If configure-lockss detects a discrepancy between a previously used subnet for inter-container communication in the system and the subnet it would choose now, you may either see the warning:

    Container subnet has changed from <former_subnet> to <new_subnet>

    or be asked the question:

    Container subnet was <former_subnet>, we think it should now be <new_subnet>. Do you want to change it?

    in which case you should enter Y (recommended) or N.

  2. Prompt: LOCKSS subnet for inter-service access control

    Enter the subnet used for inter-container communication. We recommend accepting the proposed value by hitting Enter.

4.1.6. LCAP Port

Prompt: LCAP V3 protocol port

Enter the port on the publicly routable IP address that will be used to receive LCAP (LOCKSS polling and repair) traffic. Historically, most LOCKSS nodes use 9729.

4.2. Mail Settings

4.2.1. Mail Relay

Prompt: Mail relay for this machine

Enter the hostname of this machine's outgoing mail server, for example smtp.myuniversity.edu.

4.2.2. Mail Relay Credentials

  1. Prompt: Does the mail relay <mailhost> need a username and password?

    If the outgoing mail server does not require password authentication, enter N; otherwise, enter Y.

  2. If you answered Y, you will be asked additional configuration questions:

    1. Prompt: User for <mailhost>

      Enter a username for the mail server.

    2. Prompt: Password for <mailuser>@<mailhost>

      Enter the password for the username on the mail server.

    3. Prompt: Password for <mailuser>@<mailhost> (again)

      Re-enter the password for the username on the mail server. If the two passwords do not match, the password will be asked again.

4.2.3. Administrator Email

Prompt: E-mail address for administrator

Enter the e-mail address of the person or team who will administer the LOCKSS system on this machine.

4.3. Preservation Network Settings

4.3.1. Configuration URL

  1. Prompt: Configuration URL

    Accept the default (http://props.lockss.org:8001/demo/lockss.xml) if you are not running your own LOCKSS network; otherwise, enter the URL of the LOCKSS network configuration file provided by your LOCKSS network administrator.

  2. If the configuration URL begins with https:, you will be asked additional configuration questions:

    1. Prompt: Verify configuration server authenticity?

      Enter Y if you would like to check the authenticity of the configuration server using a custom keystore; otherwise enter N.

    2. If you answered Y, you will be asked an additional configuration question:

      Server certificate keystore

      Enter the path of a Java keystore used to vverify the authenticity of the configuration server.

4.3.2. Configuration Proxy

Prompt: Configuration proxy (host:port)

If the configuration URL can be reached directly, leave this blank; otherwise, if a proxy server is required to reach the configuration URL, enter its host and port in host:port format (for example proxy.myuniversity.edu:8080).

4.3.3. Preservation Groups

Prompt: Preservation group(s)

Accept the default (demo) if you are not running your own LOCKSS network; otherwise, enter a semicolon-separated list of LOCKSS network identifiers as provided by your LOCKSS network administrator, for example ournetwork or prod;usdocspln.

4.4. Storage Areas

The LOCKSS system needs storage areas to store data:

  • One or more content data storage areas to store preserved content, as well as several databases.

  • A log data storage area to store log files.

  • A temporary data storage area to store temporary files.

Depending on your host system's layout, these storage areas may all be the same, or all be different mount points or paths.

Subdirectories will be created in each storage area to fit the needs of a system component; for example lockss-stack-cfg-data is the LOCKSS configuration service's content data directory in the content data storage areas, and lockss-stack-repo-logs is the LOCKSS repository service's log data directory in the log data storage area.

4.4.1. Content Data Storage Areas

  1. Prompt: Root path for primary content data storage

    Enter the full path of a directory to use as the root of the main storage area of the LOCKSS system, where preserved content will be stored along with several databases. It is the analog of /cache0 in the classic LOCKSS system.

  2. Prompt: Use additional directories for content data storage?

    If you want to use more than one filesystem to store preserved content, enter Y; otherwise, enter N.

  3. If you answered Y, you will be asked an additional configuration question:

    Root path for additional content data storage <count> (q to quit)

    On each line, enter the full path of a directory to use as the root of an additional storage area, and enter q when done.

4.4.2. Log Data Storage Area

Prompt: Root path for log data storage

This directory is used as the root of the storage area for log files in the LOCKSS system. Accept the default (same directory as the content data storage directory root) by hitting Enter, or enter a custom path.

4.4.3. Temporary Data Storage Area

Prompt: Root path for temporary data storage (local storage preferred)

This directory is used as the root of the storage area for temporary files in the LOCKSS system. Accept the default (same directory as the content data storage directory root) by hitting Enter, or enter a custom path.

Tip

The LOCKSS software makes heavy use of temporary storage, and we recommend that temporary directories be placed on a filesystem with relatively low latency. If the content data storage directories are on network storage (for example NFS), system performance may be improved by supplying a local directory for temporary data storage.

Caution

Depending on the characteristics of the preservation activities undertaken by the system, in some circumstances content processing may require a substantial amount of temporary space, up to tens of gigabytes. Do not use a RAM-based tmpfs volume, or a directory in a space-constrained partition.

4.5. Web User Interface Settings

  1. Prompt: User name for web UI administration

    Enter a username for the primary administrative user in the LOCKSS system's Web user interfaces.

  2. Prompt: Password for web UI administration user <uiuser>

    Enter a password for the primary administrative user.

  3. Prompt: Password for web UI administration user <uiuser> (again)

    Re-enter the password for the primary administrative user. If the two passwords do not match, the password will be asked again.

4.6. Database Settings

4.6.1. PostgreSQL

Prompt: Use embedded LOCKSS PostgreSQL DB Service?

Select either option A or option B:

  1. Enter Y to use the embedded PostgreSQL database. This is recommended in most cases; a PostgreSQL database will be run and managed by the LOCKSS system internally. If you choose this option, see Embedded PostgreSQL Database.

  2. Enter N to use an external PostgreSQL database. Select this option if you wish to use an existing PostgreSQL database at your institution or one that you run and manage yourself. If you choose this option, see External PostgreSQL Database.

4.6.1.1. Embedded PostgreSQL Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: Password for PostgreSQL database

    Enter the password for the embedded PostgreSQL database.

    Warning

    This prompt is used to record the PostgreSQL database password in the LOCKSS system's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. See Working with PostgreSQL for details.

  2. Prompt: Password for PostgreSQL database (again)

    Re-enter the password for the embedded PostgreSQL database. If the two passwords do not match, the password will be asked again.

  3. Complete the Solr section next.

4.6.1.2. External PostgreSQL Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: Fully qualified hostname (FQDN) of PostgreSQL host

    Enter the hostname of the external PostgreSQL database, for example postgres.myuniversity.edu.

  2. Prompt: Port used by PostgreSQL host

    Enter the port where the external PostgreSQL database can be reached, for example 5432.

  3. Prompt: Schema for PostgreSQL service

    Enter the schema name to be used by the LOCKSS system. The schema name used in the embedded PostgreSQL database is LOCKSS, but your database administrator may assign a different schema name to you.

  4. Prompt: Database name prefix for PostgreSQL service

    Enter the prefix to use for any LOCKSS-related database names in the schema. The database name prefix in the embedded PostgreSQL databse is Lockss (note the uppercase/lowercase), but your database administrator may assign a different database name prefix.

  5. Prompt: Login name for PostgreSQL service

    Enter the username for the external PostgreSQL database. The username in the embedded PostgreSQL database is LOCKSS, but your database administrator may assign a different username to you.

  6. Prompt: Password for PostgreSQL database

    Enter the password for the username in the external PostgreSQL database.

    Warning

    This prompt is used to record the PostgreSQL database password in the LOCKSS system's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. Contact your PostgreSQL database administrator for details.

  7. Prompt: Password for PostgreSQL database (again)

    Re-enter the password for the username in the external PostgreSQL database. If the two passwords do not match, the password will be asked again.

  8. Complete the Solr section next.

4.6.2. Solr

Prompt: Use embedded LOCKSS Solr Service?

Select either option A or option B:

  1. Enter Y to use the embedded Solr database. This is recommended in most cases; a Solr database will be run and managed by the LOCKSS system internally. If you choose this option, see Embedded Solr Database.

  2. Enter N to use an external Solr database. Select this option if you wish to use an existing Solr database at your institution or one that you run and manage yourself. If you choose this option, see External Solr Database.

4.6.2.1. Embedded Solr Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: User name for LOCKSS Solr access

    Enter the username for the embedded Solr database.

  2. Prompt: Password for LOCKSS Solr access

    Enter the password for the username in the embedded Solr database.

  3. Prompt: Password for LOCKSS Solr access (again)

    Re-enter the password for the username in the embedded Solr database. If the two passwords do not match, the password will be asked again.

  4. Complete the Metadata Query Service section next.

4.6.2.2. External Solr Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: Fully qualified hostname (FQDN) of Solr host

    Enter the hostname of the external Solr database server, for example solr.myuniversity.edu.

  2. Prompt: Port used by Solr host:

    Enter the port used by the database server on the Solr host, for example 8983.

  3. Prompt: Solr core repo name:

    Enter name of the Solr core for the LOCKSS repository. The Solr core name used in the embedded Solr database is lockss-repo, but your database administrator may assign a different Solr core name.

  4. Prompt: User name for LOCKSS Solr access

    Enter the username for the external Solr database.

  5. Prompt: Password for LOCKSS Solr access

    Enter the password for the username in the external Solr database.

  6. Prompt: Password for LOCKSS Solr access (again)

    Re-enter the password for the username in the external Solr database. If the two passwords do not match, the password will be asked again.

  7. Complete the Metadata Query Service section next.

4.7. LOCKSS Services

4.7.1. Metadata Query Service

Prompt: Use LOCKSS Metadata Query Service?

Enter Y if you want the metadata query service to be run, otherwise N.

4.7.2. Metadata Extraction Service

Prompt: Use LOCKSS Metadata Extraction Service?

Enter Y if you want the metadata extraction service to be run, otherwise N.

4.8. Web Replay Settings

4.8.1. Pywb

Prompt: Use LOCKSS Pywb Service?

Enter Y to run an embedded Pywb engine for Web replay; otherwise, enter N.

4.8.2. OpenWayback

  1. Prompt: Use LOCKSS OpenWayback Service?

    Enter Y to use an embedded OpenWayback engine for Web replay; otherwise, enter N.

  2. If you answered Y, you will be asked an additional configuration question:

    Okay to turn off authentication for read-only requests for LOCKSS Repository Service?

    OpenWayback currently does not supply user credentials when reading content from the LOCKSS repository, so the repository must be configured to respond to unauthenticated read requests. Enter Y to accept this, otherwise you will see the warning:

    Not enabling OpenWayback Service

    and OpenWayback will not be run.

4.9. Final Steps

  1. Prompt: OK to store this configuration?

Enter Y if the configuration values are to your liking; otherwise, enter N to make edits.

  1. If you answer Y, configure-lockss will perform the final configuration steps. You may be asked to confirm before directories are created for the first time:

    <directory> does not exist; shall I create it?

    or before directory permissions are changed:

    <directory> is not writable; shall I chown it?

    In each case, enter Y for "yes" and N for "no".


Footnotes

1

See Running Commands as the lockss User.