4. Configuring the LOCKSS System

After installing the LOCKSS system, you will configure it with the configure-lockss script. If you have experience with classic LOCKSS daemon version 1.x, this is the equivalent of hostconfig.

4.1. Before Invoking configure-lockss

You will need to gather information to answer configuration questions asked by configure-lockss, including:

  • The name (FQDN) of the host, the IP address of the host, and if behind NAT, the external IP address for NAT.

  • The mail relay host, and optionally mail credentials, for sending e-mail from the host, and the e-mail address for the administrator of the system.

  • The configuration URL and preservation group or groups corresponding to the LOCKSS network your system is joining.

  • The paths for the primary content storage area, any additional content storage areas, the state data storage area, the temporary storage area, and the log storage area. See the Storage and Network-Attached Storage sections for important information about performance requirements for these storage areas.

    Caution

    Each of these paths needs to be writeable by the lockss user. If this is not the case, set them up as root before running configure-lockss.

  • Username and password for the Web user interfaces.

  • A password for the PostgreSQL database.

    • Alternatively, if using an existing PostgreSQL database, the host name, port, schema, username and password for the external PostgreSQL database, as well as a prefix for database names.

  • A username and password for the Solr database.

    • Alternatively, if using an existing Solr database, the host name, port, username and password for the external Solr database, as well as the core name for the LOCKSS repository.

  • Whether you wish to use the LOCKSS Crawler Service, LOCKSS Metadata Extraction Service, LOCKSS Metadata Service, LOCKSS SOAP Compatibility Service, Pywb Web replay engine, and OpenWayback Web replay engine.

Some notes about using configure-lockss:

  • When run the first time, some of the questions asked by the script will have a suggested or default value, displayed in square brackets; hit Enter to accept the suggested value, or type the correct value and hit Enter.

  • Any subsequent runs will use the previous values as the default value; review and hit Enter to leave unchanged.

  • Password prompts will not display the previous value but can still be left unchanged with Enter.

4.2. Invoking configure-lockss

To invoke configure-lockss, simply run this command in the lockss user's lockss-installer directory as lockss 1:

scripts/configure-lockss

The script will begin with the first series of configuration questions, about Kubernetes Settings.

4.3. Kubernetes Settings

Prompt: Command to use to execute kubectl commands

Enter the command to invoke kubectl in your environment. If you are using the K3s Kubernetes environment that ships with the LOCKSS system, the proposed value is already correct.

4.4. Network Settings

4.4.1. Hostname

Prompt: Fully qualified hostname (FQDN) of this machine

Enter the machine's fully-qualified hostname (meaning with its domain name), for example locksstest.myuniversity.edu.

4.4.2. IP Address

Prompt: IP address of this machine

If the machine is publicly routable, meaning it has an IP address that can be used to identify it over the Internet, enter the publicly routable IP address. Otherwise, if the machine is accessible via network address translation (NAT), meaning it has an IP address that is valid only on your local network but it can be reached from the Internet via a NAT router, enter the internal IP address.

4.4.3. Network Address Translation

  1. Prompt: Is this machine behind NAT?

    If the machine is publicly routable, enter N; otherwise, if the machine is not publicly routable but will be accessible via network address translation (NAT), enter Y.

  2. If you answered Y, you will be asked an additional configuration question:

    External IP address for NAT

    Enter the publicly routable IP address of the NAT router.

4.4.4. Initial UI Subnet

Prompt: Initial subnet(s) for admin UI access

Enter a semicolon-separated list of subnets in CIDR or mask notation that should initially have access to the Web user interfaces (UI) of the system. The access list can be modified later via the UI.

4.4.5. Container Subnet

  1. If configure-lockss detects a discrepancy between a previously used subnet for inter-container communication in the system and the subnet it would choose now, you may either see the warning:

    Container subnet has changed from <former_subnet> to <new_subnet>

    or be asked the question:

    Container subnet was <former_subnet>, we think it should now be <new_subnet>. Do you want to change it?

    in which case you should enter Y (recommended) or N.

  2. Prompt: LOCKSS subnet for inter-service access control

    Enter the subnet used for inter-container communication. We recommend accepting the proposed value by hitting Enter.

4.4.6. LCAP Port

Prompt: LCAP V3 protocol port

Enter the port on the publicly routable IP address that will be used to receive LCAP (LOCKSS polling and repair) traffic. Historically, most LOCKSS nodes use 9729.

4.5. Mail Settings

4.5.1. Mail Relay

Prompt: Mail relay for this machine

Enter the hostname of this machine's outgoing mail server, for example smtp.myuniversity.edu.

4.5.2. Mail Relay Credentials

  1. Prompt: Does the mail relay <mailhost> need a username and password?

    If the outgoing mail server does not require password authentication, enter N; otherwise, enter Y.

  2. If you answered Y, you will be asked additional configuration questions:

    1. Prompt: User for <mailhost>

      Enter a username for the mail server.

    2. Prompt: Password for <mailuser>@<mailhost>

      Enter the password for the username on the mail server.

    3. Prompt: Password for <mailuser>@<mailhost> (again)

      Re-enter the password for the username on the mail server. If the two passwords do not match, the password will be asked again.

4.5.3. Administrator Email

Prompt: E-mail address for administrator

Enter the e-mail address of the person or team who will administer the LOCKSS system on this machine.

4.6. Preservation Network Settings

4.6.1. Configuration URL

  1. Prompt: Configuration URL

    Enter the URL of your LOCKSS network's configuration file. Select a scenario below for more details:

    If you are trying out LOCKSS 2.x, enter http://props.lockss.org:8001/demo/lockss.xml (or simply hit Enter, as this is the default).

    If you are participating in the Global LOCKSS Network and trying out LOCKSS 2.x, enter http://props.lockss.org:8001/demo/lockss.xml (or simply hit Enter, as this is the default).

    If you are configuring your LOCKSS node to participate in a given LOCKSS network, enter the configuration URL provided for that LOCKSS network by your administrators (for example https://admin.mynetwork.org/config/lockss.xml).

  2. If the configuration URL begins with https:, you will be asked additional configuration questions:

    1. Prompt: Verify configuration server authenticity?

      Enter Y if you would like to check the authenticity of the configuration server using a custom keystore; otherwise enter N.

    2. If you answered Y, you will be asked an additional configuration question:

      Server certificate keystore

      Enter the path of a Java keystore used to verify the authenticity of the configuration server.

4.6.2. Configuration Proxy

Prompt: Configuration proxy (host:port)

If the configuration URL can be reached directly, leave this blank; otherwise, if a proxy server is required to reach the configuration URL, enter its host and port in host:port format (for example proxy.myuniversity.edu:8080).

4.6.3. Preservation Groups

Prompt: Preservation group(s)

Enter a preservation group identifier or semicolon-separated list of preservation group identifiers. Select a scenario below for more details:

If you are trying out LOCKSS 2.x, enter demo (or simply hit Enter, as this is the default).

If you are participating in the Global LOCKSS Network and trying out LOCKSS 2.x, enter demoprod.

If you are configuring your LOCKSS node to participate in a given LOCKSS network, enter the preservation group(s) provided for that LOCKSS network by your administrators (for example mynetwork, or mynetwork;mygroup1;mygroup2).

4.7. Storage Areas

The LOCKSS system needs several kinds of storage areas, as described in the Storage section. See also the Network-Attached Storage section for important information about performance requirements for these storage areas.

Depending on your host system's layout, these storage areas may all be the same, or all be different mount points or paths. Each path must be writeable by the lockss user.

Subdirectories will be created in each storage area to fit the needs of each system component; for example lockss-stack-cfg-data is the LOCKSS configuration service's state data directory in the state data storage area, and lockss-stack-repo-logs is the LOCKSS repository service's log directory in the log storage area.

4.7.1. State Data Storage Area

Prompt: Root path for state data storage

This directory is used as the root of the storage area for databases and other state data. Enter the desired path, or if reconfiguring, hit Enter to accept a previously-entered value.

4.7.2. Content Storage Areas

  1. Prompt: Root path(s) for content storage

    Enter a semicolon-separated list of full paths of directories to be used to store preserved content.

  2. If the answer to the question is different than that from a previous configuration run, you will see the warning:

    If you have removed or reordered content storage directories, you must run scripts/reindex-artifacts

    If you have done anything other add new content storage areas to the end of the previously-entered list, you must run scripts/reindex-artifacts after completion of configure-lockss, before starting the system.

4.7.3. Log Storage Area

Prompt: Root path for log storage

This directory is used as the root of the storage area for log files in the LOCKSS system. Accept the default (same directory as the content data storage directory root) by hitting Enter, or enter a custom path.

4.7.4. Temporary Storage Area

Prompt: Root path for temporary storage (local storage preferred)

This directory is used as the root of the storage area for temporary files in the LOCKSS system. Accept the default (same directory as the content data storage directory root) by hitting Enter, or enter a custom path.

4.8. Web User Interface Settings

  1. Prompt: User name for web UI administration

    Enter a username for the primary administrative user in the LOCKSS system's Web user interfaces.

  2. Prompt: Password for web UI administration user <uiuser>

    Enter a password for the primary administrative user.

  3. Prompt: Password for web UI administration user <uiuser> (again)

    Re-enter the password for the primary administrative user. If the two passwords do not match, the password will be asked again.

4.9. Database Settings

4.9.1. PostgreSQL

Prompt: Use embedded LOCKSS PostgreSQL DB Service?

Select either option A or option B:

  1. Enter Y to use the embedded PostgreSQL database. This is recommended in most cases; a PostgreSQL database will be run and managed by the LOCKSS system internally. If you choose this option, see Embedded PostgreSQL Database.

  2. Enter N to use an external PostgreSQL database. Select this option if you wish to use an existing PostgreSQL database at your institution or one that you run and manage yourself. If you choose this option, see External PostgreSQL Database.

4.9.1.1. Embedded PostgreSQL Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: Password for PostgreSQL database

    Enter the password for the embedded PostgreSQL database.

    Warning

    This prompt is used to record the PostgreSQL database password in the LOCKSS system's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. See Working with PostgreSQL for details.

  2. Prompt: Password for PostgreSQL database (again)

    Re-enter the password for the embedded PostgreSQL database. If the two passwords do not match, the password will be asked again.

  3. Complete the Solr section next.

4.9.1.2. External PostgreSQL Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: Fully qualified hostname (FQDN) of PostgreSQL host

    Enter the hostname of the external PostgreSQL database, for example postgres.myuniversity.edu.

  2. Prompt: Port used by PostgreSQL host

    Enter the port where the external PostgreSQL database can be reached, for example 5432.

  3. Prompt: Schema for PostgreSQL service

    Enter the schema name to be used by the LOCKSS system. The schema name used in the embedded PostgreSQL database is LOCKSS, but your database administrator may assign a different schema name to you.

  4. Prompt: Database name prefix for PostgreSQL service

    Enter the prefix to use for any LOCKSS-related database names in the schema. The database name prefix in the embedded PostgreSQL databse is Lockss (note the uppercase/lowercase), but your database administrator may assign a different database name prefix.

  5. Prompt: Login name for PostgreSQL service

    Enter the username for the external PostgreSQL database. The username in the embedded PostgreSQL database is LOCKSS, but your database administrator may assign a different username to you.

  6. Prompt: Password for PostgreSQL database

    Enter the password for the username in the external PostgreSQL database.

    Warning

    This prompt is used to record the PostgreSQL database password in the LOCKSS system's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. Contact your PostgreSQL database administrator for details.

  7. Prompt: Password for PostgreSQL database (again)

    Re-enter the password for the username in the external PostgreSQL database. If the two passwords do not match, the password will be asked again.

  8. Complete the Solr section next.

4.9.2. Solr

Prompt: Use embedded LOCKSS Solr Service?

Select either option A or option B:

  1. Enter Y to use the embedded Solr database. This is recommended in most cases; a Solr database will be run and managed by the LOCKSS system internally. If you choose this option, see Embedded Solr Database.

  2. Enter N to use an external Solr database. Select this option if you wish to use an existing Solr database at your institution or one that you run and manage yourself. If you choose this option, see External Solr Database.

4.9.2.1. Embedded Solr Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: User name for LOCKSS Solr access

    Enter the username for the embedded Solr database.

  2. Prompt: Password for LOCKSS Solr access

    Enter the password for the username in the embedded Solr database.

  3. Prompt: Password for LOCKSS Solr access (again)

    Re-enter the password for the username in the embedded Solr database. If the two passwords do not match, the password will be asked again.

  4. Complete the Metadata Query Service section next.

4.9.2.2. External Solr Database

If you select this option, you will be asked additional configuration questions:

  1. Prompt: Fully qualified hostname (FQDN) of Solr host

    Enter the hostname of the external Solr database server, for example solr.myuniversity.edu.

  2. Prompt: Port used by Solr host:

    Enter the port used by the database server on the Solr host, for example 8983.

  3. Prompt: Solr core repo name:

    Enter name of the Solr core for the LOCKSS repository. The Solr core name used in the embedded Solr database is lockss-repo, but your database administrator may assign a different Solr core name.

  4. Prompt: User name for LOCKSS Solr access

    Enter the username for the external Solr database.

  5. Prompt: Password for LOCKSS Solr access

    Enter the password for the username in the external Solr database.

  6. Prompt: Password for LOCKSS Solr access (again)

    Re-enter the password for the username in the external Solr database. If the two passwords do not match, the password will be asked again.

  7. Complete the Metadata Query Service section next.

4.10. LOCKSS Services

4.10.1. Crawler Service

  1. Prompt: Use LOCKSS Crawler Service?

    Enter Y if you want the crawler service to be run, otherwise N. (The only situation where a crawler service is not needed is LOCKSS networks that are exclusively using direct deposit to store content, most LOCKSS networks need the crawler service.)

  2. If you answer Y: to the previous question, you will see these additional questions:

    1. Prompt: Enable classic LOCKSS crawler?

      Enter Y if you want to run the classic LOCKSS crawler, otherwise N. (Most LOCKSS networks using the crawler service use the classic LOCKSS crawler.)

    2. Prompt: Enable Wget crawler?

      Enter Y if you want to enable the usage of the external Wget crawler, otherwise N.

4.10.2. Metadata Query Service

Prompt: Use LOCKSS Metadata Query Service?

Enter Y if you want the metadata query service to be run, otherwise N.

4.10.3. Metadata Extraction Service

Prompt: Use LOCKSS Metadata Extraction Service?

Enter Y if you want the metadata extraction service to be run, otherwise N.

4.10.4. SOAP Compatibility Service

Prompt: Use LOCKSS SOAP Compatibility Service?

Enter Y if you want the SOAP compatibility service to be run, otherwise N. (This is only needed if you have external tools using the LOCKSS' legacy SOAP Web Services.)

4.11. Web Replay Settings

4.11.1. Pywb

Prompt: Use LOCKSS Pywb Service?

Enter Y to run an embedded Pywb engine for Web replay; otherwise, enter N.

4.11.2. OpenWayback

  1. Prompt: Use LOCKSS OpenWayback Service?

    Enter Y to use an embedded OpenWayback engine for Web replay; otherwise, enter N.

  2. If you answered Y, you will be asked an additional configuration question:

    Okay to turn off authentication for read-only requests for LOCKSS Repository Service?

    OpenWayback currently does not supply user credentials when reading content from the LOCKSS repository, so the repository must be configured to respond to unauthenticated read requests. Enter Y to accept this, otherwise you will see the warning:

    Not enabling OpenWayback Service

    and OpenWayback will not be run.

4.12. Final Steps

  1. Prompt: OK to store this configuration?

Enter Y if the configuration values are to your liking; otherwise, enter N to make edits.

  1. If you answer Y, configure-lockss will perform the final configuration steps. You may be asked to confirm before directories are created for the first time:

    <directory> does not exist; shall I create it?

    or before directory permissions are changed:

    <directory> is not writable; shall I chown it?

    In each case, enter Y for "yes" and N for "no".


Footnotes

1

See Running Commands as the lockss User.