Skip to main content

CKAN: Installation and configuration

CKAN Ansible Deployment

Clone this repository to your local machine and edit the variables of the playbook/host_vars/production_01.yml:

git clone https://github.com/mjanez/ckan-ansible.git && cd ckan-ansible
Playbook config

Remember to change the CKAN configuration variables before running the Ansible playbook. Specifically the host user/pwd info (ansible_user, ansible_password, etc.) and CKAN configuration: ckan_sysadmin_name, ckan_sysadmin_password and ckan_sysadmin_email. Also the proxy_server_name and nginx_port for correct deployment.

Edit the inventory folder hosts vars and add the target deployment servers IP addresses or hostname for the specific environment.

Customize the deployment configurations in host_vars/* to match your requirements. Modify any necessary variables such as database credentials, CKAN versions, and other specific settings.

SSH authentication

Also if using a SSH password authentication for private repos create a SSH key pair and copy the keys to the ./playbook/roles/common/files/keys. The filenames of the keypair files must begin with id_ (e.g. id_rsa + id_rsa.pub)

Sample deployment

  1. Select the environment you want to deploy, e.g: rhel.

  2. Edit the playbook/host_vars/production_01.yml. Put the path to the SSH private key if is not using password authentication (ansible_ssh_private_key_file/ansible_ssh_pass ).

  3. Run the Ansible playbook to deploy CKAN on the target server. The following command will deploy CKAN on the target server using the playbook configuration. The -vvv flag is used for verbose output:

    # Location of the ansible.cfg file based on the clone directory
    export ANSIBLE_CONFIG=$(pwd)/playbook/ansible.cfg

    # Location if ckan-ansible is cloned in the home directory
    export ANSIBLE_CONFIG=$HOME/ckan-ansible/playbook/ansible.cfg

    # Run the ansible playbook, Verbose with -vvv
    ansible-playbook $HOME/ckan-ansible/playbook/playbook.yml

    The ANSIBLE_CONFIG environment variable is used to specify the location of the ansible.cfg file. This is useful when you have multiple Ansible configurations and you want to specify which one to use, eg. rhel-9, ubuntu-20.04, etc.

host_vars

The */host_vars/*.yml file contain customizable configuration variables for deployment, including database credentials, CKAN version, and web server configuration. Review and modify these before running the Ansible playbook.

Check the services are running

After the deployment, you can check the status of the services using the supervisorctl command. This command provides a command-line interface to the Supervisor process control system, which allows you to control and monitor your services.

The services generated by ckan-Ansible include:

  • ckan, an open-source DMS (data management system) for powering data hubs and data portals which is served via uWSGI and NGINX.
  • ckan-pycsw, a full-featured web service for cataloging geospatial data. Software to achieve interoperability with the open data portals based on CKAN. To do this, ckan2pycsw reads data from an instance using the CKAN API, generates INSPIRE ISO-19115/ISO-19139 [^3] metadata using pygeometa, or another custom schema, and populates a pycsw instance that exposes the metadata using CSW and OAI-PMH.
  • ckan-xloader, worker for quickly load data into DataStore. A replacement for DataPusher.
  • Workers used for remote harvesting. The ckan_harvester_run worker is used to periodically run, the harvesters and the ckan_harvester_fetch worker is used to retrieves the data from the remote servers and prepares it for the ckan_harvester_run worker. Also the ckan_harvester_gather worker is used to identifying the remote resources that need to be harvested. Finally, the ckan_harvester_clean_log worker is used to periodically clean the logs of the harvesters.

To check the status of these services, you can use the supervisorctl status command. Here's an example:

$ supervisorctl status
ckan_fetch_consumer RUNNING pid 2684195, uptime 0:00:50
ckan_gather_consumer RUNNING pid 2684193, uptime 0:00:50
ckan_harvester_clean_log STOPPED Not started
ckan_harvester_run EXITED May 07 01:12 PM
ckan_pycsw RUNNING pid 2684197, uptime 0:00:50
ckan_uwsgi:ckan_uwsgi-00 RUNNING pid 2684194, uptime 0:00:50
ckan_xloader:ckan_xloader-00 RUNNING pid 2684198, uptime 0:00:50