Linux: Workplace AI Install

This guide walks you through installing Workplace AI on a Linux server. It should take around 2 hours to complete.

circle-exclamation

Install Apache

circle-info

If the server is hosting the APPs and APIs, Apache Web Server must be deployed

chevron-rightUbuntuhashtag

To deploy Apache Web Server to host the apps:

  1. If needed, update the system repo index:

 sudo apt update 
  1. Install Apache2:

sudo apt install Apache2 

When prompted enter Y, then press Enter to save your changes.

  1. Verify the installation of Apache2:

apache2 -version
  1. To configure the firewall: List application profiles for Apache:

 sudo ufw app list 

Use profile Apache to enable network connectivity on port 80:

sudo ufw allow ‘Apache’ 

Check the status:

sudo ufw status 
  1. Check the Apache service is operational:

 sudo systemctl status apache2 
chevron-rightRed Hathashtag

To install:

  1. Use dnf command to install package called httpd:

dnf install httpd 
  1. Remove the welcome site by commenting out the contents of:

/etc/httpd/conf.d/welcome.conf 

Remove the default VirtualHost entry by commenting out everything between and including the and tags

/etc/httpd/conf.d/ssl.conf 
circle-info

Use vi and nano, with a # at the start of all lines without a #.

  1. As of Red Hat 8, the Environment File is no longer automatically referenced in the HTTPD service file, which means you’ll see errors around the lines with environment variables in. Until we change the approach, this needs to be added back in beneath the Environment line in /etc/systemd/system/multi-user-target.wants/httpd.service

EnvironmentFile=/etc/sysconfig/httpd 
Then reload and restart: 
systemctl daemon-reload 
systemctl restart httpd 
  1. Run and enable the Apache webserver to start after reboot:

systemctl enable httpd 
  1. Start the Apache web service:

systemctl start httpd 
  1. View the installed Apache version details:

httpd -v 
  1. If the default Red Hat firewall is enabled and you didn’t configure them earlier (or are on a separate server to Elasticsearch) these commands should allow inbound HTTP and HTTPS traffic:

firewall-cmd --zone=public --add-port=80/tcp --permanent 
firewall-cmd --zone=public --add-port=443/tcp --permanent 
firewall-cmd --reload 
circle-info

The Workplace AI httpd configuration will redirect all HTTP traffic to HTTPS.

  1. Check mod_ssl.so is installed in /etc/httpd/modules, if not run:

dnf install httpd mod_ssl -y 

Install .Net8

chevron-rightUbuntuhashtag
  1. Install the runtime:

sudo apt update && \ 
sudo apt install -y aspnetcore-runtime-8.0
circle-info

If you wish to do any development on the server, you may want to install the entire sdk instead (which will include the runtime), but this isn’t necessary just to run AIE components

sudo apt update && \ 
sudo apt install -y dotnet-sdk-8.0
  1. Confirm the .Net versions installed:

dotnet --list-sdks 
dotnet --list-runtimes
chevron-rightRed Hathashtag
  1. Install .Net 8.0 and all its dependencies:

circle-info

Depending on your RedHat version, you may need a licensed version of RedHat to install .NET 8.0.

  1. If you wish to do any development on the server, the full SDK also includes the runtime:

  1. Confirm the .Net versions installed:

For .NET 10 this will look like this:

circle-exclamation
circle-info

Updating the dotnet package while AIE is running can cause issues loading System DLLs.

Always restart all AIE components after dotnet is updated.


Install Java

If you are installing on the same server as Elasticsearch, the Java configuration from earlier using the bundled JDK should allow tika to run, though there are complications around tika forking processes that might make the service fail. There is probably a better solution to this, but it can be resolved by adding the following to /etc/insightmaker/insightmaker.rc after that has been put in place.

If you are installing on a different server to Elasticsearch (or just want to use a separate JDK that receives updates etc.), the above is not needed. A regular install from your package manager will work fine without the above tweak.

circle-info

We can use the ‘headless’ packages to save space since we don’t require GUI functionality

chevron-rightUbuntuhashtag

You should only require the runtime:

Alternatively, if you require the full JDK development purposes etc:

chevron-rightRed Hathashtag

Deploying Workplace AI files

circle-info

If you prefer to use a GUI, you can use WinSCParrow-up-right to transfer files onto the Linux machine. However, the regular Windows terminal has SSH and SCP capabilities built in. On a client install you may require support from their infrastructure team if you can’t connect directly.

If not already installed, install the unzip utility from the appropriate package manager.

The steps below assume the following:

  1. You are using /data as a temporary working area.

  2. You have downloaded the Aiimi Insight Engine zip files to /data/download.

Once the installation is complete the files and folders in /data can be removed.

All installation commands must be executed from a root shell, you can login as root or run the following command from a non-root user:

You will need the root password and appropriate permissions.

chevron-rightUbuntuhashtag

Install Ubuntu and unzip utility:

chevron-rightRed Hathashtag

Unzip the files:

This will create the folder /data/insightmaker with the following top-level folders:

  • Apps

  • bin

  • ContentAgent

  • EnrichmentAgent

  • JobAgent

  • MigrationAgent

  • OcrAgent

  • Plugins

  • Plugins-Optional

  • scripts

  • SecurityAgent

  • SourceAgent

  • Tika

  • Utils

There are the following folders in /data/insightmaker/Apps:

  • Admin

  • AIStudio

  • DataScience

  • OData

  • Search

  • SharePoint


Running the install script

1

Run the following commands

The file /etc/insightmaker/insightmaker.rc contains the installation and runtime settings for Aiimi Insight Engine. This should be edited to match the environment. Each setting has a description in the file.

For more information on the insightmaker.rc file, see the InsightMaker.RC section.

circle-exclamation

A local copy of the certificates generated during the Elastic installation (likely elastic-certificates.p12 and elastic-stack-ca.p12) are also needed, by default these are expected to be /etc/insightmaker, but this can be configured in the insightmaker.rc file.

If the server is hosting the web apps, a public certificate and associated private key are needed. Details can be found in the insightmaker.rc. The defaults are im-public-certificate.cer and im-public-certificate.key and are expected in /etc/insightmaker unless configured differently. How you provision these will depend on your environment, you may be able to use a third-party service to generate valid signed ones, or use locally-generated self-signed ones that will cause browsers to complain about an insecure site. In a client environment, you will likely have an internal CA and a process to follow around generating new certificates. There may be additional considerations if a load balancer is in use.

The file /data/insightmaker/scripts/im-install.sh performs the installation. Help is displayed as follows:

2

To perform a basic install, run without any options

This will apply basic system configuration and create the necessary accounts, copy the installation files and adjust the permissions.

circle-info

If passwords have not been set in /etc/insightmaker/insightmaker.rc you will be prompted for them during install.

3

If this is the first server[BM1] in the environment, the next step is to initialise the Aiimi Insight Engine indices in Elastic.

circle-info

Note: This is only needed once per environment.

4

If this server is hosting agents, then run the following:

5

If this server is hosting apps, then run the following:

This will also configure Apache HTTPD.

circle-info

The above commands (steps 2-5) can be combined and run at once:

This does not initialise the system, so on the first server include --initialise:

Start all installed components with:

Where $IM_ROOT is the installation location specified in /etc/insightmaker/insightmaker.rc.

Either source the rc file or export it manually.


Installing Apryse (PDFTron)

Apryse (PDFTron) WebViewer on Linux runs in a docker container. The standard Linux deployment uses two images, pdftron and wv-loadbalancer. Some client systems will not have a direct internet connection. In this scenario we provide the images in zip files that can be loaded in offline using the /opt/insightmaker/pdftron-zip-to-image.sh script.

chevron-rightUbuntuhashtag

The pdftron service that is installed references pdftron-compose-aiimi.yml in the ‘scripts’ folder. You will need to populate the PDFTron licence in the appropriate section.

If you encounter networking or performance issues with the default setup, you might find improvements by using extra_hosts and host-gateway in the compose file. This is detailed in the troubleshooting section later in the document.

If you are seeing errors when trying to preview, try the following command to see if more information is available

circle-info

See the troubleshooting section for common errors encountered here.

chevron-rightRed Hathashtag

The default Red Hat container tool is podman, which can be installed with the following commands:

this may require the EPEL repo to be enabled on your Red Hat system:

this is required if you wish to use docker commands for familiarity or if you are using the helper shell scripts to load images offline:


MSTT Core Fonts

Without this, text in previewing may look unattractive and features like pdf generation (e.g. export of data records) may fail.

This involves building and installing the rpm package on the server, and requires the below packages

The run the following commands

circle-check

Managing Workplace AI

chevron-rightReconfiguringhashtag

The runtime settings in /etc/insightmaker/insightmaker.rc can be updated after install and applied to the system by running:

This will prompt to restart the services, if this is not done the settings will be take immediate effect.

chevron-rightProcess Managementhashtag

The individual components can be managed via systemctl. The services are:

Component
Systemd service name

Admin API

im-admin-api.service

Content Agent

im-content-agent.service

Data Science API

im-ds-api.service

Enrichment Agent

im-enrichment-agent.service

Job Agent

im-job-agent.service

Migration Agent

im-migration-agent.service

OCR Agent

im-ocr-agent.service

PDFTron API

Im-pdftron-api.service

Search API

im-search-api.service

Security Agent

im-security-agent.service

Source Agent

im-source-agent.service

Tika

im-tika-agent.service

Apache Web Server

httpd or apache2

The im-manage.sh script provides a way to bulk manage the services

This will loop over the installed services and execute the equivalent systemd command.

chevron-rightChecking the Search App is workinghashtag

Navigate to the host to check the Workplace AI Search app is loading. A login screen is displayed requesting user credentials.

circle-exclamation
chevron-rightChecking the Control Hub is workinghashtag

Navigate to the host to check the Workplace AI Control Hub is loading. A login screen is displayed requesting user credentials.

Log in and confirm you are able to access Control Hub.

circle-info

Login as the Elastic user and use the password specified during the Elastic setup.


File System Crawls

If you are planning to run a file system crawl from linux, you’ll need to install CIFS utilities

Last updated