The source agent provides two principal services:
To crawl and discover content in data and document systems, what we often call ‘source systems’
To fetch data and documents from systems for enrichment service and end-user functions.
The source agent is built around the concept of a source plugin. This provides the implementation for a back-end data or document store such as SharePoint, Network File Shares or SAP. Aiimi provides a list of source plugins, and are constantly adding new ones. Aiimi, our customers and partners can write source plugins using the Microsoft.NET framework.
Source plugins can bulk load information from back-end systems, perform deltas and store a current crawl state. Because of this a Crawl can be suspended and restarted later where it picks up from where it left off.
Source plugins pull out a set of metadata about the items discovered such as name, location, create and modified date. It also pulls the access control lists of the item. This list is used by Aiimi Insight Engine to control (authorise) who has access to the data and documents.
Source configurations consist of source system specific settings and a series of common settings. These run on a schedule which is configurable through the control hub app. The nature of the source agent means it is not CPU and memory intensive and does not require any disk space (other than for logging). Bandwidth and network latency are likely to impact how fast crawling takes. Ideally, source agents are located near the data sources from a network point of view with low latency.
It is common for Source agents to share a server with enrichment agents and the other supporting agents. Source and enrichment agents work in pairs. For example, the source agent finds content and the enrichment agent performs its tasks. Specific configuration options will be discussed in the scaling section.