REST Enrichment Service Install and Configuration

This details the REST Enrichment Service and the Python enrichment steps that are included with Aiimi Insight Engine.

The following enrichment steps are included:

  • Classification – This uses Aiimi’s clustering and classification framework to classify documents using a pre-trained model. Additional documentation for model training can be found in the InsightMaker.Python\DocumentClassification\docs folder.

  • Bert Chinese NER - Provides named entity recognition for Chinese text. It supports person, location and organisation classes.

  • Entity Mapper – Maps values found in one entity with synonyms of that value. It then stores them in another entity.

  • Generative AI Prompt - Allows you to run large language models at enrichment. You can define prompts which are then run over the text content of a file in Aiimi Insight Engine.

    • This works with the Model Server thst hosts both private (Llama2) and cloud based LLMs (Azure Open AI).

  • HF Sentence Transformers - Uses the Sentence Transformers framework to generate word embeddings which can be stored as dense vectors within Aiimi Insight Engine. These provide users with a semantic search experience.

  • HF Sparse Vector - Uses models running with the transformers framework to generate sparse vectors for files within Aiimi Insight Engine. These are stored as Rank Features within Aiimi Insight Engine and enable a search experience that can handle vocabulary mismatch.

  • Huggingface Named Entity Recognition – Extracts named entities from text or documents using statistical methods. This step is more accurate, but slower, than Spacy.

  • Language Detection – Can detect 54 different languages from text.

  • Phase and Topic Detection – Extracts repeating phrases from a document or text that are said to be ‘left right complete’. When writing about a topic, people generally repeat the core concepts and topics several times. This extracts these from the text and creates a list of the core concepts, themes, and topics.

  • Sentiment - Assigns a sentiment label and score to an object stored within Aiimi Insight Engine.

  • Document Summaries – Creates a short multi-sentence summary of a document so users can quickly understand what the document is about. There are several algorithms provided, each with different merits.

Other services within the endpoints folder are alpha enrichment steps and are unsupported.

Last updated