AI Model Server API Wrapper
The AI Model Server Wrapper provides a convenient approach in python to use generative AI, extractive AI, sentence transformers and document splitting, hosted by the Aiimi Insight Engine AI Model Service.
Initialisation
On initialisation of a ModelServer object, details required for connection to the Aiimi Insight Engine Model Server are provided.
host
string
Host URI, if hosting the Model Server locally, you can typically use “http://localhost:15008/”
https
boolean
Optional, flag for if using HTTPS (default false)
verify
boolean
Flag to verify HTTPS requests, defaults to True
system_secret
string
Aiimi Insight Engine system secret for verification. Default value, “insightmaker”
ssl_context
SSLContext
Optionally, provide a python SSLContext object for SSL enabled model servers
encoding
string
String to use for decoding the generative stream, default “utf-8”, unlikely to require changing
Example
from aiimi_insight_engine.model_server import ModelServer
model_server = ModelServer("http://localhost:15008/")
SSL Example
from aiimi_insight_engine.model_server import ModelServer
from aiimi_insight_engine.core.certs import load_ca_from_p12, create_ssl_context
ssl_context = create_ssl_context(load_ca_from_p12("elastic-stack-ca.p12", "password123"))
model_server = ModelServer("https://myserver:15008/", ssl_context=ssl_context)
Generative AI
To use a generative AI model with the Model Server Wrapper, you need to produce a GenerativeModel object. The easiest way is via the get_generative_model method.
ModelServer.get_generative_model(model_type_id, model_id, settings=None, model_parameters=None)
model_type_id
string
Model service provider, e.g. “AzureOpenAIGenerative”
model_id
boolean
Model ID parameter for provider, e.g. “GPT 3.5 Turbo”
settings
dict
Optional, settings dictionary to pass to model server
model_parameters
dict
Optional, model parameters dictionary to pass to model server. It is not required to specify “modelId” here as it will be added in based on earlier parameter.
Example
from aiimi_insight_engine.model_server import ModelServer
model_server = ModelServer("http://localhost:15008/")
generative = model_server.get_generative_model("AzureOpenAIGenerative", "GPT 4.1 Nano")
prompt = "What bird does Jack (from the document) think is the best?"
context = "Jack thinks chickens are the best bird."
print(generative.ask(prompt, context))
Generative Chat
For convenience, there is a chat object which can be created with generative AI for longer conversations.
# For generative models, you can also create chat objects which remember a conversation
chat = generative.new_chat()
print(chat.ask("Hello!"))
print(chat.ask("What was the last thing you said to me?"))
# Chats can have context too
chat2 = generative.new_chat(context)
print(chat2.ask("What other types of bird might Jack like?"))
Extractive AI
Extractive AI models work in a similar way to generative AI models. They use an ExtractiveModel object which is generated using the get_extractive_model method.
ModelServer.get_extractive_model(model_type_id, model_id)
model_type_id
string
Model service provider, e.g. “HuggingfaceExtractive”
model_id
boolean
Model ID parameter for provider, e.g. “Tiny Roberta Squad2”
Example
from aiimi_insight_engine.model_server import ModelServer
model_server = ModelServer("http://localhost:15008/")
extractive = model_server.get_extractive_model("HuggingfaceExtractive", "Tiny Roberta Squad2")
prompt = "What bird does Jack (from the document) think is the best?"
context = "Jack thinks chickens are the best bird."
print(extractive.ask(prompt, context))
Outputs from extractive AI are dictionaries with all details of extracted answers contained. Example:
{
"extractiveAnswers":[
{
"answer":"chickens",
"chunkNumber":1,
"end":20,
"executionTime":0.169106,
"score":99.52,
"start":12,
"surroundingContext":"Jack thinks chickens are the best bird.",
"surroundingContextWithHighlights":"Jack thinks <b class=""highlight"">chickens</b> are the best bird."
}
],
"warnings":[]
}
Sentence Transformers
The model server wrapper allows for the execution of sentence transformer models to vectorise strings. This can be useful for data science purposes as the transformers available will be the same ones used by Aiimi Insight Engine for document vectorisation and semantic search etc.
Example
st_model = model_server.get_sentence_transformer("HuggingfaceSentenceTransformers", "All-mpnet-base-v2 - 768 Dimensions")
print(st_model.transform("Hello World!"))
Example Output
{"executionTime": 0.423893, "vector": [ ... ]}
Document Splitting
The Aiimi Insight Engine model server document splitting capability is also exposed by the wrapper, the main use case for this endpoint is the testing of custom document splitting endpoints.
Example
splitter = model_server.get_document_splitter("MaxCharactersSplitter", {"maxChars": 40})
print(splitter.split("This is a very long bit of text.\n There are separate sentences.\n So it will split smart.\n I hope."))
Last updated