Advanced Genie Flow concepts

With the tools described so far, one is already able to create sensible dialogues. But there are some nifty tricks to pull even more from the framework.

With the user_input and ai_extraction events, dialogues that play tennis between the user and an LLM can be implemented. The order always looks something like:

Genie Flow sends an initial text to the user
User sends their input as part of a user_input event
LLM compiles a response and sends its input as part of an ai_extraction event
Genie Flow sends that response to the user
Repeat from step 2, unless a final state has been reached

Running LLM queries can be time-consuming. Also, the true power of using LLMs comes into play when a prompt is split into multiple parts. For instance, for Step Back Prompting, you want a first prompt like "do a step back and tell me about the general rules that apply to this problem", and then adding the response to that step back prompt as context to the original query. But there are many cases where you would want to string together a number of consecutive prompts.

Genie Flow has a number of advanced features that enable the programmer to do exactly that.

the `advance` event

When the intermediate results of a string of prompts need to be fed back to the user, the programmer can introduce transitions on an advance event. These events can be sent to the state machine to make it advance onto the next transition without receiving any new user input.

For example, look at the following summary from the Claims Genie code. That example implements the following Genie Flow:

Claims Genie flow

Some interesting parts from that code are:

from statemachine import State

from genie_flow.genie import GenieStateMachine


class ClaimsMachine(GenieStateMachine):
    ...
    # STATES
    ai_extracts_information = State(value=200)
    user_views_start_of_generation = State(value=300)
    ai_extracts_categories = State(value=310)

    # EVENTS AND TRANSITIONS
    ai_extraction = ai_extracts_information.to(user_views_start_of_generation, cond="have_all_info")

    advance = user_views_start_of_generation.to(ai_extracts_categories)

    ...

The dialogue at some stage enters the state ai_extracts_information, meaning that some information is extracted from the dialogue. When all the information is gathered (cond="have_all_info"), the user is shown this summary.

Here we have defined a transition from the state user_views_start_of_generation towards the state ai_extracts_categories. The idea being that when that first state is reached, the user is sent some intermediate results (in this case, a summary of the information gathered so far) upon the user front-end has the option to advance the state machine by sending it an advance event. The state machine then advances towards the state ai_extracts_categories, where further processing is done.

This means that the output of an LLM, in this case from the prompt attached to state ai_extracts_information, is sent to the user who can view it. The front-end should then send an advance event back to Genie Flow to make it advance to the next state ai_extracts_categories.

This way, the user can stay abreast of what is happening in the background, learn some intermediate results, and take away some of the waiting time experience when they are just looking at a screen where nothing happens.

chaining and branching

Although the advance event is a great way to chain the output of one prompt into the input of the next, it takes a cycle across the front-end to progress the dialogue and make it go to the next stage.

Chaining templates in the backend means that the front-end is not updated with any intermediate results and no cycles are needed between the back- and front-ends.

The way to chain multiple subsequent calls to an LLM, where the output of one is added to the input for the next, is done by putting the two consecutive templates in a list. For instance the q_and_a_trans.py has the following templates defintions:

from genie_flow.genie import GenieStateMachine


class QandATransMachine(GenieStateMachine):
    ...

    # TEMPLATES
    templates = dict(
        intro="q_and_a/intro.jinja2",
        user_enters_query="q_and_a/user_input.jinja2",
        ai_creates_response=[
            "q_and_a/ai_response.jinja2",
            "q_and_a/ai_response_summary",
        ],
    )

Here, the template assigned to the state ai_creates_response is assigned a list of templates. The first one is the original template that creates the prompt for the LLM. The second one is something like:

Summarize the following text into one paragraph that is not longer than 50 words.
Be strictly to the point, use short sentences and leave out all fluff words.
Also do not use words with more than two syllables.

---
{{ previous_result }}

This template takes the output of the previous prompt and directs the LLM to summarise that into a paragraph of not more than 50 words. The result of the previous prompt is available to the template as the property previous_result. Any other model properties are also available as in any normal template rendering.

This construct makes it easy to string together prompts that follow from one to the next. And that is very useful when the next prompt is dependent on the output of a previous one. If that is not the case, we could branch off into separate prompts that are executed in parallel. This branching is done by assigning a dictionary of prompts. This is done in the Claims Genie example as in the following extract:

from genie_flow.genie import GenieStateMachine

GenieStateMachine


class ClaimsMachine(GenieStateMachine):
    ...

    templates = dict(
        ai_extracts_categories=dict(
            user_role="claims/prompt_extract_categories_user_role.jinja2",
            product_description="claims/prompt_extract_categories_product_description.jinja2",
            target_persona="claims/prompt_extract_categories_target_persona.jinja2",
            further_info="claims/prompt_extract_categories_further_information.jinja2",
        )
    )

Here the template assignment for state ai_extracts_categories is to a dictionary of different templates. The Genie Flow framework will create separate LLM calls for each of the keys in that dictionary that are then run in parallel. The result that is returned is a dictionary with these same keys and the outputs of the LLM for each of the rendered templates.

Of course, the chaining and branching of templates can be combined. So you can chain together different branching templates following each other, followed by a simple template, like expressed in the following snippet:

some_state=[
    dict(
        foo="foo-template.jinja2",
        bar="bar-template.jinja2",
    ),
    dict(
        foo_foo="foo-foo-template.jinja2",
        bar_bar="bar-bar-template.jinja2",
    ),
    "finalize.jinja2",
]

This would first run the foo and bar templates in parallel, feed the output of that (a dictionary with the outputs of each individual prompt) into the foo_foo and bar_bar templates that are also run in parallel. The finalize template is then executed with a dictionary with keys foo_foo and bar_bar, each with the output generated by sending the respective rendered templates as a prompt to the LLM.

Remember that the result of a previous LLM call in the chain will be available in the property previous_result. If the previous step in the chain was a branching template (a dictionary of templates), that property will contain the value of that dictionary.

Mapping and running in parallel

The special template type MapTaskTemplate enables the user to map a template against a list of values in the GenieModel, and receive a list of values in return. The mapping is conducted at run-time, meaning that the values that exist in the model at the time of invocation will all be mapped against the given template. All these template invocations will be run in parallel, meaning that maximum process speed will be achieved. The speed is dependent on the number of Celery workers available at the time.

With the current release, only singular templates can be used to map against. It is foreseeable that more complex constructs such as lists and dicts will be supported.

The way to express a map task template is as follows:

from genie_flow.model.template import MapTaskTemplate
from genie_flow.genie import GenieStateMachine
from statemachine import State


...

class SomeGenieMachine(GenieStateMachine):

    # STATES
    mapping_a_template = State(value=500)

    # TEMPLATES
    templates = dict(
        mapping_a_template=MapTaskTemplate(
            "embed/chunk.jinja2",
            "embedded_doc.chunks[*].content",
        )
    )

This would assign a mapping template to the state mapping_a_template. As can be seen, this would map the template embed/chunk.jinja2 to all the values that will come out of applying the JMES Path expression embedded_doc.chunks[*].content to the model, at run time.

The template could be something like:

{{ map_value }}

And with a meta.yaml such as:

invoker:
  type: genie_flow_invoker.invoker.docproc.embed.EmbedInvoker
  text2vec_url: http://localhost:8080
  pooling_strategy: masked_mean

This would, for each and every value, call the EmbedInvoker to create an embedding for that value. The result would be a list of embeddings.

Here you can find more information on JMES Path expressions or follow the Tutorial. The expression in the above example gives a list of all the content values of all the elements of the list of chunks in the embedded_doc. If the JMES Path expression does not render a list, a warning is created, and the value is placed in a one-element list.

The following properties can be set on a MapTaskTemplate:

template_name: the qualified name of the template to map all values against. NB: currently only singular templates, no lists, dicts or other types are supported.
list_attribute: the JMES Path expression that will be applied to create the list of values to map
map_index_field: (default map_index) the name of the field that will contain the index for each of the mappings
map_value_field: (default map_value) the name of the field that will contain the value of each of the mappings

Your own Celery Task

Rather than specifying a reference to a template, or a list or dictionary of some form, the template can also be a Celery Task reference. That celery task will then be called as an argument a dictionary containing all properties of the data model attached to the state machine.

The return value of that Task will be used as any other output of an LLM call. That means that Celery Tasks can be used as part of a chain or branch, and the same rules will apply.

This gives the programmer the ability to execute arbitrary code.

Background Tasks

Retrieving a response from an LLM can take some time. A string of prompts, one feeding off of another, may take up to minutes to complete. One does not want any client who interacts with Genie Flow to wait for a response. It hogs the flow of the client logic, where one could potentially do more sensible work than wait for the result to come back.

To overcome this, Genie Flow will always respond immediately. Either with a result or with the promise of a result. It is up to the client to poll at their leisure to see if a background process has concluded and a new result can be obtained.

It is our goal to move away from polling and implement a channel approach where a client can Subscribe to messages about the finalisation of a background process.

Background processes are implemented using Celery, a Python framework for distributed queueing and parallel processing.

If a background process is started (typically by a user_input event or an advance event, the Genie Flow framework will inform the client that (one of) the possible next actions is to send a poll event. That event will send a response that either has the output of the longer running background, if that has concluded. If the background process is still running, the response will carry no other information than the fact that the next possible action is to send another poll event.

Via this mechanism, the client is free to conduct other work and is able to check the status of any longer-running process by sending a poll event.

Running

As a consequence of running background tasks using Celery, to be able to run any Genie Flow application, you would need to run two different processes:

The API that any client can talk to
At least one Celery Worker that can pick up background tasks

Besides these two processes, you need to run a Broker and a Backend. Excellent documentation on how to operate a Celery-based application can be found on their website.