Erik Jonas Hartnick
--- a/README.md

+ 94

− 2
+++ b/README.md

+ 94

− 2
 @@ -115,12 +115,104 @@ OLLAMA_MODELS=<model-path> ; <install-path>/./bin/ollama serve &

 5. Ollama is now ready to recieve requests to `<model>` by CHESS. We can also start a chat session with `ollama run <model>` to check that everything works. To stop the web service, get the job to the foreground with `fg`, then stop with `Ctrl` + `C`. To restart the web service, simply run (only) step 3 again. Remove/Uninstall with `rm <download-path>/ollama-linux-amd64.tgz`, `rm -r <install-path>/*` and `rm -r <model-path>/*`.

+## Setting up the venv
+
+1. We change to the `CHESS` directory with `cd CHESS`. There we create a python virtual environment named `venv` with the command (using the appropriate python version on our system):
+
+```bash
+python -m venv venv
+```
+
+2. We activate the environment with the following command, adding `(venv) ` as a prefix to our command prompt
+
+```bash
+source venv/bin/activate
+```
+
+3. We install the required python packages with:
+
+```bash
+pip install -r requirements.txt
+```
+
+4. We are ready to execute any further steps below. Once we're finished, we can deactivate the virtual environment (which removes the `(venv )` prefix) by running:
+
+```bash
+deactivate
+```
+
 ## Configuring the preprocessing

- TODO
+The CHESS framework uses the [langchain python package](https://python.langchain.com/docs/introduction/) to connect to a LLM via the API of a web service. The Ollama integration for langchain was added to the `requirements.txt` file as `langchain_ollama==0.1.3`  (version 0.1.3 because of its compatiblity with the existing requirements).
+
+The preprocessing calls the LLM for embedding of the database and column descriptions, thus the file [CHESS/src/database_utils/db_catalog/preprocessing.py](CHESS/src/database_utils/db_catalog/preprocessing.py) was edited, adding the import `from langchain_ollama import OllamaEmbeddings` and commenting out the existing `EMBEDDING_FUNCTION` to replace it with `EMBEDDING_FUNCTION = OllamaEmbeddings(model="llama3.2")`.
+
+1. To use a different model with Ollama for embedding, the Parameter `model=llama3.2` must be edited to `model=<model>`.
+
+2. Ensure that the Ollama web service is running (`OLLAMA_MODELS=<model-path> ; ollama/bin/./ollama serve &`)
+
+3. Run the preprocessing by changing into the `CHESS` directory with `cd CHESS`, assuming you are in the replication's repository root. Then run:
+
+```bash
+/run/./run_preprocess.sh
+```

 ## Configuring the agents

- TODO
+To configure the model for the agents, we need to add a new engine/model configuration to [CHESS/src/llm/engine_configs.py](CHESS/src/llm/engine_configs.py). For Llama3.2 with Ollama, we added the following configuration to the `ENGINE_CONFIGS`:
+
+```python
+    "meta-llama/llama3-2": {
+        "constructor": ChatOllama,
+        "params": {
+            "model": "llama3.2",
+            "temperature": 0,
+            "model_kwargs": {
+                "stop": [""],
+            },
+            "num_ctx": 32768
+        }
+    }
+```
+
+- The `constructor` will be the langchain constructor `ChatOllama` for API calls to the LLM.
+- The `params` are the constructor parameters
+- The `model` is the model used by Ollama
+- The `temperature` is the default model temperature. This gets overwritten by the config of the agents.
+- The `model_kwargs` are copied from the existing Llama config, including the `stop` entry.
+- The `num_ctx` is the context used by the model. Ollama defaults to a context size of 2048 tokens. We observed context sizes of about 15 000 tokens in the warnings from Ollama, Therefore, we set a context of about twice that. Note that Llama3.2 allows for a context size of up to 128 000, where as Llama3-70B only allows for a context size of 8192 tokens. Check with the model you would like to run with Ollama.
+- For any other parameters, check the [langchain_ollama documentation](https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html#langchain_ollama.chat_models.ChatOllama) on `ChatOllama` 
+
+1. Adding another model for Ollama will be configured similarly to the above config.
+
+To configure the agents, a `.yaml` configuration file and a shell script are needed. For testing purposes, the authors of the replication copied the [CHESS/run/configs/CHESS_IR_CG_UT.yaml](CHESS/run/configs/CHESS_IR_CG_UT.yaml) config file to [CHESS/run/configs/CHESS_IR_CG_UT_LLAMA3-2.yaml](CHESS/run/configs/CHESS_IR_CG_UT_LLAMA3-2.yaml) and the [CHESS/run/configs/CHESS_IR_SS_CG.yaml](CHESS/run/configs/CHESS_IR_SS_CG.yaml) config file to [CHESS/run/configs/CHESS_IR_SS_CG_LLAMA3-2.yaml](CHESS/run/configs/CHESS_IR_SS_CG_LLAMA3-2.yaml) and replaced every `engine` and `engine_name` config with the `meta-llama/llama3-2` model as configured above.
+
+2. Copying the appropriate `.yaml` file in `CHESS/run/configs` and setting all `engine` and `engine_name` parameters to the model of choice will configure the agents to one of the two workflows in the original CHESS paper, just with the model of choice.
+
+Similarly, the shell scripts to run the agents were copied for testing purposes by the authors of the replication, copying [CHESS/run/run_main_ir_cg_ut.sh](CHESS/run/run_main_ir_cg_ut.sh) to [CHESS/run/run_main_ig_cg_ut_llama3.2.sh](CHESS/run/run_main_ig_cg_ut_llama3.2.sh) and [CHESS/run/run_main_ir_ss_cg.sh](CHESS/run/run_main_ir_ss_cg.sh) to [CHESS/run/run_main_ir_ss_cg_llama3.2.sh](CHESS/run/run_main_ir_ss_cg_llama3.2.sh) in `CHESS/run`. The `config` variable was changed to the appropriate path of the agent configuration file:
+
+In [CHESS/run/run_main_ig_cg_ut_llama3.2.sh](CHESS/run/run_main_ig_cg_ut_llama3.2.sh), the line:
+
+```bash
+config="./run/config/CHESS_IR_CG_UT.yaml"
+```
+
+Was changed to the line:
+
+```bash
+config="./run/config/CHESS_IR_CG_UT_LLAMA3-2.yaml"
+```
+
+3. Copying a run script in the directory `CHESS/run` and adjusting the `config` variable to the appropriate config file created in step 2 makes the framework runnable with a custom configuration.
+
+In the information retriever agent (IR) there is another call to the `embed`-API that is not covered by the config in the previous steps. In the `retrieve_entity` tool, the in the file [CHESS/src/workflow/agents/information_retriever/tool_kit/retrieve_entity.py](CHESS/src/workflow/agents/information_retriever/tool_kit/retrieve_entity.py), the replication authors added the import `from langchain_ollama import OllamaEmbeddings` and the property `embedding_function` of class `RetrieveEntity` (line 34, `self.embedding_function = OpenAIEmbeddings(model="text-embedding-3-small")`) was adapted to the `OllamaEmbeddings`: `self.embedding_function = OllamaEmbeddings(model="llama3.2")`
+
+4. Changing the model for the embedding calls of the `retrieve_entity` tool in the information retriever agent to the model of choice will ensure all API calls are directed to the appropriate LLMs.
+
+5. Run the shell script for your configuration from the `CHESS` directory, e. g. for the Llama3.2 testing config of the replication authors:
+
+```bash
+run/./run_main_ir_ss_cg_llama3.2.sh
+```

 ## ...