Setup:

This repository contains the code used for the experiments described in the paper "From Legacy to Standard: LLM-Assisted Transformation of Cybersecurity Playbooks into CACAO Format," conducted as part of the CyberGuard and CyberGuard++ projects. The paper was accepted at SecAI 2025. The authors’ preprint is available on Arxiv: https://arxiv.org/pdf/2508.03342.

This study was funded by Fraunhofer FIT under the project CyberGuard and by the Fraunhofer Cluster of Excellence Cognitive Internet Technologies under the project CyberGuard++.

Setup:

Clone the repository
Run cd main/
Install poetry package manager
Run poetry install
Setup ENV in main/app
Create necessary directories to store cached results
Setup Ollama
Run poetry run langchain serve

Create necessary directories to store cached results:

Run cd main/
Run mkdir storage sql
Run mkdir sql/variables sql/workflow
Run cd sql/variables
Run mkdir constant description external names type value
Run cd sql/workflow
Run mkdir step_descriptions step_if_condition step_if_on_false step_if_on_false_retry step_if_on_true step_if_on_true_retry step_names step_on_completion step_on_completion_retry step_on_failure step_on_failure_retry step_on_success step_on_success_retry step_types

Setup Ollama:

To run locally the LLama3.1 8B model you need to download Ollama. After installation download the llama3.1 model with ollama pull llama3.1 (works the same if you want to test mistral or the other supported models)

Portkey (Optional):

The motivation to use portkey was the following - Get a user friendly interface where we can monitor the different requests that we make. It also provides various analytics tools. It works as an intermediary between our application and the LLM provider that we want to use (OpenAI in our case). Every query that we make to an LLM is forwarded first to PortKey so that logging can occur and then it is forwarded to the provider. Providers answer goes to Portkey and then back to our application.

Env Example:

OPENAI_API_KEY=...

PORTKEY_API_KEY=...
PORTKEY_TRACE_ID=...

ENVIRONMENT=prod

The OPENAI_API_KEY is highly relevant for executing the pipeline.

The PORTKEY_API_KEY (Optional) can be generated by:

Register on Portkey
Go to API Keys and generate one
Use it

The PORTKEY_TRACE_ID(Optional) can be set to an arbitrary string. Documentation can be found here.

The ENVIRONMENT variable is only used for the graph.py script to display the workflow graphs.

Routes:

Main - Most relevant route for testing / development

You can access it at http://localhost:8000/main/playground/.

There are four main flows that you can run: metadata, workflow, variables and syntactic_refinement The main route is a wrapper over the actual methods from the /extraction directory.

There are many dependencies that you can change:

If you want to test the approach on a new playbook, just include it in the playbooks/unstructured directory and the AVAILABLE_UNSTRUCTURED_PLAYBOOKS enum. playbook_file_name will be read from the playbooks/unstructured directory

When you select to include few-shot prompting, depending on the flow, the examples will be constructed either from the playbooks with the prefix Few_Shot_Metadata from playbooks/translated directory or the other playbooks from the playbooks/translated directory.

If you want to change which playbooks are used for few-shots go to main/app/utils/prompts.py

Translation Script

You can access it at http://localhost:8000/translation_script/playground/.

Choose a model, case and the table where you want to export your translation results.

Runs the following methods injected with the mapping from the cases.json configuration:

extract_metadata
extract_workflow
extract_variables

When executed a second time, it runs the syntactic_refinement pipeline because it needs the generated translations first.

Suggestion: Use the defaults for the tables.

Evaluation Script - Requires Translation Script

You can access it at http://localhost:8000/evaluation_script/playground/.

Choose a model, a flow and the table where you want to export your evaluation results.

If executed with syntactic flow it will run the syntax checker on the generated translations from the translation script.

If executed with semantic flow:

it will run the semantic evaluation
- metadata
- workflow
- variables
graph edit distance
all of the above metrics will be executed again for the results from the syntactic refinement part

Analyze - Requires Evaluation Script

You can access it at http://localhost:8000/analyze/playground/.

The script extracts and compresses the relevant data from the evaluation script that are necessary to generate the figures.

Playbooks Dataset:

Can be found at /playbooks with the json config located at playbooks/evaluation_dataset/playbooks.json

The translated and unstructured directories are used in the main pipeline for experimenting. The evaluation_dataset and semantic_evaluation_dataset are used in the translation_script, evaluation_script and analysis routes.

Cases:

The cases configuration can be found at main/app/evaluation/cases.json. This represents a mapping from the defined case name to the dependencies that are necessary for the pipeline.

Evaluation Results Storage:

The results of the evaluation can be accessed:

main/gpt-4o-2024-08-06.json
main/gpt-4o-mini-2024-07-18.json
main/llama3.1.json

Each file contains 6 tables:

translation table refers to the results of the translation_script syntactic_evaluation table refers to the results of the evaluation_script that was executed with the syntactic flow syntactic_refinement table refers to the results of the evaluation_script that was executed with the syntactic flow semantic_evaluation table refers to the results of the evaluation_script that was executed with the semantic flow semantic_evaluation_syntactic_refinement table refers to the results of the evaluation_script that was executed with the semantic flow results table refers to the results of the analyze script

Since there are 7 cases and 40 playbooks that we evaluated syntactically, a total of 280 translations should be in the tables:

translation
syntactic_evaluation
syntactic_refinement

Since there are 7 cases and 10 playbooks that we evaluated semantically, a total of 70 translations should be in the tables:

semantic_evaluation
semantic_evaluation_syntactic_refinement

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
main		main
playbooks		playbooks
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Setup:

Create necessary directories to store cached results:

Setup Ollama:

Portkey (Optional):

Env Example:

Routes:

Main - Most relevant route for testing / development

Translation Script

Evaluation Script - Requires Translation Script

Analyze - Requires Evaluation Script

Playbooks Dataset:

Cases:

Evaluation Results Storage:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Setup:

Create necessary directories to store cached results:

Setup Ollama:

Portkey (Optional):

Env Example:

Routes:

Main - Most relevant route for testing / development

Translation Script

Evaluation Script - Requires Translation Script

Analyze - Requires Evaluation Script

Playbooks Dataset:

Cases:

Evaluation Results Storage:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages