Bug Description
insert_kwargs and update_kwargs passed to refresh_ref_docs() are silently dropped after the first matching document because the method calls .pop() on the shared update_kwargs dict inside the document loop. So, the first inserted or updated document receives the expected kwargs, but subsequent documents in the same batch receive {} without any error.
Version
0.14.21
Steps to Reproduce
from typing import Any, List
from llama_index.core import Document, VectorStoreIndex
from llama_index.core.schema import BaseNode, TransformComponent
class RecordKwargs(TransformComponent):
def __call__(self, nodes: List[BaseNode], **kwargs: Any) -> List[BaseNode]:
print(f"transform received: {kwargs}")
return nodes
docs = [Document(text=f"doc {i}") for i in range(3)]
index = VectorStoreIndex([], transformations=[RecordKwargs()])
print("refresh_ref_docs with insert_kwargs={'my_flag': True}:")
index.refresh_ref_docs(docs, insert_kwargs={"my_flag": True})
Relevant Logs/Tracebacks
refresh_ref_docs with insert_kwargs={'my_flag': True}:
transform received: {'my_flag': True}
transform received: {}
transform received: {}
[True, True, True]
Bug Description
insert_kwargsandupdate_kwargspassed torefresh_ref_docs()are silently dropped after the first matching document because the method calls.pop()on the sharedupdate_kwargsdict inside the document loop. So, the first inserted or updated document receives the expected kwargs, but subsequent documents in the same batch receive{}without any error.Version
0.14.21
Steps to Reproduce
Relevant Logs/Tracebacks
refresh_ref_docs with insert_kwargs={'my_flag': True}: transform received: {'my_flag': True} transform received: {} transform received: {} [True, True, True]