[langgraph] 도구(tool) 호출 시 에러 처리하는 방법

LangGraph 공식문서를 번역한 내용입니다. 필요한 경우 부연 설명을 추가하였고 이해하기 쉽게 예제를 일부 변경하였습니다. 문제가 되면 삭제하겠습니다.

https://langchain-ai.github.io/langgraph/how-tos/tool-calling-errors/

LLM은 도구를 호출하는데 있어 완벽하지 않다. 모델이 존재하지 않는 도구를 호출하거나 요청된 스키마와 일치하지 않는 인수를 반환할 수도 있다. 스키마를 간단하게 유지하고, 한 번에 전달하는 도구의 수를 줄이며, 도구에 대한 좋은 이름과 설명을 제공하는 것과 같은 전략은 이러한 위험을 줄이는 데 도움이 될 수 있지만 완벽한 해결책은 아니다.

이 가이드는 이러한 실패 모드를 완화하기 위해 그래프에 오류 처리 기능을 구축하는 방법을 다룬다.

준비

우선, 필요한 패키지를 설치하자.

pip install langgraph langgraph langchain_openai

ToolNode 사용하기

먼저, 입력 쿼리에 대해 가상의 날씨 도구를 정의한다. 여기서의 목적은 모델이 도구를 올바르게 호출하지 못하는 실제 사례를 시뮬레이션하는 것이다.

from langchain_core.tools import tool


@tool
def get_weather(location: str):
    """Call to get the current weather."""
    if location == "san francisco":
        raise ValueError("Input queries must be proper nouns")
    elif location == "San Francisco":
        return "It's 60 degrees and foggy."
    else:
        raise ValueError("Invalid input.")

다음으로, ReAct 에이전트의 그래프 구현을 설정한다. 이 에이전트는 일부 쿼리를 입력받아 충분한 정보를 얻을 때까지 도구를 반복적으로 호출하여 쿼리를 해결한다. 호출된 도구를 실행하기 위해 사전 구축된 ToolNode를 사용하고, OpenAI에서 제공하는 모델을 사용한다.

from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.prebuilt import ToolNode

tool_node = ToolNode([get_weather])

model_with_tools = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools(
    [get_weather]
)


def should_continue(state: MessagesState):
    messages = state["messages"]
    last_message = messages[-1]
    if last_message.tool_calls:
        return "tools"
    return END


def call_model(state: MessagesState):
    messages = state["messages"]
    response = model_with_tools.invoke(messages)
    return {"messages": [response]}


workflow = StateGraph(MessagesState)

workflow.add_node("agent", call_model)
workflow.add_node("tools", tool_node)

workflow.add_edge(START, "agent")
workflow.add_conditional_edges("agent", should_continue, ["tools", END])
workflow.add_edge("tools", "agent")

app = workflow.compile()

from IPython.display import Image, display

try:
    display(
        Image(
            app.get_graph().draw_mermaid_png(
                output_file_path="how-to-handle-tool-calling-errors.png"
            )
        )
    )
except Exception:
    pass

도구를 호출할 때, 잘못된 입력으로 도구 호출이 되어 오류를 발생시키는 것을 볼 수 있다. 도구를 실행하는 사전 구축된 ToolNode에는 오류를 감지하고 이를 모델에 전달하여 모델이 다시 시도하도록 하는 오류 처리 기능이 있다.

response = app.invoke(
    {"messages": [("human", "what is the weather in san francisco?")]},
)

for message in response["messages"]:
    string_representation = f"{message.type.upper()}: {message.content}\n"
    print(string_representation)

HUMAN: what is the weather in san francisco?

AI: [{'id': 'toolu_01K5tXKVRbETcs7Q8U9PHy96', 'input': {'location': 'san francisco'}, 'name': 'get_weather', 'type': 'tool_use'}]

TOOL: Error: ValueError('Input queries must be proper nouns')
 Please fix your mistakes.

AI: [{'text': 'Apologies, it looks like there was an issue with the weather lookup. Let me try that again with the proper format:', 'type': 'text'}, {'id': 'toolu_01KSCsme3Du2NBazSJQ1af4b', 'input': {'location': 'San Francisco'}, 'name': 'get_weather', 'type': 'tool_use'}]

TOOL: It's 60 degrees and foggy.

AI: The current weather in San Francisco is 60 degrees and foggy.

커스텀 전략

대부분 위와 같이 하는 것도 괜찮지만, 커스텀 폴백(fallback)이 더 나은 경우도 있다.

예를 들어, 아래의 도구는 특정 길이 입력을 요구한다. 작은 모델에겐 까다로운 문제이다. 또한 모델이 문자열을 전달해야 한다고 착각하도록 의도적으로 topic을 복수형으로 만들지 않았다.

from dotenv import load_dotenv
from langchain_core.output_parsers import StrOutputParser
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.constants import START, END
from langgraph.graph import MessagesState, StateGraph
from langgraph.prebuilt import ToolNode
from pydantic import BaseModel, Field

load_dotenv()


class HaikuRequest(BaseModel):
    topic: list[str] = Field(
        max_length=3,
        min_length=3,
    )


@tool
def master_haiku_generator(request: HaikuRequest):
    """Generates a haiku based on the provided topics."""
    model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    chain = model | StrOutputParser()
    topics = ", ".join(request.topic)
    haiku = chain.invoke(f"{topics}에 대한 haiku를 작성해 줘")
    return haiku


tool_node = ToolNode([master_haiku_generator])

model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
model_with_tools = model.bind_tools([master_haiku_generator])


def should_continue(state: MessagesState):
    messages = state["messages"]
    last_message = messages[-1]
    if last_message.tool_calls:
        return "tools"
    return END


def call_model(state: MessagesState):
    messages = state["messages"]
    response = model_with_tools.invoke(messages)
    return {"messages": [response]}


workflow = StateGraph(MessagesState)

workflow.add_node("agent", call_model)
workflow.add_node("tools", tool_node)

workflow.add_edge(START, "agent")
workflow.add_conditional_edges("agent", should_continue, ["tools", END])
workflow.add_edge("tools", "agent")

app = workflow.compile()

response = app.invoke(
    {"messages": [("human", "물에 대한 멋진 haiku를 작성해 줘.")]},
    {"recursion_limit": 10},
)

for message in response["messages"]:
    string_representation = f"{message.type.upper()}: {message.content}\n"
    print(string_representation)

HUMAN: 물에 대한 멋진 haiku를 작성해 줘.

AI: 

TOOL: 맑은 물소리  
자연의 숨결 속에  
고요한 아침

AI: 여기 물에 대한 멋진 하이쿠가 있습니다:

맑은 물소리  
자연의 숨결 속에  
고요한 아침

모델이 입력을 올바르게 처리하는 데 두 번의 시도가 필요함을 볼 수 있다.

더 나은 전략은 실패한 시도를 잘라내어 방해 요소를 줄이고, 그 후 더 고급 모델로 폴백하는 것이다. 다음은 그 예시이다. 또한 우리는 사전 구축된 ToolNode 대신에 커스텀 노드를 사용하여 도구를 호출한다.

import json

from langchain_core.messages import AIMessage, ToolMessage
from langchain_core.messages.modifier import RemoveMessage


@tool
def master_haiku_generator(request: HaikuRequest):
    """Generates a haiku based on the provided topics."""
    model = ChatAnthropic(model="claude-3-haiku-20240307", temperature=0)
    chain = model | StrOutputParser()
    topics = ", ".join(request.topic)
    haiku = chain.invoke(f"Write a haiku about {topics}")
    return haiku


def call_tool(state: MessagesState):
    tools_by_name = {master_haiku_generator.name: master_haiku_generator}
    messages = state["messages"]
    last_message = messages[-1]
    output_messages = []
    for tool_call in last_message.tool_calls:
        try:
            tool_result = tools_by_name[tool_call["name"]].invoke(tool_call["args"])
            output_messages.append(
                ToolMessage(
                    content=json.dumps(tool_result),
                    name=tool_call["name"],
                    tool_call_id=tool_call["id"],
                )
            )
        except Exception as e:
            # Return the error if the tool call fails
            output_messages.append(
                ToolMessage(
                    content="",
                    name=tool_call["name"],
                    tool_call_id=tool_call["id"],
                    additional_kwargs={"error": e},
                )
            )
    return {"messages": output_messages}


model = ChatAnthropic(model="claude-3-haiku-20240307", temperature=0)
model_with_tools = model.bind_tools([master_haiku_generator])

better_model = ChatAnthropic(model="claude-3-5-sonnet-20240620", temperature=0)
better_model_with_tools = better_model.bind_tools([master_haiku_generator])


def should_continue(state: MessagesState):
    messages = state["messages"]
    last_message = messages[-1]
    if last_message.tool_calls:
        return "tools"
    return END


def should_fallback(
    state: MessagesState,
) -> Literal["agent", "remove_failed_tool_call_attempt"]:
    messages = state["messages"]
    failed_tool_messages = [
        msg
        for msg in messages
        if isinstance(msg, ToolMessage)
        and msg.additional_kwargs.get("error") is not None
    ]
    if failed_tool_messages:
        return "remove_failed_tool_call_attempt"
    return "agent"


def call_model(state: MessagesState):
    messages = state["messages"]
    response = model_with_tools.invoke(messages)
    return {"messages": [response]}


def remove_failed_tool_call_attempt(state: MessagesState):
    messages = state["messages"]
    # Remove all messages from the most recent
    # instance of AIMessage onwards.
    last_ai_message_index = next(
        i
        for i, msg in reversed(list(enumerate(messages)))
        if isinstance(msg, AIMessage)
    )
    messages_to_remove = messages[last_ai_message_index:]
    return {"messages": [RemoveMessage(id=m.id) for m in messages_to_remove]}


# Fallback to a better model if a tool call fails
def call_fallback_model(state: MessagesState):
    messages = state["messages"]
    response = better_model_with_tools.invoke(messages)
    return {"messages": [response]}


workflow = StateGraph(MessagesState)

workflow.add_node("agent", call_model)
workflow.add_node("tools", call_tool)
workflow.add_node("remove_failed_tool_call_attempt", remove_failed_tool_call_attempt)
workflow.add_node("fallback_agent", call_fallback_model)

workflow.add_edge(START, "agent")
workflow.add_conditional_edges("agent", should_continue, ["tools", END])
workflow.add_conditional_edges("tools", should_fallback)
workflow.add_edge("remove_failed_tool_call_attempt", "fallback_agent")
workflow.add_edge("fallback_agent", "tools")

app = workflow.compile()

이제 도구 노드는 도구 호출이 실패하면 additional_kwargs에 오류 필드를 포함한 ToolMessages를 반환한다. 만약 그런 일이 발생하면, 실패한 도구 메시지를 제거하는 다른 노드로 넘어가고, 더 나은 모델이 도구 호출 생성을 다시 시도한다.

아래 다이어그램은 이를 시각적으로 보여준다.

try:
    display(
        Image(
            app.get_graph().draw_mermaid_png(
                output_file_path="how-to-handle-tool-calling-errors-haiku-fallback.png"
            )
        )
    )
except Exception:
    pass

한번 시도해 보자. 제거 단계를 강조하기 위해, 모델에서 응답을 스트리밍하여 각 실행된 노드를 볼 수 있도록 한다.

stream = app.stream(
    {"messages": [("human", "Write me an incredible haiku about water.")]},
    {"recursion_limit": 10},
)

for chunk in stream:
    print(chunk)

{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_80dE5u1tindIXTO59Jbcdzrv', 'function': {'arguments': '{"request":{"topic":["물","자연","고요함"]}}', 'name': 'master_haiku_generator'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 82, 'total_tokens': 109, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0705bf87c0', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-8bccec0e-398e-45c5-98f3-72956fab732e-0', tool_calls=[{'name': 'master_haiku_generator', 'args': {'request': {'topic': ['물', '자연', '고요함']}}, 'id': 'call_80dE5u1tindIXTO59Jbcdzrv', 'type': 'tool_call'}], usage_metadata={'input_tokens': 82, 'output_tokens': 27, 'total_tokens': 109, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
{'tools': {'messages': [ToolMessage(content='"\\ub9d1\\uc740 \\ubb3c\\uc18c\\ub9ac  \\n\\uc790\\uc5f0\\uc758 \\uc228\\uacb0 \\uc18d\\uc5d0  \\n\\uace0\\uc694\\ud55c \\uc544\\uce68"', name='master_haiku_generator', id='2c99e339-5a55-4f61-819d-42c0e99323ee', tool_call_id='call_80dE5u1tindIXTO59Jbcdzrv')]}}
{'agent': {'messages': [AIMessage(content='여기 물에 대한 멋진 하이쿠입니다:\n\n흐르는 물소리  \n자연의 조화 속에  \n고요한 아침', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 183, 'total_tokens': 217, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0705bf87c0', 'finish_reason': 'stop', 'logprobs': None}, id='run-ef221353-9261-401f-8f54-f428d246f067-0', usage_metadata={'input_tokens': 183, 'output_tokens': 34, 'total_tokens': 217, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}

보다 깔끔한 응답을 받을 수 있다. 더 강력한 모델은 첫 번째 시도에서 올바르게 처리하고, 작은 모델의 실패는 그래프 상태에서 제거된다. 이 짧은 메시지 기록은 시도들로 그래프 상태가 과도하게 채워지는 것을 방지한다. 또한 이 LangSmith 추적을 검사할 수 있으며, 여기에는 작은 모델에 대한 초기 실패 호출이 표시된다.