[langchain] 응답값을 Json으로 출력하기 위한 다양한 방법

LLM의 호출결과는 기본적으로 string으로 받을 수 있다. 하지만 다양한 데이터 구조화를 위해 JSON으로 응답을 받기 위한 방법을 찾아보니 아래와 같이 여러가지 방식으로 사용할 수 있다. 여기서 각 방법의 장단점을 설명하고 어떻게 동작하는지 알아보자.

Raw prompting
JsonOutputParser
JSON mode
Tool calling
with_structured_output()

1. Raw Prompting

설명

Raw prompting은 모델에게 특정 형식의 출력을 요청하는 가장 직관적인 방법이다. 사용자 질문과 함께 출력 형식을 설명하는 지침을 제공하여, 모델의 메시지나 문자열 출력을 원하는 형태로 구조화할 수 있다. 이 방법은 어떤 특정한 모델(클래스) 기능 없이도 작동하며, 충분한 추론 능력을 가진 모델이라면 전달된 스키마를 이해할 수 있다.

장점

유연성: Raw prompting은 특정 모델 기능을 필요로 하지 않으며, JSON뿐만 아니라 XML, YAML 등 원하는 형식을 요청할 수 있다.
모델 비의존성: 특정한 훈련 데이터에 치우치지 않고, 다양한 데이터 형식에 맞춰 프롬프트를 최적화할 수 있다.

단점

비결정적 출력: LLM(대규모 언어 모델)은 비결정적이므로, 일관되게 정확한 형식의 출력을 얻는 것이 어려울 수 있다.
프롬프트 최적화의 어려움: 각 모델이 훈련된 데이터에 따라 특정 형식을 선호할 수 있어, 최적의 프롬프트를 찾는 것이 까다로울 수 있다.

예시

template = """
{query}

Please provide the following information in JSON format:
format instructions: {format_instructions}
"""

format_instructions = """
{
    "answer": "The answer to the user query",
}
"""
query = "한국의 수도는?"


prompt = PromptTemplate(
    template=template,
    input_variables=["query"],
    partial_variables={"format_instructions": format_instructions},
)

model = ChatOpenAI()
chain = prompt | model

result = chain.invoke({"query": query})

print(result)

LLM 요청 로그

[llm/start] [chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "Human: \n    Answer the user query.\n    Please provide the following information in JSON format:\n    format instructions: \n    {\n        \"answer\": \"The answer to the user query\",\n    }\n    \n    \n    한국의 수도는?"
  ]
}
[llm/end] [chain:RunnableSequence > llm:ChatOpenAI] [608ms] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "\n{\n    \"answer\": \"서울\"\n}",
        "generation_info": {
          "finish_reason": "stop",
          "logprobs": null
        },
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "\n{\n    \"answer\": \"서울\"\n}",
            "response_metadata": {
              ...
              },
              "model_name": "gpt-3.5-turbo-0125",
              "system_fingerprint": null,
              "finish_reason": "stop",
              "logprobs": null
            },
            "type": "ai",
            "id": "run-a560c1ae-961e-44fc-bfed-555ce4e05462-0",
            "usage_metadata": {
              ...
            "tool_calls": [],
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ]

실행결과

{
    content='{\n    "answer": "서울"\n}' 
    response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 56, 'total_tokens': 67}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} 
    id='run-f63469cf-63f3-4b7a-ac04-eaf032777ead-0'
}

응답값 content가 string으로 내려온다.

Raw prompting은 어떤 방법을 선택하든 결과를 조정하는 데 중요한 역할을 한다.

2. JsonOutputParser

설명

JsonOutputParser는 LangChain에서 제공하는 클래스 중 하나로, 모델의 출력을 JSON 형식으로 직접 파싱하는 데 사용된다. 이 클래스는 모델 출력이 JSON 형식일 것이라고 예상할 때 유용하며, 파싱 과정을 자동화하여 JSON 데이터를 쉽게 다룰 수 있도록 도와준다.

장점

자동화된 파싱: 모델의 출력을 JSON으로 자동 파싱하여, 코드를 간결하게 유지할 수 있다.
유연성: 다양한 구조의 JSON 데이터를 처리할 수 있도록 설계되어 있다.

단점

모델 의존성: 모델의 출력이 올바른 JSON 형식을 따르지 않으면 파싱 과정에서 오류가 발생할 수 있다.

예시

template = """
Answer the user query.
format instructions: {format_instructions}

{query}
"""

format_instructions = """
{
    "answer": "The answer to the user query",
}
"""
query = "한국의 수도는?"

prompt = PromptTemplate(
    template=template,
    input_variables=["query"],
    partial_variables={"format_instructions": format_instructions},
)

model = ChatOpenAI()
chain = prompt | model | JsonOutputParser()

result = chain.invoke({"query": query})

print("result", result)

LLM 요청 로그

[llm/start] [chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "Human: \n    Answer the user query.\n    format instructions: \n    {\n        \"answer\": \"The answer to the user query\",\n    }\n    \n    \n    한국의 수도는?"
  ]
}
[llm/end] [chain:RunnableSequence > llm:ChatOpenAI] [556ms] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "{\n    \"answer\": \"서울\"\n}",
        "generation_info": {
          "finish_reason": "stop",
          "logprobs": null
        },
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "{\n    \"answer\": \"서울\"\n}",
            "response_metadata": {
              ...
              },
              "model_name": "gpt-3.5-turbo-0125",
              "system_fingerprint": null,
              "finish_reason": "stop",
              "logprobs": null
            },
            "type": "ai",
            "id": "run-c93a4a72-6da5-4544-8fc6-28739c4ec7e6-0",
            "usage_metadata": {
              ...
            },
            "tool_calls": [],
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      ...
    },
    "model_name": "gpt-3.5-turbo-0125",
    "system_fingerprint": null
  },
  "run": null
}

실행결과

{'answer': '서울'}

응답값이 json으로 내려오는 것을 확인할 수 있다.

3. JSON Mode

설명

일부 모델, 예를 들어 Mistral, OpenAI, Together AI, Ollama 등은 JSON 모드라는 기능을 지원한다. 이 모드를 활성화하면 모델의 출력이 항상 유효한 JSON 형식으로 제한된다. 이 모드는 보통 설정을 통해 활성화되며, 간단한 프롬프트 지시만으로 작동한다.

장점

간편성: 직접적으로 사용할 수 있고, Tool calling보다 사용이 간편하다.
구조화된 출력: 항상 JSON 형식으로 출력되므로 파싱이 쉽다.

단점

제한된 유연성: JSON 형식으로만 출력이 제한되므로, 특정 상황에서는 유연성이 떨어질 수 있다.

예시

template = """
Answer the user query.
You must always output a JSON object with an "answer" key.

{query}
"""

query = "한국의 수도는?"

prompt = PromptTemplate(
    template=template,
    input_variables=["query"],
)

model = ChatOpenAI(
    model_kwargs={"response_format": {"type": "json_object"}},
)
chain = prompt | model | SimpleJsonOutputParser()

result = chain.invoke({"query": query})

print("result", result)

LLM 요청 로그

[llm/start] [chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "Human: \n    Answer the user query.\n    You must always output a JSON object with an \"answer\" key.\n    \n    한국의 수도는?"
  ]
}
{
  "generations": [
    [
      {
        "text": "{\n    \"answer\": \"서울\"\n}",
        "generation_info": {
          "finish_reason": "stop",
          "logprobs": null
        },
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "{\n    \"answer\": \"서울\"\n}",
            "response_metadata": {
              ...
            },
            "type": "ai",
            "id": "run-82339852-d63f-497b-a4d4-f85613cf8206-0",
            "usage_metadata": {
              ...
            },
            "tool_calls": [],
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      ...
    },
    "model_name": "gpt-3.5-turbo-0125",
    "system_fingerprint": null
  },
  "run": null
}

실행결과

{'answer': '서울'}

4. Tool Calling

설명

Tool calling은 구조화된 출력을 원하는 경우 매우 편리한 방법이다. 이 방법은 스키마를 프롬프트로 전달하는 대신, LangChain의 Tool과 Chat Model 기능을 사용하여 자동으로 스키마에 맞는 출력을 생성한다.

장점

정확성: 스키마를 정확하게 따르도록 모델을 설정할 수 있어, 출력의 신뢰성이 높다.
편의성: 복잡한 프롬프트 없이도 구조화된 데이터를 생성할 수 있다.

단점

구현 복잡성: Tool 설정과 스키마 바인딩 과정이 다소 복잡할 수 있다.

예시

class ResponseFormatter(BaseModel):
    """Always use this tool to structure your response to the user."""

    answer: str = Field(description="The answer to the user's question")
    followup_question: str = Field(
        description="A followup question the user could ask"
    )

template = """
Answer the user query.

{query}
"""

query = "한국의 수도는?"

prompt = PromptTemplate(
    template=template,
    input_variables=["query"],
)

model = ChatOpenAI()
model_with_tools = model.bind_tools([ResponseFormatter])
chain = prompt | model_with_tools
result = chain.invoke({"query": query})
print("result", result)

LLM 호출 로그

[llm/start] [chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "Human: \n    Answer the user query.\n    \n    한국의 수도는?"
  ]
}
[llm/end] [chain:RunnableSequence > llm:ChatOpenAI] [752ms] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "",
        "generation_info": {
          "finish_reason": "tool_calls",
          "logprobs": null
        },
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "",
            "additional_kwargs": {
              "tool_calls": [
                {
                  "id": "call_N1Xtr3eU1VNuMrNHiN8GYpz7",
                  "function": {
                    "arguments": "{\"answer\":\"한국의 수도는 서울입니다.\"}",
                    "name": "ResponseFormatter"
                  },
                  "type": "function"
                }
              ]
            },
            "response_metadata": {
              "token_usage": {
               ...
              },
              "model_name": "gpt-3.5-turbo-0125",
              "system_fingerprint": null,
              "finish_reason": "tool_calls",
              "logprobs": null
            },
            "type": "ai",
            "id": "run-fc6d01ad-7b90-44dd-90cc-71a657f1658a-0",
            "tool_calls": [
              {
                "name": "ResponseFormatter",
                "args": {
                  "answer": "한국의 수도는 서울입니다."
                },
                "id": "call_N1Xtr3eU1VNuMrNHiN8GYpz7",
                "type": "tool_call"
              }
            ],
            "usage_metadata": {
              ...
            },
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      ...
      },
      "prompt_tokens_details": {
        "audio_tokens": null,
        "cached_tokens": 0
      }
    },
    "model_name": "gpt-3.5-turbo-0125",
    "system_fingerprint": null
  },
  "run": null
}

실행결과

{
content='' 
additional_kwargs={'tool_calls': [{'id': 'call_MQqxCPUvMn1wN3LqsCVHzoJB', 'function': {'arguments': '{"answer":"한국의 수도는 서울입니다."}', 'name': 'ResponseFormatter'}, 'type': 'function'}]} 
response_metadata={'token_usage': {'completion_tokens': 25, 'prompt_tokens': 74, 'total_tokens': 99}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None} id='run-ce964566-cc64-4a49-a932-1f3a33fdb719-0' tool_calls=[{'name': 'ResponseFormatter', 'args': {'answer': '한국의 수도는 서울입니다.'}, 'id': 'call_MQqxCPUvMn1wN3LqsCVHzoJB'}]
}

결과값은 아래 방법으로 얻어낼 수 있다.

print("result", result.additional_kwargs["tool_calls"][0]["function"]["arguments"])

Tool calling은 정확한 데이터 구조를 필요로 하는 경우에 강력한 도구가 될 수 있다.

5. with_structured_output()

설명

LangChain의 일부 채팅 모델은 with_structured_output() 메서드를 지원한다. 이 메서드는 스키마를 입력으로 받아들이며, 모델의 출력을 딕셔너리 또는 Pydantic 객체로 반환한다. 이 메서드는 일반적으로 앞서 설명한 고급 방법들 중 하나를 내부적으로 사용하며, 적절한 출력 파서를 가져오고 스키마를 모델에 맞게 포맷팅하는 작업을 자동으로 처리한다.

장점

편리성: 복잡한 설정 없이 간단하게 스키마 기반의 구조화된 출력을 얻을 수 있다.
통합성: LangChain에서 제공하는 다양한 고급 기능을 쉽게 사용할 수 있다.

단점

제한된 지원: 모든 모델이 이 메서드를 지원하는 것은 아니며, 특정 모델에만 적용된다.

현재는 모든 LLM에서 다 지원하는 것 같다. 아래 링크에서 지원모델을 확인할 수 있다.

https://python.langchain.com/v0.2/docs/integrations/chat/#advanced-features

예시

class ResponseModel(BaseModel):
    """Response to answer user."""

    question: str = Field(description="User question")
    result: str = Field(description="The answer to the user query")


template = """
Answer the user query.

{query}
"""

prompt = PromptTemplate(
    template=template,
    input_variables=["query"],
)

model = ChatOpenAI()
structured_llm = model.with_structured_output(ResponseModel)
chain = prompt | structured_llm

query = "한국의 수도는?"
result = chain.invoke({"query": query})
print(result)

LLM 호출 로그

[llm/start] [chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "Human: \nAnswer the user query.\n\n한국의 수도는?"
  ]
}
{
  "generations": [
    [
      {
        "text": "",
        "generation_info": {
          "finish_reason": "stop",
          "logprobs": null
        },
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "",
            "additional_kwargs": {
              "tool_calls": [
                {
                  "id": "call_FQ7XtkWB10syFSQ3PT8UnSzE",
                  "function": {
                    "arguments": "{\"question\":\"한국의 수도는?\",\"result\":\"서울입니다.\"}",
                    "name": "ResponseModel"
                  },
                  "type": "function"
                }
              ]
            },
            "response_metadata": {
             ...
              },
              "model_name": "gpt-3.5-turbo-0125",
              "system_fingerprint": null,
              "finish_reason": "stop",
              "logprobs": null
            },
            "type": "ai",
            "id": "run-c0c04a17-df6f-41ed-abdc-2a122d8735e1-0",
            "tool_calls": [
              {
                "name": "ResponseModel",
                "args": {
                  "question": "한국의 수도는?",
                  "result": "서울입니다."
                },
                "id": "call_FQ7XtkWB10syFSQ3PT8UnSzE",
                "type": "tool_call"
              }
            ],
            "usage_metadata": {
              ...
            },
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      ...
    },
    "model_name": "gpt-3.5-turbo-0125",
    "system_fingerprint": null
  },
  "run": null
}

with_structured_output()는 JSON을 다룰 때 매우 간편하게 사용할 수 있는 방법이며, LangChain의 여러 기능을 통합하여 최적의 결과를 얻을 수 있다.

PydanticToolsParser

invoke를 디버깅해보면 prompt > llm 후에 PydanticToolsParser가 step으로 실행된다. PydanticToolsParser는 openai의 응답에서 tool의 내용을 파싱하는 역할을 한다. 즉 PydanticToolsParser(tools=[result 클래스])로 넘기면 해당 클래스 타입으로 매핑이 된다.

6. 정리

json으로 표시할 때 응답으로 내려오는 데이터가 다르다. Raw json, JsonOutputParser, JSON mode는 Text에 응답이 포함되어 넘어오고 Tool calling, with_structured_output는 tool_calls에 포함되어 넘어온다.

	- Raw json - JsonOutputParser - JSON mode	Tool calling	with_structured_output
Text	{"answer": "서울"}	-	-
tool_calls	[]	{ "name": "ResponseFormatter", "args": {"answer": "서울입니다."}, "id": "xxx", "type": "tool_call" }	{ "name": "ResponseModel", "args": { . "question": "한국의 수도는?", . "result": "서울입니다." }, "id": "xxx", "type": "tool_call" }

참고

https://python.langchain.com/v0.2/docs/concepts/

저작자표시 (새창열림)

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

[langchain] 응답값을 Json으로 출력하기 위한 다양한 방법

1. Raw Prompting

설명

장점

단점

예시

2. JsonOutputParser

설명

장점

단점

예시

3. JSON Mode

설명

장점

단점

예시

4. Tool Calling

설명

장점

단점

예시

5. with_structured_output()

설명

장점

단점

예시

6. 정리

참고

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역