[LangChain] 메모리(Memory) 관리

ChatGPT and LangChain: The Complete Developer's Masterclass 강좌의 일부를 요약한 내용이다.

LangChain의 메모리 관리를 위해 만들어볼 프로그램은 아래와 같다.

사용자가 1+1을 물어보면 2라고 대답할 것이다. 그리고 And 3 more?라고 물어보면 뭐라고 대답할까? 만일 1+1이 2라는 사실을 알고 있다면 5라고 대답하겠지만 그렇지 않으면 정5라고 대답하기 어려울 것이다.

ChatGPT prompt에서는 기본적으로 이전 context정보를 알지 못한다. 그래서 And 3 more?라고 물어보면 무엇을 묻는 건지 알수 없을 것이다. 그래서 이를 위해 이전 prompt정보를 저장해두는 메모리가 필요할 것이다. 이를 사용하는 방법을 알아보자.

사용자 입력 받기

우선 사용자 입력을 받는 것을 해보자. python에서는 input을 사용하면 된다.

while True:
    content = input(">> ")

    print(f"You entered: {content}")

이렇게 하고 실행해보면 아래와 같이 출력이 된다.

>> Hi there
You entered: Hi there

Chat vs Completion 모델

사용할 수 있는 아주 많은 LLM이 있다. (GPT 3.0, PaLM, BLOOM, Llama, GLM, Alpace, OPT, StableLM, Camel)

LLM을 사용할 때 일반적으로 두 가지의 텍스트 생성 스타일이 있다. 대부분 LLM은 completion 스타일을 따른다.

Completion 스타일

Completion 스타일은 아래와 같이 특정 문장을 입력하면 완성을 해주는 기능을 가지고 있다.

"나는 세금과 관련된 농담을 하는 코메디언이다. 세금이" 라고 입력하고 전송하면 뒤의 글자를 만들어서 붙여준다.

이렇듯 문장을 자동완성 해주는 형태인 것이다.

Chat 스타일

completion 스타일 외에 대화형(Conversational) 스타일도 있다. 대부분 잘 알려진 ChatGPT, Gemini, Claude 등이 대화형을 지원하는 모델이다. 물론 Completion도 지원한다.

Completion 스타일의 경우 Input의 문장을 완성해주는 아주 간단한 형태이지만, Chat 스타일인 경우 좀 더 복잡하다.

역할

System Message: 챗봇이 어떻게 행동할 것인지 정의하는 메시지
User Message: 사용자가 생성한 메시지. 즉 내가 보낸 메시지를 의미
Assistant Message: 챗 모델에서 생성된 메시지

ChatGPT는 이전 대화 이력을 기억 못한다

https://chatgpt.com/에서 대화를 할 때는 이전 대화 내용을 기억하고 대화를 이어간다. 하지만 api를 사용할 때는 기본적으로 ChatGPT는 대화의 이력을 기억하지 못한다. 그래서 메시지를 보낼 때 이전 대화이력을 모두 같이 보내야한다. 그렇지 않으면 정확한 대답을 하지 못한다.

예를 들면 아래와 같은 대화가 있다고 하자.

[User] Tell me a joke

[Assistant] Why did the chicken cross the road?

[user] Why?

이 경우 이전 메시지 내용을 모르고 단순히 Why?라고 물어보면 ChatGPT는 무슨 대답을 해야 할지 모른다.

그래서 단순히 Why?라고 물어보면 안되고 아래와 같이 prompt를 작성해야 한다.

User says: Tell me a joke
Assistant says: Why did the chicken cross the road?

Why?

이렇게 해야 이전에 어떤 대화내용이 있는지 확인하고 Why가 어떤 의미인지 파악을 할 수 있게 된다.

Open AI 용어 vs LangChain 용어

Open AI와 LangChain에서 사용하는 용어가 조금 다르다.

Open AI에서는 System, User, Assistant라고 나타내고 LangChain에서는 System, Human, AI로 나타낸다. LangChain에서는 해당 역할에 맞는 클래스명을 제공한다.

ChatPromptTemplate

이전의 PromptTemplate과 달리 ChatPromptTemplate은 그 안에 두 가지의 PromptTemplate을 가지고 있다.

SystemMessagePromptTemplate
HumanMessagePromptTemplate

내부 동작방식은 PromptTemplate과 동일하다. System 메시지와 Human 메시지의 변수를 선언하고 input_variables로 변환되는 형태를 가지고 있다.

코드 작성

ChatPromptTemplate을 사용하여 코드를 작성해보자.

from langchain_openai.chat_models import ChatOpenAI
from langchain.chains import LLMChain
from langchain.prompts import HumanMessagePromptTemplate, ChatPromptTemplate
from dotenv import load_dotenv

load_dotenv()

chat = ChatOpenAI()

prompt = ChatPromptTemplate(
    input_variables=["content"],
    messages=[
        HumanMessagePromptTemplate.from_template("{content}"),
    ],
)

chain = LLMChain(
    llm=chat,
    prompt=prompt
)

while True:
    content = input(">> ")
    result = chain.invoke({"content": content})
    print(result["text"])

그리고 다음과 같이 메시지를 입력해보자.

>> What is 1+1?
1+1 equals 2.
>> And 3 more?
1. What is your favorite hobby?
2. Do you have any pets?
3. What is your favorite type of cuisine?

1+1은 2라고 하지만, 3을 더하면 엉뚱한 응답이 넘어온다. 이는 이전에 1+1은 2라는 사실을 몰라서 그런 것이다.

메모리

Memory는 LangChain에서 제공해주는 클래스이다. Chain 내에서 데이터를 저장하는 데 사용이 된다.

Memory는 Chain을 호출할 때마다 두 번 사용이 된다. 첫 번째는 ChatPromptTemplate을 호출하기 전에 사용(input_variables)되고 두 번째는 LLM을 호출한 응답결과(output)에서 사용된다.

LangChain에는 여러가지 종류의 Memory를 가지고 있지만 대화형에서는 ConversationBufferMemory를 사용한다. ConversationBufferMemory의 기본 기능은 메시지 input, output의 이력을 저장하고 있다가 prompt를 전송할 때마다 이력을 같이 전송한다. 그럼 메시지 이력을 어디에다 저장하는 것일까? MessagePlaceholder에 저장한다.

ChatPromptTemplate은 SystemMessagePromptTemplate과 HumanMessagePromptTemplate으로 구성이 되어 있다. 이전 대화를 저장하기 위해서 MessagePlaceholder를 추가한다. 변수명은 messages이다.

실제 호출을 하는 순간 아래와 같은 형식으로 변경이 될 것이다.

ConversationBufferMemory를 통해 이전 대화내용을 전송할 수 있지만 서버가 재시작되면 메모리 내용이 사라진다. 그래서 어딘가 저장을 해 놓아야 한다.

이것을 구현하기 위해서 FileChatMessageHistory를 사용한다. 하지만 여기에 문제가 있다.

이 방식은 모든 대화이력을 지속적으로 prompt에 같이 전달하기 때문에 prompt의 크기가 엄청 커질 수 있다.

예를 들어 What is 1+1? 이런 식의 대화를 여러번 할 수 있다. 대화를 100번 이상하면 어떻게 될까? 대화이력을 계속해서 저장하다 보면 나중에는 엄청나게 많은 메시지가 쌓여질 것이다. 이 대화이력을 LLM에 매번 전송을 하면 데이터 크기 제한에 걸리게 될 것이다. 또한 API 사용 비용도 많이 나온다.

완벽한 방법은 아니지만 이를 해결하기 위해서 ConversationSummaryMemory를 사용한다.

ConversationSummaryMemory

ConversationSummaryMemory의 동작방식은 다음과 같다.
1+1을 한 결과를 LLM의 응답을 받으면 ConversationSummaryMemory가 응답결과에 관여한다. 내부적으로 체인을 가지고 있다. 자체적으로 Language Model에 요청하는 PromptTemplate이 있다. PromptTemplate이 LLM에 요청하고 응답결과를 요약한다. 그리고 다음 질문 때 이전 요약정보를 System Message로 전달한다.

코드로 작성해보자.

from langchain_openai.chat_models import ChatOpenAI
from langchain.chains import LLMChain
from langchain.prompts import (
    MessagesPlaceholder,
    HumanMessagePromptTemplate,
    ChatPromptTemplate,
)
from langchain.memory import (
    ConversationSummaryMemory
)

from dotenv import load_dotenv

load_dotenv()

chat = ChatOpenAI(verbose=True)
memory = ConversationSummaryMemory(
    memory_key="messages",
    return_messages=True,
    llm=chat,
)
prompt = ChatPromptTemplate(
    input_variables=["content", "messages"],
    messages=[
        MessagesPlaceholder(variable_name="messages"),
        HumanMessagePromptTemplate.from_template("{content}"),
    ],
)

chain = LLMChain(llm=chat, prompt=prompt, memory=memory, verbose=True)

while True:
    content = input(">> ")

    result = chain.invoke({"content": content})

    print(result["text"])

실행해보면 아래와 같다. 디버깅을 위해 verbose=True를 추가하였다.

>> What is 1+1?


> Entering new LLMChain chain...
Prompt after formatting:
System: 
Human: What is 1+1?

> Finished chain.
1 + 1 equals 2.
>> And 3 more?


> Entering new LLMChain chain...
Prompt after formatting:
System: The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential. When asked what 1+1 is, the AI responds that 1 + 1 equals 2.
Human: And 3 more?

> Finished chain.
Adding 3 to 2 equals 5.
>> And 5 more?


> Entering new LLMChain chain...
Prompt after formatting:
System: The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential. When asked what 1+1 is, the AI responds that 1 + 1 equals 2. Adding 3 more to 2 equals 5, according to the AI.
Human: And 5 more?

> Finished chain.
Adding 5 more to 5 equals 10, according to the AI.

prompt에서 User 메시지에 이전대화를 요약해서 넣어주는 것을 확인할 수 있다.