Building a Practical AI-Powered Essay Writer Using LangChain, GPT-3.5 and Tavily

In this post, we’ll show you how to create a powerful AI-driven essay writer using technologies like LangChain, OpenAI’s GPT-3.5 model, and Tavily. We’ll break down each part of the code and demonstrate how to use it in a real-world scenario to generate an essay on any topic, step by step, with AI assistance.

Introduction

With advancements in AI and natural language processing (NLP), building automated content generation systems has become more accessible. The project we’re discussing in this blog post is an AI-powered essay-writing assistant that:

Plans the structure of an essay.
Generates an essay based on the user’s query.
Critiques the generated content and revises it based on feedback.
Uses multiple rounds of revision to improve the overall quality.

This setup allows users to provide a topic to the system, and it will automatically generate an essay with multiple revisions.

Technologies Used

1. LangChain

LangChain provides the infrastructure to build applications that utilize language models like GPT-3.5. It simplifies working with chains of prompts, agents, and allows for easy integration with external systems.

2. OpenAI’s GPT-3.5

We are using the GPT-3.5 model from OpenAI to handle the core task of generating and improving the content.

3. Tavily API

Tavily is a research API that helps augment the essay generation with additional research and external content based on the queries generated by the model.

4. SQLite for State Saving

We use SQLite to manage the state between different steps of the essay writing process, allowing us to checkpoint the progress and handle multiple revisions.

Project Breakdown: How the AI-Powered Essay Writer Works

We will now walk through the core components of the code, explaining how each part works as a tool or role within the system. Each node in the diagram represents a specific role, such as planning, researching, drafting, critiquing, and revising the essay. Each of these nodes relies on the LLM (Large Language Model) in a specific capacity, using the provided content and instructions to fulfill its assigned role. This makes the project self-reflective and self-improving, as it critiques and revises the content autonomously. The diagram below shows the flow of the essay-writing system.

Step-by-Step Code Explanation with Improved Naming

Step 1: Setup and Imports

from dotenv import load_dotenv
from langgraph.graph import StateGraph, END
from typing import TypedDict, List
from langgraph.checkpoint.sqlite import SqliteSaver
from langchain_openai import ChatOpenAI
from tavily import TavilyClient
import os

# Load environment variables
load_dotenv()

# Initialize SQLite for state saving
state_saver = SqliteSaver.from_conn_string(":memory:")

# Initialize the OpenAI GPT-3.5 model
gpt_model = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# Initialize Tavily API client for research
research_client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

Step 2: Define Essay State Structure

class EssayState(TypedDict):
    topic: str
    outline: str
    draft: str
    feedback: str
    research_content: List[str]
    current_revision: int
    max_revisions: int

This state contains:

topic: The user-provided topic for the essay.
outline: The high-level structure or outline of the essay.
draft: The generated draft of the essay.
feedback: Critique of the draft.
research_content: Additional research material.
current_revision: The current revision number.
max_revisions: The maximum number of revisions allowed.

Step 3: Define the Prompts

# Prompts for each stage
OUTLINE_PROMPT = "You are an expert writer tasked with creating an outline for the essay topic."
ESSAY_GENERATION_PROMPT = "Generate a high-quality essay using the outline and research data."
CRITIQUE_PROMPT = "Critique the essay and provide feedback for improvement."
RESEARCH_PROMPT = "Generate search queries to gather relevant research content."
REVISION_RESEARCH_PROMPT = "Generate search queries to gather relevant research for essay improvements."

These prompts are tailored for each stage of the writing process.

Step 4: Generating the Essay Outline

def generate_outline(state: EssayState):
    messages = [
        SystemMessage(content=OUTLINE_PROMPT),
        HumanMessage(content=state['topic'])
    ]
    response = gpt_model.invoke(messages)
    return {"outline": response.content}

Purpose: Generate the outline or structure for the essay.
Input: User-provided topic.
Output: A structured outline for the essay.

Step 5: Researching Content for the Essay

def research_content(state: EssayState):
    queries = gpt_model.with_structured_output(Queries).invoke([
        SystemMessage(content=RESEARCH_PROMPT),
        HumanMessage(content=state['topic'])
    ])
    research_data = state['research_content'] or []
    for query in queries.queries:
        search_results = research_client.search(query=query, max_results=2)
        for result in search_results['results']:
            research_data.append(result['content'])
    return {"research_content": research_data}

Purpose: Generate queries for research and fetch external content.
Input: Topic of the essay.
Output: External content related to the essay topic.

Step 6: Generating the Essay Draft

def generate_draft(state: EssayState):
    research_material = "\n\n".join(state['research_content'] or [])
    user_message = HumanMessage(
        content=f"Topic: {state['topic']}\n\nOutline:\n\n{state['outline']}")

    messages = [
        SystemMessage(content=ESSAY_GENERATION_PROMPT.format(content=research_material)),
        user_message
    ]

    response = gpt_model.invoke(messages)
    return {"draft": response.content, "current_revision": state.get("current_revision", 1) + 1}

Purpose: Generate the first draft of the essay.
Input: Essay topic, outline, and research content.
Output: Initial draft of the essay.

Step 7: Critiquing the Draft

def critique_draft(state: EssayState):
    messages = [
        SystemMessage(content=CRITIQUE_PROMPT),
        HumanMessage(content=state['draft'])
    ]
    response = gpt_model.invoke(messages)
    return {"feedback": response.content}

Purpose: Provide feedback and critique on the draft.
Input: The essay draft.
Output: Feedback for improvement.

Step 8: Researching for Revisions

def research_for_revisions(state: EssayState):
    queries = gpt_model.with_structured_output(Queries).invoke([
        SystemMessage(content=REVISION_RESEARCH_PROMPT),
        HumanMessage(content=state['feedback'])
    ])
    revision_research_content = state['research_content'] or []
    for query in queries.queries:
        search_results = research_client.search(query=query, max_results=2)
        for result in search_results['results']:
            revision_research_content.append(result['content'])
    return {"research_content": revision_research_content}

Purpose: Perform additional research to improve the essay based on feedback.
Input: Feedback on the draft.
Output: New research content to revise the essay.

Step 9: Conditional Logic for Revisions

def should_continue_revisions(state: EssayState):
    if state["current_revision"] > state["max_revisions"]:
        return END
    return "critique"

Purpose: Determine whether to continue with another round of revisions.
Input: The current revision count and the maximum allowed.
Output: Decision to continue or end the process.

Final Execution Example

Lets write an essay on “Why AI Still Thinks Cats Rule the Internet”

thread = {"configurable": {"thread_id": "1"}}
for state in graph.stream({
    'topic': "Why AI Still Thinks Cats Rule the Internet?",
    "max_revisions": 4,
    "current_revision": 1,
}, thread):
    print(state)

This execution will generate an essay, provide critiques, and improve the essay with multiple rounds of revisions until the final version is reached.

Step-by-Step Execution Output Explanation

Here’s how the execution proceeds, with each component’s role explained:

Step 1: Outline Generation

Component: generate_outline
What It Did: The system generated a structured outline for the essay based on the provided topic: “Why AI Still Thinks Cats Rule the Internet.”

{
    "outline": "1. Introduction\n  A. Overview of AI and cats on the internet.\n  B. Importance of understanding why AI recognizes cats."
}

Step 2: Research Content Collection

Component: research_content
What It Did: The system generated relevant queries for the essay topic and used Tavily to gather research content to support the writing.

{
    "research_content": [
        "AI algorithms are trained to detect common images on the internet, including cats due to their overwhelming presence...",
        "Researchers have noted that cat-related content drives higher engagement on social media platforms..."
    ]
}

Step 3: First Essay Draft

Component: generate_draft
What It Did: The system generated the first draft of the essay using the outline and research content.

{
    "draft": "Introduction\n\nCats have become an undeniable presence on the internet, and AI's fascination with them is no coincidence..."
}

Step 4: Critique of the Draft

Component: critique_draft
What It Did: The system provided feedback on the initial draft, highlighting areas for improvement.

Step 5: Research for Revisions

Component: research_for_revisions
What It Did: Based on the feedback, the system conducted additional research to gather more relevant information for improving the essay.

Step 6: Revised Essay Draft

Component: generate_draft
What It Did: The system generated a revised version of the essay incorporating the new research content.

Conclusion

In this post, we’ve demonstrated how to build an AI-powered essay writer using LangChain, GPT-3.5, and Tavily. The process involves generating an essay outline, drafting the essay, critiquing it, and revising it multiple times until the desired quality is achieved. This setup allows for extensive customization and can be adapted for various use cases, including article writing, report generation, and more.

This simple workflow can be expanded further to handle more complex tasks, and tools like LangChain provide the flexibility to integrate additional features or external systems for enhanced results.