Overview

CareerLens is an AI-native career intelligence platform that helps professionals understand their career trajectory, skill gaps, and market positioning. It is not a chatbot that answers career questions — it is a structured agent system that reasons, reflects, retrieves, and produces auditable analysis.

This is the system I am actively building. The architecture section reflects current design decisions that are in production or actively being deployed.

Problem

Existing career tools fall into two categories:

Static assessments — personality tests, rigid skill matrices, rule-based recommendations
LLM chatbots — general-purpose models that hallucinate job market data and give superficially plausible but unreliable advice

Neither approach is reliable enough for someone making a significant career decision. The problem requires grounded retrieval (real job market data), structured reasoning (not just token prediction), and quality control (reflection and validation before output).

Constraints

Groundedness: Claims about job markets must be retrievable from real data, not generated
Structured output: Analysis must be parseable by downstream systems (not free text)
Latency: Multi-agent processing can be slow; progressive output UX required
Cost: GPT-4o is expensive; only use where reasoning quality justifies cost
Reliability: Agent loops can fail mid-way; must be resumable

Architecture

Agent Graph (LangGraph)

User Input (resume + career goals)
    │
    ▼
┌─────────────────────────────────────────────┐
│               CareerLens Graph               │
│                                             │
│  Profile Parser ──→ Skill Extractor         │
│        │                    │               │
│        ▼                    ▼               │
│  Job Market Retriever ←── Skill Gap Analyzer│
│        │                                    │
│        ▼                                    │
│  Analysis Synthesizer                       │
│        │                                    │
│        ▼                                    │
│  Reflection Node ──→ [if low confidence]    │
│        │              └── Re-retrieve       │
│        │                                    │
│        ▼                                    │
│  Output Formatter (Structured JSON)         │
└─────────────────────────────────────────────┘
    │
    ▼
Structured Career Report

Node Responsibilities

Node	Model	Responsibility
Profile Parser	GPT-4o	Extract structured profile from resume (JSON)
Skill Extractor	GPT-4o + Structured Outputs	Taxonomy-mapped skill extraction
Job Market Retriever	OpenSearch (RAG)	Retrieve relevant job postings and market data
Skill Gap Analyzer	GPT-4o	Compare profile skills vs. market requirements
Analysis Synthesizer	GPT-4o	Produce initial career analysis
Reflection Node	GPT-4o	Evaluate analysis quality; request re-retrieval if weak
Output Formatter	GPT-4o mini	Format final output to JSON schema

Technology Decisions

Decision	Choice	Why
Agent framework	LangGraph	Explicit state graph with conditional edges; debuggable; supports cycles for reflection
Primary LLM	GPT-4o	Best reasoning quality for multi-step career analysis
Retrieval	OpenSearch + custom embeddings	Existing infra; good BM25 hybrid capability
Structured outputs	OpenAI Structured Outputs	Guarantees JSON schema compliance; eliminates parsing failures
State persistence	PostgreSQL	Agent state checkpointed per step; resumable on failure
Response streaming	SSE	Progressive output; user sees analysis building in real-time

Implementation

LangGraph State Schema

State is explicitly typed — no magic dictionaries:

from langgraph.graph import StateGraph, END
from pydantic import BaseModel
from typing import Optional

class CareerAnalysisState(BaseModel):
    # Inputs
    resume_text: str
    career_goals: str

    # Intermediate
    structured_profile: Optional[dict] = None
    extracted_skills: Optional[list[str]] = None
    retrieved_jobs: Optional[list[dict]] = None
    skill_gaps: Optional[list[dict]] = None

    # Analysis
    initial_analysis: Optional[str] = None
    reflection_score: Optional[float] = None  # 0-1 quality score
    reflection_feedback: Optional[str] = None

    # Output
    final_report: Optional[dict] = None

    # Control
    retry_count: int = 0
    max_retries: int = 3

Reflection Loop

The reflection node is where quality control happens. It evaluates the initial analysis and either approves it or triggers a re-retrieval cycle:

async def reflection_node(state: CareerAnalysisState) -> CareerAnalysisState:
    """
    Evaluate analysis quality and trigger retry if below threshold.
    Uses structured output to get a consistent quality score.
    """
    class ReflectionResult(BaseModel):
        quality_score: float  # 0.0 to 1.0
        issues: list[str]
        needs_more_data: bool
        specific_gaps: list[str]

    result = await openai_client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": REFLECTION_SYSTEM_PROMPT,
            },
            {
                "role": "user",
                "content": f"""
Analysis: {state.initial_analysis}
Retrieved data: {len(state.retrieved_jobs)} job postings
Claimed skill gaps: {state.skill_gaps}

Evaluate if this analysis is specific, grounded, and actionable.
""",
            },
        ],
        response_format=ReflectionResult,
    )

    reflection = result.choices[0].message.parsed

    return state.model_copy(update={
        "reflection_score": reflection.quality_score,
        "reflection_feedback": "\n".join(reflection.issues),
    })


def should_retry(state: CareerAnalysisState) -> str:
    """Conditional edge: retry retrieval or proceed to output."""
    if (
        state.reflection_score < 0.7
        and state.retry_count < state.max_retries
    ):
        return "retry_retrieval"
    return "format_output"

Graph Assembly

def build_career_graph() -> CompiledGraph:
    graph = StateGraph(CareerAnalysisState)

    graph.add_node("parse_profile", parse_profile_node)
    graph.add_node("extract_skills", extract_skills_node)
    graph.add_node("retrieve_jobs", retrieve_jobs_node)
    graph.add_node("analyze_gaps", analyze_gaps_node)
    graph.add_node("synthesize", synthesize_node)
    graph.add_node("reflect", reflection_node)
    graph.add_node("format_output", format_output_node)

    graph.set_entry_point("parse_profile")
    graph.add_edge("parse_profile", "extract_skills")
    graph.add_edge("extract_skills", "retrieve_jobs")
    graph.add_edge("retrieve_jobs", "analyze_gaps")
    graph.add_edge("analyze_gaps", "synthesize")
    graph.add_edge("synthesize", "reflect")

    graph.add_conditional_edges(
        "reflect",
        should_retry,
        {
            "retry_retrieval": "retrieve_jobs",  # Loop back
            "format_output": "format_output",
        },
    )

    graph.add_edge("format_output", END)

    return graph.compile(checkpointer=PostgresCheckpointer())

Failures & Lessons

Failure 1: Free-text LLM output for structured data Early versions generated analysis as free text, then tried to parse it. JSON parsing failed ~15% of the time on complex nested structures. OpenAI's Structured Outputs API (with Pydantic model as response_format) reduced this to 0%.

Failure 2: Unbounded reflection loops Without a max_retries guard, a poor retrieval result could trigger infinite reflection-retry cycles. Added explicit retry counter in state, capped at 3 iterations.

Failure 3: No state persistence Multi-agent processing takes 15–30 seconds. Without state checkpointing, any failure (network, LLM timeout) required complete restart. Added LangGraph's PostgresCheckpointer to persist state after each node completion.

Failure 4: Retrieval context window overflow Initially retrieved 50 job postings per query. GPT-4o context was full; analysis quality degraded significantly. Implemented relevance-ranked truncation: top 15 postings, with key fields extracted to minimize tokens.

Future Improvements

Fine-tuned skill extraction — Domain-specific fine-tuned model for skill taxonomy classification; current GPT-4o approach has inconsistencies with specialized technical skills
Market trend grounding — Add real-time salary data and job posting velocity as retrieval sources
Personalized follow-up questions — Agentic clarification before processing when input is ambiguous
Streaming agent output — Real-time node progress updates via WebSocket (partially implemented)

Key Takeaways

LangGraph's explicit state graph is significantly more debuggable than implicit agent frameworks — you always know what state caused what behavior
Structured Outputs (Pydantic response_format) is not optional for multi-step agents that need to pass data between nodes
Reflection loops genuinely improve output quality, but must be bounded to prevent runaway processing
State persistence is a first-class requirement for any agent system where processing takes more than a few seconds