top of page

Chatbot (1) – Building a RAG-Based Knowledge Base: Using AI Agentic Workflow

Updated: Oct 3

ree

Background & Challenges


As customer expectations evolve, companies are no longer satisfied with basic chatbot functionalities. There is a growing demand for intelligent assistants that deliver a high-tech yet personable customer experience. This trend has accelerated the adoption of AI avatars and next-gen virtual agents. From an IT consulting standpoint, while visual quality of avatars matters, the top priority is ensuring the chatbot delivers accurate and trustworthy responses.


Key Challenge: Building a High-Quality Knowledge Base for RAG Systems


Technically implementing RAG (Retrieval-Augmented Generation) with LLMs is relatively straightforward. However, real-world deployment presents several major challenges:


  1. Preparing high-quality source knowledge data for RAG

  2. Conducting thorough testing of the developed chatbot

  3. Customizing the solution to meet client-specific requirements


These steps are labor-intensive and time-consuming.


Solution: AI Agentic Workflow for Automated Q&A Pair Generation


To overcome these challenges, TecAce introduced an innovative solution using its proprietary Q&A Generator tool, built on an AI agentic workflow. This tool automatically generates question-answer pairs and applies AI-based supervision to evaluate and improve response quality.


ree

Knowledge Source Processing Workflow 


Knowledge Source Processing Workflow


  • Multi-format document ingestion: Supports text, PDF, Excel, images, and more

  • Semantic chunking: Uses structural analysis to break down documents into meaning-based chunks

  • User-goal oriented document parsing: Focuses on extracting Q&A pairs aligned with user needs


ree

AI Agentic Q&A Generation System


AI Agentic Q&A Generation System


  • Automatically generates questions of varying complexity and depth

  • Covers simple factual queries to complex scenario-based and edge-case questions

  • Users can adjust the quantity, complexity, and type of questions


ree

Generate and optiomize accurate answers


Generate and optimize accurate answers


  • A precise and contextually correct answer is retrieved or generated

  • An evaluator agent scores the QA pair using quantitative metrics

  • QA pairs failing to meet thresholds are regenerated with feedback-driven improvements

  • Answers are tailored to match the desired tone and brand style



AI Supervision: Quality Assurance System


Using TecAce’s AI Supervision Framework, each QA pair is assessed with strict quality metrics:

Metric

Description

Threshold

Accuracy

Factual correctness of the answer

≥ 90%

Answer Relevance

Semantic alignment between Q & A

≥ 85%

Hallucination Rate

Rate of generated content not found in source

≤ 5%

Readability

Flesch Reading Ease score

≥ 60

QA pairs below threshold are fed back into the workflow for regeneration with improvement guidance.



Implementation Results


Processing a ~50-page document, the system automatically generated around 2,000 high-quality QA pairs through this workflow, resulting in:


  1. Significantly Improved Accuracy Over 50% improvement in answer correctness compared to directly inputting the document into RAG


  2. Reduced Human Burden Initial manual effort for data extraction and QA generation decreased substantially


  3. Optimized Vector DB Creation Structured QA pairs improved retrieval efficiency and minimized preprocessing time by ~70%


  4. Customizable Responses Produced consistent brand-aligned answers even without fine-tuning


Benefits and Limitations


Key Benefits


  • Consistent Responses: Delivers uniform quality in customer-facing interactions


  • Easy Knowledge Base Maintenance: Structured format enables faster updates and edits


  • Improved Contextual Understanding: Q&A format enhances retrieval compared to paragraph-level chunking


  • Better User Experience: Clear and concise responses increase user satisfaction


Limitations & Mitigation


Issue

Mitigation Strategy

Knowledge Base Bias

Potential propagation of source bias

AI Supervision with bias testing and diversified questioning

Data Quality & Hallucination

Noisy or incomplete data can degrade QA quality

Multi-layered evaluation using Accuracy, Relevance, etc.


Conclusion & Future Outlook


Building a Q&A-based RAG knowledge base with AI agentic workflows significantly boosts the accuracy, efficiency, and reliability of chatbot systems. This approach is particularly valuable for customer service domains where structured, high-quality responses are essential.


Moving forward, TecAce aims to expand the use of agentic workflows across other AI knowledge applications, paving the way for scalable, adaptive AI customer experiences.


Lead AI governance with AI Supervision! AI Supervision ensures transparency and ethical responsibility of AI systems, supporting businesses in establishing reliable AI governance. Create a safer and more trustworthy AI environment with AI Supervision, which offers real-time monitoring, performance evaluation, and compliance with ethical standards!

Comments


bottom of page