diff --git a/2_open_source_models/recursive_multimodel/readme.md b/2_open_source_models/recursive_multimodel/readme.md deleted file mode 100644 index a66c452..0000000 --- a/2_open_source_models/recursive_multimodel/readme.md +++ /dev/null @@ -1,282 +0,0 @@ -# Hybrid RAG with Critic–Refiner Workflow (Qwen2.5 + LAmini) - -## 1. 🎯Goal - -This project implements a **Retrieval-Augmented Generation (RAG)** pipeline enhanced - with a **dual-stage Critic–Refiner architecture**. - -The main objective was to create a **highly accurate, context-grounded, and reliable - question-answering system**, combining: - -- **Qwen2.5-7B-Instruct** (cloud-based Critic) -- **LAmini (local GGUF model)** (Refiner) -- **LlamaIndex** (retrieval engine) - -The system rigorously evaluates draft answers using a critic model, detects -factual errors or missing context, and then rewrites them using a local refiner - model. -This produces answers that are **trustworthy**, **grounded**, and **fully derived - from source documents**. - ---- - -## 2. 🤖 About the Models Used - -### 2.1 Qwen2.5-7B-Instruct (Critic Model) - -Qwen2.5-7B is a powerful instruction-tuned LLM developed by Alibaba Cloud. -It was chosen as the **Critic** for these reasons: - -- **High factual reliability:** Qwen models consistently score high in truthfulness -- and instruction-following benchmarks. -- **Ideal for evaluation:** As a cloud-based model on Hugging Face Inference API, -- it is fast, stable, and accurate. -- **Excellent reasoning capabilities:** Perfect for evaluating alignment between -- retrieved context and generated draft answers. - -### 2.2 LAmini (Local Refiner Model) - -LAmini is a compact, efficient, open-source model designed for rewriting and -stylistic refinement. -It was selected as the **Refiner** because: - -- **Small and fast:** Runs comfortably on consumer hardware in `.gguf` format. -- **Excellent at rewriting:** Ideal for polishing or correcting drafts based on -- reviewer feedback. -- **Local privacy:** No online requests; all refinement happens locally. -- **Lightweight:** Fits the project's goal of low-cost, local execution. - -### 2.3 Why a Critic–Refiner System? - -This architecture ensures: - -- The **Critic** checks for correctness, consistency, and missing facts. -- The **Refiner** rewrites only the necessary corrections. -- The workflow minimizes hallucinations and guarantees source-grounded answers. - -This structure is heavily inspired by **self-correcting LLM systems** and - **Human-in-the-Loop editorial workflows**, but automated. - ---- - -## 3. 🛠️ Methodology: Retrieval-Augmented Generation (RAG) - -To answer questions based on documents not included in the LLM’s training data, - RAG augments the model’s knowledge using retrieval. - -The pipeline works as follows: - -1. **Retrieval:** - User question → Convert to embedding → Search vector index → Retrieve relevant - text chunks. - -2. **Draft Generation:** - The retrieved context + question are used to generate a **draft answer**. - -3. **Critic Evaluation (Qwen2.5):** - The critic compares the draft answer against the retrieved context and returns: - - `[OK]` — Draft is accurate - - `[REVISE]` — Draft contains errors/missing info - - plus a bulleted list of required corrections. - -4. **Refinement (LAmini):** - LAmini rewrites the draft based **only on the critic’s feedback**, producing - the final polished answer. - -This ensures accuracy and consistency with the source documents. - -### Implementation Details - -- **Framework:** `LlamaIndex` -- **Local Model Loader:** `llama-cpp-python` -- **Embedding Model:** `HuggingFaceEmbedding` (e.g., BAAI/bge-small) -- **Critic Model:** `Qwen/Qwen2.5-7B-Instruct` via HuggingFace Inference API -- **Refiner Model:** `LAmini-Chat` in `.gguf` format -- **Energy Tracking:** CodeCarbon (`OfflineEmissionsTracker`) - ---- - -## 4. 📑 Prompt Engineering: The Editorial Workflow - -### 4.1 Critic Prompt - -The Critic acts like a strict editor. - -It must: - -- Judge the draft answer -- Compare it with the source context -- Output `[OK]` or `[REVISE]` -- Provide bullet-point feedback only when necessary - -Example behavior: -[REVISE] - -The draft added information not found in the source context. - -Missing key fact about X. - -### 4.2 Refiner Prompt (LAmini) - -The Refiner receives: - -- Draft answer -- Editor (Critic) feedback - -It rewrites the answer accordingly, following strict rules: - -- Only fix issues the Critic highlighted -- No new information allowed -- Must produce a complete final answer - -This avoids adding hallucinations and ensures correctness. - ---- - -## 5. 📊 Sample Workflow (Prompts & Responses) - -You can include your own examples below. - -### Example: [Summarization] - -- **Prompt:** - > In your own words, describe what happened when the computer alarms - appeared during the landing. - -### **Cycle 1:** - -- **Draft Answer:** - >As the descent began, Armstrong and Aldrin found themselves - passing landmarks on the surface two or three seconds early. The problem could - have been mascons, concentrations of high mass in a region or regions of the - Moon's crust that contains a gravitational anomaly, potentially altering Eagle's - trajectory. Five minutes into the descent, the burn and 6,000 feet (1,800 m) - above the surface of the moon, the LM guidance computer (LGC) distracted the - crew with the first of several unexpected 1201 and 1202 program alarms. Inside - Mission Control Center, computer engineer Jack Garman told Guidance Officer - Steve Bales it was safe to continue the descent. The program alarm indicated - "executive overflows," meaning the guidance computer could not complete all its - tasks in real-time and had to postpone some of them. Margaret Hamilton, the - Director of Apollo Flight Computer Programming at the MIT Charles Stark Draper - Laboratory later recalled: "To blame the computer for the Apollo 11 problems is - like blaming the person who spots a fire and calls the fire department. - Actually, the computer was programmed to do more than recognize error - conditions. A complete set of recovery programs was incorporated into the - software. The software's action, in this case, was to eliminate lower priority - tasks and re-establish the more important ones. The computer, rather than almost - forcing an abort, prevented an abourt. If the computer hadn't recognized this - problem and took recovery action, I doubt if Apollo 11 would have been the - successful Moon landing it was. - -- **Critic Response:** - - >- The draft does not accurately describe the computer alarms that appeared - during the landing. The draft should include the specific computer alarms - (1201 and 1202) and the crew's reaction to them. - >- The draft does not mention the specific actions taken by the crew in - response to the computer alarms, such as Armstrong taking semi-automatic control. - >- The draft does not include the details about Mission Control's role in - addressing the alarms and the reassurance given to the crew. - >- The draft does not mention the post-mission analysis showing the real fuel - remaining was probably closer to 50 seconds, which is critical information. - -### **Cycle 2:** - -- **Draft Answer:** - >The LM guidance computer (LGC) distracted the crew with the first - of several unexpected 1201 and 1202 program alarms during the Apollo 11 mission. - The program alarm indicated "executive overflows," meaning the guidance computer - could not complete all its tasks in real-time and had to postpone some of them. - The recovery programs were incorporated into the software, eliminating lower - priority tasks and re-establishing the more important ones. Margaret Hamilton, - the Director of Apollo Flight Computer Programming at the MIT Charles Stark - Draper Laboratory later recalled that to blame the computer for the Apollo 11, - problems is like blaming the person who spots a fire and calls the fire - department. Actually, the computer was programmed to do more than recognize - error conditions. The software's action, in this case, was to eliminate lower - priority jobs and regain the more significant ones. If the computer hadn't - recognized this problem and took recovery action, I doubt if Apollo 11 would - have been the successful Moon landing it was. - -- **Critic Response:** - - >- The draft does not mention the specific computer alarms (1201 and 1202) - that appeared during the landing. - >- The draft does not reference the exact timing of the alarms (five minutes - into the descent, 6,000 feet above the surface). - >- The draft does not include the specific actions taken by Mission Control - to address the alarms. - >- The draft does not mention the critical role of the guidance computer in - safely continuing the descent despite the alarms. - >- The draft does not include the perspective of the astronauts experiencing - the alarms. - >- The draft does not provide the exact quote from Jack Garman to Steve Bales. - >- The draft does not mention Margaret Hamilton's role or her specific - recollection of the event. - -- **Refined Answer (Final):** - - >The LM guidance computer (LGC) distracted the crew with the first - of several unexpected 1201 and 1202 program alarms during the Apollo 11 mission. - The program alarm indicated "executive overflows," meaning the guidance computer - could not complete all its tasks in real-time and had to postpone some of them. - The recovery programs were incorporated into the software, eliminating lower - priority tasks and re-establishing the more important ones. Margaret Hamilton, - the Director of Apollo Flight Computer Programming at the MIT Charles Stark - Draper Laboratory later recalled that to blame the computer for the Apollo 11, - problems is like blaming the person who spots a fire and calls the fire - department. Actually, the computer was programmed to do more than recognize - error conditions. The software's action, in this case, was to eliminate lower - priority jobs and regain the more significant ones. If the computer hadn't - recognized this problem and took recovery action, I doubt if Apollo 11 would - have been the successful Moon landing it was. - ---- - -## 6. 🌱 Environmental Tracking - -We used **CodeCarbon** to measure local compute emissions and energy usage. - -This enables: - -- Transparency regarding energy cost -- Comparison with API-based approaches -- Understanding environmental impact on local hardware - ---- - -## 7. 📚 References (Reputable Sources) - -All documentation used: - -- Hugging Face Inference API - - -- LlamaIndex Documentation - - -- LAmini Models - - -- Qwen2.5 Models - - -- LlamaCPP / GGUF Models - - -- CodeCarbon - - ---- - -## 8. ✅ Summary - -This project demonstrates a powerful hybrid RAG architecture that blends cloud - reasoning and local refinement. -Using a Critic–Refiner pipeline dramatically increases accuracy, reduces - hallucinations, and ensures answers remain faithful to the source documents. - -LAmini provides fast, private, offline rewriting, while Qwen2.5 guarantees - high-quality factual evaluation. - -Together, they form a reliable, cost-efficient, and production-ready RAG system. diff --git a/2_open_source_models/recursive_multimodel/recursive_agent.ipynb b/2_open_source_models/recursive_multimodel/recursive_agent.ipynb deleted file mode 100644 index 0b467a1..0000000 --- a/2_open_source_models/recursive_multimodel/recursive_agent.ipynb +++ /dev/null @@ -1,421 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "id": "dd933b7e", - "metadata": {}, - "outputs": [], - "source": [ - "!pip install huggingface_hub\n", - "!pip install llama-index-core\n", - "!pip install llama-index-embeddings-huggingface\n", - "!pip install sentence-transformers\n", - "!pip install pypdf\n", - "!pip install codecarbon\n", - "!pip install tf-keras" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2f230080", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install llama-index-llms-llama-cpp llama-index-embeddings-huggingface llama-index-core" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1869cdf3", - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "C:\\Users\\YNA\\AppData\\Roaming\\Python\\Python312\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", - " from .autonotebook import tqdm as notebook_tqdm\n" - ] - } - ], - "source": [ - "import textwrap\n", - "from huggingface_hub import InferenceClient\n", - "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n", - "from llama_index.embeddings.huggingface import HuggingFaceEmbedding\n", - "from codecarbon import OfflineEmissionsTracker\n", - "from transformers import AutoTokenizer, AutoModelForSeq2SeqLM\n", - "import torch\n", - "\n", - "# --- 1. Configuration ---\n", - "HF_API_KEY = (\n", - " \"hf_abcdefghijklmnopqrstuvwxyz\" # Replace with your actual Hugging Face API key\n", - ")\n", - "CRITIC_MODEL_ID = \"Qwen/Qwen2.5-7B-Instruct\"\n", - "\n", - "# Local Config (Refiner)\n", - "LAMINI_MODEL_PATH = r\"C:\\Users\\user\\.cache\\huggingface\\hub\\models--MBZUAI--LaMini-Flan-T5-248M\\snapshots\\4e871ba5f20216feaa3b845fc782229cd64eba47\"\n", - "DATA_PATH = (\n", - " r\"C:\\Users\\user\\ELO2_GREEN_AI\\2_open_source_models\\quantized_models\\mistral7b\\data\"\n", - ")\n", - "YOUR_COUNTRY_ISO_CODE = \"EGY\"\n", - "\n", - "# --- 2. Define Robust Prompts ---\n", - "\n", - "# Critic: Forced to start with a status tag\n", - "CRITIC_SYSTEM_PROMPT = \"\"\"You are a strict Editor.\n", - "Compare the 'Draft' to the 'Source Context'.\n", - "\n", - "Output format:\n", - "- Start with \"[OK]\" if the draft is accurate and needs no changes.\n", - "- Start with \"[REVISE]\" if there are errors or missing key facts.\n", - "- Then provide a bulleted list of feedback.\n", - "\n", - "Rules:\n", - "1. If the Draft contradicts the Context, mark it [REVISE].\n", - "2. If the Draft is missing a CRITICAL fact, mark it [REVISE].\n", - "3. Do NOT nitpick small details.\"\"\"\n", - "\n", - "CRITIC_USER_TEMPLATE = \"\"\"--- Source Context ---\n", - "{context}\n", - "--- User Question ---\n", - "{query}\n", - "--- Draft Answer ---\n", - "{draft}\n", - "\n", - "Critique:\"\"\"\n", - "\n", - "REFINER_PROMPT_TEMPLATE = \"\"\"You are a professional Writer.\n", - "Rewrite the 'Draft Answer' to incorporate the 'Editor's Feedback'.\n", - "\n", - "Rules:\n", - "- Only fix what the Editor asked for.\n", - "- Do NOT cut off the answer; write the complete response.\n", - "- Do not add external info.\n", - "\n", - "--- Draft Answer ---\n", - "{draft}\n", - "\n", - "--- Editor's Feedback ---\n", - "{feedback}\n", - "\n", - "--- Rewritten Answer ---\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "159299b0", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "API Client (Critic) initialized.\n", - "Loading LaMini-Flan-T5 (Refiner)...\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "`torch_dtype` is deprecated! Use `dtype` instead!\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Local Mistral 7B (Refiner) loaded.\n", - "Retriever ready.\n" - ] - } - ], - "source": [ - "if not HF_API_KEY:\n", - " raise ValueError(\"HF_TOKEN not set.\")\n", - "\n", - "# --- 1. Initialize API Client (Critic) ---\n", - "client = InferenceClient(token=HF_API_KEY)\n", - "print(\"API Client (Critic) initialized.\")\n", - "\n", - "# --- 2. Initialize Local Mistral (Refiner) ---\n", - "\n", - "\n", - "print(\"Loading LaMini-Flan-T5 (Refiner)...\")\n", - "lamini_tokenizer = AutoTokenizer.from_pretrained(LAMINI_MODEL_PATH)\n", - "lamini_model = AutoModelForSeq2SeqLM.from_pretrained(\n", - " LAMINI_MODEL_PATH, torch_dtype=torch.float16, device_map=\"auto\"\n", - ")\n", - "\n", - "\n", - "class LaMiniWrapper:\n", - " def __init__(self, model, tokenizer):\n", - " self.model = model\n", - " self.tokenizer = tokenizer\n", - "\n", - " def complete(self, prompt):\n", - " inputs = self.tokenizer(\n", - " prompt, return_tensors=\"pt\", max_length=512, truncation=True\n", - " ).to(self.model.device)\n", - "\n", - " outputs = self.model.generate(\n", - " **inputs,\n", - " max_new_tokens=1024,\n", - " temperature=0.1,\n", - " do_sample=True,\n", - " top_p=0.95,\n", - " repetition_penalty=1.1,\n", - " no_repeat_ngram_size=3,\n", - " )\n", - "\n", - " text = self.tokenizer.decode(outputs[0], skip_special_tokens=True)\n", - "\n", - " class Response:\n", - " def __init__(self, text):\n", - " self.text = text\n", - "\n", - " return Response(text)\n", - "\n", - "\n", - "llm_local = LaMiniWrapper(lamini_model, lamini_tokenizer)\n", - "print(\"Local Mistral 7B (Refiner) loaded.\")\n", - "\n", - "# --- 3. Initialize Local Retriever ---\n", - "embed_model = HuggingFaceEmbedding(model_name=\"BAAI/bge-small-en-v1.5\")\n", - "documents = SimpleDirectoryReader(DATA_PATH).load_data()\n", - "index = VectorStoreIndex.from_documents(documents, embed_model=embed_model)\n", - "retriever = index.as_retriever(similarity_top_k=3)\n", - "print(\"Retriever ready.\")" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "419abb79", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "--- Hybrid Studio (Robust Version) ---\n", - "Type 'exit' to quit.\n", - "\n", - "--- Cycle 1 ---\n", - "1. Cloud API (Critic) is evaluating...\n", - "\n", - "[Editor's Feedback]:\n", - "[REVISE]\n", - "\n", - "- The draft does not accurately describe the computer alarms that appeared during the landing. The draft should include the specific computer alarms (1201 and 1202) and the crew's reaction to them.\n", - "- The draft does not mention the specific actions taken by the crew in response to the computer alarms, such as Armstrong taking semi-automatic control.\n", - "- The draft does not include the details about Mission Control's role in addressing the alarms and the reassurance given to the crew.\n", - "- The draft does not mention the post-mission analysis showing the real fuel remaining was probably closer to 50 seconds, which is critical information.\n", - "\n", - "Feedback:\n", - "- The draft should include the specific computer alarms (1201 and 1202) and the crew's reaction to them.\n", - "- Include the details about Mission Control's role in addressing the alarms and the reassurance given to the crew.\n", - "- Mention the post-mission analysis showing the real fuel remaining was probably closer to 50 seconds.\n", - "\n", - "2. Local GPU (Refiner) is rewriting...\n", - "\n", - "[Refined Draft]:\n", - "Draft Answer: As the descent began, Armstrong and Aldrin found themselves\n", - "passing landmarks on the surface two or three seconds early. The problem could\n", - "have been mascons, concentrations of high mass in a region or regions of the\n", - "Moon's crust that contains a gravitational anomaly, potentially altering Eagle's\n", - "trajectory. Five minutes into the descent, the burn and 6,000 feet (1,800 m)\n", - "above the surface of the moon, the LM guidance computer (LGC) distracted the\n", - "crew with the first of several unexpected 1201 and 1202 program alarms. Inside\n", - "Mission Control Center, computer engineer Jack Garman told Guidance Officer\n", - "Steve Bales it was safe to continue the descent. The program alarm indicated\n", - "\"executive overflows,\" meaning the guidance computer could not complete all its\n", - "tasks in real-time and had to postpone some of them. Margaret Hamilton, the\n", - "Director of Apollo Flight Computer Programming at the MIT Charles Stark Draper\n", - "Laboratory later recalled: \"To blame the computer for the Apollo 11 problems is\n", - "like blaming the person who spots a fire and calls the fire department.\n", - "Actually, the computer was programmed to do more than recognize error\n", - "conditions. A complete set of recovery programs was incorporated into the\n", - "software. The software's action, in this case, was to eliminate lower priority\n", - "tasks and re-establish the more important ones. The computer, rather than almost\n", - "forcing an abort, prevented an abourt. If the computer hadn't recognized this\n", - "problem and took recovery action, I doubt if Apollo 11 would have been the\n", - "successful Moon landing it was.\n", - "\n", - "--- Cycle 2 ---\n", - "1. Cloud API (Critic) is evaluating...\n", - "\n", - "[Editor's Feedback]:\n", - "[REVISE]\n", - "\n", - "- The draft does not mention the specific computer alarms (1201 and 1202) that appeared during the landing.\n", - "- The draft does not reference the exact timing of the alarms (five minutes into the descent, 6,000 feet above the surface).\n", - "- The draft does not include the specific actions taken by Mission Control to address the alarms.\n", - "- The draft does not mention the critical role of the guidance computer in safely continuing the descent despite the alarms.\n", - "- The draft does not include the perspective of the astronauts experiencing the alarms.\n", - "- The draft does not provide the exact quote from Jack Garman to Steve Bales.\n", - "- The draft does not mention Margaret Hamilton's role or her specific recollection of the event.\n", - "\n", - "2. Local GPU (Refiner) is rewriting...\n", - "\n", - "[Refined Draft]:\n", - "Draft Answer: The LM guidance computer (LGC) distracted the crew with the first\n", - "of several unexpected 1201 and 1202 program alarms during the Apollo 11 mission.\n", - "The program alarm indicated \"executive overflows,\" meaning the guidance computer\n", - "could not complete all its tasks in real-time and had to postpone some of them.\n", - "The recovery programs were incorporated into the software, eliminating lower\n", - "priority tasks and re-establishing the more important ones. Margaret Hamilton,\n", - "the Director of Apollo Flight Computer Programming at the MIT Charles Stark\n", - "Draper Laboratory later recalled that to blame the computer for the Apollo 11,\n", - "problems is like blaming the person who spots a fire and calls the fire\n", - "department. Actually, the computer was programmed to do more than recognize\n", - "error conditions. The software's action, in this case, was to eliminate lower\n", - "priority jobs and regain the more significant ones. If the computer hadn't\n", - "recognized this problem and took recovery action, I doubt if Apollo 11 would\n", - "have been the successful Moon landing it was.\n", - "\n", - "==================================================\n", - "FINAL RESULT:\n", - "Draft Answer: The LM guidance computer (LGC) distracted the crew with the first\n", - "of several unexpected 1201 and 1202 program alarms during the Apollo 11 mission.\n", - "The program alarm indicated \"executive overflows,\" meaning the guidance computer\n", - "could not complete all its tasks in real-time and had to postpone some of them.\n", - "The recovery programs were incorporated into the software, eliminating lower\n", - "priority tasks and re-establishing the more important ones. Margaret Hamilton,\n", - "the Director of Apollo Flight Computer Programming at the MIT Charles Stark\n", - "Draper Laboratory later recalled that to blame the computer for the Apollo 11,\n", - "problems is like blaming the person who spots a fire and calls the fire\n", - "department. Actually, the computer was programmed to do more than recognize\n", - "error conditions. The software's action, in this case, was to eliminate lower\n", - "priority jobs and regain the more significant ones. If the computer hadn't\n", - "recognized this problem and took recovery action, I doubt if Apollo 11 would\n", - "have been the successful Moon landing it was.\n", - "==================================================\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "C:\\Users\\YNA\\AppData\\Roaming\\Python\\Python312\\site-packages\\codecarbon\\output_methods\\file.py:94: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.\n", - " df = pd.concat([df, new_df], ignore_index=True)\n" - ] - } - ], - "source": [ - "# --- Helper to call API (Increased Tokens) ---\n", - "def call_critic_api(system_prompt, user_prompt):\n", - " messages = [\n", - " {\"role\": \"system\", \"content\": system_prompt},\n", - " {\"role\": \"user\", \"content\": user_prompt},\n", - " ]\n", - " try:\n", - " response = client.chat_completion(\n", - " messages=messages,\n", - " model=CRITIC_MODEL_ID,\n", - " max_tokens=2048, # INCREASED from 512 to prevent cutting\n", - " temperature=0.1,\n", - " )\n", - " return response.choices[0].message.content.strip()\n", - " except Exception as e:\n", - " return f\"API Error: {e}\"\n", - "\n", - "\n", - "# --- Start Hybrid Loop ---\n", - "REFINEMENT_CYCLES = 2\n", - "tracker = OfflineEmissionsTracker(country_iso_code=YOUR_COUNTRY_ISO_CODE)\n", - "tracker.start()\n", - "\n", - "print(\"\\n--- Hybrid Studio (Robust Version) ---\")\n", - "print(\"Type 'exit' to quit.\")\n", - "\n", - "try:\n", - " while True:\n", - " # 1. Inputs\n", - " query = input(\"\\n(1/3) Enter User Query: \")\n", - " if query.lower() in [\"exit\", \"quit\"]:\n", - " break\n", - "\n", - " # Retrieve Context locally\n", - " retrieved_nodes = retriever.retrieve(query)\n", - " context_str = \"\\n---\\n\".join([node.get_content() for node in retrieved_nodes])\n", - "\n", - " draft_text = input(\"(2/3) Paste Draft Text: \")\n", - "\n", - " tracker.start_task(\"Hybrid Refinement\")\n", - " current_draft = draft_text\n", - "\n", - " for i in range(REFINEMENT_CYCLES):\n", - " print(f\"\\n--- Cycle {i + 1} ---\")\n", - "\n", - " # --- A. CRITIC STEP (API) ---\n", - " print(\"1. Cloud API (Critic) is evaluating...\")\n", - " critic_input = CRITIC_USER_TEMPLATE.format(\n", - " context=context_str, query=query, draft=current_draft\n", - " )\n", - " critique = call_critic_api(CRITIC_SYSTEM_PROMPT, critic_input)\n", - "\n", - " print(f\"\\n[Editor's Feedback]:\\n{critique}\\n\")\n", - "\n", - " # --- NEW ROBUST CHECK ---\n", - " # Only stop if it explicitly starts with [OK]\n", - " if critique.startswith(\"[OK]\"):\n", - " print(\">> Critic is satisfied. Stopping early.\")\n", - " break\n", - " elif \"[OK]\" in critique[:20]: # Fallback if it has a small prefix\n", - " print(\">> Critic is satisfied. Stopping early.\")\n", - " break\n", - "\n", - " # --- B. REFINER STEP (Local) ---\n", - " print(\"2. Local GPU (Refiner) is rewriting...\")\n", - " refiner_input = REFINER_PROMPT_TEMPLATE.format(\n", - " draft=current_draft, feedback=critique\n", - " )\n", - "\n", - " # Local generation with sufficient length\n", - " refined_response = llm_local.complete(refiner_input)\n", - " current_draft = refined_response.text\n", - "\n", - " print(\"\\n[Refined Draft]:\")\n", - " print(textwrap.fill(current_draft, width=80))\n", - "\n", - " tracker.stop_task()\n", - " print(\"\\n\" + \"=\" * 50)\n", - " print(\"FINAL RESULT:\")\n", - " print(textwrap.fill(current_draft, width=80))\n", - " print(\"=\" * 50)\n", - "\n", - "finally:\n", - " tracker.stop()" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.10" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -}