RAG Intelligence is a Retrieval-Augmented Generation (RAG) system that integrates various components to facilitate seamless interaction with large language models (LLMs), data retrieval, and conversational AI. This README provides an overview of the key scripts in the repository and instructions on how to run them.
- Python 3.7 or higher
- pip
- virtualenv (optional but recommended)
-
Clone the repository:
git clone https://github.com/RSKMN/rag-intelligence.git cd rag-intelligence
-
Create and activate a virtual environment (optional but recommended):
python -m venv env source env/bin/activate # On Windows, use 'env\Scripts\activate'
-
Install the required dependencies:
pip install -r requirements.txt
Description: This script sets up a Flask API to handle various endpoints, including processing user queries, handling PDF uploads, and interacting with the LLM for responses.
Usage:
/ai
: Accepts POST requests with a JSON payload containing aquery
field./ask_pdf
: Accepts POST requests with a JSON payload containing aquery
field, retrieves relevant information from the PDF data, and returns an answer./pdf
: Accepts POST requests with a file upload, processes the PDF, and updates the vector store.
Description: This script initializes and runs a FastAPI server, providing endpoints for health checks and processing prompts. It dynamically loads and initializes example classes that implement methods like ingest_docs
, rag_chain
, and llm_chain
.
Usage:
/health
: GET endpoint for health checks./prompt
: POST endpoint that accepts a JSON payload with a list of messages constituting the conversation so far.
Description: This script facilitates conversations with the LLM, managing the context and flow of dialogue to maintain coherence and relevance.
Usage: Handles user inputs, maintains conversation history, and interacts with the LLM to generate responses.
Description: This script is designed for fine-tuning the LLaMA model on question-answer pairs, enhancing the model's performance on specific tasks or datasets.
Usage: Loads training data, fine-tunes the LLaMA model, and saves the updated model for deployment.
Description: This script implements the full RAG pipeline, integrating document retrieval and LLM response generation to answer user queries effectively.
Usage: Combines retrievers and LLMs to process user queries, retrieve relevant documents, and generate informed responses.
To run the Flask API:
python app.py
The server will start, and you can interact with the endpoints as described above.
To run the FastAPI server:
python server.py
The server will start, providing health checks and prompt processing endpoints.
To engage in a conversation with the LLM:
python llm_convo.py
Follow the on-screen prompts to input your queries and receive responses from the LLM.
To fine-tune the LLaMA model:
python script.py
Ensure you have the necessary training data and configurations set up before running this script.
To execute the RAG pipeline:
python pipeline.py
This will process user queries through the retrieval and generation components to produce responses.
Contributions are welcome! Please fork the repository and submit a pull request with your changes. Ensure that your code adheres to the project's coding standards and includes appropriate tests.
This project is licensed under the MIT License. See the LICENSE.md file for more details.