AI Bootcamp in India

Homework 3

Overview

During the webinar “Building a RAG Application: From Cool Demo to Production-Ready Solution,” you learned that developing RAG applications effectively requires the use of evaluation frameworks. These frameworks help you identify issues and measure the impact of optimization techniques that you apply to address those issues.

For your homework assignment, you will build a simple question-answering system using the RAG technique and evaluate its performance using RAGAS, a RAG evaluation framework.

Once you have completed the assignment, please share the link to your repository containing the evaluation results and the code committed to the repository.

Please carefully follow the instructions below for setup and guidance on how to properly complete the assignment and report your results.

1. Pre-requisites

In order to complete the homework assignment you will need to have

Python 3.10+
OpenAI API Key

2. Setup

Fork this repository from GitHub
Install dependencies by running

pip install -r requirements.txt

Rename .env-example to .env
Set your OPENAI_API_KEY in the .env file

OPENAI_API_KEY=sk-...

3. Implementing a Question Answering RAG System

A question-answering system is one of the most common types of LLM (Large Language Model) applications implemented using the RAG (Retrieval-Augmented Generation) technique.

In this assignment, you will implement a RAG pipeline that answers questions based on the transcript of Andrey Karpathy's famous talk, "[1hr Talk] Intro to Large Language Models."

The transcript is included in this repository at docs/intro-to-llms-karpathy.txt.

To implement a Question Answering system using a Naive RAG approach, you will need to perform the following steps:

Data Ingestion:
- Split the document into chunks.
- Embed the chunks.
- Store the embeddings in a vector database.
RAG:
- Embed the question.
- Retrieve the relevant context from the vector database.
- Prompt the LLM with the question and retrieved context to generate an answer.

While the above might sound complex, it can actually be implemented with just five lines of LangChain code:

openAIEmbeddings = OpenAIEmbeddings()
loader = TextLoader("./docs/intro-to-llms-karpathy.txt")
index = VectorstoreIndexCreator(embedding=openAIEmbeddings).from_loaders([loader])
llm = ChatOpenAI(temperature=0)
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=index.vectorstore.as_retriever(),
    return_source_documents=True,
)

question = "What is retrieval augmented generation and how does it enhance the capabilities of large language models?"
result = qa_chain({"query": question})

You can use any programming language or framework to implement the RAG pipeline.

There are numerous resources available online to help you implement a RAG system using your preferred stack.

Below is a short list of resources for some of the most popular languages and frameworks.

Python
- RAG Tutorial with LangChain
- Q&A RAG with LLamaIndex
Typescript
- Retrieval Q&A with Typescript LangChain
Java
- RAG application with Redis and Spring AI
.NET
- Retrieval Augmented Generation with .NET

4. Generating Answers to the Question List

To ensure fair grading of all participants' submissions, we have prepared a list of 50 questions that your RAG pipeline must answer.

You can find these 50 questions in the repository at questions.json.

Please write a short script that retrieves contexts from your vector database for each of the questions and generates answers. The questions, retrieved contexts, and answers should be saved in a JSON format, as shown in the output-example.json.

This is necessary so that we can use the outputs of your system to automatically evaluate it using RAGAS metrics.

The JSON file containing your outputs must be pushed to this repository along with your code so that we can review it.

Please note that the questions cannot be altered, as this ensures that the results of all participants can be fairly compared.

Here's an example of a simple python script that generates the answers to the provided questions and saves the results in the required format:

import json

json_results = []
for question in test_questions:
  response = qa_chain.invoke({"query" : question})

  json_results.append( {
    "question" : question,
    "answer" : response["result"],
    "contexts" : [context.page_content for context in response["source_documents"]]
} )

with open('./my_rag_output.json', mode='w', encoding='utf-8') as f:
  f.write( json.dumps(json_results, indent=4) )

5. Evaluation

This repository is fully set up to evaluate the outputs of a Question Answering RAG system based on the transcript of "Andrej Karpathy's - [1hr Talk] Intro to Large Language Models."

So, when your pipeline is ready and you have generated the answers to our question list and saved them in the required format, you can simply use the following command to run the evaluation:

python eval.py path-to-your-output.json

The evaluation is based on the RAGAS framework and will output scores for each of the metrics. The overall ragas_score is simply the mean value of all the other metrics.

{
    "results": {
        "faithfulness": 0.9149, 
        "answer_relevancy": 0.8295, 
        "context_recall": 0.8867, 
        "context_precision": 0.8478, 
        "answer_correctness": 0.7237
    }, "ragas_score": 0.8404951600375368
}

After running the evaluation, the results are automatically saved in the ./results directory:

results/
    - ragas-eval-run-details.csv
    - ragas-eval-scores.json

You can use RAG optimization techniques to improve your RAGAS metrics score.

When you're satisfied with the results, please ensure that you commit and push the following to the repository:

The code implementing your RAG pipeline
The JSON file containing the outputs for our question list
The files in the results/ directory containing the evaluation results and details.

Notice on Dataset Attribution

The transcript and evaluation dataset in this project are based on content from the YouTube video [1hr Talk] Intro to Large Language Models by Andrej Karpathy, licensed under Creative Commons Attribution (CC-BY). Changes have been made to format the transcript.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Bootcamp in India

Homework 3

Overview

1. Pre-requisites

2. Setup

3. Implementing a Question Answering RAG System

4. Generating Answers to the Question List

5. Evaluation

Notice on Dataset Attribution

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
eval		eval
results		results
.env-example		.env-example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
questions.json		questions.json
requirements.txt		requirements.txt

License

DataArt/ai-bootcamp-india-rag-homework

Folders and files

Latest commit

History

Repository files navigation

AI Bootcamp in India

Homework 3

Overview

1. Pre-requisites

2. Setup

3. Implementing a Question Answering RAG System

4. Generating Answers to the Question List

5. Evaluation

Notice on Dataset Attribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages