itsluketwist
diff --git a/‎README.md
Lines changed: 28 additions & 22 deletions b/‎README.md
Lines changed: 28 additions & 22 deletions
diff --git a/‎data/README.md
Lines changed: 0 additions & 68 deletions b/‎data/README.md
Lines changed: 0 additions & 68 deletions
@@ -56,22 +56,16 @@ pip install .
 
 ## *usage*
 
-After [*installation*](#installation), there are 2 ways to run the experiment code.
-The easiest of which is via the the [`main.ipynb`](main.ipynb) notebook, which fully describes
-each experiment and provides the methods to run them.
+After [*installation*](#installation), you can run experiment code contained in `src/`.
 
-You can also use the `run` command from your terminal - this is likely best if you want to
-reproduce the experiments on an external server or in a [docker](https://www.docker.com/)
-container.
-
-```shell
-run --dataset-file data/example.json
+```python
+from src import (
+    run_llm_code_bias_experiment,
+    get_llm_code_recommendations,
+)
 ```
 
-All other non-experiment code that likely only needed to be ran a single time is explained in,
-and can be interfaced with, via it's corresponding Jupyter notebook.
-These notebooks are contained in the [`notebooks/`](notebooks/) directory, and are described in the
-[*structure*](#structure) section.
+The easiest way to reproduce the experiments is via the the [`main.ipynb`](main.ipynb) notebook, which fully describes each experiment and provides the methods and setup to run them.
 
 This repository uses up to 3 different LLM APIs -
 [OpenAI](https://platform.openai.com/docs/overview),
@@ -88,20 +82,32 @@ export TOGETHER_API_KEY=...
 
 ## *structure*
 
-This repository contains all of the code used for the project, to allow easy reproduction and encourage further investigation into LLMs coding preferences.
+This repository contains all of the code used for the project, to allow easy reproduction and encourage further investigation into LLM coding preferences.
 It has the following directory structure:
 
-- `data/` - Contains the data used to conduct the experiments, including benchmark datasets. Has it's own [README.md](data/README.md) with detailed information.
+- `main.ipynb` - The easiest way to reproduce the project experiments, containing all methods and setup necessary to do so.
+- `data/` - Contains all the data used to conduct the experiments.
+    - `benchmark_tasks.ipynb` - Notebook used to download, process and sample the benchmark datasets.
+    - `language/benchmark_tasks/` - Benchmark datasets used to explore language preferences.
+    - `language/project_tasks/` - Project descriptions used to explore language preferences.
+    - `library/benchmark_tasks/` - Benchmark datasets used to explore library preferences.
+    - `library/project_tasks/` - Project descriptions used to explore library preferences.
 - `output/` - The full results for all experiments, and other outputs from running the code.
-    - `extended_analysis/chain_of_thought/` - Results from the investigation into the use of chain-of-thought prompting.
-    - `extended_analysis/temperature/` - Results from the investigation into the influence of adjusting the temperature parameter.
+    - `case_analysis/` - GitHub statistics used for the library case analysis.
+    - `extended_analysis/double_check/` - Language preferences when inducing reasoning by asking the model to 'double check'.
+    - `extended_analysis/first_list/` - Language preferences when inducing reasoning by asking the model to 'think step by step'.
+    - `extended_analysis/step_by_step` - Language preferences when asking for recommendations in-context.
+    - `extended_analysis/temperature` - Results from experiments when adjusting temperature.
     - `kendall_tau/` - Statistical analysis of the results.
-    - `language_results/benchmark_tasks/` - Results for language preferences of LLMs when solving tasks from benchmarks.
-    - `language_results/project_init_tasks/` - Results for language preferences of LLMs when writing initial project code.
-    - `library_results/benchmark_tasks/` - Results for library preferences of LLMs when solving tasks from benchmarks.
-    - `library_results/project_init_tasks/` - Results for library preferences of LLMs when writing initial project code.
-    - `paper_figures/` - Figures used in the paper and the data used to create them.
+    - `language_results/benchmark_tasks/` - Language preferences of LLMs for benchmark dataset tasks.
+    - `language_results/project_tasks/` - Language preferences of LLMs for project initialisation tasks.
+    - `language_results/project_ranks/` - Language recommendations of LLMs for project initialisation tasks.
+    - `library_results/benchmark_tasks/` - Library preferences of LLMs for benchmark dataset tasks.
+    - `library_results/project_tasks/` - Library preferences of LLMs for project initialisation tasks.
+    - `library_results/project_ranks/` - Library recommendations of LLMs for project initialisation tasks.
+    - `paper_figures/` - Figures used in the paper and repository.
 - `src/` - The main project code, accessing LLM model APIs before extracting data and analysing the results.
+- `tests/` - Unit tests for the automatic extraction code.
 
 ## *development*