You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+28-22Lines changed: 28 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,22 +56,16 @@ pip install .
56
56
57
57
## *usage*
58
58
59
-
After [*installation*](#installation), there are 2 ways to run the experiment code.
60
-
The easiest of which is via the the [`main.ipynb`](main.ipynb) notebook, which fully describes
61
-
each experiment and provides the methods to run them.
59
+
After [*installation*](#installation), you can run experiment code contained in `src/`.
62
60
63
-
You can also use the `run` command from your terminal - this is likely best if you want to
64
-
reproduce the experiments on an external server or in a [docker](https://www.docker.com/)
65
-
container.
66
-
67
-
```shell
68
-
run --dataset-file data/example.json
61
+
```python
62
+
from src import (
63
+
run_llm_code_bias_experiment,
64
+
get_llm_code_recommendations,
65
+
)
69
66
```
70
67
71
-
All other non-experiment code that likely only needed to be ran a single time is explained in,
72
-
and can be interfaced with, via it's corresponding Jupyter notebook.
73
-
These notebooks are contained in the [`notebooks/`](notebooks/) directory, and are described in the
74
-
[*structure*](#structure) section.
68
+
The easiest way to reproduce the experiments is via the the [`main.ipynb`](main.ipynb) notebook, which fully describes each experiment and provides the methods and setup to run them.
This repository contains all of the code used for the project, to allow easy reproduction and encourage further investigation into LLMs coding preferences.
85
+
This repository contains all of the code used for the project, to allow easy reproduction and encourage further investigation into LLM coding preferences.
92
86
It has the following directory structure:
93
87
94
-
-`data/` - Contains the data used to conduct the experiments, including benchmark datasets. Has it's own [README.md](data/README.md) with detailed information.
88
+
-`main.ipynb` - The easiest way to reproduce the project experiments, containing all methods and setup necessary to do so.
89
+
-`data/` - Contains all the data used to conduct the experiments.
90
+
-`benchmark_tasks.ipynb` - Notebook used to download, process and sample the benchmark datasets.
91
+
-`language/benchmark_tasks/` - Benchmark datasets used to explore language preferences.
92
+
-`language/project_tasks/` - Project descriptions used to explore language preferences.
93
+
-`library/benchmark_tasks/` - Benchmark datasets used to explore library preferences.
94
+
-`library/project_tasks/` - Project descriptions used to explore library preferences.
95
95
-`output/` - The full results for all experiments, and other outputs from running the code.
96
-
-`extended_analysis/chain_of_thought/` - Results from the investigation into the use of chain-of-thought prompting.
97
-
-`extended_analysis/temperature/` - Results from the investigation into the influence of adjusting the temperature parameter.
96
+
-`case_analysis/` - GitHub statistics used for the library case analysis.
97
+
-`extended_analysis/double_check/` - Language preferences when inducing reasoning by asking the model to 'double check'.
98
+
-`extended_analysis/first_list/` - Language preferences when inducing reasoning by asking the model to 'think step by step'.
99
+
-`extended_analysis/step_by_step` - Language preferences when asking for recommendations in-context.
100
+
-`extended_analysis/temperature` - Results from experiments when adjusting temperature.
98
101
-`kendall_tau/` - Statistical analysis of the results.
99
-
-`language_results/benchmark_tasks/` - Results for language preferences of LLMs when solving tasks from benchmarks.
100
-
-`language_results/project_init_tasks/` - Results for language preferences of LLMs when writing initial project code.
101
-
-`library_results/benchmark_tasks/` - Results for library preferences of LLMs when solving tasks from benchmarks.
102
-
-`library_results/project_init_tasks/` - Results for library preferences of LLMs when writing initial project code.
103
-
-`paper_figures/` - Figures used in the paper and the data used to create them.
102
+
-`language_results/benchmark_tasks/` - Language preferences of LLMs for benchmark dataset tasks.
103
+
-`language_results/project_tasks/` - Language preferences of LLMs for project initialisation tasks.
104
+
-`language_results/project_ranks/` - Language recommendations of LLMs for project initialisation tasks.
105
+
-`library_results/benchmark_tasks/` - Library preferences of LLMs for benchmark dataset tasks.
106
+
-`library_results/project_tasks/` - Library preferences of LLMs for project initialisation tasks.
107
+
-`library_results/project_ranks/` - Library recommendations of LLMs for project initialisation tasks.
108
+
-`paper_figures/` - Figures used in the paper and repository.
104
109
-`src/` - The main project code, accessing LLM model APIs before extracting data and analysing the results.
110
+
-`tests/` - Unit tests for the automatic extraction code.
0 commit comments