Skip to content

Commit e35a977

Browse files
authored
Merge pull request #564 from aurelio-labs/james/hybrid-router-example
feat: improved hybrid router doc
2 parents 3f54712 + 68549cd commit e35a977

File tree

1 file changed

+26
-7
lines changed

1 file changed

+26
-7
lines changed

docs/examples/hybrid-router.ipynb

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"# Semantic Router: Hybrid Layer\n"
7+
"# Hybrid Router\n"
88
]
99
},
1010
{
@@ -107,7 +107,15 @@
107107
"cell_type": "markdown",
108108
"metadata": {},
109109
"source": [
110-
"Now we initialize our embedding model:\n"
110+
"Now we initialize our embedding models, we use a dense encoder from [OpenAI](https://platform.openai.com/) and a sparse encoder from [Aurelio](https://platform.aurelio.ai/). The `AurelioSparseEncoder` we use here provides a remote sparse encoder that can significantly improve routing accuracy when combined with dense embeddings.\n",
111+
"\n",
112+
"Semantic Router supports other _local_ sparse encoders like `TfidfEncoder` or `BM25Encoder`. Compared to these, the `AurelioSparseEncoder`:\n",
113+
"\n",
114+
"1. Doesn't require local fitting (training) on your dataset\n",
115+
"2. Handles out-of-vocabulary words better\n",
116+
"3. Works better with asymmetric retrieval (different encoding for queries vs. documents)\n",
117+
"\n",
118+
"We initialize both like so:"
111119
]
112120
},
113121
{
@@ -117,16 +125,24 @@
117125
"outputs": [],
118126
"source": [
119127
"import os\n",
120-
"from semantic_router.encoders import OpenAIEncoder, TfidfEncoder\n",
128+
"from semantic_router.encoders import OpenAIEncoder, AurelioSparseEncoder\n",
121129
"from getpass import getpass\n",
122130
"\n",
131+
"# get OpenAI API key from https://platform.openai.com/\n",
123132
"os.environ[\"OPENAI_API_KEY\"] = os.getenv(\"OPENAI_API_KEY\") or getpass(\n",
124133
" \"Enter OpenAI API Key: \"\n",
125134
")\n",
126135
"\n",
127-
"dense_encoder = OpenAIEncoder()\n",
128-
"# sparse_encoder = BM25Encoder()\n",
129-
"sparse_encoder = TfidfEncoder()"
136+
"dense_encoder = OpenAIEncoder(name=\"text-embedding-3-small\", score_threshold=0.3)\n",
137+
"\n",
138+
"# get Aurelio API key from https://platform.aurelio.ai\n",
139+
"# use \"SRHYBRIDROUTER\" for free credits\n",
140+
"os.environ[\"AURELIO_API_KEY\"] = os.getenv(\"AURELIO_API_KEY\") or getpass(\n",
141+
" \"Enter Aurelio API Key: \"\n",
142+
")\n",
143+
"\n",
144+
"# Using Aurelio's BM25 sparse encoder\n",
145+
"sparse_encoder = AurelioSparseEncoder(name=\"bm25\")"
130146
]
131147
},
132148
{
@@ -155,7 +171,10 @@
155171
"from semantic_router.routers import HybridRouter\n",
156172
"\n",
157173
"router = HybridRouter(\n",
158-
" encoder=dense_encoder, sparse_encoder=sparse_encoder, routes=routes\n",
174+
" encoder=dense_encoder, \n",
175+
" sparse_encoder=sparse_encoder, \n",
176+
" routes=routes,\n",
177+
" alpha=0.5 # Balance between dense (0) and sparse (1) embeddings\n",
159178
")"
160179
]
161180
},

0 commit comments

Comments
 (0)