Skip to content

Commit 4b29c1b

Browse files
Milvus-doc-botMilvus-doc-bot
authored andcommitted
Release new docs to master
1 parent be821d9 commit 4b29c1b

File tree

9 files changed

+356
-12
lines changed

9 files changed

+356
-12
lines changed

v2.6.x/assets/standalone-monitoring/docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ services:
2020

2121
minio:
2222
container_name: milvus-minio
23-
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
23+
image: minio/minio:RELEASE.2024-12-18T13-15-44Z
2424
environment:
2525
MINIO_ACCESS_KEY: minioadmin
2626
MINIO_SECRET_KEY: minioadmin

v2.6.x/site/en/adminGuide/deploy_pulsar.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,61 @@ extraConfigFiles:
6565
helm install <your_release_name> milvus/milvus -f values.yaml
6666
```
6767

68+
## Configure Woodpecker with Helm
69+
70+
For Milvus clusters on K8s, you can configure Woodpecker in the same command that starts Milvus. Alternatively, you can configure Woodpecker using the <code>values.yml</code> file on the /charts/milvus path in the [milvus-helm](https://github.com/milvus-io/milvus-helm) repository before you start Milvus.
71+
72+
For details on how to configure Milvus using Helm, refer to [Configure Milvus with Helm Charts](configure-helm.md). For details on Woodpecker-related configuration items, refer to [woodpecker-related configurations](use-woodpecker.md).
73+
|
74+
### Using the YAML file
75+
76+
1. Configure the <code>externalConfigFiles</code> section in the <code>values.yaml</code> file.
77+
78+
```yaml
79+
extraConfigFiles:
80+
user.yaml: |+
81+
woodpecker:
82+
meta:
83+
type: etcd # The Type of the metadata provider. currently only support etcd.
84+
prefix: woodpecker # The Prefix of the metadata provider. default is woodpecker.
85+
client:
86+
segmentAppend:
87+
queueSize: 10000 # The size of the queue for pending messages to be sent of each log.
88+
maxRetries: 3 # Maximum number of retries for segment append operations.
89+
segmentRollingPolicy:
90+
maxSize: 256M # Maximum size of a segment.
91+
maxInterval: 10m # Maximum interval between two segments, default is 10 minutes.
92+
maxBlocks: 1000 # Maximum number of blocks in a segment
93+
auditor:
94+
maxInterval: 10s # Maximum interval between two auditing operations, default is 10 seconds.
95+
logstore:
96+
segmentSyncPolicy:
97+
maxInterval: 200ms # Maximum interval between two sync operations, default is 200 milliseconds.
98+
maxIntervalForLocalStorage: 10ms # Maximum interval between two sync operations local storage backend, default is 10 milliseconds.
99+
maxBytes: 256M # Maximum size of write buffer in bytes.
100+
maxEntries: 10000 # Maximum entries number of write buffer.
101+
maxFlushRetries: 5 # Maximum size of write buffer in bytes.
102+
retryInterval: 1000ms # Maximum interval between two retries. default is 1000 milliseconds.
103+
maxFlushSize: 2M # Maximum size of a fragment in bytes to flush.
104+
maxFlushThreads: 32 # Maximum number of threads to flush data
105+
segmentCompactionPolicy:
106+
maxSize: 2M # The maximum size of the merged files.
107+
maxParallelUploads: 4 # The maximum number of parallel upload threads for compaction.
108+
maxParallelReads: 8 # The maximum number of parallel read threads for compaction.
109+
segmentReadPolicy:
110+
maxBatchSize: 16M # Maximum size of a batch in bytes.
111+
maxFetchThreads: 32 # Maximum number of threads to fetch data.
112+
storage:
113+
type: minio # The Type of the storage provider. Valid values: [minio, local]
114+
rootPath: /var/lib/milvus/woodpecker # The root path of the storage provider.
115+
```
116+
117+
2. After configuring the preceding sections and saving the <code>values.yaml</code> file, run the following command to install Milvus which uses the Woodpecker configurations.
118+
119+
```shell
120+
helm install <your_release_name> milvus/milvus -f values.yaml
121+
```
122+
68123
## Configure Kafka with Helm
69124

70125
For Milvus clusters on K8s, you can configure Kafka in the same command that starts Milvus. Alternatively, you can configure Kafka using the <code>values.yml</code> file on the /charts/milvus path in the [milvus-helm](https://github.com/milvus-io/milvus-helm) repository before you start Milvus.
Lines changed: 283 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,283 @@
1+
---
2+
id: use-woodpecker.md
3+
title: Use Woodpecker (Milvus v2.6.x)
4+
related_key: Woodpecker
5+
summary: Learn how to enable woodpecker as the WAL in milvus.
6+
---
7+
## Use Woodpecker (Milvus v2.6.x)
8+
9+
This guide explains how to enable and use Woodpecker as the Write-Ahead Log (WAL) in Milvus 2.6.x. Woodpecker is a cloud‑native WAL designed for object storage, offering high throughput, low operational overhead, and seamless scalability. For architecture and benchmark details, see [Woodpecker](woodpecker_architecture.md).
10+
11+
### Overview
12+
13+
- Starting from Milvus 2.6, Woodpecker is an optional WAL that provides ordered writes and recovery as the logging service.
14+
- As a message queue choice, it behaves similarly to Pulsar/Kafka and can be enabled via configuration.
15+
- Two storage backends are supported: local file system (`local`) and object storage (`minio`/S3-compatible).
16+
17+
### Quick start
18+
19+
To enable Woodpecker, set the MQ type to Woodpecker:
20+
21+
```yaml
22+
mq:
23+
type: woodpecker
24+
```
25+
26+
Note: Switching `mq.type` for a running cluster is an upgrade operation. Follow the upgrade procedure carefully and validate on a fresh cluster before switching production.
27+
28+
### Configuration
29+
30+
Below is the complete Woodpecker configuration block (edit `milvus.yaml` or override in `user.yaml`):
31+
32+
```yaml
33+
# Related configuration of woodpecker, used to manage Milvus logs of recent mutation operations, output streaming log, and provide embedded log sequential read and write.
34+
woodpecker:
35+
meta:
36+
type: etcd # The Type of the metadata provider. currently only support etcd.
37+
prefix: woodpecker # The Prefix of the metadata provider. default is woodpecker.
38+
client:
39+
segmentAppend:
40+
queueSize: 10000 # The size of the queue for pending messages to be sent of each log.
41+
maxRetries: 3 # Maximum number of retries for segment append operations.
42+
segmentRollingPolicy:
43+
maxSize: 256M # Maximum size of a segment.
44+
maxInterval: 10m # Maximum interval between two segments, default is 10 minutes.
45+
maxBlocks: 1000 # Maximum number of blocks in a segment
46+
auditor:
47+
maxInterval: 10s # Maximum interval between two auditing operations, default is 10 seconds.
48+
logstore:
49+
segmentSyncPolicy:
50+
maxInterval: 200ms # Maximum interval between two sync operations, default is 200 milliseconds.
51+
maxIntervalForLocalStorage: 10ms # Maximum interval between two sync operations local storage backend, default is 10 milliseconds.
52+
maxBytes: 256M # Maximum size of write buffer in bytes.
53+
maxEntries: 10000 # Maximum entries number of write buffer.
54+
maxFlushRetries: 5 # Maximum size of write buffer in bytes.
55+
retryInterval: 1000ms # Maximum interval between two retries. default is 1000 milliseconds.
56+
maxFlushSize: 2M # Maximum size of a fragment in bytes to flush.
57+
maxFlushThreads: 32 # Maximum number of threads to flush data
58+
segmentCompactionPolicy:
59+
maxSize: 2M # The maximum size of the merged files.
60+
maxParallelUploads: 4 # The maximum number of parallel upload threads for compaction.
61+
maxParallelReads: 8 # The maximum number of parallel read threads for compaction.
62+
segmentReadPolicy:
63+
maxBatchSize: 16M # Maximum size of a batch in bytes.
64+
maxFetchThreads: 32 # Maximum number of threads to fetch data.
65+
storage:
66+
type: minio # The Type of the storage provider. Valid values: [minio, local]
67+
rootPath: /var/lib/milvus/woodpecker # The root path of the storage provider.
68+
```
69+
70+
Key notes:
71+
72+
- `woodpecker.meta`
73+
- **type**: Currently only `etcd` is supported. Reuse the same etcd as Milvus to store lightweight metadata.
74+
- **prefix**: The key prefix for metadata. Default: `woodpecker`.
75+
- `woodpecker.client`
76+
- Controls segment append/rolling/auditing behavior on the client side to balance throughput and end‑to‑end latency.
77+
- `woodpecker.logstore`
78+
- Controls sync/flush/compaction/read policies for log segments. These are the primary knobs for throughput/latency tuning.
79+
- `woodpecker.storage`
80+
- **type**: `minio` for MinIO/S3‑compatible object storage (MinIO/S3/GCS/OSS, etc.); `local` for local/shared file systems.
81+
- **rootPath**: Root path for the storage backend (effective for `local`; with `minio`, paths are dictated by bucket/prefix).
82+
83+
### Deployment modes
84+
85+
Milvus supports both Standalone and Cluster modes. Woodpecker storage backend support matrix:
86+
87+
| | `storage.type=local` | `storage.type=minio` |
88+
| ----------------- | ------------------------ | ------------------------ |
89+
| Milvus Standalone | Supported | Supported |
90+
| Milvus Cluster | Limited (needs shared FS) | Supported |
91+
92+
Notes:
93+
94+
- With `minio`, Woodpecker shares the same object storage with Milvus (MinIO/S3/GCS/OSS, etc.).
95+
- With `local`, a single‑node local disk is only suitable for Standalone. If all pods can access a shared file system (e.g., NFS), Cluster mode can also use `local`.
96+
97+
## Deployment guides
98+
99+
### Enable Woodpecker for a Milvus Cluster on Kubernetes (Milvus Operator, storage=minio)
100+
101+
After installing the [Milvus Operator](install_cluster-milvusoperator.md), start a Milvus cluster with Woodpecker enabled using the official sample:
102+
103+
```bash
104+
kubectl apply -f https://raw.githubusercontent.com/zilliztech/milvus-operator/main/config/samples/milvus_cluster_woodpecker.yaml
105+
106+
```
107+
108+
This sample configures Woodpecker as the message queue and enables the Streaming Node. The first startup may take time to pull images; wait until all pods are ready:
109+
110+
```bash
111+
kubectl get pods
112+
kubectl get milvus my-release -o yaml | grep -A2 status
113+
```
114+
When ready, you should see pods similar to:
115+
```
116+
NAME READY STATUS RESTARTS AGE
117+
my-release-etcd-0 1/1 Running 0 17m
118+
my-release-etcd-1 1/1 Running 0 17m
119+
my-release-etcd-2 1/1 Running 0 17m
120+
my-release-milvus-datanode-7f8f88499d-kc66r 1/1 Running 0 16m
121+
my-release-milvus-mixcoord-7cd7998d-x59kg 1/1 Running 0 16m
122+
my-release-milvus-proxy-5b56cf8446-pbnjm 1/1 Running 0 16m
123+
my-release-milvus-querynode-0-558d9cdd57-sgbfx 1/1 Running 0 16m
124+
my-release-milvus-streamingnode-58fbfdfdd8-vtxfd 1/1 Running 0 16m
125+
my-release-minio-0 1/1 Running 0 17m
126+
my-release-minio-1 1/1 Running 0 17m
127+
my-release-minio-2 1/1 Running 0 17m
128+
my-release-minio-3 1/1 Running 0 17m
129+
```
130+
Run the following command to uninstall the Milvus cluster.
131+
```bash
132+
kubectl delete milvus my-release
133+
```
134+
135+
If you need to adjust Woodpecker parameters, follow the settings described in [message storage config](deploy_pulsar.md).
136+
137+
### Enable Woodpecker for a Milvus Cluster on Kubernetes (Helm Chart, storage=minio)
138+
139+
First add and update the Milvus Helm chart as described in [Run Milvus in Kubernetes with Helm](install_cluster-helm.md).
140+
141+
Then deploy with one of the following examples:
142+
143+
– Cluster deployment (recommended settings with Woodpecker and Streaming Node enabled):
144+
145+
```bash
146+
helm install my-release zilliztech/milvus \
147+
--set image.all.tag=v2.6.0 \
148+
--set pulsarv3.enabled=false \
149+
--set woodpecker.enabled=true \
150+
--set streaming.enabled=true \
151+
--set indexNode.enabled=false
152+
```
153+
154+
– Standalone deployment (Woodpecker enabled):
155+
156+
```bash
157+
helm install my-release zilliztech/milvus \
158+
--set image.all.tag=v2.6.0 \
159+
--set cluster.enabled=false \
160+
--set pulsarv3.enabled=false \
161+
--set standalone.messageQueue=woodpecker \
162+
--set woodpecker.enabled=true \
163+
--set streaming.enabled=true
164+
```
165+
166+
After deployment, follow the docs to port‑forward and connect. To adjust Woodpecker parameters, follow the settings described in [message storage config](deploy_pulsar.md).
167+
168+
### Enable Woodpecker for Milvus Standalone in Docker (storage=local)
169+
170+
Follow [Run Milvus in Docker](install_standalone-docker.md). Example:
171+
172+
```bash
173+
mkdir milvus-wp && cd milvus-wp
174+
curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh
175+
176+
# Create user.yaml to enable Woodpecker with local filesystem
177+
cat > user.yaml <<'EOF'
178+
mq:
179+
type: woodpecker
180+
woodpecker:
181+
storage:
182+
type: local
183+
rootPath: /var/lib/milvus/woodpecker
184+
EOF
185+
186+
bash standalone_embed.sh start
187+
```
188+
189+
To further change Woodpecker settings, update `user.yaml` and run `bash standalone_embed.sh restart`.
190+
191+
### Enable Woodpecker for Milvus Standalone with Docker Compose (storage=minio)
192+
193+
Follow [Run Milvus with Docker Compose](install_standalone-docker-compose.md). Example:
194+
195+
```bash
196+
mkdir milvus-wp-compose && cd milvus-wp-compose
197+
wget https://github.com/milvus-io/milvus/releases/download/v2.6.0/milvus-standalone-docker-compose.yml -O docker-compose.yml
198+
# By default, the Docker Compose standalone uses Woodpecker
199+
sudo docker compose up -d
200+
# If you need to change Woodpecker parameters further, write an override:
201+
docker exec -it milvus-standalone bash -lc 'cat > /milvus/configs/user.yaml <<EOF
202+
mq:
203+
type: woodpecker
204+
woodpecker:
205+
logstore:
206+
segmentSyncPolicy:
207+
maxFlushThreads: 16
208+
storage:
209+
type: minio
210+
EOF'
211+
212+
# Restart the container to apply the changes
213+
docker restart milvus-standalone
214+
```
215+
216+
## Throughput tuning tips
217+
218+
Based on the benchmarks and backend limits in [Woodpecker](woodpecker_architecture.md), optimize end‑to‑end write throughput from the following aspects:
219+
220+
- Storage‑side
221+
- **Object storage (minio/S3‑compatible)**: Increase concurrency and object size (avoid tiny objects). Watch network and bucket bandwidth limits. A single MinIO node on SSD often caps around 100 MB/s locally; a single EC2 to S3 can reach GB/s.
222+
- **Local/shared file systems (local)**: Prefer NVMe/fast disks. Ensure the FS handles small writes and fsync latency well.
223+
- Woodpecker knobs
224+
- Increase `logstore.segmentSyncPolicy.maxFlushSize` and `maxFlushThreads` for larger flushes and higher parallelism.
225+
- Tune `maxInterval` according to media characteristics (trade latency for throughput with longer aggregation).
226+
- For object storage, consider increasing `segmentRollingPolicy.maxSize` to reduce segment switches.
227+
- Client/application side
228+
- Use larger batch sizes and more concurrent writers/clients.
229+
- Control refresh/index build timing (batch up before triggering) to avoid frequent small writes.
230+
231+
Batch Insert Demo
232+
```python
233+
from pymilvus import MilvusClient
234+
import random
235+
236+
# 1. Set up a Milvus client
237+
client = MilvusClient(
238+
uri="http://<Proxy Pod IP>:27017",
239+
)
240+
241+
# 2. Create a collection
242+
res = client.create_collection(
243+
collection_name="test_milvus_wp",
244+
dimension=512,
245+
metric_type="IP",
246+
shards_num=2,
247+
)
248+
print(res)
249+
250+
# 3. Insert randomly generated vectors
251+
colors = ["green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey"]
252+
data = []
253+
254+
batch_size = 1000
255+
batch_count = 2000
256+
for j in range(batch_count):
257+
start_time = time.time()
258+
print(f"Inserting {j}th vectors {j * batch_size} startTime{start_time}")
259+
for i in range(batch_size):
260+
current_color = random.choice(colors)
261+
data.append({
262+
"id": (j*batch_size + i),
263+
"vector": [ random.uniform(-1, 1) for _ in range(512) ],
264+
"color": current_color,
265+
"color_tag": f"{current_color}_{str(random.randint(1000, 9999))}"
266+
})
267+
res = client.insert(
268+
collection_name="test_milvus_wp",
269+
data=data
270+
)
271+
data = []
272+
print(f"Inserted {j}th vectors endTime:{time.time()} costTime:{time.time() - start_time}")
273+
```
274+
275+
## Latency
276+
277+
Woodpecker is a cloud-native WAL designed for object storage with trade-offs between throughput, cost, and latency. The currently supported lightweight embedded mode prioritizes cost and throughput optimization, as most scenarios only require data to be written within a certain time rather than demanding low latency for individual write requests. Therefore, Woodpecker employs batched writes, with default intervals of 10ms for local filesystem storage backends and 200ms for MinIO-like storage backends. During slow write operations, the maximum latency equals the interval time plus flush time.
278+
279+
Note that batch insertion is triggered not only by time intervals but also by batch size, which defaults to 2MB.
280+
281+
For details on architecture, deployment modes (MemoryBuffer / QuorumBuffer), and performance, see [Woodpecker Architecture](woodpecker_architecture.md).
282+
283+
For more parameter details, refer to the Woodpecker [GitHub repository](https://github.com/zilliztech/woodpecker).

v2.6.x/site/en/getstarted/run-milvus-docker/install_standalone-docker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ $ bash standalone_embed.sh start
3333

3434
**What's new in v2.6.0:**
3535
- **Streaming Node**: Enhanced data processing capabilities
36-
- **WoodPecker MQ**: Improved message queue with reduced maintenance overhead
36+
- **Woodpecker MQ**: Improved message queue with reduced maintenance overhead, see [Use Woodpecker](use-woodpecker.md) for detail
3737
- **Optimized Architecture**: Consolidated components for better performance
3838

3939
Always download the latest script to ensure you get the most recent configurations and architecture improvements.

v2.6.x/site/en/getstarted/run-milvus-docker/prerequisite-docker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ The following dependencies will be obtained and configured automatically when Mi
3131
| Software | Version | Note |
3232
| -------- | ----------------------------- | ---- |
3333
| etcd | 3.5.0 | See [additional disk requirements](#Additional-disk-requirements). |
34-
| MinIO | RELEASE.2023-03-20T20-16-18Z | |
34+
| MinIO | RELEASE.2024-12-18T13-15-44Z | |
3535
| Pulsar | 2.8.2 | |
3636

3737
### Additional disk requirements

0 commit comments

Comments
 (0)