Skip to content
This repository was archived by the owner on Dec 13, 2024. It is now read-only.

Commit a7b9369

Browse files
author
thisisaaronland
committed
update docs
1 parent 3fdf0ba commit a7b9369

File tree

2 files changed

+25
-75
lines changed

2 files changed

+25
-75
lines changed

README.md

Lines changed: 14 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,12 @@ Usage of ./bin/wof-sqlite-index-features:
3838
Index the records related to a feature, specifically wof:belongsto, wof:depicts and wof:involves. Alt files for relations are not indexed at this time.
3939
-index-relations-reader-uri string
4040
A valid go-reader.Reader URI from which to read data for a relations candidate.
41+
-iterator-uri string
42+
A valid whosonfirst/go-whosonfirst-iterate/emitter URI. Supported emitter URI schemes are: directory://,featurecollection://,file://,filelist://,geojsonl://,repo:// (default "repo://")
4143
-live-hard-die-fast
4244
Enable various performance-related pragmas at the expense of possible (unlikely) database corruption (default true)
4345
-mode string
44-
The mode to use importing data. Valid modes are: directory://, featurecollection://, file://, filelist://, geojsonl://, metafile://, repo://, sqlite://. (default "repo://")
46+
A valid whosonfirst/go-whosonfirst-iterate/emitter URI. Supported emitter URI schemes are: directory://,featurecollection://,file://,filelist://,geojsonl://,repo://. THIS FLAG IS DEPRECATED, please use -iterator-uri instead. (default "repo://")
4547
-names
4648
Index the 'names' table
4749
-optimize
@@ -50,10 +52,6 @@ Usage of ./bin/wof-sqlite-index-features:
5052
The number of concurrent processes to index data with (default 8)
5153
-properties
5254
Index the 'properties' table
53-
-query value
54-
One or more {PATH}={REGEXP} parameters for filtering records.
55-
-query-mode string
56-
Specify how query filtering should be evaluated. Valid modes are: ALL, ANY (default "ALL")
5755
-rtree
5856
Index the 'rtree' table
5957
-search
@@ -62,6 +60,8 @@ Usage of ./bin/wof-sqlite-index-features:
6260
Index the 'spr' table
6361
-strict-alt-files
6462
Be strict when indexing alt geometries (default true)
63+
-supersedes
64+
Index the 'supersedes' table
6565
-timings
6666
Display timings during and after indexing
6767
```
@@ -72,8 +72,8 @@ For example:
7272
$> ./bin/wof-sqlite-index-features \
7373
-dsn microhoods.db \
7474
-all \
75-
-mode meta:// \
76-
/usr/local/data/whosonfirst-data/meta/wof-microhood-latest.csv
75+
-iterator-uri 'repo://?include=properties.wof:placetype=microhood' \
76+
/usr/local/data/whosonfirst-data-admin-us
7777
```
7878

7979
Or creating databases for all the Who's On First repos:
@@ -105,7 +105,7 @@ done
105105

106106
#### Inline queries
107107

108-
You can also specify inline queries by passing a `-query` parameter which is a string in the format of:
108+
You can also specify inline queries by appending one or more `include` or `exclude` parameters to a `emitter.Emitter` URI, where the value is a string in the format of:
109109

110110
```
111111
{PATH}={REGULAR EXPRESSION}
@@ -119,8 +119,7 @@ For example:
119119
$> ./bin/wof-sqlite-index-features \
120120
-all \
121121
-dsn ca-region.db \
122-
-query 'properties.wof:placetype=region' \
123-
-mode repo:// \
122+
-iterator-uri 'repo://?include=properties.wof:placetype=region' \
124123
/usr/local/data/whosonfirst-data-admin-ca
125124
126125
$> sqlite3 ca-region.db
@@ -141,19 +140,15 @@ sqlite> SELECT id,name,placetype FROM spr;
141140
85682113|Saskatchewan|region
142141
136251273|Quebec|region
143142
85682105|Nunavut|region
144-
sqlite>
145-
146143
```
147144

148-
You can pass multiple `-query` parameters. For example:
145+
You can pass multiple query parameters. For example:
149146

150147
```
151148
$> ./bin/wof-sqlite-index-features \
152149
-all \
153150
-dsn ca-region.db \
154-
-query 'properties.wof:placetype=region' \
155-
-query 'properties.wof:name=(?i)new.*'
156-
-mode repo:// \
151+
-iterator-uri 'repo://?include=properties.wof:placetype=region&include=properties.wof:name=(?i)new.*' \
157152
/usr/local/data/whosonfirst-data-admin-ca
158153
159154
$> sqlite3 ca-region-new.db
@@ -163,61 +158,16 @@ Enter ".help" for usage hints.
163158
sqlite> SELECT id,name,placetype FROM spr;
164159
85682065|New Brunswick|region
165160
85682123|Newfoundland and Labrador|region
166-
sqlite>
167161
```
168162

169-
The default query mode is to ensure that all queries match but you can also specify that only one or more queries need to match by passing the `-query-mode ANY` flag:
163+
The default query mode is to ensure that all queries match but you can also specify that only one or more queries need to match by appending a `include_mode` or `exclude_mode` parameter where the value is either "ANY" or "ALL".
170164

171165
#### SQLite performace-related PRAGMA
172166

173167
Note that the `-live-hard-die-fast` flag is enabled by default. That is to enable a number of performace-related PRAGMA commands (described [here](https://blog.devart.com/increasing-sqlite-performance.html) and [here](https://www.gaia-gis.it/gaia-sins/spatialite-cookbook/html/system.html)) without which database index can be prohibitive and time-consuming. These is a small but unlikely chance of database corruptions when this flag is enabled.
174168

175169
Also note that the `-live-hard-die-fast` flag will cause the `PAGE_SIZE` and `CACHE_SIZE` PRAGMAs to be set to `4096` and `1000000` respectively so the eventual cache size will require 4GB of memory. This is probably fine on most systems where you'll be indexing data but I am open to the idea that we may need to revisit those numbers or at least make them configurable.
176170

177-
### wof-sqlite-query-features
178-
179-
Query a search-enabled SQLite database by name(s). Results are output as CSV encoded rows containing `id` and `(wof:)name` properties.
180-
181-
_This assumes you have created the database using the `wof-sqlite-index-features` tool with the `-search` paramter._
182-
183-
```
184-
$> ./bin/wof-sqlite-query-features -h
185-
Usage of ./bin/wof-sqlite-query-features:
186-
-column string
187-
The 'names_*' column to query against. Valid columns are: names_all, names_preferred, names_variant, names_colloquial. (default "names_all")
188-
-driver string
189-
(default "sqlite3")
190-
-dsn string
191-
(default ":memory:")
192-
-is-ceased string
193-
A comma-separated list of valid existential flags (-1,0,1) to filter results according to whether or not they have been marked as ceased. Multiple flags are evaluated as a nested 'OR' query.
194-
-is-current string
195-
A comma-separated list of valid existential flags (-1,0,1) to filter results according to their 'mz:is_current' property. Multiple flags are evaluated as a nested 'OR' query.
196-
-is-deprecated string
197-
A comma-separated list of valid existential flags (-1,0,1) to filter results according to whether or not they have been marked as deprecated. Multiple flags are evaluated as a nested 'OR' query.
198-
-is-superseded string
199-
A comma-separated list of valid existential flags (-1,0,1) to filter results according to whether or not they have been marked as superseded. Multiple flags are evaluated as a nested 'OR' query.
200-
-output string
201-
A valid path to write (CSV) results to. If empty results are written to STDOUT.
202-
-table string
203-
The name of the SQLite table to query against. (default "search")
204-
```
205-
206-
For example:
207-
208-
```
209-
$> ./bin/wof-sqlite-query-features -dsn test2.db JFK
210-
102534365,John F Kennedy Int'l Airport
211-
212-
$> ./bin/wof-sqlite-query-features -dsn test2.db -column names_colloquial Paris
213-
85922583,San Francisco
214-
102027181,Shanghai
215-
102030585,Kolkata
216-
101751929,Tromsø
217-
```
218-
219-
Full-text search is supported using SQLite's FTS4 indexer. In order to index the `search` table you must explicitly pass the `-search` flag to the `wof-sqlite-index-features` command. It is _not_ included when you set the `-all` flag (which should probably be renamed to be `-common` but that's not the case today...) because it increases the overall indexing time by a non-trivial amount.
220-
221171
## Spatial indexes
222172

223173
### RTree
@@ -232,7 +182,6 @@ $> ./bin/wof-sqlite-index-features \
232182
-properties \
233183
-timings \
234184
-dsn /usr/local/ca-alt.db \
235-
-mode repo:// \
236185
/usr/local/data/whosonfirst-data-admin-ca/
237186
```
238187

@@ -246,7 +195,6 @@ $> ./bin/wof-sqlite-index-features \
246195
-timings \
247196
-spr \
248197
-geometries \
249-
-mode repo:// \
250198
-dsn test.db /usr/local/data/whosonfirst-data-constituency-ca/
251199
252200
10:09:46.534281 [wof-sqlite-index-features] STATUS time to index geometries (87) : 21.251828704s
@@ -301,7 +249,6 @@ $> ./bin/wof-sqlite-index-features \
301249
-geometries \
302250
-dsn /usr/local/data/dist/sqlite/whosonfirst-data-latest.db \
303251
-timings \
304-
-mode repo:// \
305252
/usr/local/data/whosonfirst-data
306253
307254
...time passes...
@@ -324,7 +271,6 @@ $> ./bin/wof-sqlite-index-features \
324271
-all \
325272
-dsn /usr/local/data/dist/sqlite/whosonfirst-data-latest-nospatial.db \
326273
-timings \
327-
-mode repo:// \
328274
/usr/local/data/whosonfirst-data
329275
...time passes...
330276
10:06:13.226187 [wof-sqlite-index-features] STATUS time to index names (951541) : 12m32.359733539s
@@ -340,11 +286,9 @@ $> ./bin/wof-sqlite-index-features \
340286

341287
As of this writing individual tables are indexed atomically. There may be some improvements to be made indexing tables in separate Go routines but my hunch is this will make SQLite sad and cause a lot of table lock errors. I don't need to be right about that, though...
342288

343-
## Dependencies and relationships
344-
345-
These are documented in the [Dependencies and relationships section](https://github.com/whosonfirst/go-whosonfirst-sqlite#dependencies-and-relationships) of the `go-whosonfirst-sqlite` package.
346-
347289
## See also
348290

349291
* https://github.com/whosonfirst/go-whosonfirst-sqlite
350292
* https://github.com/whosonfirst/go-whosonfirst-sqlite-features
293+
* https://github.com/whosonfirst/go-whosonfirst-sqlite-index
294+
* https://github.com/whosonfirst/go-whosonfirst-iterate

cmd/wof-sqlite-index-features/main.go

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,12 @@ import (
2525
func main() {
2626

2727
valid_schemes := strings.Join(emitter.Schemes(), ",")
28-
emitter_desc := fmt.Sprintf("A valid whosonfirst/go-whosonfirst-iterate/emitter URI. Supported emitter URI schemes are: %s", valid_schemes)
28+
iterator_desc := fmt.Sprintf("A valid whosonfirst/go-whosonfirst-iterate/emitter URI. Supported emitter URI schemes are: %s", valid_schemes)
2929

30-
// mode := flag.String("mode", "repo://", emitter_modes)
31-
emitter_uri := flag.String("emitter-uri", "repo://", emitter_desc)
30+
iterator_uri := flag.String("iterator-uri", "repo://", iterator_desc)
31+
32+
mode_desc := fmt.Sprintf("%s. THIS FLAG IS DEPRECATED, please use -iterator-uri instead.", iterator_desc)
33+
mode := flag.String("mode", "repo://", mode_desc)
3234

3335
dsn := flag.String("dsn", ":memory:", "")
3436
driver := flag.String("driver", "sqlite3", "")
@@ -60,6 +62,10 @@ func main() {
6062

6163
flag.Parse()
6264

65+
if *iterator_uri == "" {
66+
*iterator_uri = *mode
67+
}
68+
6369
ctx := context.Background()
6470

6571
runtime.GOMAXPROCS(*procs)
@@ -324,10 +330,10 @@ func main() {
324330
idx.Timings = *timings
325331
idx.Logger = logger
326332

327-
err = idx.IndexPaths(ctx, *emitter_uri, flag.Args())
333+
err = idx.IndexPaths(ctx, *iterator_uri, flag.Args())
328334

329335
if err != nil {
330-
logger.Fatal("Failed to index paths in %s mode because: %s", *emitter_uri, err)
336+
logger.Fatal("Failed to index paths in %s mode because: %s", *iterator_uri, err)
331337
}
332338

333339
os.Exit(0)

0 commit comments

Comments
 (0)