Skip to content

Commit bf5e5e3

Browse files
hatamiarash7gitbook-bot
authored andcommitted
GITBOOK-2: No subject
1 parent 8e6eb6c commit bf5e5e3

15 files changed

+528
-0
lines changed

docs/README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
icon: face-glasses
3+
layout:
4+
title:
5+
visible: true
6+
description:
7+
visible: false
8+
tableOfContents:
9+
visible: true
10+
outline:
11+
visible: true
12+
pagination:
13+
visible: true
14+
---
15+
16+
# Introduce
17+
18+
Netquack DuckDB extension is designed to simplify working with domains, URIs, and web paths directly within your database queries. Whether you're extracting top-level domains (TLDs), parsing URI components, or analyzing web paths, Netquack provides a suite of intuitive functions to handle all your network tasks efficiently. Built for data engineers, analysts, and developers.
19+
20+
With Netquack, you can unlock deeper insights from your web-related datasets without the need for external tools or complex workflows.
21+
22+
### What is DuckDB <a href="#what-is-duckdb" id="what-is-duckdb"></a>
23+
24+
DuckDB is an in-process SQL OLAP database management system designed to efficiently handle analytical query workloads. It is lightweight, easy to integrate, and features an intuitive interface for querying and processing data directly within applications. DuckDB is gaining popularity for its performance and low overhead, making it an excellent choice for processing large datasets directly in various programming environments.

docs/SUMMARY.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Table of contents
2+
3+
* [Introduce](README.md)
4+
* [Why Netquack](why-netquack.md)
5+
6+
## Getting Started
7+
8+
* [Quickstart](getting-started/quickstart.md)
9+
* [How to build](getting-started/publish-your-docs.md)
10+
11+
## Functions
12+
13+
* [Extract Domain](functions/editor.md)
14+
* [Extract Subdomain](functions/extract-subdomain.md)
15+
* [Extract Path](functions/extract-path.md)
16+
* [Extract Host](functions/extract-host.md)
17+
* [Extract Schema](functions/extract-schema.md)
18+
* [Extract Query](functions/extract-query.md)
19+
* [Extract TLD](functions/extract-tld.md)
20+
* [Tranco](functions/tranco/README.md)
21+
* [Get Tranco Rank](functions/tranco/get-tranco-rank.md)
22+
* [Download / Update Tranco](functions/tranco/download-update-tranco.md)

docs/functions/editor.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract Domain
16+
17+
This function extracts the main domain from a URL. For this purpose, the extension will get all public suffixes from the [publicsuffix.org](https://publicsuffix.org/) list and extract the main domain from the URL.
18+
19+
The download process of the public suffix list is done automatically when the function is called for the first time. After that, the list is stored in the `public_suffix_list` table to avoid downloading it again.
20+
21+
```sql
22+
D SELECT extract_domain('a.example.com') as domain;
23+
┌─────────────┐
24+
│ domain │
25+
varchar
26+
├─────────────┤
27+
example.com
28+
└─────────────┘
29+
30+
D SELECT extract_domain('https://b.a.example.com/path') as domain;
31+
┌─────────────┐
32+
│ domain │
33+
varchar
34+
├─────────────┤
35+
example.com
36+
└─────────────┘
37+
```
38+

docs/functions/extract-host.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract Host
16+
17+
This function extracts the host from a URL.
18+
19+
```sql
20+
D SELECT extract_host('https://b.a.example.com/path/path') as host;
21+
┌─────────────────┐
22+
│ host │
23+
varchar
24+
├─────────────────┤
25+
b.a.example.com
26+
└─────────────────┘
27+
28+
D SELECT extract_host('example.com:443/path/image.png') as host;
29+
┌─────────────┐
30+
│ host │
31+
varchar
32+
├─────────────┤
33+
example.com
34+
└─────────────┘
35+
```

docs/functions/extract-path.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract Path
16+
17+
This function extracts the path from a URL.
18+
19+
```sql
20+
D SELECT extract_path('https://b.a.example.com/path/path') as path;
21+
┌────────────┐
22+
path
23+
varchar
24+
├────────────┤
25+
/path/path
26+
└────────────┘
27+
28+
D SELECT extract_path('example.com/path/path/image.png') as path;
29+
┌──────────────────────┐
30+
path
31+
varchar
32+
├──────────────────────┤
33+
/path/path/image.png
34+
└──────────────────────┘
35+
```

docs/functions/extract-query.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract Query
16+
17+
This function extracts the query string from a URL.
18+
19+
```sql
20+
D SELECT extract_query_string('example.com?key=value') as query;
21+
┌───────────┐
22+
│ query │
23+
varchar
24+
├───────────┤
25+
│ key=value │
26+
└───────────┘
27+
28+
D SELECT extract_query_string('http://example.com.ac/path/?a=1&b=2&') as query;
29+
┌──────────┐
30+
│ query │
31+
varchar
32+
├──────────┤
33+
│ a=1&b=2& │
34+
└──────────┘
35+
```

docs/functions/extract-schema.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract Schema
16+
17+
This function extracts the schema from a URL. Supported schemas for now:
18+
19+
* `http` | `https`
20+
* `ftp`
21+
* `mailto`
22+
* `tel` | `sms`
23+
24+
```sql
25+
D SELECT extract_schema('https://b.a.example.com/path/path') as schema;
26+
┌─────────┐
27+
│ schema │
28+
varchar
29+
├─────────┤
30+
│ https │
31+
└─────────┘
32+
33+
D SELECT extract_schema('mailto:[email protected]') as schema;
34+
┌─────────┐
35+
│ schema │
36+
varchar
37+
├─────────┤
38+
│ mailto │
39+
└─────────┘
40+
41+
D SELECT extract_schema('tel:+123456789') as schema;
42+
┌─────────┐
43+
│ schema │
44+
varchar
45+
├─────────┤
46+
│ tel │
47+
└─────────┘
48+
```

docs/functions/extract-subdomain.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract Subdomain
16+
17+
This function extracts the sub-domain from a URL. This function will use the public suffix list to extract the TLD. Check the [Extracting The Main Domain](https://github.com/hatamiarash7/duckdb-netquack#extracting-the-main-domain) section for more information about the public suffix list.
18+
19+
```sql
20+
D SELECT extract_subdomain('http://a.b.example.com/path') as dns_record;
21+
┌────────────┐
22+
│ dns_record │
23+
varchar
24+
├────────────┤
25+
a.b
26+
└────────────┘
27+
28+
D SELECT extract_subdomain('test.example.com.ac') as dns_record;
29+
┌────────────┐
30+
│ dns_record │
31+
varchar
32+
├────────────┤
33+
│ test │
34+
└────────────┘
35+
```

docs/functions/extract-tld.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Extract TLD
16+
17+
This function extracts the top-level domain from a URL. This function will use the public suffix list to extract the TLD. Check the [Extracting The Main Domain](https://github.com/hatamiarash7/duckdb-netquack#extracting-the-main-domain) section for more information about the public suffix list.
18+
19+
```sql
20+
D SELECT extract_tld('https://example.com.ac/path/path') as tld;
21+
┌─────────┐
22+
│ tld │
23+
varchar
24+
├─────────┤
25+
com.ac
26+
└─────────┘
27+
28+
D SELECT extract_tld('a.example.com') as tld;
29+
┌─────────┐
30+
│ tld │
31+
varchar
32+
├─────────┤
33+
│ com │
34+
└─────────┘
35+
```

docs/functions/tranco/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
layout:
3+
title:
4+
visible: true
5+
description:
6+
visible: false
7+
tableOfContents:
8+
visible: true
9+
outline:
10+
visible: true
11+
pagination:
12+
visible: true
13+
---
14+
15+
# Tranco
16+
17+
Work with the [Tranco](https://tranco-list.eu/) database in your DuckDB database.

0 commit comments

Comments
 (0)