Skip to content

Commit 3490271

Browse files
add: extended documentation.
1 parent 9f19747 commit 3490271

File tree

3 files changed

+35
-33
lines changed

3 files changed

+35
-33
lines changed

README.rst

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -130,40 +130,41 @@ the corresponding text representation.
130130

131131
Command line parameters
132132
-----------------------
133+
133134
The inscript.py command line client supports the following parameters::
134135

135-
usage: inscript.py [-h] [-o OUTPUT] [-e ENCODING] [-i] [-d] [-l] [-a] [-r ANNOTATION_RULES] [-p POSTPROCESSOR] [--indentation INDENTATION]
136-
[--table-cell-separator TABLE_CELL_SEPARATOR] [-v]
137-
[input]
138-
139-
Convert the given HTML document to text.
140-
141-
positional arguments:
142-
input Html input either from a file or a URL (default:stdin).
143-
144-
optional arguments:
145-
-h, --help show this help message and exit
146-
-o OUTPUT, --output OUTPUT
147-
Output file (default:stdout).
148-
-e ENCODING, --encoding ENCODING
149-
Input encoding to use (default:utf-8 for files; detected server encoding for Web URLs).
150-
-i, --display-image-captions
151-
Display image captions (default:false).
152-
-d, --deduplicate-image-captions
153-
Deduplicate image captions (default:false).
154-
-l, --display-link-targets
155-
Display link targets (default:false).
156-
-a, --display-anchor-urls
157-
Display anchor URLs (default:false).
158-
-r ANNOTATION_RULES, --annotation-rules ANNOTATION_RULES
159-
Path to an optional JSON file containing rules for annotating the retrieved text.
160-
-p POSTPROCESSOR, --postprocessor POSTPROCESSOR
161-
Optional component for postprocessing the result (html, surface, xml).
162-
--indentation INDENTATION
163-
How to handle indentation (extended or strict; default: extended).
164-
--table-cell-separator TABLE_CELL_SEPARATOR
165-
Separator to use between table cells (default: three spaces).
166-
-v, --version display version information
136+
usage: inscript.py [-h] [-o OUTPUT] [-e ENCODING] [-i] [-d] [-l] [-a] [-r ANNOTATION_RULES] [-p POSTPROCESSOR] [--indentation INDENTATION]
137+
[--table-cell-separator TABLE_CELL_SEPARATOR] [-v]
138+
[input]
139+
140+
Convert the given HTML document to text.
141+
142+
positional arguments:
143+
input Html input either from a file or a URL (default:stdin).
144+
145+
optional arguments:
146+
-h, --help show this help message and exit
147+
-o OUTPUT, --output OUTPUT
148+
Output file (default:stdout).
149+
-e ENCODING, --encoding ENCODING
150+
Input encoding to use (default:utf-8 for files; detected server encoding for Web URLs).
151+
-i, --display-image-captions
152+
Display image captions (default:false).
153+
-d, --deduplicate-image-captions
154+
Deduplicate image captions (default:false).
155+
-l, --display-link-targets
156+
Display link targets (default:false).
157+
-a, --display-anchor-urls
158+
Display anchor URLs (default:false).
159+
-r ANNOTATION_RULES, --annotation-rules ANNOTATION_RULES
160+
Path to an optional JSON file containing rules for annotating the retrieved text.
161+
-p POSTPROCESSOR, --postprocessor POSTPROCESSOR
162+
Optional component for postprocessing the result (html, surface, xml).
163+
--indentation INDENTATION
164+
How to handle indentation (extended or strict; default: extended).
165+
--table-cell-separator TABLE_CELL_SEPARATOR
166+
Separator to use between table cells (default: three spaces).
167+
-v, --version display version information
167168

168169
169170

src/inscriptis/metadata.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@
44
55
__copyright__ = '2016-2021 Albert Weichselbraun, Fabian Odoni'
66
__license__ = 'Apache 2.0'
7-
__version__ = '2.1.1'
7+
__version__ = '2.2.0'

src/inscriptis/model/table.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,7 @@ class TableRow:
151151
columns: the table row's columns.
152152
cell_separator: string used for separating columns from each other.
153153
"""
154+
154155
__slots__ = ('columns', 'cell_separator')
155156

156157
def __init__(self, cell_separator):

0 commit comments

Comments
 (0)