Skip to content

Commit 7ed792a

Browse files
committed
documents baseline comes with CRF
1 parent 3312f34 commit 7ed792a

23 files changed

+179
-61
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ and [benchmark results](/docs/transformers_benchmark.md) with fine-tuning BERT).
88

99
| Model| Dataset | Precision | Recall | F1 |
1010
|-------| ------- | :---------: | :------: | :--: |
11-
|BERT-base-cased (this repo)| CONLL-2003 | 91.69 | 92.05 | 91.87 |
12-
|Roberta-base (this repo)| CoNLL-2003 | **91.88** | **93.01** |**92.44**|
13-
|BERT-base-cased (this repo)| OntoNotes 5 |89.57 | 89.45 | 89.51 |
14-
|Roberta-base (this repo)| OntoNotes 5 | **90.12** | **91.25** |**90.68**|
11+
|BERT-base-cased + CRF (this repo)| CONLL-2003 | 91.69 | 92.05 | 91.87 |
12+
|Roberta-base + CRF (this repo)| CoNLL-2003 | **91.88** | **93.01** |**92.44**|
13+
|BERT-base-cased + CRF (this repo)| OntoNotes 5 |89.57 | 89.45 | 89.51 |
14+
|Roberta-base + CRF (this repo)| OntoNotes 5 | **90.12** | **91.25** |**90.68**|
1515

1616
More [details](/docs/transformers_benchmark.md)
1717

config/reader.py

Lines changed: 0 additions & 50 deletions
This file was deleted.

docs/transformers_benchmark.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ We strictly follow the optimizer configuration as in HuggingFace and use batch s
99
|-------| ------- | :---------: | :------: | :--: |
1010
|HuggingFace Default (bert-base-cased)| Test Set | 90.71 | 92.04| 91.37|
1111
|HuggingFace Default (roberta-base)*| Test Set | 89.41 | 91.47|90.43|
12-
|BERT-base-cased (this repo)| Test set | 91.69 | 92.05 | 91.87 |
13-
|BERT-large-cased (this repo)| Test Set | 92.03 | 92.17 | 92.10 |
14-
|Roberta-base (this repo)| Test Set | 91.88 | 93.01 |92.44|
15-
|Roberta-large (this repo)| Test Set | **92.27** | **93.18** |**92.72**|
12+
|BERT-base-cased-CRF (this repo)| Test set | 91.69 | 92.05 | 91.87 |
13+
|BERT-large-cased-CRF (this repo)| Test Set | 92.03 | 92.17 | 92.10 |
14+
|Roberta-base-CRF (this repo)| Test Set | 91.88 | 93.01 |92.44|
15+
|Roberta-large-CRF (this repo)| Test Set | **92.27** | **93.18** |**92.72**|
1616
HuggingFace Default (roberta-base)* has an issue with tokenization (There is no leading space).
1717

1818
We didn't achieve 92.4 F1 as reported in the BERT paper.
@@ -25,8 +25,8 @@ I think one of the main reasons is they are using the document-level dataset ins
2525

2626
| Model| Dataset | Precision | Recall | F1 |
2727
|-------| ------- | :---------: | :------: | :--: |
28-
|BERT-base-cased (this repo)| Test Set |89.57 | 89.45 | 89.51 |
29-
|BERT-large-cased (this repo)*| Test Set | - | -|-|
30-
|Roberta-base (this repo)| Test Set | **90.12** | **91.25** |**90.68**|
28+
|BERT-base-cased-CRF (this repo)| Test Set |89.57 | 89.45 | 89.51 |
29+
|BERT-large-cased-CRF (this repo)*| Test Set | - | -|-|
30+
|Roberta-base-CRF (this repo)| Test Set | **90.12** | **91.25** |**90.68**|
3131

3232
Roberta-base (this repo)* is still running. The others are not finished yet.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)