Encoding API #7

irinakhismatullina · 2019-10-29T09:43:13Z

Encoding happens exactly the same way it is done in YouTokenToMe.

Input - space-separated words. No normalization is performed
Each word is tokenized separately, special bow (or space) token is added to the beginning of each word.
Left-first order of tokenization (aaa -> aa a and not a aa).
Both numerical encoding and tokens are returned at once. This may be changed if needed.

Closes #4

Signed-off-by: Irina Khismatullina <[email protected]>

bpe_test.go

bpe.go

vmarkovtsev · 2019-10-29T15:02:38Z

Good job @irinakhismatullina ! I like where we are heading towards.

Signed-off-by: Irina Khismatullina <[email protected]>

irinakhismatullina added 2 commits October 29, 2019 12:37

Implement encoding

e7620a0

Signed-off-by: Irina Khismatullina <[email protected]>

Implement tests

289b2e8

Signed-off-by: Irina Khismatullina <[email protected]>

irinakhismatullina requested a review from vmarkovtsev October 29, 2019 09:46

vmarkovtsev requested changes Oct 29, 2019

View reviewed changes

Address review

7237c4c

Signed-off-by: Irina Khismatullina <[email protected]>

irinakhismatullina requested a review from vmarkovtsev October 30, 2019 10:08

vmarkovtsev approved these changes Oct 30, 2019

View reviewed changes

vmarkovtsev merged commit 2feb9cc into src-d:master Oct 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Encoding API #7

Encoding API #7

Uh oh!

irinakhismatullina commented Oct 29, 2019 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vmarkovtsev commented Oct 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Encoding API #7

Encoding API #7

Uh oh!

Conversation

irinakhismatullina commented Oct 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vmarkovtsev commented Oct 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

irinakhismatullina commented Oct 29, 2019 •

edited

Loading