Skip to content
Discussion options

You must be logged in to vote

Hi @mspronesti, in general, using syntax in social media, such as twitter, is not recommended due to the special nature of the text (emojis, hashtags, abbreviations, urls). I would recommend using one of the linear readers in lambeq, such cups_reader (with removed swaps), stairs_readers, or even spiders_reader. Regarding pre-processing, do not bother to "correct the spelling" -- this by itself is a separate task and can't be done reliably. My advice is (if you use one of the linear readers), to include emojis as separate tokens in your vocabulary -- they have a special meaning in social media and actually are very useful in determining the "meaning" of a sentence. There are not strict rul…

Replies: 1 comment 6 replies

Comment options

You must be logged in to vote
6 replies
@dimkart
Comment options

@mspronesti
Comment options

@dimkart
Comment options

@mspronesti
Comment options

@dimkart
Comment options

Answer selected by mspronesti
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants