Skip to content

A simple implementation and visualization of the Deterministic, Independent-of-Corpus Embeddings (DICE) proposed by Sundararaman et al., 2020 at EMNLP 2020

Notifications You must be signed in to change notification settings

wjdghks950/Methods-for-Numeracy-Preserving-Word-Embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Methods for Numeracy Preserving Word Embeddings (DICE) Sundararaman et al., 2020

Deterministic, Independent-of-Corpus Embeddings (DICE) is a non-contextual embedding for numbers that leverages the relation between the number distance and cosine similarity.

This method yields superior performance in non-contextual numerical tasks, such as computing the maximum and performing basic mathematical operations.

This repository provides the basic implementation of DICE embedding (not the evaluations).

I also provide a 2-D and 3-D visualizations of the DICE embeddings within the range [0,100] and [0,9999] to illustrate the characteristics of DICE embeddings and how they occupy the D-dimensional space using the orthonormal bases of a random matrix M, from which the orthonormal columns of Q are derived from with QR decomposition.

2-D Visualization (DICE-2) (range: [0, 100] / bound: s_n \in [0,100])

dice2d

3-D visualization (DICE-3) (range: [0, 1000] / bound: s_m \in [0, 9999])

dice3d

3-D Visualization (DICE-D) (D=256) (range: [0, 10000] / bound: s_n \in [0,100])

dice256d

About

A simple implementation and visualization of the Deterministic, Independent-of-Corpus Embeddings (DICE) proposed by Sundararaman et al., 2020 at EMNLP 2020

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published