Corpus × corpus distances

Have you considered speed improvements that might result from computing distances between two corpora (queries and document collection)? With cosine similarity, this is simply a dot product between two term-document matrices. With network flows, perhaps a large network, where the same words in different documents would be distinct nodes, could be constructed? Running a for cycle over `nearest_neighbors` is not very fast even with the heuristics you implemented.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Corpus × corpus distances #47

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Corpus × corpus distances #47

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions