Skip to content

How to create code embeddings from Java codebase and store it in a vector database? #180

Open
@shankernamami

Description

@shankernamami

Hi there team code2vec,

I am working on a personal project. My aim is to store a Java codebase in a vector database to run similarity searches and retrieve code files from the db relevant to my query. Queries can be of the type:

  1. Method creating database pool connection.
  2. Entity class linked to 'Subjects' table

Basically a query will be an activity performed by the codebase and I should return the package, classname, (and method if required).

My plan is to vectorize these search queries using a vectorizer present in your codebase, perform similarity search and return results.

My questions are:

  1. How can I generate vectors for Java code using a your pretrained model?
  2. Will it be a good idea to vectorize an English query for similarity search?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions