Skip to content

feat: add embeddings service for codebase analysis and similarity search #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

ChiragAgg5k
Copy link
Member

What does this PR do?

adds embeddings service.

Test Plan

Related PRs and Issues

Have you read the Contributing Guidelines on issues?

yes.

Copy link

github-actions bot commented Jun 6, 2025

🚀 Previews available on pkg.vc!

📦 @appwrite.io/synapse

Install @appwrite.io/synapse with:

npm install https://pkg.vc/-/@appwrite/@appwrite.io/synapse@2bca058

Alternatively, you may specify a branch name or pull request number:

npm install https://pkg.vc/-/@appwrite/@appwrite.io/synapse~feat-embeddings
npm install https://pkg.vc/-/@appwrite/@appwrite.io/synapse!26

Comment on lines 110 to 118
"node_modules",
".git",
"dist",
"build",
"coverage",
".next",
"tmp",
".cache",
"huggingface_hub",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should merge the list of files we define + the files from the .gitignore

lets exclude all the open runtime specific scripts like build.sh etc etc. in our own list as we dont need them indexed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of them exist in the helpers folder, so i added that to the list as well, just worried if this might be bad for actual codebases..

also embeddings will be generated in the artifact folder, where these files should not exist anyway

constructor(
synapse: Synapse,
workDir: string,
modelName: string = "jinaai/jina-embeddings-v2-base-code",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accuracy is really important here and so are the quality of the embeddings. we should create an adapter system that allows us to use openAI embeddings or our own custom embeddings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see some results with the open AI embeddings

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i agree, will work on adding an adapter.

as for openai embeddings, thats what i had done before but it would require exposing the open ai api key in the container, which we should definitely not do as it might get exposed to the user

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants