Cross-sectional analysis of multiple Repo

Mati · October 22, 2023, 9:36am

Hi

If I want to analyze across multiple Repo’s, can I create a workspace and display multiple folders so that Codebase can analyze multiple files?

If anyone has any good ideas, please let me know.

*However, since Codebase’s reference area is limited, we think we need to use it in combination with Langchain or something similar.
*By the way, I am a non-programmer.

Jakob · October 22, 2023, 11:52am

If you use the “Add Folder to Workspace…” option to add a second folder to your workspace, the Codebase context features still only use the codebase of the first folder in your workspace. But with the “@” symbol, you can still reference files from both folders. This seems to be a bug because the advanced codebase context settings provide the option to use “all” or a specific folder. We’ll investigate.

However, if you put both folders into one folder and open that one parent folder in Cursor, the codebase context features can use both folders simultaneously. So I would recommend that you do that.

Mati · October 22, 2023, 1:43pm

・Thanks for your suggestion well said as you suggested.
・On the other hand, what I’m wondering is whether GPT holds that much memory space. While it can take in large amounts of information, can it actually process all of that information?
・Does Cursor introduce its own storage space, etc. in Langchain compared to, for example, the web version of GPT?

Jakob · October 22, 2023, 1:57pm

The AI itself can only remember a limited amount of text at once. However, what codebase indexing does is make your codebase highly searchable in a smart manner. When you ask a question, it retrieves the most relevant snippets from your codebase related to your question and presents them within the limited context of the AI. This way, the AI only sees what it needs to answer your question, and its memory space doesn’t need to be excessively large.

Mati · October 22, 2023, 2:11pm

・Okay, so if I still want to implement Langchain, can I combine Cursor and Langchain? Probably not…
・On a related note, you can’t read a large number of files like this like all Codebase can with the web version of GPT, can Codebase increase the accuracy of GPT compared to regular GPT? Or is it the same?

Jakob · October 22, 2023, 2:26pm

I’m not sure if I understand your question correctly. LangChain is a framework for developing applications powered by language models. It has helpful methods that you can use when developing your own AI application to make the AI context-aware, for example. Cursor is its own application that has its own logic to make the AI context-aware.
The ChatGPT website can’t read a large number of files because it’s limited by its own context size. It doesn’t have indexing logic built into it like Cursor, which makes it able to retrieve the most relevant snippets, as I described in my previous answer. Yes, it will greatly improve the accuracy of the AI answers when you ask something about your codebase. With ChatGPT, the relevant parts of your codebase would likely be excluded from the limited context, and it wouldn’t even know what code you wrote.

Mati · October 22, 2023, 2:40pm

・Thanks for the great answer, github copilot doesn’t have an indexing feature either, right?

Jakob · October 22, 2023, 2:42pm

Correct. As far as I know, Copilot only sees the code that’s actually visible in your Visual Studio Code window, and that can’t be a lot of code, so they don’t need indexing.

boqihua123 · October 24, 2023, 7:18am

意思是cursor的gpt4无法理解多文件夹内容，也无法处理

Jakob · October 24, 2023, 11:26am

@boqihua123 Please read my reply here. Cursor can handle multiple folders.

Topic		Replies	Views
How can you index an entire codebase and what can I use it for? General	1	446	September 7, 2023
Codebase does not add any files to my context Bug Report in-progress	14	502	July 10, 2024
Codebase indexing doesn't include entire codebase General	1	266	February 6, 2024
@ two or more codebases，Does it work? General	5	358	October 24, 2023
Indexing only reads first folder in the workspace Bug Report in-progress	4	263	July 8, 2024

Cross-sectional analysis of multiple Repo

Related Topics