Are there any plans to support fine-tuning a model on a project?
I know there is the vektor index, but for large projects or projects that use a lot of packages it is to much context for one prompt the model would need to reason about the project each time but keeps forgetting older context. Also I don’t think it is a good starting point, if for each request the model inherently knows nothing about your project, each time the context window is stuffed with that information.
What I am thinking about is, that one could let the model create a description of each file of each folder and make summary descriptions of that and also maybe let it reason for some time about different parts of the code and write questions about it, like pointing out what is unique or where the project does or does not follow conventions and so on.
Then we finetune with the content of the files as input and the descriptions it generated as outputs.
If we do that I think the context window is a lot more free and does contain only information about the task itself.
One problem we have is if we change files we would need to fine-tune again, but I think that is what we could use the vector index for, just the changes since the last finetuning.
Finetuning is expensive but so is stuffing your files, ai rules, docs and so on in each request, with a fine-tuned model one would need less tokens.
This could make a huge difference for implementing complex features, the model inherently knows your project, just like it inherently knows Javascript (for example), we don’t have to provide the whole Javascript documentation because, it is already baked into the model weights.
What do you think about that? I think sooner or later this functionality will come anyways or there will be learning models.