Deepseek finetuning on agent trajectories + context caching

They got SOTA on swe bench

I wonder if you can use similar things (synthetic data generation using self play and static analysis, and agent/SWE trajectories) to finetune a deepseek model which is cheap and already very good, then get a custom provider to do context caching…I feel like you’d be able to beat sonnet at mini costs. Deepseek does caching on their API, but I wouldn’t trust it.

1 Like