The /summarize function is great for keeping the context size manageable and reducing the overall cost of high-end models requests.
I was wondering:
-
Does summarizing affect the quality of the responses?
-
What happens if the summarization is done using a cheaper model (since summarizing itself has a cost)?
In other words, could we regularly summarize using a free or low-cost model to reduce context size and lower usage costs — or would that create a hidden cost by increasing the number of interactions needed to clarify or correct weaker answers?