We should be able to set different LLMs for Ask and Agent

We can use the small model to ask normal questions, and use the large model to resolve complex questions. This can save our fast-requests.