From initial investigations, this doesn’t look to be much of a breakthrough model. The model performs very well on some benchmarks, but poorly on others, and we don’t believe the context window is genuinely functional, in that if you gave it 10m tokens of context, it wouldn’t have real understanding of all of them!
3 Likes