In line editor suggests code which exists on the next line

chris-NR7 · March 28, 2025, 5:12pm

This is a screenshot of an in-line suggestion. The suggested code already exists on the next two lines.

This problem started when I used the new Dart formatter which introduces line breaks where they did not previously exist. I have experimented with the OpenAI tokenizer which produced different results when the code is formatted differently. You can see further details on this and a subsequent discussion with Claude on this github issue.

github.com/dart-lang/dart_style

There is a significant risk that the new formatter will degrade the performance of AI copilots

opened 09:16AM - 15 Mar 25 UTC

closed 08:06PM - 26 Mar 25 UTC

chris-NR

Since the introduction of the new formatter I have noticed an issue with the inl…ine predictions made by my AI coding assistant. I am using Cursor IDE (v0.46.11). Where a new line break has been inserted by the formatter (see issue #1668) the coding assistant predicts text that already exists on subsequent line(s). Based on the assumption that whitespace would be irrelevant to LLMs my initial thought was that this may be a bug in Cursor. Here is an example in context (the grey text is the prediction). <img width="612" alt="Image" src="https://github.com/user-attachments/assets/4c709387-f928-4911-8d0f-897567d9cfb3" /> I decided to verify my assumption about how an LLM might ignore whitespace when tokenizing differently formatted code. Using the [OpenAI tokenizer](https://platform.openai.com/tokenizer) I discovered that the code *is differently tokenized*. <details><summary>Tokenization Differences</summary> ### Old Formatter <img width="702" alt="Image" src="https://github.com/user-attachments/assets/ac1f4a84-47ff-475b-b6ec-618c4b0074f8" /> ### New Formatter <img width="707" alt="Image" src="https://github.com/user-attachments/assets/37486d6f-7b74-4946-81fc-626094081e32" /> </details> This led me to do further research on the changes introduced with the new formatter #1253. A conversation with Claude Sonnet 3.7 resulted in the following conclusion (usual caveats regarding AI generated results apply). Link to full conversation with Claude at end of extract. ### Claude's Conclusion "These changes directly explain your observation about prediction differences. With the new formatter: * Lines are being broken differently * Trailing commas are being added/removed automatically * The visual structure of code has shifted toward tall style When you observed the model predicting "text which was already present in my code but on the next line," this is likely because the new formatter is introducing line breaks where the old formatter wouldn't have, changing how the model perceives the context window. Since coding assistants are sensitive to formatting patterns they've seen during training, ***these significant formatter changes would naturally lead to prediction misalignments until the models are updated with code formatted according to the new Dart 3.7 style guidelines.***" [Link to research](https://claude.ai/share/5c8381cb-f96a-431f-bc07-8a29f44ac608)

system · April 27, 2025, 5:13pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Some auto formatting ANNOYING bug/issue Bug Reports	1	506	December 31, 2024
Reliability Issues in Parsing LLM Outputs Feedback	0	153	November 8, 2024
Proposal: Enhanced AI Response Formatting in Cursor Feature Requests	1	118	December 6, 2024
Quality problems Discussions	16	941	November 22, 2024
Major issue: Composer Only Ads New Code & Deletes Existing Discussions	2	104	December 6, 2024

In line editor suggests code which exists on the next line

Related topics