How to develop accurately and efficiently in a huge codebase

I have recently been researching how to use AI assistants effectively in large legacy projects to achieve precise development outcomes while avoiding low-value results caused by AI misinterpretations. The legacy project I’m working on is about 4GB in size and has hundreds of thousands of lines of code.

Currently, I have been writing requirement documents, technical documents, validation rules, and other related documents for this project. Throughout each development cycle, I continuously add new requirements and update the related documentation.

My current approach is to mark version numbers within a single Requirements.md file, for example:

## Core Requirements (v1.0.0)
## Interaction Flow
....

## New Requirements (v2.0.0)
### Enhanced Features
- **FR-2.1**: [new features requirements]
- **FR-2.2**: [other new features requirements]

### Technical Enhancements
- **TR-2.1**: [technical requirements]
- **TR-2.2**: [performance requirements]
...

## Change History
- **v2.0.0** (2025-06-26): Added new message formatting requirements
- **v1.0.0** (2025-06-01): Initial requirements specification

All of these versions are kept within the same Requirements.md file. However, as the project grows and more versions are added, the file becomes larger, potentially increasing token consumption and affecting the accuracy of AI when processing or modifying the document.

Here are the two key topics I want to discuss based on my current situation:

  1. For a large legacy project like this, how should I manage the expansion of Requirements.md? Should the contents be written down to very detailed levels?
  2. As version numbers and requirements keep increasing, is it a good practice to store all of these requirements in a single Requirements.md file? Is there a better approach for managing growing requirements across versions?