Vibe CEO'ing a project with Cursor! Part 2: Hands‑On BMad AAiDD Workflow + GitHub Prompts

bmadcode · April 19, 2025, 4:23am

Hey there! Part 2 of the “Better than Vibe Coding” series is live (finally lol, sorry for the delay!
No theory this time - just sharing the github and in the video I start a real app project following the Agile AI Driven Development flow, showing how to make the most of cursor agents for amazing results.

Gemini Web 2.5 Pro for free deep‑research → project brief
PM AI generates the PRD
Architect AI lays out the architecture & stories
Demo app kickoff: automated 2‑host podcast pipeline
Full prompts & agent configs: GitHub - bmadcode/BMAD-METHOD: Breakthrough Method for Agile Ai Driven Development

Please check out the video to see how to use the prompts: https://youtu.be/1wQUio9TiIQ

Curious how you’d tweak the persona prompts or integrate additional checks—let’s compare notes!

tobynicholas · April 22, 2025, 11:00pm

Hey Brian –

Have you done any work on evaluating prompts using any kind of automated process?

I’ve started using a simple and pretty flawed system for figuring out how changes in prompts, rules, personas and modes affect the output quality of a product:

– use the same prompt across evals for a sample product, e.g. html->markdown converter (I like this particular one because it’s not possible to get to 100%, but there’s a massive range of how far you can get with it)
– run an eval changing one parameter each time, e.g. PRD vs no PRD, PRD with no persona vs PRD with persona, one PRD prompt vs another, one-shot vs task manager, gemini 2.5 vs claude 3.7 etc
– repeat x times (shamefully, x=3 until the tooling improves, which definitely results in anecdata but better than nothing)
– get a different model (I like o3 for this but o4-mini-high is working well too) to score the final outputs multiple times each

It’s tedious work for sure but the results have been eye-opening so far – I’m still compiling them but I can draw some early conclusions already, and I’ve noticed a big level-up in my own work by adopting the “winning” strategies.

If you’ve worked on anything like this I’d 100% love to get some hints on how to improve the eval process… would be great to see some kind of leaderboard eventually for different prompts, rules and modes.

bmadcode · April 25, 2025, 1:55pm

This is a great systematic approach of AB testing to see incrementally what works and does not. There are so many variables at play from type of project, scope, technology choices, prompt wording, and ever evolving models that are improving it can be tough to figure out quickly what the minimal set of lean rules, prompts and docs are needed for consistent success.

Automating this testing somehow would be really a great idea!

MillerTom · April 30, 2025, 5:21am

Bmad thanks so much for putting this out there! It works great! I am going to build a DevOps agent that will guide user to get basic devops pipeline implemented. It will suggest platform agnostic solutions so can go to AWS, Azure, GCP, DigitalOcean, VPS seamlessly. It will suggest that every PR have 1 reviewer. The basics. best practices. What do you think Bloco

bmadcode · May 7, 2025, 3:27am

Thanks awesome idea! Please feel free to PR it against the latest version V2 of the repo!

Topic		Replies	Views
BMAD-METHOD V2 in an Evolution IMO - The POWER of Custom Agents, Smaller Docs, and CHECKLISTS! How To	3	295	May 11, 2025
To vibecode or not, that is the question Discussion	3	166	April 21, 2025
Introducing the Agile-AI Workflow (AIADD): A New Way to Build Complex Apps with Cursor How To	2	609	April 9, 2025
Vibe Coder Extension - Voice Agent in Cursor Showcase	1	927	February 27, 2025
Looking for Cursor Workshop Leader Discussion	6	62	April 30, 2025

Vibe CEO'ing a project with Cursor! Part 2: Hands‑On BMad AAiDD Workflow + GitHub Prompts

Related topics