How about "time to the first paying user" metric? Parallelism could slow that down. Or even "lines of code per paying user"? Or just a user... It is not an art to agent-code a ton of code. It is an art to land just the few ones that are a must for a good MVP.