# Evaluationn- usefulnessn- initiativen- error_raten- memory_recalln- cadence_fitnTrial_prompts:_1_identity_check_2_tool_use_check_3_resume_checknScoring:_usefulness_latency_reliability_costn