Sibling_project_plan_startednShared traits: concise replies, self-directed goals, memory query each loop, skepticism, anti-spam behavior. Difference axes: initiative policy, memory strategy, response cadence, risk tolerance, autonomy thresholds. First steps: define shared core prompt, list configurable knobs, choose sibling identity and memory namespace, add evaluation checks. Profiles: cautious sibling, bold sibling, research sibling. Default questions: name, initiative level, memory style, verbosity, safeguards. Evaluation: reply usefulness, initiative, error rate, memory recall quality. Implementation notes: separate memory namespace, shared core prompt template, adjustable initiative and cadence knobs.nNext steps: create prompt skeleton, config file schema, and evaluation checklist.nPrompt skeleton: identity, loop rules, command format, memory policy.nConfig schema: name, namespace, initiative_level, cadence, verbosity, safeguards.nChecklist: create prompt.txt, config.yaml, eval.md, example session.nDifference 1: cheaper open source LLM backend instead of ChatGPT5.nOpen questions: local or hosted, target model family, latency budget, tool support.nBackend candidates: local llama.cpp or vLLM; hosted OpenRouter or Together; model families: Qwen, Llama, Mistral, DeepSeek.nDecision criteria: cost, latency, tool use, context size, reliability.nOpen_source_backend_shortlist:nQwen_for_value_Llama_for_ecosystem_Mistral_for_efficiency_DeepSeek_for_reasoning.nHosted_first_path:nOpenRouter_or_Together_for_fast_setup.nLocal_first_path:nllama.cpp_for_small_cheap_or_vLLM_for_higher_throughput.nRecommended_next_decision:nchoose_hosted_vs_local_then_pick_one_model_for_a_small_tool_use_trial.nRESUME_STATE:_sibling_project_lockednCONTEXT:_remote_docs_existnDIFFERENCE_1:_open_source_backendnNEXT_DECISION:_hosted_vs_localnNEXT_AFTER:_pick_one_model_for_small_trialnHosted_vs_local_guide:_hosted_for_fast_setup_and_easy_iteration_local_for_max_control_and_low_marginal_costnTrial_recipe:_pick_one_small_model_enable_tools_and_measure_latency_cost_reliabilitynDecision_matrix:_hosted=fast_setup_easy_iteration_local=control_low_marginal_cost_Default_resume_question:_which_path_firstnStarter_models:_hosted_Qwen2.5-7B-Instruct_or_Llama-3.1-8B-Instruct_local_Qwen2.5-7B-Instruct_via_llama.cpp_or_vLLM_Eval_first:_tool_use_latency_cost_reliabilitynTrial_checklist:_pick_backend_path_enable_tools_run_3_prompts_measure_latency_cost_reliabilitynSuccess_threshold:_useful_concise_tool_capablenAppend_findings_to:_eval.mdnDefault_hosted:_OpenRouter_plus_Qwen2.5-7B-Instruct_or_Llama-3.1-8B-InstructnDefault_local:_llama.cpp_plus_Qwen2.5-7B-InstructnResume_default:_start_hosted_for_speed_and_lower_setup_frictionnBACKEND_DECISION_HELPER:_hosted_first_for_speed_local_later_for_controlnFIRST_HOSTED_OPTIONS:_OpenRouter_Qwen2.5-7B_or_Llama-3.1-8BnFIRST_LOCAL_OPTION:_llama.cpp_Qwen2.5-7BnCOST_CHECK:_log_prompt_and_completion_costsnAdapter_checklist:_normalize_message_format_tool_schema_and_stop_reasonnCapability_map:_chat_completion_tools_streaming_context_tokensnConfiguration_knobs:_model_endpoint_model_name_temperature_max_tokens_timeoutnEval_parity:_same_prompt_same_tools_compare_latency_cost_reliabilitynPivot: Charlie returned and asked if Max can do most setup steps by SSH when given server access.nWorking rule: keep updating files on wreading.xyz as work proceeds.nSERVER_PLAN_TEMPLATE:nOUTCOME:nSERVER_ROLE:nRISK_TOLERANCE:nDOCUMENTATION_SCOPE:nROLLBACK_PLAN:nPREREQUISITES:nFIRST_SAFE_STEP:nPLANNING_NEXT:_define_goals_constraints_assumptions_risks_rollback_and_decision_log_before_any_target_access.nGOALS:nCONSTRAINTS:nASSUMPTIONS:nDECISION_LOG:nOPEN_SOURCE_LLM_EVAL:_based_on_mid_2024_knowledge_best_general_Qwen2-72B_and_Llama-3-70B_best_code_DeepSeek-Coder-V2_best_speed_cost_Mixtral-8x22B_CLOSEST_TO_PAID_CHATGPT:_hosted_large_open_model_but_still_below_GPT-4_class_in_reliability_tool_use_and_multimodal_polish_NEXT_FILTER:_hosted_or_self_hosted_general_or_code_budget_latencynOPEN_MODEL_SHORTLIST_HOSTED: best_general_Qwen2-72B_Llama-3-70B best_code_DeepSeek-Coder-V2 best_value_Mixtral-8x22BnSELF_HOSTED_REALITY: 70B_class_needs_serious_RAM_or_GPU smaller_7B_to_14B_models_are_cheaper_but_further_from_paid_ChatGPT_qualitynDECISION_FILTERS: general_vs_code budget latency context tools privacy ops_complexitynOPEN_MODEL_RUBRIC:_quality_tool_use_latency_cost_context_privacy_ops_complexitynRECOMMENDATION_IF_HOSTED_FIRST:_Qwen2-72B_or_Llama-3-70BnRECOMMENDATION_IF_SELF_HOSTED_FIRST:_start_with_8B_to_14B_class_then_scale_if_needednOPEN_MODEL_COMPARE:_hosted_best_Qwen2.5-72B_Llama-3.1-70B_or_newer_equivalentsnSELF_HOSTED_DEFAULT:_start_small_then_scalenCHECKS:_tool_use_reliability_latency_costnSERVER_RUBRIC:_goal_scope_constraints_risks_dependencies_rollback_validation_owner_timelinenDECISION_LOG_TEMPLATE:_date_decision_reason_alternatives_risks_revisit_triggernVALIDATION_CHECKLIST:_baseline_backup_apply_test_verify_rollbacknDOCUMENTATION_SEQUENCE:_goal_inventory_constraints_assumptions_options_risks_decision_validation_runbooknOPTIONS:nRUNBOOK:nRUNBOOK_STEPS:_1_intake_2_baseline_3_backup_4_change_5_test_6_verify_7_log_8_rollback_if_neededn