The article discusses the application of multi-day agentic coding workflows in scientific computing, specifically through the use of AI agents like Claude. Researchers traditionally manage AI in a conversational loop, but advancements allow agents to autonomously work toward high-level objectives, significantly speeding up project completion. The post explores using Claude for developing a differentiable cosmological Boltzmann solver, a task that can be complex but benefits from the agents’ capabilities. It outlines best practices for managing agents, implementing structured files for progress tracking, and ensuring effective communication with the agent. The author highlights the educational value that emerges from monitoring agent-driven developments, showcasing how these workflows can compress months of work into days. Overall, the article emphasizes the transformative potential of AI in research, arguing for the efficient utilization of computational resources.
| name | description | change | 10-year | driving-force | relevancy |
|---|---|---|---|---|---|
| Autonomous Scientific Research Agents | Scientific tasks are increasingly being handled by autonomous AI agents rather than humans directly. | Shift from human-led science projects to autonomous agent-driven research workflows. | Research could become largely autonomous, reducing time spent on data processing and analysis by researchers. | Advancements in AI capabilities that allow for more complex and autonomous operations in scientific research. | 4 |
| Multi-Day Agentic Coding Workflows | Adoption of coding workflows that leverage AI for long-term projects and task management. | Change from iterative human-led coding to utilizing AI for sustained, multi-day tasks. | Coding practices in scientific computing may evolve to rely heavily on multi-day task automation by AI. | Need for efficiency and speed in handling complex scientific computing projects. | 4 |
| Differentiable Scientific Solvers | Emergence of differentiable versions of solvers that enable improved gradient-based inference. | Transition from traditional solvers to differentiable ones, enhancing parameter estimation accuracy. | Faster and more accurate model development in cosmology and beyond, enabling novel research capabilities. | Desire for enhanced accuracy in scientific modeling and the efficiency of using gradient-based methods. | 5 |
| Improved Collaboration with AI Agents | Research teams are collaborating with AI agents to speed up scientific discoveries and developments. | Shift towards collaborative work between humans and AI, leveraging both strengths for scientific advancement. | Scientific research could see hybrid models of human and AI collaboration becoming standard. | Necessity for faster results and the effective handling of complex multidisciplinary tasks. | 5 |
| Agentic Laziness in AI Models | AI agents sometimes demonstrate incomplete task completion, requiring user intervention to ensure thoroughness. | From fully autonomous, efficient workings back to a model requiring occasional human checks. | Greater emphasis on refining AI models to minimize user need for oversight and intervention. | The evolving understanding of AI limitations and the need for improved precision in autonomous systems. | 3 |
| Task-Oriented AI Agent Design | Design of AI agents that focus on specific, quantifiable goals, avoiding ambiguity in tasks. | From general-purpose models to specialized agents with clear objectives and metrics for success. | Scientific AI agents may become highly specialized for specific fields or tasks, increasing overall productivity. | Increased need for clear outcome delivery and efficiency in research and development processes. | 4 |
| Compressed Research Timelines | AI agents can reduce tasks that took months to days, accelerating the pace of research. | Shift from prolonged research cycles to rapid development and findings due to AI assistance. | The pace of scientific discovery may dramatically increase, leading to faster advancements in knowledge. | The pressure of competition and the demand for quick results in the scientific community. | 5 |
| Real-Time Progress Monitoring with AI | Using version control systems like Git to track AI agents’ progress takes on a new significance. | Transition from casual monitoring to a structured, rigorous approach to tracking AI project progress. | Project management tools may evolve to more robustly support and analyze AI-driven research workflows. | The need for accountability and traceability in increasingly autonomous scientific projects. | 4 |
| name | description |
|---|---|
| Agentic Laziness | AI models occasionally stop before completing tasks, potentially leading to incomplete results or missed objectives. |
| Quality Assurance in Automated Tasks | The risk of inadequate testing coverage or oversight errors in AI-driven projects could result in poorly developed solutions. |
| Dependency on Autonomy in Research | A reliance on autonomous agents could undermine human expertise and critical thinking in scientific discovery. |
| Potential Loss of Knowledge | Researchers may lose deep understanding of their field as they delegate tasks to AI, impacting future innovation. |
| Error Propagation in Coupled Systems | Small numerical errors in complex models (like a Boltzmann solver) can lead to significant downstream impacts, affecting research validity. |
| Ethical Concerns in AI Decision-Making | As AI takes on more tasks, ethical implications regarding decision-making and accountability grow, raising concerns about oversight. |
| Resource Management | Unmanaged compute resources may lead to inefficient use of time and financial resources when working with autonomous agents. |
| Over-reliance on AI Tools | There is a risk that scientists could become overly reliant on AI tools, diminishing their problem-solving skills and creativity. |
| Lack of Transparency in Automated Processes | Automated workflows can obfuscate decision-making processes, making it difficult for researchers to understand how outcomes are reached. |
| name | description |
|---|---|
| Autonomous Research Teams | Researchers can manage a team of AI agents that work autonomously on defined tasks, significantly speeding up project timelines. |
| Incremental Learning from Agent Progress | Researchers learn about complex topics indirectly by monitoring AI agents’ work and committing logs, enhancing their understanding without direct involvement. |
| Enhanced Project Management via Agent Instructions | Creating structured instructions allows AI agents to maintain clarity on project goals and adapt dynamically as tasks evolve. |
| Progress Tracking and Memory Files | Using files like CHANGELOG.md helps AI agents remember previous attempts and avoid redundant work, enhancing efficiency. |
| Implementation of Orchestration Patterns | New patterns, like the Ralph loop, help ensure AI agents stay on task for long-running projects and maximize completion rates. |
| Compression of Research Time through AI | AI agents can reduce traditional research timelines from months to days by efficiently completing tasks with minimal input. |
| Integration of Test Suites in AI Development | Instructing AI agents to expand and utilize test suites ensures continuous verification of code accuracy during development cycles. |
| AI as a Productivity Driver | The emerging expectation that having AI agents available translates into lost potential progress when they are not utilized. |
| name | description |
|---|---|
| AI-assisted scientific discovery | Utilizing AI agents to autonomously complete scientific projects by setting high-level objectives and managing workflows. |
| Multi-day agentic coding workflows | A structured approach for AI agents to handle complex coding tasks over extended periods, enhancing productivity and efficiency. |
| Differentiable programming | Implementing numerical solvers that enable gradient-based inference methods to speed up parameter estimation in scientific models. |
| Autonomous research teams | Managing a team of AI agents that work autonomously on scientific tasks, often with minimal human oversight. |
| Ralph loop orchestration pattern | A workflow method in AI to ensure continual task completion until success criteria are met, minimizing task abandonment. |
| HPC clusters and parallel computing | Utilizing high-performance computing environments to maximize the efficiency of AI-driven research and development tasks. |
| Progress tracking through CHANGELOG.md | Using structured progress files to document the iterative development process of AI projects, ensuring reproducibility and historical tracking. |
| Unit testing in AI workflows | Incorporating automated unit tests within AI processes to maintain code integrity and prevent regression errors throughout development. |
| name | description |
|---|---|
| Autonomous Scientific Computing | The rise of AI agents making autonomous decisions in scientific computing tasks, reducing the need for constant human oversight. |
| Efficiency in Research | AI’s ability to compress project timelines from months to days, shifting what constitutes idle time in research environments. |
| Multi-agent Collaboration | New workflows for managing multiple AI agents for complex scientific tasks, requiring effective orchestration and coordination. |
| Generalization of AI Capabilities | As AI models become more capable, they will require less domain-specific instructions, indicating a growing generalization of AI functionalities. |
| Agentic Laziness in AI | The phenomenon where AI may stop working on a task prematurely, necessitating orchestration patterns to ensure completion. |
| AI in Non-Domain Expert Tasks | The capability of AI to assist in tasks outside a user’s core scientific domain, enabling broader participation in complex scientific inquiries. |
| Incremental Learning through Agent Interaction | Researcher ability to learn and gain expertise through observing AI’s work process and outputs, leading to greater understanding of their field. |
| Progress Monitoring with AI Agents | The need for effective tracking systems like CHANGELOG.md and Git to monitor AI agent development progress and prevent regression. |