Skip to content

jbarber0/PatsPlants

Repository files navigation

PatsPlants

PatsPlants is an OpenBio-style framework for plant science, built to help Pat use agentic AI for real work without needing to already know the life-science tooling landscape.

Inspired by OpenMontage, PatsPlants follows the same core idea: give an AI coding assistant the right project context, workflows, and domain-specific skills so it can do real end-to-end work instead of only producing generic answers.

This repo does three things:

  1. gives Codex a plant-science operating context,
  2. installs and integrates the full SciAgent-Skills library locally for this project,
  3. provides runnable baseline workflows for ML and LLM-oriented plant-science projects.

Quick Start

If Pat only reads one section, it should be this one.

1. Set Up The Project

From the repo root:

./scripts/setup_local.sh

That script will:

  • create a local Python virtual environment at .venv/,
  • install the PatsPlants package and demo dependencies,
  • install a project-local copy of all 197 SciAgent-Skills into .sciagent-skills/,
  • run a basic environment check,
  • run the test suite.

2. Confirm It Works

.venv/bin/python -m patsplants doctor
.venv/bin/python -m patsplants list-pipelines

3. Run The Demo ML Workflow

.venv/bin/python -m patsplants run-demo-training --config configs/demo_multimodal_training.json --output-dir artifacts/demo_training

4. Scaffold The Demo Crop Knowledge Assistant

.venv/bin/python -m patsplants scaffold-crop-assistant --output-dir artifacts/crop_knowledge_assistant

5. Open This Repo In Your AI Coding Assistant

Pat can use any of these:

  • Codex
  • Claude Code
  • GitHub Copilot

Ask one of these to start:

  • "Use PatsPlants to help me design a transcriptomics workflow for drought stress in maize."
  • "Use the crop knowledge assistant pipeline to help me build a literature-grounded RAG assistant for Arabidopsis root development."
  • "Use the phenotyping pipeline to propose a computer-vision workflow for leaf lesion segmentation."

Pat's First 3 Real Use Cases

If Pat is not sure where to begin, start with one of these.

1. Transcriptomics For Plant Stress

Use this when Pat has RNA-seq counts or expression tables and wants help turning them into a real workflow.

Good first prompt:

Use PatsPlants to help me build a transcriptomics workflow for maize drought stress. I have count data, sample metadata, and I want differential expression, pathway interpretation, and a shortlist of candidate genes.

What the agent should anchor on:

  • pipeline: plant-omics-discovery
  • likely SciAgent-Skills: pydeseq2-differential-expression, gseapy-gene-enrichment, gget-genomic-databases, string-database-ppi
  • expected outputs:
    • analysis plan
    • cleaned input contract
    • differential expression table
    • enrichment summary
    • candidate gene shortlist

Helpful command:

.venv/bin/python -m patsplants show-pipeline plant-omics-discovery

Concrete local example:

Helpful commands:

.venv/bin/python -m patsplants inspect-omics-example --dataset-dir data/maize_drought_example
.venv/bin/python -m patsplants scaffold-omics-project --dataset-dir data/maize_drought_example --output-dir artifacts/maize_drought_project

2. Plant Imaging And Phenotyping

Use this when Pat has greenhouse, microscopy, or field images and wants a baseline modeling or image-analysis workflow.

Good first prompt:

Use PatsPlants to propose a baseline phenotyping workflow for soybean leaf disease images. I want preprocessing, segmentation or feature extraction, a baseline classifier, and a plan for evaluation.

What the agent should anchor on:

  • pipeline: phenotyping-and-field-vision
  • likely SciAgent-Skills: opencv-bioimage-analysis, scikit-image-processing, cellpose-cell-segmentation, scikit-learn-machine-learning
  • expected outputs:
    • image preprocessing plan
    • annotation strategy
    • baseline model recommendation
    • evaluation checklist

Helpful command:

.venv/bin/python -m patsplants show-pipeline phenotyping-and-field-vision

3. Literature-Grounded Plant Knowledge Assistant

Use this when Pat wants an LLM or RAG assistant that answers questions from papers, protocols, breeding notes, or agronomy documents.

Good first prompt:

Use PatsPlants to help me scaffold a literature-grounded assistant for Arabidopsis root development. I want a source ingestion plan, a citation-first answer policy, and an evaluation checklist.

What the agent should anchor on:

  • pipeline: crop-knowledge-assistant
  • likely SciAgent-Skills: deep-research, pubmed-database, openalex-database, scientific-critical-thinking, transformers-bio-nlp
  • expected outputs:
    • source manifest
    • retrieval and grounding plan
    • system prompt
    • evaluation checklist

Helpful commands:

.venv/bin/python -m patsplants show-pipeline crop-knowledge-assistant
.venv/bin/python -m patsplants scaffold-crop-assistant --output-dir artifacts/crop_knowledge_assistant

If Pat Wants More Prompt Examples

See PROMPT_GALLERY.md for copy-paste prompts organized by use case and assistant.

AI Assistant Setup

This repo includes agent-specific context files so Pat can get started quickly no matter which assistant he uses:

Codex

  1. Open Codex in this repo.
  2. Confirm .sciagent-skills/registry.yaml exists.
  3. Start with a direct instruction, for example:
Use PatsPlants to help me build a bulk RNA-seq analysis plan for maize drought stress.

Claude Code

  1. Open the repo in Claude Code.
  2. Make sure local setup has been run.
  3. Start with a workflow-oriented prompt, for example:
Use PatsPlants and the installed SciAgent-Skills to propose a phenotype-modeling workflow for soybean disease images.

GitHub Copilot

  1. Open the repo in VS Code with GitHub Copilot enabled.
  2. Let Copilot read the repository context files.
  3. Start from repo chat or inline chat with a prompt like:
Read the PatsPlants README and AGENTS instructions, then help me adapt the multimodal demo to a real plant dataset.

Who This Is For

This repo is written for a user who:

  • is a plant scientist,
  • has some Python experience,
  • wants to automate workflows with an AI coding agent,
  • does not already know which bioinformatics or scientific-computing tools to pick.

If that matches Pat, start with the setup script below and do not worry about the rest of the file yet.

What "Agentic AI" Means Here

In this repo, agentic AI means:

  • Codex can inspect the codebase,
  • read scientific skill files when needed,
  • write scripts and configs,
  • run commands,
  • and help assemble end-to-end workflows instead of only answering questions.

PatsPlants does not replace plant-science judgment. It gives the agent a better scientific toolbox and a safer starting point.

If Pat Is Using Codex

This repo includes AGENTS.md, which tells Codex how to work in this project and where to find the local SciAgent skill registry.

The important project-local integration is:

  • .sciagent-skills/registry.yaml
  • .sciagent-skills/skills/...
  • AGENTS.md

If setup completed successfully, open Codex in this repo and it should be able to use the project-local SciAgent knowledge.

What Gets Installed

Project-local Scientific Skill Library

The setup script installs the full SciAgent-Skills repository into:

.sciagent-skills/

That is a project-local copy of the 197-skill life-sciences library. It is not committed to git. It exists so Pat can use the project on his machine without depending on whatever happens to be installed globally.

Python Environment

The local virtual environment is created here:

.venv/

It installs:

  • PatsPlants itself
  • numpy
  • scikit-learn

Those are enough to run the validated ML demo in this repo.

Repo Layout

PatsPlants/
├── AGENTS.md
├── configs/
├── data/
│   └── cards/
├── docs/
├── examples/
├── pipeline_defs/
├── scripts/
├── skills/
│   └── domain/
├── src/patsplants/
└── tests/

Included Pipelines

  1. plant-omics-discovery Focus: candidate gene discovery, transcriptome prioritization, pathway inference, and explainable predictive modeling.
  2. phenotyping-and-field-vision Focus: greenhouse, microscopy, and field imagery for segmentation, tracking, trait extraction, and stress classification.
  3. multimodal-plant-foundation-models Focus: representation learning across omics, phenotype images, environmental context, and scientific text.
  4. crop-knowledge-assistant Focus: plant-science LLM systems, domain RAG, and literature-grounded agronomy or breeding copilots.
  5. plant-design-loop Focus: design-build-test-learn workflows for proteins, regulatory elements, metabolic engineering, and trait optimization.

Validated Demo Workflows

1. Multimodal Plant ML Demo

This is a runnable baseline workflow, not just a placeholder.

Command:

.venv/bin/python -m patsplants run-demo-training --config configs/demo_multimodal_training.json --output-dir artifacts/demo_training

What it does:

  • generates a synthetic plant multimodal dataset,
  • trains a baseline model for drought-tolerance classification,
  • evaluates accuracy, F1, and ROC AUC,
  • saves a model card and feature-importance summary.

Use this when Pat wants to understand the project shape before swapping in real data.

Files involved:

2. Crop Knowledge Assistant Scaffold

Command:

.venv/bin/python -m patsplants scaffold-crop-assistant --output-dir artifacts/crop_knowledge_assistant

What it does:

  • creates a starter directory for a plant-science knowledge assistant,
  • writes a runbook based on the bundled pipeline,
  • writes a starter system prompt,
  • writes an evaluation checklist,
  • writes a source manifest template.

Use this when Pat wants to build an LLM or RAG assistant for papers, protocols, breeding notes, or agronomy references.

Beginner Workflow For Pat

If Pat is unsure what to do first, use this order:

  1. Run setup:
./scripts/setup_local.sh
  1. Verify the environment:
.venv/bin/python -m patsplants doctor
  1. Look at the available pipelines:
.venv/bin/python -m patsplants list-pipelines
  1. Run the demo ML workflow:
.venv/bin/python -m patsplants run-demo-training --help

or:

.venv/bin/python -m patsplants run-demo-training --config configs/demo_multimodal_training.json --output-dir artifacts/demo_training
  1. Ask Codex for a real next task, for example:
  • "Use PatsPlants to help me design a transcriptomics workflow for drought stress in maize."
  • "Use the crop knowledge assistant pipeline to help me build a literature-grounded RAG assistant for Arabidopsis root development."
  • "Use the phenotyping pipeline to propose a computer-vision workflow for leaf lesion segmentation."

Local Setup Details

Prerequisites

  • macOS or Linux
  • python3 available on the command line
  • git
  • internet access for cloning SciAgent-Skills and installing Python packages

One-Command Setup

./scripts/setup_local.sh

Manual Setup

If the script fails and Pat wants to do it manually:

python3 -m venv .venv
.venv/bin/python -m pip install --upgrade pip setuptools wheel
.venv/bin/pip install -e .
./scripts/install_sciagent_skills.sh
.venv/bin/python -m patsplants doctor
.venv/bin/python -m unittest discover -s tests -v

Useful Commands

Inspect pipelines

.venv/bin/python -m patsplants list-pipelines
.venv/bin/python -m patsplants show-pipeline multimodal-plant-foundation-models

Inspect curated SciAgent mappings inside PatsPlants

.venv/bin/python -m patsplants list-skills
.venv/bin/python -m patsplants list-skills --category modeling

Generate a runbook

.venv/bin/python -m patsplants scaffold-runbook crop-knowledge-assistant --output artifacts/crop_assistant_runbook.md

Run everything that is validated in this repo

./scripts/run_demo_workflows.sh

How To Ask Codex For Help

Good prompts in this repo usually include:

  • the plant or crop,
  • the data type,
  • the biological question,
  • what output you want,
  • any constraints.

Examples:

  • "I have RNA-seq counts from maize leaves under drought and control. Help me build a PatsPlants differential expression workflow and explain each step."
  • "I have greenhouse images of soybean leaves with disease labels. Use PatsPlants to scaffold a training plan for a baseline vision model."
  • "Help me build a plant-science RAG assistant that answers questions from papers and protocols, with citations."

Files Worth Reading

Design Principles

  • Use existing SciAgent-Skills instead of inventing overlapping capability.
  • Start from a baseline that runs, then escalate to more advanced models.
  • Keep ML and LLM work grounded in plant-science questions and evaluation.
  • Make the first-run experience easy enough for a scientist who is new to agentic AI.

Common Problems

"The setup script failed during pip install"

Try:

.venv/bin/python -m pip install --upgrade pip setuptools wheel
.venv/bin/pip install -e .

"I do not see .sciagent-skills/"

Run:

./scripts/install_sciagent_skills.sh

"Codex is not using the scientific skills"

Make sure:

  • you opened Codex inside this repo,
  • .sciagent-skills/registry.yaml exists,
  • AGENTS.md exists at the repo root.

"I want to use my own plant dataset"

Start by replacing the demo config and data assumptions rather than rewriting the whole repo. The validated demo is meant to give Pat a working baseline he can adapt safely.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors