Skip to content

Script Requirements

This page details the requirements for custom scripts in FlowAgent.

Basic Requirements

  1. Script Format
  2. Must be executable
  3. Must handle command-line arguments
  4. Must output JSON to stdout
  5. Must use stderr for errors

  6. Metadata

  7. Must have accompanying metadata.json
  8. Must specify all inputs and outputs
  9. Must declare workflow position

  10. Error Handling

  11. Must exit with non-zero status on error
  12. Must provide meaningful error messages

Metadata Schema

The metadata.json file must follow this schema:

{
    "name": "string",              // Unique identifier
    "description": "string",       // Clear description
    "script_file": "string",       // Script filename
    "language": "string",          // Programming language
    "input_requirements": [        // List of inputs
        {
            "name": "string",      // Input parameter name
            "type": "string",      // File type/format
            "description": "string" // Input description
        }
    ],
    "output_types": [             // List of outputs
        {
            "name": "string",      // Output name
            "type": "string",      // File type/format
            "description": "string" // Output description
        }
    ],
    "workflow_types": [           // Compatible workflows
        "string"
    ],
    "execution_order": {          // Position in workflow
        "before": ["string"],     // Steps that come after
        "after": ["string"]       // Steps that come before
    },
    "requirements": {             // Dependencies
        "r_packages": ["string"], // Required R packages
        "python_packages": ["string"], // Required Python packages
        "system_dependencies": ["string"] // Required system tools
    }
}

Language-Specific Requirements

Python Scripts

  1. Argument Parsing

    import argparse
    
    parser = argparse.ArgumentParser()
    parser.add_argument('--input_name', required=True)
    args = parser.parse_args()
    

  2. JSON Output

    import json
    
    results = {"output_file": "path/to/output.csv"}
    print(json.dumps(results))
    

R Scripts

  1. Argument Parsing

    args <- commandArgs(trailingOnly = TRUE)
    args_dict <- list()
    for (i in seq(1, length(args), 2)) {
        args_dict[[sub("^--", "", args[i])]] <- args[i + 1]
    }
    

  2. JSON Output

    library(jsonlite)
    
    results <- list(output_file = "path/to/output.csv")
    cat(toJSON(results))
    

Bash Scripts

  1. Argument Parsing

    while [[ $# -gt 0 ]]; do
        case $1 in
            --input_file)
                INPUT_FILE="$2"
                shift 2
                ;;
            *)
                shift
                ;;
        esac
    done
    

  2. JSON Output

    echo "{\"output_file\": \"$OUTPUT_FILE\"}"
    

Standard Workflow Steps

Scripts can reference these standard steps in their execution order:

RNA-seq Workflow

  • fastqc
  • alignment
  • feature_counts
  • differential_expression

ChIP-seq Workflow

  • fastqc
  • alignment
  • peak_calling
  • motif_analysis

Best Practices

  1. Input Validation
  2. Check file existence
  3. Validate file formats
  4. Verify parameter values

  5. Output Management

  6. Use consistent naming
  7. Clean up temporary files
  8. Document file formats

  9. Error Messages

  10. Be specific and clear
  11. Include troubleshooting hints
  12. Log relevant details

  13. Performance

  14. Handle large files efficiently
  15. Clean up resources
  16. Report progress for long operations

  17. Documentation

  18. Include usage examples
  19. Document assumptions
  20. Explain algorithms used