Tutorial: End-to-End Research Workflow with Theta Sweep Pipeline

Note

Reading Time: ~30-35 minutes

Difficulty: Beginner to Intermediate

Prerequisites: Basic understanding of navigation trajectories

This tutorial demonstrates how to use the ThetaSweepPipeline for complete end-to-end analysis of experimental trajectory data without needing detailed knowledge of CANN implementation.


1. Introduction to Pipelines

1.1 What is a Pipeline?

A pipeline provides a high-level interface for complete analyses without requiring implementation details:

[ ]:
# Traditional approach (manual)
1. Load data
2. Create networks
3. Run simulation
4. Compute analyses
5. Generate visualizations
6. Save results

# Pipeline approach (automated)
pipeline = ThetaSweepPipeline(trajectory_data, times)
results = pipeline.run()  # Everything done!

1.2 Why Use Pipelines?

For experimental neuroscientists: - No need to understand CANN mathematics - Focus on your data and questions - Reproducible, standardized analyses

For computational researchers: - Rapid prototyping - Parameter sweeps and batch processing - Consistent output formats

1.3 ThetaSweepPipeline Overview

The ThetaSweepPipeline implements complete theta sweep analysis:

Inputs: - Trajectory data (position over time) - Timing information

Automatic processing: - Direction cell network simulation - Grid cell network simulation - Theta modulation computation - Population activity analysis

Outputs: - Trajectory analysis plots - Theta sweep animations - Population activity visualizations - Raw simulation data for custom analysis


2. Quick Start: Basic Pipeline Usage

2.1 Minimal Example

The simplest possible usage - just trajectory and times:

[ ]:
import numpy as np
from canns.pipeline import ThetaSweepPipeline  # :cite:p:`chu2024firing,ji2025systems`

# Example: Load your experimental trajectory
# positions: shape (n_steps, 2) - [x, y] coordinates
# times: shape (n_steps,) - timestamps in seconds
positions = np.load('my_trajectory.npy')  # Your data
times = np.load('my_times.npy')

# Run complete analysis (one line!)
pipeline = ThetaSweepPipeline(
    trajectory_data=positions,
    times=times
)

results = pipeline.run(output_dir="results/")

print(f"Animation saved to: {results['animation_path']}")
print(f"Analysis plots in: results/")

That’s it! The pipeline handles everything automatically.

2.2 Understanding the Outputs

After running, you’ll find in the output directory:

results/
├── trajectory_analysis.png       # Trajectory path and statistics
├── population_activity_hd.png    # Direction cell activity
├── population_activity_gc.png    # Grid cell activity
├── theta_sweep_animation.gif     # Complete dynamics animation
└── simulation_data.npz           # Raw data for custom analysis

2.3 Quick Data Format Check

Your trajectory data should be:

[ ]:
# positions: (n_steps, 2) array
print(f"Position shape: {positions.shape}")  # Should be (N, 2)
print(f"Position range X: [{positions[:,0].min()}, {positions[:,0].max()}]")
print(f"Position range Y: [{positions[:,1].min()}, {positions[:,1].max()}]")

# times: (n_steps,) array
print(f"Times shape: {times.shape}")  # Should be (N,)
print(f"Duration: {times[-1] - times[0]:.2f}s")
print(f"Mean dt: {np.mean(np.diff(times)):.4f}s")

Common issues: - Positions not in meters? Scale appropriately - Times not in seconds? Convert first - Non-uniform sampling? Pipeline handles automatically


3. Loading External Trajectory Data

3.1 From CSV Files

[ ]:
import pandas as pd

# Load from CSV
df = pd.read_csv('trajectory.csv')

# Extract positions and times
positions = df[['x', 'y']].values  # (n_steps, 2)
times = df['time'].values          # (n_steps,)

# Run pipeline
pipeline = ThetaSweepPipeline(positions, times)
results = pipeline.run()

CSV format example: .. code-block:: csv

time,x,y 0.000,0.5,0.5 0.001,0.501,0.502 0.002,0.503,0.505 …

3.2 From MATLAB Files

[ ]:
from scipy.io import loadmat

# Load MATLAB file
data = loadmat('trajectory.mat')

positions = data['positions']  # Already (n_steps, 2)
times = data['times'].flatten()  # Flatten if needed

pipeline = ThetaSweepPipeline(positions, times)
results = pipeline.run()

3.3 From Tracking Software

Common formats from tracking systems:

[ ]:
# DeepLabCut output
import pandas as pd
dlc_data = pd.read_csv('tracking_output.csv', header=[0,1,2])
x = dlc_data[('bodypart1', 'x')].values
y = dlc_data[('bodypart1', 'y')].values
positions = np.column_stack([x, y])

# Bonsai output
bonsai_data = pd.read_csv('bonsai_tracking.csv')
positions = bonsai_data[['X', 'Y']].values / 100  # Convert cm to m

# Custom tracking
# Always ensure: positions in meters, times in seconds

3.4 Synthetic Test Data

For testing or demonstration:

[ ]:
def create_test_trajectory(n_steps=1000, dt=0.002):
    """Create smooth test trajectory"""
    times = np.linspace(0, (n_steps-1)*dt, n_steps)

    # Circular trajectory with some noise
    t_param = np.linspace(0, 4*np.pi, n_steps)
    radius = 0.5
    center = np.array([0.75, 0.75])

    x = center[0] + radius * np.cos(t_param)
    y = center[1] + radius * np.sin(t_param)

    # Add small noise
    noise = np.random.normal(0, 0.01, (n_steps, 2))
    positions = np.column_stack([x, y]) + noise

    return positions, times

# Test the pipeline
positions, times = create_test_trajectory()
pipeline = ThetaSweepPipeline(positions, times, env_size=1.5)
results = pipeline.run(output_dir="test_results/")

4. Advanced Customization

4.1 Environment Configuration

Adjust environment parameters:

[ ]:
pipeline = ThetaSweepPipeline(
    trajectory_data=positions,
    times=times,
    env_size=2.0,    # Environment size (meters)
    dt=0.001,        # Simulation time step (seconds)
)

When to adjust: - env_size: Match your experimental arena (1m, 2m, etc.) - dt: Finer time resolution for very fast movements

4.2 Network Parameters

Customize direction and grid cell networks:

[ ]:
pipeline = ThetaSweepPipeline(
    trajectory_data=positions,
    times=times,

    # Direction cell network
    direction_cell_params={
        'num': 100,                  # Number of direction cells
        'adaptation_strength': 15.0,  # SFA :cite:p:`mi2014spike,li2025dynamics` strength
        'noise_strength': 0.0,       # Activity noise
    },

    # Grid cell network
    grid_cell_params={
        'num_gc_x': 100,            # Grid cells per dimension
        'adaptation_strength': 8.0,  # SFA strength
        'mapping_ratio': 5,          # Grid spacing control
        'phase_offset': 1.0/20,      # Theta sweep magnitude
    },
)

Parameter effects: - Higher adaptation_strength: Stronger theta oscillations - Larger mapping_ratio: Smaller grid spacing - Larger phase_offset: Bigger theta sweeps

4.3 Theta Modulation Settings

Control theta rhythm parameters:

[ ]:
pipeline = ThetaSweepPipeline(
    trajectory_data=positions,
    times=times,

    theta_params={
        'theta_strength_hd': 1.0,    # HD cell modulation strength
        'theta_strength_gc': 0.5,    # Grid cell modulation strength
        'theta_cycle_len': 100.0,    # Cycle length (ms)
    },
)

Biological correspondence: - theta_cycle_len=100ms → 10 Hz theta frequency - theta_cycle_len=125ms → 8 Hz theta frequency

4.4 Animation and Visualization

Customize output visualizations:

[ ]:
results = pipeline.run(
    output_dir="custom_results/",
    save_animation=True,           # Generate animation
    save_plots=True,               # Save analysis plots
    animation_fps=10,              # Animation frame rate
    animation_n_step=20,           # Sample every N frames
    verbose=True,                  # Print progress
)

Performance tips: - Lower animation_fps for smaller files - Higher animation_n_step for faster generation - Set save_animation=False to skip animation (faster)


5. Complete Research Example

5.1 Realistic Scenario

Analyzing a recorded navigation session:

[ ]:
import numpy as np
import pandas as pd
from canns.pipeline import ThetaSweepPipeline  # :cite:p:`chu2024firing,ji2025systems`

# Load experimental data
print("Loading experimental trajectory...")
df = pd.read_csv('experiment_2024_session_3.csv')

# Extract and preprocess
positions = df[['x_cm', 'y_cm']].values / 100  # Convert cm to meters
times = df['timestamp_ms'].values / 1000       # Convert ms to seconds

# Data quality check
print(f"Trajectory: {len(positions)} samples")
print(f"Duration: {times[-1] - times[0]:.2f}s")
print(f"Position range: X[{positions[:,0].min():.2f}, {positions[:,0].max():.2f}]m, "
      f"Y[{positions[:,1].min():.2f}, {positions[:,1].max():.2f}]m")

# Remove any NaN values (tracking failures)
valid = ~np.isnan(positions).any(axis=1)
positions = positions[valid]
times = times[valid]
print(f"Valid samples: {len(positions)}")

5.2 Run Comprehensive Analysis

[ ]:
# Configure pipeline with experimental parameters
pipeline = ThetaSweepPipeline(
    trajectory_data=positions,
    times=times,
    env_size=1.5,  # 1.5m x 1.5m arena

    direction_cell_params={
        'num': 100,
        'adaptation_strength': 15.0,  # SFA :cite:p:`mi2014spike,li2025dynamics`
        'noise_strength': 0.05,  # Small noise for realism
    },

    grid_cell_params={
        'num_gc_x': 100,
        'adaptation_strength': 8.0,
        'mapping_ratio': 5,
    },

    theta_params={
        'theta_strength_hd': 1.0,
        'theta_strength_gc': 0.5,
        'theta_cycle_len': 100.0,  # 10 Hz
    },
)

# Run analysis
print("\nRunning theta sweep analysis...")
results = pipeline.run(
    output_dir="experiment_analysis/",
    save_animation=True,
    save_plots=True,
    animation_fps=15,
    verbose=True,
)

print("\n✅ Analysis complete!")
print(f"📊 Results saved to: experiment_analysis/")
print(f"🎬 Animation: {results['animation_path']}")

5.3 Custom Post-Processing

Access raw simulation data for custom analyses:

[ ]:
# Load simulation results
sim_data = results['simulation_data']

# Available data
dc_activity = sim_data['dc_activity']    # Direction cell firing
gc_activity = sim_data['gc_activity']    # Grid cell firing
theta_phase = sim_data['theta_phase']    # Theta phase over time
internal_pos = sim_data['internal_position']  # Decoded position

# Example analysis: Phase precession :cite:p:`ji2025phase,o1993phase`
import matplotlib.pyplot as plt

# Find high activity periods
activity_threshold = gc_activity.mean() + gc_activity.std()
high_activity = gc_activity.max(axis=1) > activity_threshold

# Plot phase vs time for high activity
plt.figure(figsize=(12, 4))
plt.scatter(times[high_activity], theta_phase[high_activity],
            c=gc_activity[high_activity].max(axis=1),
            cmap='viridis', s=10, alpha=0.6)
plt.xlabel('Time (s)')
plt.ylabel('Theta Phase (rad)')
plt.title('Phase Precession During High Grid Cell Activity')
plt.colorbar(label='Max Activity')
plt.savefig('experiment_analysis/phase_precession.png', dpi=150)
plt.show()

# Compare decoded vs actual position
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.plot(positions[:,0], positions[:,1], 'k-', alpha=0.5, label='Actual')
plt.plot(internal_pos[:,0], internal_pos[:,1], 'r-', alpha=0.7, label='Decoded')
plt.xlabel('X Position (m)')
plt.ylabel('Y Position (m)')
plt.title('Position Tracking Accuracy')
plt.legend()
plt.axis('equal')

plt.subplot(1, 2, 2)
error = np.linalg.norm(positions - internal_pos, axis=1)
plt.plot(times, error, 'b-', alpha=0.7)
plt.xlabel('Time (s)')
plt.ylabel('Position Error (m)')
plt.title('Decoding Error Over Time')
plt.tight_layout()
plt.savefig('experiment_analysis/position_accuracy.png', dpi=150)
plt.show()

print(f"\n📈 Mean position error: {error.mean():.3f}m")
print(f"📈 Max position error: {error.max():.3f}m")

5.4 Batch Processing Multiple Sessions

Process multiple experimental sessions:

[ ]:
import glob
from pathlib import Path

# Find all session files
session_files = glob.glob('experiments/session_*.csv')
print(f"Found {len(session_files)} sessions to process")

# Process each session
for session_file in session_files:
    session_name = Path(session_file).stem
    print(f"\nProcessing {session_name}...")

    # Load data
    df = pd.read_csv(session_file)
    positions = df[['x_cm', 'y_cm']].values / 100
    times = df['timestamp_ms'].values / 1000

    # Run pipeline
    pipeline = ThetaSweepPipeline(positions, times, env_size=1.5)

    output_dir = f"batch_results/{session_name}/"
    results = pipeline.run(
        output_dir=output_dir,
        save_animation=False,  # Skip animation for speed
        save_plots=True,
        verbose=False,
    )

    print(f"  ✓ Complete: {output_dir}")

print("\n🎉 Batch processing complete!")

6. Next Steps

Congratulations! You’ve learned how to use the ThetaSweepPipeline for end-to-end theta sweep analysis.

Key Takeaways

  1. Pipelines simplify workflows - One-line analysis of complex models

  2. Flexible data loading - CSV, MATLAB, tracking software

  3. Automatic outputs - Plots, animations, and raw data

  4. Customizable parameters - Full control when needed

  5. Batch processing - Analyze multiple sessions efficiently

When to Use Pipelines

Perfect for: - Experimental neuroscientists without coding expertise - Rapid prototyping and exploratory analysis - Standardized processing of multiple datasets - Publication-quality figure generation - Teaching and demonstrations

Consider manual approach when: - Need non-standard model architectures - Implementing new analysis methods - Fine-grained control over every step - Extending the pipeline functionality

Pipeline Features Summary

Feature | Basic Usage | Advanced Usage |

|---------|————-|----------------| | Data input | trajectory_data, times | Multiple format loaders | | Environment | Auto-detected | Customizable env_size, dt | | Networks | Default parameters | Full parameter dictionaries | | Theta | Default 10 Hz | Custom frequency, strength | | Outputs | Standard plots | Raw data + custom analysis | | Batch | Single session | Multi-session processing |

Continue Learning

Extending the Pipeline

Want to modify or extend the pipeline?

  1. Check source code: canns.pipeline.theta_sweep.ThetaSweepPipeline

2. Inherit and customize: .. code-block:: python

from canns.pipeline import ThetaSweepPipeline

class MyCustomPipeline(ThetaSweepPipeline):
def custom_analysis(self):

# Add your analysis here pass

  1. Contribute: Submit enhancements via GitHub pull requests

Getting Help

Best Practices

  1. Data quality first: Clean tracking data before pipeline

  2. Start simple: Use default parameters initially

  3. Validate outputs: Check trajectory plots for sanity

  4. Document parameters: Save configuration for reproducibility

  5. Version control: Track pipeline version used for each analysis

Thank you for completing the CANN tutorial series! You now have comprehensive knowledge of CANN modeling, brain-inspired learning, and practical research workflows.