Tutorial: End-to-End Research Workflow with Theta Sweep Pipeline¶
Note
Reading Time: ~30-35 minutes
Difficulty: Beginner to Intermediate
Prerequisites: Basic understanding of navigation trajectories
This tutorial demonstrates how to use the ThetaSweepPipeline for complete end-to-end analysis of experimental trajectory data without needing detailed knowledge of CANN implementation.
1. Introduction to Pipelines¶
1.1 What is a Pipeline?¶
A pipeline provides a high-level interface for complete analyses without requiring implementation details:
[ ]:
# Traditional approach (manual)
1. Load data
2. Create networks
3. Run simulation
4. Compute analyses
5. Generate visualizations
6. Save results
# Pipeline approach (automated)
pipeline = ThetaSweepPipeline(trajectory_data, times)
results = pipeline.run() # Everything done!
1.2 Why Use Pipelines?¶
For experimental neuroscientists: - No need to understand CANN mathematics - Focus on your data and questions - Reproducible, standardized analyses
For computational researchers: - Rapid prototyping - Parameter sweeps and batch processing - Consistent output formats
1.3 ThetaSweepPipeline Overview¶
The ThetaSweepPipeline implements complete theta sweep analysis:
Inputs: - Trajectory data (position over time) - Timing information
Automatic processing: - Direction cell network simulation - Grid cell network simulation - Theta modulation computation - Population activity analysis
Outputs: - Trajectory analysis plots - Theta sweep animations - Population activity visualizations - Raw simulation data for custom analysis
2. Quick Start: Basic Pipeline Usage¶
2.1 Minimal Example¶
The simplest possible usage - just trajectory and times:
[ ]:
import numpy as np
from canns.pipeline import ThetaSweepPipeline # :cite:p:`chu2024firing,ji2025systems`
# Example: Load your experimental trajectory
# positions: shape (n_steps, 2) - [x, y] coordinates
# times: shape (n_steps,) - timestamps in seconds
positions = np.load('my_trajectory.npy') # Your data
times = np.load('my_times.npy')
# Run complete analysis (one line!)
pipeline = ThetaSweepPipeline(
trajectory_data=positions,
times=times
)
results = pipeline.run(output_dir="results/")
print(f"Animation saved to: {results['animation_path']}")
print(f"Analysis plots in: results/")
That’s it! The pipeline handles everything automatically.
2.2 Understanding the Outputs¶
After running, you’ll find in the output directory:
results/
├── trajectory_analysis.png # Trajectory path and statistics
├── population_activity_hd.png # Direction cell activity
├── population_activity_gc.png # Grid cell activity
├── theta_sweep_animation.gif # Complete dynamics animation
└── simulation_data.npz # Raw data for custom analysis
2.3 Quick Data Format Check¶
Your trajectory data should be:
[ ]:
# positions: (n_steps, 2) array
print(f"Position shape: {positions.shape}") # Should be (N, 2)
print(f"Position range X: [{positions[:,0].min()}, {positions[:,0].max()}]")
print(f"Position range Y: [{positions[:,1].min()}, {positions[:,1].max()}]")
# times: (n_steps,) array
print(f"Times shape: {times.shape}") # Should be (N,)
print(f"Duration: {times[-1] - times[0]:.2f}s")
print(f"Mean dt: {np.mean(np.diff(times)):.4f}s")
Common issues: - Positions not in meters? Scale appropriately - Times not in seconds? Convert first - Non-uniform sampling? Pipeline handles automatically
3. Loading External Trajectory Data¶
3.1 From CSV Files¶
[ ]:
import pandas as pd
# Load from CSV
df = pd.read_csv('trajectory.csv')
# Extract positions and times
positions = df[['x', 'y']].values # (n_steps, 2)
times = df['time'].values # (n_steps,)
# Run pipeline
pipeline = ThetaSweepPipeline(positions, times)
results = pipeline.run()
CSV format example: .. code-block:: csv
time,x,y 0.000,0.5,0.5 0.001,0.501,0.502 0.002,0.503,0.505 …
3.2 From MATLAB Files¶
[ ]:
from scipy.io import loadmat
# Load MATLAB file
data = loadmat('trajectory.mat')
positions = data['positions'] # Already (n_steps, 2)
times = data['times'].flatten() # Flatten if needed
pipeline = ThetaSweepPipeline(positions, times)
results = pipeline.run()
3.3 From Tracking Software¶
Common formats from tracking systems:
[ ]:
# DeepLabCut output
import pandas as pd
dlc_data = pd.read_csv('tracking_output.csv', header=[0,1,2])
x = dlc_data[('bodypart1', 'x')].values
y = dlc_data[('bodypart1', 'y')].values
positions = np.column_stack([x, y])
# Bonsai output
bonsai_data = pd.read_csv('bonsai_tracking.csv')
positions = bonsai_data[['X', 'Y']].values / 100 # Convert cm to m
# Custom tracking
# Always ensure: positions in meters, times in seconds
3.4 Synthetic Test Data¶
For testing or demonstration:
[ ]:
def create_test_trajectory(n_steps=1000, dt=0.002):
"""Create smooth test trajectory"""
times = np.linspace(0, (n_steps-1)*dt, n_steps)
# Circular trajectory with some noise
t_param = np.linspace(0, 4*np.pi, n_steps)
radius = 0.5
center = np.array([0.75, 0.75])
x = center[0] + radius * np.cos(t_param)
y = center[1] + radius * np.sin(t_param)
# Add small noise
noise = np.random.normal(0, 0.01, (n_steps, 2))
positions = np.column_stack([x, y]) + noise
return positions, times
# Test the pipeline
positions, times = create_test_trajectory()
pipeline = ThetaSweepPipeline(positions, times, env_size=1.5)
results = pipeline.run(output_dir="test_results/")
4. Advanced Customization¶
4.1 Environment Configuration¶
Adjust environment parameters:
[ ]:
pipeline = ThetaSweepPipeline(
trajectory_data=positions,
times=times,
env_size=2.0, # Environment size (meters)
dt=0.001, # Simulation time step (seconds)
)
When to adjust:
- env_size: Match your experimental arena (1m, 2m, etc.)
- dt: Finer time resolution for very fast movements
4.2 Network Parameters¶
Customize direction and grid cell networks:
[ ]:
pipeline = ThetaSweepPipeline(
trajectory_data=positions,
times=times,
# Direction cell network
direction_cell_params={
'num': 100, # Number of direction cells
'adaptation_strength': 15.0, # SFA :cite:p:`mi2014spike,li2025dynamics` strength
'noise_strength': 0.0, # Activity noise
},
# Grid cell network
grid_cell_params={
'num_gc_x': 100, # Grid cells per dimension
'adaptation_strength': 8.0, # SFA strength
'mapping_ratio': 5, # Grid spacing control
'phase_offset': 1.0/20, # Theta sweep magnitude
},
)
Parameter effects:
- Higher adaptation_strength: Stronger theta oscillations
- Larger mapping_ratio: Smaller grid spacing
- Larger phase_offset: Bigger theta sweeps
4.3 Theta Modulation Settings¶
Control theta rhythm parameters:
[ ]:
pipeline = ThetaSweepPipeline(
trajectory_data=positions,
times=times,
theta_params={
'theta_strength_hd': 1.0, # HD cell modulation strength
'theta_strength_gc': 0.5, # Grid cell modulation strength
'theta_cycle_len': 100.0, # Cycle length (ms)
},
)
Biological correspondence:
- theta_cycle_len=100ms → 10 Hz theta frequency
- theta_cycle_len=125ms → 8 Hz theta frequency
4.4 Animation and Visualization¶
Customize output visualizations:
[ ]:
results = pipeline.run(
output_dir="custom_results/",
save_animation=True, # Generate animation
save_plots=True, # Save analysis plots
animation_fps=10, # Animation frame rate
animation_n_step=20, # Sample every N frames
verbose=True, # Print progress
)
Performance tips:
- Lower animation_fps for smaller files
- Higher animation_n_step for faster generation
- Set save_animation=False to skip animation (faster)
5. Complete Research Example¶
5.1 Realistic Scenario¶
Analyzing a recorded navigation session:
[ ]:
import numpy as np
import pandas as pd
from canns.pipeline import ThetaSweepPipeline # :cite:p:`chu2024firing,ji2025systems`
# Load experimental data
print("Loading experimental trajectory...")
df = pd.read_csv('experiment_2024_session_3.csv')
# Extract and preprocess
positions = df[['x_cm', 'y_cm']].values / 100 # Convert cm to meters
times = df['timestamp_ms'].values / 1000 # Convert ms to seconds
# Data quality check
print(f"Trajectory: {len(positions)} samples")
print(f"Duration: {times[-1] - times[0]:.2f}s")
print(f"Position range: X[{positions[:,0].min():.2f}, {positions[:,0].max():.2f}]m, "
f"Y[{positions[:,1].min():.2f}, {positions[:,1].max():.2f}]m")
# Remove any NaN values (tracking failures)
valid = ~np.isnan(positions).any(axis=1)
positions = positions[valid]
times = times[valid]
print(f"Valid samples: {len(positions)}")
5.2 Run Comprehensive Analysis¶
[ ]:
# Configure pipeline with experimental parameters
pipeline = ThetaSweepPipeline(
trajectory_data=positions,
times=times,
env_size=1.5, # 1.5m x 1.5m arena
direction_cell_params={
'num': 100,
'adaptation_strength': 15.0, # SFA :cite:p:`mi2014spike,li2025dynamics`
'noise_strength': 0.05, # Small noise for realism
},
grid_cell_params={
'num_gc_x': 100,
'adaptation_strength': 8.0,
'mapping_ratio': 5,
},
theta_params={
'theta_strength_hd': 1.0,
'theta_strength_gc': 0.5,
'theta_cycle_len': 100.0, # 10 Hz
},
)
# Run analysis
print("\nRunning theta sweep analysis...")
results = pipeline.run(
output_dir="experiment_analysis/",
save_animation=True,
save_plots=True,
animation_fps=15,
verbose=True,
)
print("\n✅ Analysis complete!")
print(f"📊 Results saved to: experiment_analysis/")
print(f"🎬 Animation: {results['animation_path']}")
5.3 Custom Post-Processing¶
Access raw simulation data for custom analyses:
[ ]:
# Load simulation results
sim_data = results['simulation_data']
# Available data
dc_activity = sim_data['dc_activity'] # Direction cell firing
gc_activity = sim_data['gc_activity'] # Grid cell firing
theta_phase = sim_data['theta_phase'] # Theta phase over time
internal_pos = sim_data['internal_position'] # Decoded position
# Example analysis: Phase precession :cite:p:`ji2025phase,o1993phase`
import matplotlib.pyplot as plt
# Find high activity periods
activity_threshold = gc_activity.mean() + gc_activity.std()
high_activity = gc_activity.max(axis=1) > activity_threshold
# Plot phase vs time for high activity
plt.figure(figsize=(12, 4))
plt.scatter(times[high_activity], theta_phase[high_activity],
c=gc_activity[high_activity].max(axis=1),
cmap='viridis', s=10, alpha=0.6)
plt.xlabel('Time (s)')
plt.ylabel('Theta Phase (rad)')
plt.title('Phase Precession During High Grid Cell Activity')
plt.colorbar(label='Max Activity')
plt.savefig('experiment_analysis/phase_precession.png', dpi=150)
plt.show()
# Compare decoded vs actual position
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.plot(positions[:,0], positions[:,1], 'k-', alpha=0.5, label='Actual')
plt.plot(internal_pos[:,0], internal_pos[:,1], 'r-', alpha=0.7, label='Decoded')
plt.xlabel('X Position (m)')
plt.ylabel('Y Position (m)')
plt.title('Position Tracking Accuracy')
plt.legend()
plt.axis('equal')
plt.subplot(1, 2, 2)
error = np.linalg.norm(positions - internal_pos, axis=1)
plt.plot(times, error, 'b-', alpha=0.7)
plt.xlabel('Time (s)')
plt.ylabel('Position Error (m)')
plt.title('Decoding Error Over Time')
plt.tight_layout()
plt.savefig('experiment_analysis/position_accuracy.png', dpi=150)
plt.show()
print(f"\n📈 Mean position error: {error.mean():.3f}m")
print(f"📈 Max position error: {error.max():.3f}m")
5.4 Batch Processing Multiple Sessions¶
Process multiple experimental sessions:
[ ]:
import glob
from pathlib import Path
# Find all session files
session_files = glob.glob('experiments/session_*.csv')
print(f"Found {len(session_files)} sessions to process")
# Process each session
for session_file in session_files:
session_name = Path(session_file).stem
print(f"\nProcessing {session_name}...")
# Load data
df = pd.read_csv(session_file)
positions = df[['x_cm', 'y_cm']].values / 100
times = df['timestamp_ms'].values / 1000
# Run pipeline
pipeline = ThetaSweepPipeline(positions, times, env_size=1.5)
output_dir = f"batch_results/{session_name}/"
results = pipeline.run(
output_dir=output_dir,
save_animation=False, # Skip animation for speed
save_plots=True,
verbose=False,
)
print(f" ✓ Complete: {output_dir}")
print("\n🎉 Batch processing complete!")
6. Next Steps¶
Congratulations! You’ve learned how to use the ThetaSweepPipeline for end-to-end theta sweep analysis.
Key Takeaways¶
Pipelines simplify workflows - One-line analysis of complex models
Flexible data loading - CSV, MATLAB, tracking software
Automatic outputs - Plots, animations, and raw data
Customizable parameters - Full control when needed
Batch processing - Analyze multiple sessions efficiently
When to Use Pipelines¶
Perfect for: - Experimental neuroscientists without coding expertise - Rapid prototyping and exploratory analysis - Standardized processing of multiple datasets - Publication-quality figure generation - Teaching and demonstrations
Consider manual approach when: - Need non-standard model architectures - Implementing new analysis methods - Fine-grained control over every step - Extending the pipeline functionality
Pipeline Features Summary¶
|---------|————-|----------------|
| Data input | trajectory_data, times | Multiple format loaders |
| Environment | Auto-detected | Customizable env_size, dt |
| Networks | Default parameters | Full parameter dictionaries |
| Theta | Default 10 Hz | Custom frequency, strength |
| Outputs | Standard plots | Raw data + custom analysis |
| Batch | Single session | Multi-session processing |
Continue Learning¶
Related: Tutorials 1-7 - Understanding models in detail
Related: Scenario 3 - Brain-inspired learning
Extending the Pipeline¶
Want to modify or extend the pipeline?
Check source code:
canns.pipeline.theta_sweep.ThetaSweepPipeline
2. Inherit and customize: .. code-block:: python
from canns.pipeline import ThetaSweepPipeline
- class MyCustomPipeline(ThetaSweepPipeline):
- def custom_analysis(self):
# Add your analysis here pass
Contribute: Submit enhancements via GitHub pull requests
Getting Help¶
Documentation: https://canns.readthedocs.io
Examples: See
examples/pipeline/directoryIssues: GitHub Issues
Community: Discussions and Q&A on GitHub
Best Practices¶
Data quality first: Clean tracking data before pipeline
Start simple: Use default parameters initially
Validate outputs: Check trajectory plots for sanity
Document parameters: Save configuration for reproducibility
Version control: Track pipeline version used for each analysis
Thank you for completing the CANN tutorial series! You now have comprehensive knowledge of CANN modeling, brain-inspired learning, and practical research workflows.