Compare commits

...

8 Commits

3 changed files with 379 additions and 208 deletions

173
README.md
View File

@ -1,174 +1,33 @@
# AOC Sync
A Python script that polls multiple git repositories containing Advent of Code implementations written in Rust using `cargo-aoc` format. It automatically updates repositories when changes are detected, runs benchmarks, and generates a beautiful HTML comparison page showing performance metrics across users, years, and days.
A tool to poll multiple git repositories containing Advent of Code implementations and generate performance comparison reports.
## Features
## Building the Podman Image
- **Automatic Git Polling**: Monitors multiple repositories for changes
- **Flexible Repository Structure**: Supports both single-repo (all years) and multi-repo (one per year) configurations
- **Automatic Runtime Measurement**: Runs `cargo aoc` for all implemented days and parses runtime information
- **Performance Parsing**: Extracts timing data from cargo-aoc output
- **Data Storage**: SQLite database for historical performance data
- **HTML Reports**: Beautiful, responsive HTML comparison pages
- **Gap Handling**: Gracefully handles missing years, days, and parts
- **Configurable Comparisons**: Filter by specific years and days
The default configuration uses a custom Podman image (`aocsync:latest`) that has `cargo-aoc` pre-installed for faster execution.
## Requirements
To build the image:
- Python 3.7+
- Git
- Rust and Cargo
- `cargo-aoc` installed (install with `cargo install cargo-aoc`)
## Installation
1. Clone this repository:
```bash
git clone <repository-url>
cd aocsync
podman build -t aocsync:latest -f Dockerfile .
```
2. Install Python dependencies:
```bash
pip install -r requirements.txt
```
3. Ensure `cargo-aoc` is installed:
```bash
cargo install cargo-aoc
```
Note: The script runs `cargo aoc` (not `cargo aoc bench`) and parses runtime information from the output.
Alternatively, you can use the `rust:latest` image, but it will install `cargo-aoc` on each run (slower).
## Configuration
Edit `config.yaml` to configure repositories to monitor:
```yaml
# Poll interval in seconds
poll_interval: 300
# Output directory for generated HTML
output_dir: "output"
# Data storage directory
data_dir: "data"
# Repositories to monitor
repositories:
# Single repository with all years
- name: "user1"
url: "https://github.com/user1/advent-of-code"
type: "single"
local_path: "repos/user1"
# Multiple repositories, one per year
- name: "user2"
type: "multi-year"
years:
- year: 2023
url: "https://github.com/user2/aoc-2023"
local_path: "repos/user2-2023"
- year: 2024
url: "https://github.com/user2/aoc-2024"
local_path: "repos/user2-2024"
# Optional: Filter specific years to compare
compare_years: [2023, 2024]
# Optional: Filter specific days to compare
# compare_days: [1, 2, 3, 4, 5]
```
See `config.yaml` for configuration options, including:
- Repository URLs and paths
- Docker/Podman container settings
- Build and registry cache directories
- Resource limits
## Usage
### Run Once
To sync repositories and generate a report once:
```bash
python aocsync.py --once
# Run sync (only processes changed repositories)
python3 aocsync.py
# Force rerun all benchmarks
python3 aocsync.py --force
```
### Force Rerun All Days
To force rerun all days even if repositories haven't changed:
```bash
python aocsync.py --once --force
```
or
```bash
python aocsync.py --once --rerun-all
```
This is useful for refreshing all data or testing changes to the script.
### Continuous Polling
To continuously poll repositories (default):
```bash
python aocsync.py
```
Or with a custom config file:
```bash
python aocsync.py --config myconfig.yaml
```
## Output
- **Database**: Performance data is stored in `data/results.db` (SQLite)
- **HTML Report**: Generated at `output/index.html` (configurable via `output_dir`)
The HTML report includes:
- Performance comparison tables for each year/day/part
- Visual highlighting of fastest and slowest implementations
- Relative speed comparisons (X times faster/slower)
- Responsive design for viewing on any device
## How It Works
1. **Git Polling**: Checks each configured repository for changes by comparing local and remote commits
2. **Repository Update**: Clones new repositories or updates existing ones when changes are detected
3. **Day Detection**: Automatically finds implemented days by scanning for `day*.rs` files and Cargo.toml entries
4. **Runtime Measurement**: Runs `cargo aoc --day X` for each implemented day
5. **Parsing**: Extracts timing data from cargo-aoc output (handles nanoseconds, microseconds, milliseconds, seconds)
6. **Storage**: Stores results in SQLite database with timestamps
7. **Report Generation**: Generates HTML comparison page showing latest results
## Repository Structure Detection
The script automatically detects:
- **Year**: From repository path, name, or Cargo.toml
- **Days**: From `src/bin/day*.rs`, `src/day*.rs`, or Cargo.toml entries
- **Parts**: From cargo-aoc benchmark output (Part 1, Part 2)
## Troubleshooting
### Cargo-aoc not found
Ensure `cargo-aoc` is installed and in your PATH:
```bash
cargo install cargo-aoc
```
### Git authentication issues
For private repositories, ensure your git credentials are configured or use SSH URLs.
### Benchmark timeouts
If benchmarks take too long, the script has a 5-minute timeout per day. Adjust in the code if needed.
### Missing performance data
If some users/days/parts don't show up:
- Check that `cargo aoc --day X` runs successfully in the repository
- Verify the repository structure matches cargo-aoc conventions
- Ensure `cargo aoc` outputs timing information (check if it's configured to show runtime)
- Check logs for parsing errors
## License
[Your License Here]

View File

@ -86,6 +86,18 @@ class Config:
def rsync_config(self) -> Optional[dict]:
return self.config.get('rsync')
@property
def docker_config(self) -> dict:
"""Get Podman configuration with defaults"""
docker_config = self.config.get('docker', {})
return {
'build_cache_dir': docker_config.get('build_cache_dir', ''),
'registry_cache_dir': docker_config.get('registry_cache_dir', ''),
'memory': docker_config.get('memory', '2g'),
'cpus': docker_config.get('cpus', '2'),
'image': docker_config.get('image', 'aocsync:latest')
}
class Database:
"""SQLite database for storing performance results"""
@ -355,6 +367,70 @@ class GitManager:
logger.error(f"Error checking for changes: {e}")
return True # Assume changes to be safe
def has_year_changes(self, repo_path: Path, year: int, last_git_rev: str = "") -> bool:
"""Check if a specific year directory has changes since last_git_rev
Args:
repo_path: Path to the repository root
year: Year to check
last_git_rev: Last git revision we processed (empty string means check all changes)
Returns:
True if year directory has changes, False otherwise
"""
repo_path = Path(repo_path)
if not repo_path.exists() or not (repo_path / '.git').exists():
return True # Needs to be cloned
try:
# Check if year directory exists
year_dir = repo_path / str(year)
if not year_dir.exists() or not year_dir.is_dir():
return False # Year directory doesn't exist, no changes
# Get current HEAD revision
result = subprocess.run(
['git', 'rev-parse', '--short', 'HEAD'],
cwd=repo_path,
capture_output=True,
text=True
)
if result.returncode != 0:
return True # Can't determine, assume changes
current_rev = result.stdout.strip()
# If no last_git_rev, check if there are any commits affecting this year
if not last_git_rev:
# Check if year directory has any commits
result = subprocess.run(
['git', 'log', '--oneline', '--', str(year)],
cwd=repo_path,
capture_output=True,
text=True
)
return bool(result.stdout.strip())
# Check if current revision is different from last processed
if current_rev != last_git_rev:
# Check if year directory was modified between last_git_rev and current
result = subprocess.run(
['git', 'diff', '--name-only', last_git_rev, 'HEAD', '--', str(year)],
cwd=repo_path,
capture_output=True,
text=True
)
if result.returncode == 0:
return bool(result.stdout.strip())
return False
except Exception as e:
logger.error(f"Error checking year changes for {year}: {e}")
return True # Assume changes to be safe
class CargoAOCRunner:
"""Runs cargo-aoc benchmarks and parses results"""
@ -496,10 +572,129 @@ class CargoAOCRunner:
logger.warning(f"Could not get recent commits for {repo_path}: {e}")
return commits
@staticmethod
def _run_cargo_aoc_in_container(work_dir: Path, day: int, repo_root: Path, docker_config: dict) -> subprocess.CompletedProcess:
"""Run cargo aoc in a Podman container for security
Args:
work_dir: Working directory (year directory) - can be absolute or relative
day: Day number to run
repo_root: Absolute path to repository root
docker_config: Podman configuration dictionary
Returns:
CompletedProcess with stdout, stderr, returncode
"""
repo_root = Path(repo_root).resolve()
work_dir = Path(work_dir).resolve()
# Ensure work_dir is under repo_root
try:
work_dir_rel = str(work_dir.relative_to(repo_root))
except ValueError:
# If work_dir is not under repo_root, this is an error
raise ValueError(f"work_dir {work_dir} is not under repo_root {repo_root}")
# Determine build cache directory
build_cache_dir = docker_config.get('build_cache_dir', '')
use_temp_build = False
temp_build_dir = None
if build_cache_dir:
# Use persistent build cache directory
build_cache_path = Path(build_cache_dir).resolve()
build_cache_path.mkdir(parents=True, exist_ok=True)
logger.info(f"Using persistent build cache: {build_cache_path}")
else:
# Create a temporary directory for cargo build artifacts (outside repo)
import tempfile
temp_build_dir = tempfile.mkdtemp(prefix='cargo-aoc-build-')
build_cache_path = Path(temp_build_dir)
use_temp_build = True
logger.info(f"Using temporary build cache: {build_cache_path}")
# Determine registry cache directory
registry_cache_dir = docker_config.get('registry_cache_dir', '')
try:
# Build Podman command
podman_image = docker_config.get('image', 'aocsync:latest')
# Check if image has cargo-aoc pre-installed (aocsync:latest) or needs installation
needs_cargo_aoc_install = podman_image == 'rust:latest' or not podman_image.startswith('aocsync')
podman_cmd = [
'podman', 'run',
'--rm', # Remove container after execution
#'--network=none', # No network access
'--memory', docker_config.get('memory', '2g'), # Limit memory
'--cpus', str(docker_config.get('cpus', '2')), # Limit CPU
'--read-only', # Read-only root filesystem
'--tmpfs', '/tmp:rw,noexec,nosuid,size=1g', # Writable /tmp for cargo
'-v', f'{repo_root}:/repo:rw', # Mount repo read-write (cargo-aoc needs to write metadata files)
'-v', f'{build_cache_path}:/build:rw', # Writable build directory
'-w', f'/repo/{work_dir_rel}', # Working directory in container
]
# Handle cargo registry cache and cargo home
if registry_cache_dir:
# Use persistent registry cache - mount entire CARGO_HOME directory
registry_cache_path = Path(registry_cache_dir).resolve()
registry_cache_path.mkdir(parents=True, exist_ok=True)
# Mount the parent directory as CARGO_HOME so both registry and bin are persisted
cargo_home_path = registry_cache_path.parent / 'cargo-home'
cargo_home_path.mkdir(parents=True, exist_ok=True)
podman_cmd.extend(['-v', f'{cargo_home_path}:/root/.cargo:rw'])
logger.info(f"Using persistent cargo home (registry + bin): {cargo_home_path}")
else:
# Use tmpfs for cargo home (cleared after each run)
podman_cmd.extend(['--tmpfs', '/root/.cargo:rw,noexec,nosuid,size=200m'])
# Note: cargo-aoc installation location is handled above via CARGO_HOME mount
# If using persistent cache, cargo bin is already mounted via cargo-home
# If using tmpfs, cargo bin is already included in /root/.cargo tmpfs
# Set environment variables in the container so cargo uses the mounted directories
# Set build directory (target directory for compiled artifacts)
podman_cmd.extend(['-e', 'CARGO_TARGET_DIR=/build/target'])
# Set cargo home to use the mounted registry cache
podman_cmd.extend(['-e', 'CARGO_HOME=/root/.cargo'])
# Add Podman image and command
if needs_cargo_aoc_install:
# Install cargo-aoc first if not pre-installed (slower, but works with rust:latest)
# Check if already installed to avoid reinstalling every time
podman_cmd.extend([
podman_image,
'sh', '-c', 'if ! command -v cargo-aoc >/dev/null 2>&1; then cargo install --quiet cargo-aoc 2>/dev/null || true; fi; cargo aoc --day ' + str(day)
])
else:
# Use pre-installed cargo-aoc (faster, requires aocsync:latest image)
podman_cmd.extend([
podman_image,
'cargo', 'aoc', '--day', str(day)
])
result = subprocess.run(
podman_cmd,
capture_output=True,
text=True,
timeout=300 # 5 minute timeout
)
return result
finally:
# Clean up temporary build directory if we created one
if use_temp_build and temp_build_dir:
try:
shutil.rmtree(temp_build_dir)
except Exception as e:
logger.warning(f"Failed to clean up temp build directory {temp_build_dir}: {e}")
@staticmethod
def run_benchmarks(repo_path: Path, year: int, user: str = "unknown",
repo_url: str = "", is_multi_year: bool = False,
log_file: Optional[Path] = None) -> List[PerformanceResult]:
log_file: Optional[Path] = None, docker_config: Optional[dict] = None) -> List[PerformanceResult]:
"""Run cargo aoc benchmarks and parse results
Args:
@ -511,17 +706,19 @@ class CargoAOCRunner:
log_file: Optional path to log file to append cargo aoc output to
"""
results = []
repo_path = Path(repo_path)
repo_path = Path(repo_path).resolve()
# Get git revision
git_rev = CargoAOCRunner.get_git_rev(repo_path)
# Determine the working directory
# Determine the working directory and repo root
if is_multi_year:
# For multi-year repos, repo_path is already the year directory
work_dir = repo_path
repo_root = repo_path # For multi-year, repo_path is the repo root
else:
# For single repos, check if we need to navigate to a year subdirectory
repo_root = repo_path # Repo root is the repo_path
work_dir = repo_path
year_dir = repo_path / str(year)
if year_dir.exists() and year_dir.is_dir():
@ -539,17 +736,18 @@ class CargoAOCRunner:
for day in days:
try:
logger.info(f"Running cargo aoc for {user} year {year} day {day} in {work_dir}")
# Run cargo aoc for this day (no year flag, must be in correct directory)
cmd = ['cargo', 'aoc', '--day', str(day)]
result = subprocess.run(
cmd,
cwd=work_dir,
capture_output=True,
text=True,
timeout=300 # 5 minute timeout per day
)
logger.info(f"Running cargo aoc for {user} year {year} day {day} in {work_dir} (in Podman container)")
# Run cargo aoc in a Podman container for security
# Use default docker_config if not provided
if docker_config is None:
docker_config = {
'build_cache_dir': '',
'registry_cache_dir': '',
'memory': '2g',
'cpus': '2',
'image': 'aocsync:latest'
}
result = CargoAOCRunner._run_cargo_aoc_in_container(work_dir, day, repo_root, docker_config)
# Write to log file if provided
if log_file:
@ -561,7 +759,7 @@ class CargoAOCRunner:
with open(log_file, 'a', encoding='utf-8') as f:
f.write(f"\n{'='*80}\n")
f.write(f"[{timestamp}] {user} - Year {year} - Day {day}\n")
f.write(f"Command: {' '.join(cmd)}\n")
f.write(f"Command: cargo aoc --day {day} (in Podman container)\n")
f.write(f"Working Directory: {work_dir}\n")
f.write(f"Return Code: {result.returncode}\n")
f.write(f"{'='*80}\n")
@ -931,8 +1129,9 @@ class HTMLGenerator:
if len(data_points) < 2:
return ""
# Graph dimensions
width = 600
# Graph dimensions - make responsive to fit in modal
# Modal has max-width 600px with 20px padding, so max SVG width should be ~560px
width = 560
height = 200
padding = 40
graph_width = width - 2 * padding
@ -961,9 +1160,9 @@ class HTMLGenerator:
else:
return f"{ns}ns"
# Generate SVG
# Generate SVG - make it responsive
svg_parts = []
svg_parts.append(f'<svg width="{width}" height="{height}" style="border: 1px solid #ddd; background: #fafafa;">')
svg_parts.append(f'<svg width="100%" height="{height}" viewBox="0 0 {width} {height}" preserveAspectRatio="xMidYMid meet" style="border: 1px solid #ddd; background: #fafafa; max-width: 100%;">')
# Draw axes
svg_parts.append(f'<line x1="{padding}" y1="{padding}" x2="{padding}" y2="{height - padding}" stroke="#333" stroke-width="2"/>') # Y-axis
@ -1387,6 +1586,15 @@ class HTMLGenerator:
padding: 10px;
background: white;
border-radius: 4px;
overflow-x: auto;
overflow-y: visible;
}}
.history-graph svg {{
display: block;
width: 100%;
max-width: 100%;
height: auto;
}}
.compact-commits {{
@ -1779,6 +1987,8 @@ class AOCSync:
self.html_gen = HTMLGenerator(self.config.output_dir)
self.git_manager = GitManager()
self.force_rerun = force_rerun
# Ensure Podman image exists
self._ensure_podman_image()
def process_repository(self, repo_config: dict, user_name: str):
"""Process a single repository configuration"""
@ -1789,11 +1999,7 @@ class AOCSync:
url = repo_config['url']
local_path = repo_config['local_path']
if self.force_rerun or self.git_manager.has_changes(url, local_path):
if self.force_rerun:
logger.info(f"Force rerun enabled, processing repository {user_name}...")
else:
logger.info(f"Repository {user_name} has changes, updating...")
# Always update the repo to get latest changes
if self.git_manager.clone_or_update_repo(url, local_path):
repo_path = Path(local_path)
@ -1801,20 +2007,17 @@ class AOCSync:
config_years = repo_config.get('years')
url = repo_config['url']
years_to_process = []
if config_years:
# Use years from config
for year in config_years:
self._run_and_store_benchmarks(repo_path, year, user_name,
repo_url=url, is_multi_year=False)
years_to_process = config_years
else:
# Try to determine year(s) from the repository
years = CargoAOCRunner.extract_years_from_repo(repo_path)
if years:
# Run benchmarks for each detected year
for year in years:
self._run_and_store_benchmarks(repo_path, year, user_name,
repo_url=url, is_multi_year=False)
years_to_process = years
else:
# If no year detected, check for year directories
logger.warning(f"No year detected for {user_name}, checking for year directories")
@ -1823,8 +2026,30 @@ class AOCSync:
year_dir = repo_path / str(try_year)
if year_dir.exists() and year_dir.is_dir():
logger.info(f"Found year directory {try_year} for {user_name}")
self._run_and_store_benchmarks(repo_path, try_year, user_name,
years_to_process.append(try_year)
# Process each year, checking for changes if not forcing
for year in years_to_process:
if self.force_rerun:
logger.info(f"Force rerun enabled, processing {user_name} year {year}...")
self._run_and_store_benchmarks(repo_path, year, user_name,
repo_url=url, is_multi_year=False)
else:
# Get last git_rev for this user/year from database
last_results = self.db.get_latest_results(years=[year], days=None)
last_git_rev = ""
for result in last_results:
if result['user'] == user_name and result['year'] == year:
last_git_rev = result.get('git_rev', '')
break
# Check if this year has changes
if self.git_manager.has_year_changes(repo_path, year, last_git_rev):
logger.info(f"Year {year} for {user_name} has changes, running benchmarks...")
self._run_and_store_benchmarks(repo_path, year, user_name,
repo_url=url, is_multi_year=False)
else:
logger.info(f"Year {year} for {user_name} has no changes, skipping...")
elif repo_type == 'multi-year':
# Multiple repositories, one per year
@ -1867,9 +2092,11 @@ class AOCSync:
# Create log file path in output directory
log_file = Path(self.config.output_dir) / 'cargo-aoc.log'
log_file.parent.mkdir(parents=True, exist_ok=True)
# Get Podman configuration
docker_config = self.config.docker_config
results = CargoAOCRunner.run_benchmarks(repo_path, year=year, user=user,
repo_url=repo_url, is_multi_year=is_multi_year,
log_file=log_file)
log_file=log_file, docker_config=docker_config)
# Store results
for result in results:
@ -1877,6 +2104,72 @@ class AOCSync:
logger.info(f"Stored {len(results)} benchmark results for {user} year {year}")
def _ensure_podman_image(self):
"""Check if the Podman image exists, build it if it doesn't"""
docker_config = self.config.docker_config
image_name = docker_config.get('image', 'aocsync:latest')
# Only check/build if using aocsync:latest or custom image (not rust:latest)
if image_name != 'rust:latest':
# Check if image exists
try:
result = subprocess.run(
['podman', 'images', '--format', '{{.Repository}}:{{.Tag}}'],
capture_output=True,
text=True,
timeout=5
)
# Parse image name (handle cases like "aocsync:latest" or just "aocsync")
image_repo, image_tag = (image_name.split(':') + ['latest'])[:2]
# Check if image exists in the output
image_exists = False
for line in result.stdout.strip().split('\n'):
if line:
repo, tag = (line.split(':') + ['latest'])[:2]
if repo == image_repo and tag == image_tag:
image_exists = True
break
if not image_exists:
logger.info(f"Podman image {image_name} not found, building it...")
self._build_podman_image(image_name)
else:
logger.debug(f"Podman image {image_name} exists")
except Exception as e:
logger.warning(f"Could not check for Podman image {image_name}: {e}")
logger.info("Attempting to build image anyway...")
self._build_podman_image(image_name)
def _build_podman_image(self, image_name: str):
"""Build the Podman image from Dockerfile"""
dockerfile_path = Path(__file__).parent / 'Dockerfile'
if not dockerfile_path.exists():
logger.error(f"Dockerfile not found at {dockerfile_path}")
logger.error("Cannot build Podman image. Please create Dockerfile or use rust:latest image.")
return
try:
logger.info(f"Building Podman image {image_name} from {dockerfile_path}...")
result = subprocess.run(
['podman', 'build', '-t', image_name, '-f', str(dockerfile_path), str(dockerfile_path.parent)],
capture_output=True,
text=True,
timeout=600 # 10 minute timeout for build
)
if result.returncode == 0:
logger.info(f"Successfully built Podman image {image_name}")
else:
logger.error(f"Failed to build Podman image: {result.stderr}")
logger.warning("Falling back to rust:latest (will install cargo-aoc on each run)")
except subprocess.TimeoutExpired:
logger.error("Podman build timed out")
except Exception as e:
logger.error(f"Error building Podman image: {e}")
def sync_all(self):
"""Sync all repositories"""
logger.info("Starting sync of all repositories...")

View File

@ -13,6 +13,25 @@ rsync:
enabled: true
destination: "xinu.tv:/var/www/static/aoc/"
# Podman container configuration for running cargo aoc
docker:
# Persistent directory for cargo build artifacts (speeds up rebuilds)
# If not specified, uses temporary directory that's cleaned up after each run
build_cache_dir: "/tmp/aocsync/build_cache_dir"
# Persistent directory for cargo registry cache (downloaded dependencies)
# If not specified, uses tmpfs that's cleared after each run
registry_cache_dir: "/tmp/aocsync/registry_cache_dir"
# Container resource limits
memory: "2g" # Memory limit
cpus: "2" # CPU limit
# Podman image to use (should have cargo-aoc installed)
# Default: "aocsync:latest" (build with: podman build -t aocsync:latest -f Dockerfile .)
# Alternative: "rust:latest" (will install cargo-aoc on first run, slower)
image: "aocsync:latest"
# Repositories to monitor
repositories:
# Example: Single repository with all years