Ranktration
Rank/compare algorithms, models, or approaches with weighted multi-criteria analysis.
How It Works
- Collect trajectories - Gather approaches with measurable metrics
- Smart sampling - Select representative sample for large datasets
- Pairwise battles - Compare sample trajectories using weighted scores
- Tournament ranking - Establish global rankings through competitive analysis
- Statistical confidence - Measure ranking stability and significance
- Final scoring - Apply ranking bonuses to create comprehensive evaluation
License
MIT License - see LICENSE file for details.
Credit
This implementation is inspired by and derived from the RULER (Robust Unified Learning Evaluation & Ranking) framework originally developed by OpenPipe for AI evaluation and trajectory analysis in machine learning.
Contact
For questions, issues, or contributions: