Rigor is the moat. Most quant shops show only winners — we show the math, the backtests, and the verdicts on strategies that didn't pass validation. Live strategies remain undisclosed.
18
Hypotheses tested
11
Killed outright
3
Parked (insufficient data)
4
Under research / partial
N-011 · May 2026 · KILLED
REFLEX — Multi-timeframe reflexive bounce
KILLED · no edge
Hypothesis
After volume-confirmed dump events, mid-frequency reflexive bounce trades on 30m/1h/2h/4h timeframes carry sufficient edge for systematic deployment with tight SL/TP.
Method
Deep multi-timeframe path-dependent backtest on 90 days of 5min klines across 78 symbols (T1+T2+T3). Realistic filters: BTC regime gate, time-of-day, 60min cooldown. Per-tier breakdown.
Results
30min TF (n=421)
WR 38-42%, mean −0.10 to +0.01%
1h TF — best variant (n=479)
WR 40-44%, mean −0.00 to +0.06%, monthly +1.0%
2h TF (n=695)
WR 34-40%, mean −0.13 to −0.21%
4h TF (n=1089)
WR 32-40%, mean −0.19 to −0.27%
Pump filter overlay (skip pumped >30%/24h)
no improvement
Best Sharpe across all configs
0.03
Earlier "validation" was wrong
Initial test on n=100 random sample claimed +16.8%/mo. Bootstrap CI on that sample was ±0.70%, mean +0.41% — statistically indistinguishable from zero. Multiplying noise by 30× trade frequency compounded into a fake-positive projection.
Mean falls inside zero band. No statistically significant edge.
KILLED
Killed before paper validation. Modules disabled via env, code archived but not running. No live capital deployed.
Never claim edge from a single small sample. Confidence intervals on monthly projections require n ≥ 300 with proper out-of-sample. Compounding noise gives the illusion of signal.
N-014 · May 2026 · KILLED
PHOENIX — Capitulation reversal long
KILLED · path bias
Hypothesis
On 1473 historical dump events (≥6% drop in 60min with ≥3× volume burst), 87.9% saw price recover above pre-dump levels within 4h. Capitulation lows should be tradeable LONG with mean +7% return.
Math
Define event as bar window [t₀, t₁] with (p_{t_1} - p_{t_0})/p_{t_0} ≤ θ_{drop} and V_{t_1-t_0} ≥ k · V̄_{prior 6h}, where tier-specific θ ∈ {-3%, -4%, -6%, -10%} and k ∈ {2.0, 2.5, 3.0, 4.0}.
True path-dependent simulation on 1m Binance futures klines for each event. Compared against naive close-only assumption.
Results
Naive close-only sim (SL=10/TP=15)
+1608% / 90d, Sharpe 17
Path-dependent sim (1m klines)
−1494% / 90d
Events with price < peakC after entry
1473 / 1473 (100%)
Mean further drawdown after dump
−5.98%
SL=5% trigger rate
61.3% of trades
SL=10% trigger rate
22.3%
KILLED
The 87.9% bounce statistic is an artifact of measuring at 4h close. In reality, price first dips further (avg −6%) before bouncing. Stops trigger before recovery. Adding any realistic SL destroys the edge. The bounce is real but untradeable for retail.
Aggregate statistics ≠ tradeable edge. Path dependency matters more than terminal distributions. Backtest entry, exit, AND every bar in between.
N-015 · May 2026 · KILLED
GEOFLOW — Perelman L-distance filter
KILLED · sample bias
Hypothesis
Borrowing from Perelman's reduced-length functional from Ricci flow theory, "smooth" price approaches to a signal trigger (low geometric path energy) should correlate with higher follow-through than chaotic approaches.
Math — Perelman L-distance, discretized for price action
Where τ_k = k/N is normalized time position, σ²_k is local realized variance in 10-bar window centered on k, and r_k is the log-return at bar k.
Results — appeared promising on curated sample
Q1 (lowest L) WR on pump-only sample
76.1%, avg +4.37%
Q5 (highest L) WR on pump-only sample
52.1%, avg +2.53%
Walk-forward TEST > TRAIN (suspicious)
+5.52% vs +4.67%
DD reduction vs no-filter
−72%
Reality check — 5000 random "smart-money interesting" moments
RAVEN filter (pre-condition) pass rate
0.50% (25/5000)
Of those, true positives
24%, not 49% as CV claimed
GEOFLOW filter applied on top
100% removed TPs, kept all FPs
Final P&L
−$2.24 / $100 over 12 trades
KILLED
L-distance discriminates within pump-only labeled set, but the relationship inverts on the real-world distribution. FPs in the smart-money universe have lower L than TPs. The earlier "Sharpe 18" was a sample-bias artifact. Curated samples deceive.
A filter that works on labeled positives may not generalize. Always validate on the realistic operating distribution, including false positives.
N-016 · May 2026 · PARTIAL
V14 QUANT scoring inversion
PARTIAL · weak real edge
Hypothesis
The internal scoring function of a legacy strategy (V14 QUANT, stored in 14.6M signalsnapshots) may be overfit. Higher quality scores may not correlate positively with realized P&L.
Method
Random sample 5000 snapshots. For each: fetch forward 1m klines, simulate entry at the snapshot's own entryTop/Bot zone, walk forward bar-by-bar checking SL/TP/timeout. Compute realized pnlPct per trade. Then rank-correlate features vs outcome.
Real but small. Inverting the strategy's own scoring captures a +3-6%/month edge at 5× leverage. Strong decay TRAIN→TEST suggests the edge is regime-dependent. Worth deploying as overlay filter, not standalone.
Even "your own" scoring system can be wrong. Test the inverse. Overfit detection often hides in plain sight.
N-017 · May 2026 · PLANNED
LPPL — Log-Periodic Power Law (Sornette)
PLANNED · implementation
Hypothesis
Following Sornette's Log-Periodic Power Law model, financial bubbles exhibit faster-than-exponential growth with log-periodic oscillations preceding a critical time tc. We test whether crypto's accelerated bubble cycles (days–weeks rather than months) yield tractable tc predictions on BTC, ETH, and top alts.
Sornette (1996–present), Financial Crisis Observatory, ETH Zurich
27+ peer-reviewed papers documenting fits on major historical bubbles (1929, 1987, 2000, 2007)
Mixed crypto results: BTC 2017 and 2021 peaks fit well ex-post; out-of-sample mixed
PLANNED
Implementation queued. Edge probability estimated 30–50% based on Sornette's published track record. Crypto-specific high-frequency LPPL with multi-asset confluence has no public retail-grade implementation we are aware of.
N-007 · May 2026 · INSUFFICIENT DATA
MONITORING 2.0 — Pre-announcement drift detector
PARKED · n=3
Hypothesis
Binance MONITORING-tag announcements (volatility-warning labels) are preceded by detectable volume + range anomalies in the 6h window before the public announcement. If true, a pre-event SHORT detector could capture the post-announcement drop before the crowd reads the news.
Method
Phase 0: scrape Binance CMS catalog (catalogId=49). 20 pages × 50 articles. For each: identify coin, check if it has Binance USDT-M futures listing, fetch 1m klines spanning [T-6h, T+1h] around announcement. Compute pre-event drift, volume ratio, range expansion.
Results
Articles fetched
16 MONITORING tags
With futures-tradable coin
4
With sufficient kline history
3
Pre-event drift direction
3/3 DOWN (−2.3%, −13.0%, −19.0%)
Sample size for inference
n=3 — insufficient
100% downward drift in pre-event window is suggestive but n=3 is noise. Bootstrap 95% CI on n=3 spans entire ±50% range.
Why sample so small
Most MONITORING tags target spot-only altcoins, not futures-tradable
CMS pagination rate-limited beyond ~16 articles
Older announcement pages removed from CMS archive
PARKED
Hypothesis remains UNTESTED, not rejected. Need n ≥ 30 via archive.org scraping or spot-price analysis. Re-attempt when broader historical data sourced.
Promising directional signal (100% down-drift) at n=3 means nothing. Statistical significance starts at n ≥ 30 for non-parametric tests, n ≥ 100 for any robust claim.
N-004 · 2026 · KILLED
Hyperliquid whale-mirror copy strategy
KILLED · execution mismatch
Hypothesis
Top-PnL Hyperliquid traders publish every position on-chain (HL's order book is fully transparent). Mirroring their entries to Binance USDT-M futures should capture a fraction of their edge — particularly on majors (BTC, ETH, SOL) where slippage is minimal.
Method
Identify top-20 wallets by 30-day PnL on HL. WebSocket listen to their position changes. When a tracked wallet opens a position above $100K notional, fire a mirroring order on Binance same coin, same side.
Why it failed in practice
Latency mismatch: HL whale fills are typically maker-limit at deep liquidity. Any follower with public-API latency enters 0.3-1.5% worse than the originator on average.
Position duration mismatch: Whales hold hours-to-days with average DCA-in over multiple fills. Our single-shot mirror catches only the entry tip, exits poorly.
Wallet attribution: Same person operates multiple wallets. Composite NET exposure ≠ individual wallet signal.
Selection bias: "Top 30-day PnL" includes survivorship + recency bias. Wallets fall off the leaderboard the moment they have a drawdown.
Reverse-MEV risk: When public HL whale wallets are watched, they sometimes intentionally fake entries to trap copy-traders.
KILLED
Tested in earlier sessions, failed. Hard rule established: never re-propose HL wallet-copy strategies regardless of who suggests them. Architecture flaw, not parameter flaw.
Transparent doesn't mean tradeable. Public information that requires faster execution than the originator already has zero edge for slower followers.
N-005 · 2026 · KILLED
DELISTING auto-short on Binance announcements
KILLED · adverse selection
Hypothesis
Binance DELISTING announcements (catalogId=48) trigger immediate panic dumps. Auto-shorting the announced coin within seconds of announcement should capture −15 to −40% over 24-72h.
Why it failed
Massive gap-down on first fill: by the time futures opens enough liquidity to short, price has already dumped 20-40%. Average entry slippage 8-15% from announcement price.
Spot dump > futures dump: spot market panics first; futures follows but at different magnitude due to existing long liquidations forcing a bounce.
Squeeze risk: after initial dump, surviving longs squeeze shorts on thin liquidity. 30-40% counter-bounces common within 24h.
Liquidity collapse: trading volume on delisted coins evaporates within 12h, making exit difficult without significant slippage.
Distinction from MONITORING tag
MONITORING tags (catalogId=49, BURST-class strategies) are different — they happen pre-announcement with intact liquidity. DELISTING auto-short specifically refers to acting AFTER public announcement, which has structural reasons to fail.
KILLED
Hard rule: never auto-short on DELISTING announcements. MONITORING (different category) remains a valid signal source under different mechanics.
Reactive shorts on widely-broadcast events are race-conditioned against the entire market. Edge requires either earlier information or different mechanics (pre-announcement detection like MONITORING 2.0).
Aggregating signal feeds from multiple "smart money" sources (HL alerts, on-chain whale flows, exchange leaderboards) into a unified scoring system should produce above-baseline performance through diversification of signal sources.
Live result
Period
7 days live, Q1 2026
P&L
−8.94%
Win-rate
~38%
Average loss size
2.4× average win size
Root cause analysis
Signal aggregation amplifies false signals: when 3 independent feeds say BUY simultaneously, often it's because they're all reading the same already-developing pump that's about to reverse
No regime filter: equally weighted signals in trending vs choppy regimes — chop regime saw 90% of losses
Implicit correlation between "smart money" sources: HL whales, Twitter influencers, and on-chain trackers all watch each other. Independent signals are not actually independent.
KILLED
−8.94% in 7 live days. Killed and replaced. Lesson reinforced for all future copy-style approaches.
Signal sources presented as "independent" rarely are. Test correlation between source returns before aggregating. Diversification doesn't help when underlying signals share hidden common factors.
N-008 · May 2026 · BLOCKED
KAIROS — Surgical liquidation-flush long
BLOCKED · data dep
Hypothesis
After a forced-liquidation cascade ($1M+ in 5min on a single coin), forced sellers are exhausted and price overshoots fair value. Entry on first reversal candle with tight 1-3% SL captures the mean-reversion bounce with high win-rate.
Entry trigger: Cascade(t) = 1 AND r5m(t) ≤ −3% AND volume spike ≥ 2σ. SL at cascade-low minus 0.5% buffer. TP at Fibonacci 38-61% retracement of cascade move.
Why blocked
Public Binance API gives 5min granularity only — insufficient for 1-3% SL precision on T2/T3 alts (normal noise = 2-5% per 5min)
Real-time liquidation aggregate at per-coin per-minute resolution is paywalled (Coinglass Standard $79/mo)
Without Coinglass: cascade detection is post-hoc only, not real-time
Tested 5 alternative entry patterns on public 5min data (flush bounce, breakout pullback, absorption, funding flip, squeeze) — ALL negative EV at SL ≤ 3%
BLOCKED
Mathematically blocked by infrastructure tier. Resume on Coinglass paid plan activation. Target: 70%+ WR with SL 1-3% on liquid alts.
Strategy viability is data-resolution-bound. Some edges require sub-minute liquidation feeds that public APIs don't expose. Pay for the data or build a different strategy.
N-012 · May 2026 · PARKED
True HFT scalping (microsecond)
PARKED · infrastructure gap
Hypothesis
Sub-second mean reversion on tick data is profitable for sufficiently fast operators.
Structural barriers
Colocation in exchange datacenter required (~$5–10K/mo per venue)
FPGA / kernel-bypass networking for sub-100μs latency
FIX or proprietary binary protocols, not REST
Capital floor for meaningful PnL at 0.5–2 bps/trade is enterprise-tier, not retail
Pivoting to mid-frequency (1–5min hold) as accessible alternative for the broader retail/prop universe.
PARKED
Real edge exists in this space but is structurally unreachable without enterprise-tier infrastructure. Not killed; revisitable if scale and infra justify the investment.
Want this rigor on your strategy?
We backtest, validate, and kill strategies systematically — until something survives. If you have an algorithm idea, or want access to our live signals, start here.