Reading this with an AI? Connect the Edge Lab MCP server (https://lifelongedge.com/mcp) to query the study catalog as data.
The system, and what it returns
HV-RSI buys short-term dips in strong S&P 500 names via a limit order a few percent below the close, and holds for about a week. It is in the market rarely — it holds a position on ~38% of trading days and deploys only ~8% of capital on average — so its raw return is the wrong lens. The questions that fit are whether the per-trade signal is real, whether it survives out-of-sample, and what the deployed slice earns. Across 21 years of survivorship-bias-free history, on split- and dividend-adjusted prices:
- The signal separates from random. Run a random dip-buyer through the identical machinery and its risk-normalized CAR25 clusters near zero (median ≈ 0) while it deploys 4–8× more capital. HV-RSI clears 100% of those 40 cash-matched monkeys — every seed lands below it — in-sample and out-of-sample.
- It holds up out-of-sample on large caps. On the S&P 500 the win rate is flat (64.9% → 64.9%) and profit factor improves (1.55 → 1.61) from the 2005–17 window to the 2018–26 window.
- The deployed slice compounds at ~11%/yr. Raw CAGR is 4.0% because the book sits in cash most of the time; measured over time-in-market only, the in-market rate is ~11%/yr, at a −13.4% maximum drawdown against the index’s −55%.
- Selectivity is the source of the edge. Every tested change that raises capital use (looser entry, fewer filters, more slots) lowers the risk-normalized return.
- Who it is for: an allocator who wants a low-correlation, capital-light return stream — a portfolio sleeve that earns on a small slice and leaves the rest of the book free for cash yield or other strategies. As a standalone return engine it is structurally under-deployed.
| S1 · Feasibility & signals | done |
| S2 · Entry quality vs random | done |
| S3 · Cadence & regime | done |
| S4 · Portfolio P&L vs buy & hold | done |
| S5 · Robustness | done |
| S6 · Publication | done |
| S7 · Promotion | planned |
| S8 · Maintenance | planned |
- The system
- Setup & data
- The system vs. buying the index (and why exposure matters)
- A trade, drawn out (sample chart)
- Skill or luck? the cash-matched monkey
- Does it hold up out-of-sample?
- Can you deploy more capital?
- Metrics by lookback
- Sizing out-of-sample
- Exposure & holding period
- What this means for you
- Costs, slippage & cash
- Constraints
- Code, data & artifacts
1. The system
A short-term mean-reversion system specified by Glenn Osborne (January 2025). The thesis: a strong stock making fresh short-term lows on consecutive days is likely to bounce within a week, and a limit order a few percent below the close fills only when intraday selling reaches that price.
Entry signal
Fill
Liquidity filters
Exits
Sizing
Ranking
2. Setup & data
The universe uses point-in-time, survivorship-bias-free (SBF) S&P 500 membership: each historical date resolves to the names that were index members on that date, including those later delisted, acquired, or removed.
Universe (index membership — partial list)
| Item | Value |
|---|---|
| Universe | S&P 500, SBF membership (~959 ever-members) |
| Window | 2005-01-03 → 2026-05-15 (21.4y) |
| Slots | 20 × 10% of equity |
| Starting capital | $100,000 |
3. The system vs. buying the index (and why exposure matters)
Finding
On raw return the system trails buy-and-hold (4.0% vs 10.9% a year) because it holds cash most of the time. It is a capital-light book that is rarely exposed, then strikes. The applicable measures are the per-trade edge and the risk-normalized return on the slice it deploys; raw CAGR measures a different exposure profile.
| Metric | HV-RSI system | Buy & hold SPY |
|---|---|---|
| CAGR (as tested, 0% on cash) | 4.02% | 10.88% |
| Exposure-adjusted return (in-market only) | ~11.0% | 10.88% |
| Average capital deployed | ~7.7% | 100% |
| Max drawdown | −13.35% | −55.19% |
| Win rate | 65.0% | — |
| Profit factor | 1.58 | — |
| N trades / avg hold | 1,096 / 3.9 days | — |
| Final equity ($100k) | $232,089 | $908,276 |
The book runs at roughly one-thirteenth the capital at risk and one-quarter the drawdown (−13.4% vs −55.2%). The slice it deploys compounds at ~11%/yr (§10); the constraint that binds the system is how rarely it can deploy capital, addressed by the risk-normalized CAR25 in §5.
Year by year — monthly returns
System monthly returns — green positive, red negative, color intensity scaled to size. Yr% is the system’s full-year return; SPY is buy-and-hold for context. The system rarely moves much in any month (it is mostly cash) yet stays green through the years the index fell hardest — 2008 system flat vs SPY −37%. 2026 is year-to-date through May.
View the full monthly-returns grid (2005–2026)
| Year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | Yr% | SPY |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2005 | +0.5 | +0.2 | -0.3 | +1.0 | +1.4 | +0.0 | +0.1 | +0.5 | +0.4 | +4.7 | +0.0 | +0.4 | +9.2 | +5.3 |
| 2006 | +0.0 | -0.7 | +1.1 | +0.4 | -1.3 | +0.7 | +0.6 | +0.0 | -0.4 | -1.4 | -0.3 | +0.1 | -1.1 | +13.8 |
| 2007 | +0.6 | +0.2 | -2.2 | +0.0 | +0.3 | +0.8 | -2.4 | +0.6 | +0.0 | +0.0 | +0.2 | +0.4 | -1.4 | +5.3 |
| 2008 | +1.8 | +0.0 | +0.2 | +0.4 | +0.4 | +1.0 | +2.9 | +0.0 | -0.9 | -1.4 | +0.0 | +0.0 | +4.4 | -36.2 |
| 2009 | +0.0 | +0.0 | +0.0 | +0.3 | +1.8 | +1.1 | +2.8 | +0.0 | +1.6 | -3.5 | +3.6 | +0.0 | +7.8 | +22.7 |
| 2010 | +0.6 | +0.2 | -0.2 | -0.7 | +1.2 | -1.3 | +5.1 | +0.9 | +0.0 | +0.1 | +0.4 | +0.0 | +6.3 | +13.1 |
| 2011 | +0.6 | +1.3 | +1.4 | +0.2 | +2.8 | +0.4 | +0.5 | -8.2 | -0.1 | +0.8 | +0.1 | +0.9 | +0.3 | +0.9 |
| 2012 | +0.2 | +0.0 | +0.8 | +1.7 | -1.9 | +0.0 | +0.5 | +0.0 | +0.0 | -0.3 | +0.7 | +0.0 | +1.7 | +14.2 |
| 2013 | -0.1 | -0.1 | +0.2 | +2.2 | -0.0 | +2.8 | -0.2 | -0.0 | +0.3 | +0.5 | -0.4 | -0.9 | +4.3 | +29.0 |
| 2014 | -0.3 | +0.0 | +0.4 | +1.1 | +0.0 | +0.0 | -1.6 | +1.1 | +0.1 | +4.3 | +0.4 | +1.3 | +6.9 | +14.6 |
| 2015 | +0.7 | +0.4 | +0.3 | -0.1 | +0.0 | +0.1 | -1.2 | +2.5 | -0.4 | +0.0 | +0.3 | +0.4 | +3.0 | +1.3 |
| 2016 | -0.3 | +0.0 | +0.0 | +0.1 | +0.0 | -0.2 | +0.6 | -0.1 | -0.5 | -0.1 | -0.1 | +0.0 | -0.5 | +13.6 |
| 2017 | +0.8 | -0.6 | -1.0 | +0.2 | -0.2 | +0.0 | +0.6 | +0.0 | +0.1 | +0.4 | -0.1 | +0.4 | +0.6 | +20.8 |
| 2018 | +0.1 | -1.4 | +0.4 | +0.0 | -1.1 | -0.4 | -0.1 | +0.2 | -0.1 | -3.5 | -2.2 | -0.6 | -8.4 | -5.2 |
| 2019 | +0.0 | +0.0 | +0.9 | +1.2 | -0.1 | +1.3 | +0.2 | -0.1 | +0.0 | +0.0 | +1.7 | +0.3 | +5.4 | +31.1 |
| 2020 | -2.0 | -4.4 | +2.1 | +0.0 | +0.4 | +0.9 | +0.2 | +0.0 | +2.7 | +2.5 | +4.0 | +1.1 | +7.4 | +17.2 |
| 2021 | +1.9 | +3.4 | +10.5 | +0.4 | +0.0 | -0.1 | +3.4 | +0.8 | +2.9 | +1.0 | -0.4 | +0.7 | +27.0 | +30.5 |
| 2022 | +2.6 | +4.8 | +0.5 | +0.8 | -1.8 | -6.3 | +0.1 | +0.1 | -1.3 | +0.0 | +0.0 | -1.1 | -2.0 | -18.6 |
| 2023 | -0.2 | +0.3 | -2.5 | +0.0 | -0.3 | +0.7 | +0.0 | +0.8 | +0.9 | -0.2 | +0.6 | +0.4 | +0.2 | +26.7 |
| 2024 | +1.6 | +0.0 | +0.3 | +2.9 | +1.4 | +0.4 | +0.5 | +1.1 | +0.5 | -0.0 | +0.5 | -1.2 | +8.4 | +25.6 |
| 2025 | -0.4 | +4.9 | -1.3 | +4.4 | -0.0 | +0.4 | +0.0 | +0.5 | -0.4 | +2.2 | -0.2 | +0.3 | +10.7 | +18.0 |
| 2026* | +1.0 | -0.9 | -0.0 | -0.1 | +0.3 | · | · | · | · | · | · | · | +0.3 | +8.5 |
4. A trade, drawn out
One symbol from the run, with the system’s level and decisions on real (split-adjusted) price: the 150-day SMA the trend filter uses, entry marks (green) where a 3%-below limit filled after two consecutive short-term lows, and exit marks colored by reason — the bounce (close above prior high) or the 5-day time stop.
Annotated sample — KLAC (S&P 500)
5. Skill or luck? the cash-matched monkey
The monkey baseline here is cash-matched: it runs the identical machinery (same SMA150 / price / volume gates, same 3%-below limit, same exit, same 20 slots) and replaces only the short-term-low signal with a random draw from the eligible pool. 40 seeds per window, scored on canonical risk-normalized CAR25.
Finding
HV-RSI clears 100% of cash-matched monkeys in both windows — every one of the 40 random seeds lands below it. The monkey CAR25 clusters near zero (median ≈ 0) while each deploys 4–8× more capital (random names dip 3% just as often; those fills bounce less, at a monkey win rate ~57–58% against the system’s 65%). The “two consecutive short-term lows in an uptrend” selection is what separates it from random, and the separation is the same magnitude in-sample and out-of-sample.
| Window | HV-RSI CAR25 | HV exposure | HV win rate | Monkey CAR25 (p50) | Monkey exposure | Monkey win rate | HV percentile |
|---|---|---|---|---|---|---|---|
| In-sample 2005–17 | +4.16% | 4.9% | 64.8% | −0.04% | ~33% | ~57% | 100% |
| Out-of-sample 2018–26 | +6.39% | 11.8% | 64.8% | −0.08% | ~38% | ~58% | 100% |
CAR25 is the canonical risk-normalized annual return (Bandy safe-f at a −20% drawdown constraint), the same metric used to rank systems across the platform; it excludes cash interest. The system’s safe-f pins at the leverage ceiling in every window — the −20% drawdown limit never binds — which means its constraint is capital deployment, not risk (it is so rarely invested that even heavy leverage doesn’t reach a 20% drawdown). See §9.
6. Does it hold up out-of-sample?
Finding
The parameters are taken verbatim from the source specification (nothing was fit here), so an in-sample / out-of-sample split asks “does the same fixed rule keep working on unseen data?” On the S&P 500 the edge is flat-to-improving out-of-sample: the win rate holds and the profit factor rises.
| Metric | SPY in-sample (2005–17) | SPY out-of-sample (2018–26) |
|---|---|---|
| Trades | 558 | 539 |
| Win rate | 64.9% | 64.9% |
| Profit factor | 1.55 | 1.61 |
| CAGR (ex-interest) | +3.13% | +5.40% |
| Max drawdown | −12.75% | −13.36% |
| CAR25 (risk-normalized) | +4.16% | +6.39% |
Small-cap cross-check (next test). Applying the same rules to the Russell 2000 would test whether the edge is universe-dependent. That cross-check is not run on the split-adjusted prices used here (the Russell membership is not available in this run’s environment), so it is not reported as a current result; running it on adjusted data is the next test. Windows: in-sample 2005-01 → 2017-12, out-of-sample 2018-01 → 2026-05.
7. Can you deploy more capital?
Finding
Since the system is capital-constrained, the question is whether a tweak can put more money to work while holding the edge. No tested lever raises out-of-sample risk-normalized CAR25 above the baseline. Selectivity is the source of the edge; loosening it admits the random-like fills from §5 that return near-zero CAR25.
| Lever (out-of-sample) | Trades | Win rate | Exposure | CAR25 | vs baseline |
|---|---|---|---|---|---|
| baseline (3% limit, 2-day, 20 slots) | 528 | 64.8% | 11.8% | +6.39% | — |
| limit 2% | 908 | 61.5% | 16.6% | +4.58% | lower |
| limit 1% | 1,567 | 60.1% | 25.0% | −0.04% | to zero |
| limit 0% (at close) | 2,321 | 57.6% | 40.7% | −0.17% | to zero |
| 1 day instead of 2 | 1,363 | 61.7% | 22.9% | +2.72% | lower |
| 30 slots | 545 | 64.0% | 11.9% | +4.26% | lower |
| 40 slots | 544 | 64.2% | 12.0% | +3.14% | lower |
Two patterns: looser entry multiplies the trade count while the marginal fills mean-revert less (win rate falls to ~58–62% and risk-normalized return falls toward zero); more slots keep the 64% win rate and dilute into weaker-ranked names. The baseline parameters sit at the risk-normalized efficient point. This is the ablation that measures how the edge responds to scaling the deployed capital.
Q. If it usually holds only one or two positions, why not size each at ~33% of the portfolio instead of ~15%?
Because the one-or-two figure is an average, not a ceiling, and the per-slot size is set by the drawdown budget rather than by typical concurrency.
- Signals cluster. On a dislocation day many names reach the buy zone at once — e.g. 2026-03-20 opened FCX, XEL, NEM, BALL and O together (§10 blotter). The twenty small slots exist to absorb those bursts, and §5 shows the edge is richest exactly when dips cluster. Sizing each position at 33% caps the book at three names and forces it to forgo the clustered fills where the mean-reversion edge concentrates.
- Each slot is already sized at the drawdown ceiling. §9 fits the Bandy safe-f at a −20% DD-95 target; it pins at the 3.0× leverage ceiling, which is safe-f ÷ 20 = ~15% per slot. That fraction is calibrated so a worst-case cluster of up to twenty concurrent dip-buys still holds DD-95 inside −20%. At 33% per slot, any cluster larger than three breaches 100% deployment and blows through that drawdown budget.
- Low exposure is not timid sizing. §9’s finding is explicit — “sizing is not the binding constraint; capital deployment is.” Each slot is already as large as the drawdown budget allows; the book is simply invited in rarely. Concentrating into three larger positions swaps the breadth that catches clusters for single-name risk the safe-f sizing exists to cap — which the §7 ablation above shows lowers the risk-normalized return, not raises it.
8. Metrics by lookback
Trade-level and risk-adjusted metrics over trailing 12 / 24 / 36 months and the full period. Trade metrics are from the trade list; Sharpe and Calmar from the daily equity curve. 1R = the 3×ATR(14) risk box on the entry day (the Edge Lab Stage-2 risk unit); per-trade R is winsorized at ±10R. The R magnitudes are small because the system targets a one-week bounce, a fraction of a 3×ATR box.
| Window | Trades | Win rate | Avg win (R) | Avg loss (R) | Expectancy (R) | Sharpe | Calmar |
|---|---|---|---|---|---|---|---|
| Trailing 12m | — | 61.0% | +0.29 | −0.38 | +0.03 | 0.74 | 1.01 |
| Trailing 24m | — | 66.2% | +0.35 | −0.41 | +0.09 | 1.02 | 2.12 |
| Trailing 36m | — | 69.7% | +0.36 | −0.41 | +0.13 | 1.26 | 2.28 |
| Full (21y) | 1,096 | 65.0% | +0.34 | −0.43 | +0.07 | 0.64 | 0.30 |
9. Sizing out-of-sample
Finding
The position-sizing fraction is itself a fitted choice, so it gets its own out-of-sample test, separate from the ruleset pivot. Fitting the platform Bandy safe-f (the drawdown-constrained fraction, −20% target) on the pre-2018 trades and re-fitting on the post-2018 trades, the fraction pins at the leverage ceiling in both windows and the drawdown-95 stays well inside the −20% target. The finding: sizing is not the binding constraint — capital deployment is. The book is so rarely invested that even the ceiling fraction does not reach the drawdown limit, so the fitted size carries forward trivially.
| Window | safe-f | per slot (of 20) | CAR25 | DD-95 at safe-f |
|---|---|---|---|---|
| sizing in-sample (≤2017) | 3.0× (ceiling) | 15% | +4.2% | −11.6% |
| sizing out-of-sample (2018+) | 3.0× (ceiling) | 15% | +6.6% | −12.1% |
safe-f is the portfolio-aware Bandy fraction on the per-trade return stream, sized as safe-f ÷ 20 per slot. It pins at the analytics ceiling because the deployed slice is small; the drawdown constraint is slack. The equity curve as tested is un-levered. CAR25 excludes cash interest.
10. Exposure & holding period
This is the system’s defining trait. It holds a position on only ~38% of trading days and deploys ~8% of capital on average; the rest sits in cash. Measured over time-in-market only, the deployed slice compounds at ~11%/yr — the raw 4.0% CAGR is the in-market rate diluted by the cash it holds, not a weak signal.
| Metric | Full (21y) | In-sample (≤2017) | Out-of-sample (2018+) | Buy & hold |
|---|---|---|---|---|
| Time in market (days holding ≥1 position) | 37.8% | 22.0% | 62.5% | 100% |
| Average dollar deployment | ~7.7% | — | — | 100% |
| CAGR (raw, 0% on cash) | 4.02% | 3.13% | 5.43% | 10.88% |
| Exposure-adjusted return (in-market only) | ~11.0% | ~15.1% | ~8.8% | = CAGR |
| Average holding period | 3.9 trading days (median 4) | — | ||
Exposure-adjusted return credits the in-market rate only while capital is deployed and 0 in cash, isolating whether the in-market periods are productive from the cash drag. The buy-and-hold exposure-adjusted return equals its CAGR (always invested).
Recent trade blotter — most recent closed positions
| Symbol | Entry | Exit | Hold (td) | Entry $ | Exit $ | Profit % | Profit $ | Equity at exit | Exit reason |
|---|---|---|---|---|---|---|---|---|---|
| UPS | 2026-03-06 | 2026-03-12 | 5 | 100.95 | 97.89 | −3.0% | −$694 | $227,256 | time |
| VTRS | 2026-03-06 | 2026-03-12 | 5 | 13.87 | 13.70 | −1.2% | −$287 | $227,256 | time |
| BDX | 2026-03-06 | 2026-03-12 | 5 | 163.80 | 159.63 | −2.5% | −$583 | $227,256 | time |
| COO | 2026-03-09 | 2026-03-13 | 5 | 74.25 | 69.92 | −5.8% | −$173 | $227,192 | time |
| FCX | 2026-03-20 | 2026-03-23 | 2 | 52.01 | 54.94 | +5.6% | $1,277 | $229,310 | target |
| XEL | 2026-03-20 | 2026-03-24 | 3 | 77.14 | 77.96 | +1.1% | $240 | $230,010 | target |
| NEM | 2026-03-20 | 2026-03-25 | 4 | 96.22 | 101.52 | +5.5% | $1,250 | $231,098 | target |
| BALL | 2026-03-20 | 2026-03-25 | 4 | 57.22 | 60.75 | +6.2% | $1,401 | $231,098 | target |
| O | 2026-03-20 | 2026-03-26 | 5 | 60.23 | 59.75 | −0.8% | −$183 | $231,176 | time |
| HII | 2026-03-30 | 2026-04-01 | 3 | 370.34 | 393.32 | +6.2% | $1,425 | $232,601 | target |
| T | 2026-04-13 | 2026-04-16 | 4 | 25.67 | 26.40 | +2.9% | $665 | $233,266 | target |
| LMT | 2026-04-22 | 2026-04-28 | 5 | 554.79 | 512.29 | −7.7% | −$1,785 | $231,481 | time |
| PH | 2026-05-01 | 2026-05-06 | 4 | 882.14 | 902.66 | +2.3% | $534 | $234,491 | target |
| MRNA | 2026-05-01 | 2026-05-06 | 4 | 44.56 | 48.79 | +9.5% | $2,194 | $234,491 | target |
| APH | 2026-05-05 | 2026-05-11 | 5 | 136.80 | 122.47 | −10.5% | −$2,422 | $231,787 | time |
| TFC | 2026-05-13 | 2026-05-15 | 3 | 46.36 | 46.96 | +1.3% | $302 | $232,089 | eod_close |
11. What this means for you
| Question | Answer from 21 years of data |
|---|---|
| Does the signal separate from random? | Yes — clears 100% of cash-matched random monkeys, in and out of sample. |
| Does it hold up out-of-sample? | On the S&P 500 yes (profit factor 1.55 → 1.61, win rate flat). The small-cap cross-check is a pending next test. |
| Does it beat buying the index? | Below the index on raw return; the deployed slice compounds at ~11%/yr at ~8% capital risked and ¼ the drawdown. |
| Can you scale it up? | No tested lever that deploys more capital raises out-of-sample CAR25. Selectivity is the source of the edge. |
| What kind of tool is it? | A low-beta, cash-heavy return stream with an out-of-sample edge on large caps — a portfolio sleeve that leaves the rest of the book free. |
The measured fit is a low-correlation, capital-light sleeve: it earns on the small slice it risks and leaves the rest of the book free (for cash yield or other strategies), and the next step is combining this uncorrelated stream with other systems at the portfolio level. As a standalone return engine it is structurally under-deployed; on its own the deployed slice compounds at the rates shown above.
12. Costs, slippage & cash
Finding. Commissions are negligible; slippage is the cost that matters for this system; and the cash the book holds is a return the reader must supply. The fill model (§1) books a limit touch 3% below the close — the frictions below sit on top of that.
Commissions — modeled, negligible
We modeled Interactive Brokers Pro (tiered) US-stock pricing — $0.0035/share, a $0.35 order minimum, a 1%-of-trade cap, plus the SEC Section 31 fee and FINRA TAF on sells — applied to all 1,096 round trips at three account sizes. The drag is a fraction of the raw return:
| Account | Commissions / yr | Drag on full-book CAGR | Drag on the deployed slice |
|---|---|---|---|
| $100,000 | ~$80 | ~8 bps/yr (4.02% → 3.94%) | ~1.0%/yr (~11.0% → ~10.0%) |
| $500,000 | ~$370 | ~7 bps/yr (4.02% → 3.95%) | ~0.9%/yr |
| $1,000,000 | ~$740 | ~7 bps/yr (4.02% → 3.95%) | ~0.9%/yr |
Large-cap names at $50–900/share make IB’s per-share fee a fraction of a basis point per order; the $0.35 minimum only nicks the $100K book. We modelled IBKR Pro commissions; they lower CAGR by under ~10 bps/yr at every size — covered, and immaterial.
Slippage — the cost that matters
The book turns its deployed slice over roughly 65× a year (1,096 trades, ~3.9-day holds, ~8% average deployment), so a few basis points per fill compound into a real number on the capital actually at work. Slippage is also structurally adverse here: entries are limit-buys into selling exhaustion, and the dips cluster in dislocations (2008, 2011, 2020) where spreads widen 2–5×. A flat assumption understates the tail. The sensitivity, before any regime weighting:
| Slippage / side | Round trip | Drag on full-book CAGR | Drag on the deployed slice (~11%) |
|---|---|---|---|
| 0 bps | 0 bps | 4.02% (as tested) | ~11.0% |
| 2.5 bps | 5 bps | −26 bps (→ ~3.76%) | −3.2 pp (→ ~7.8%) |
| 5 bps | 10 bps | −51 bps (→ ~3.51%) | −6.4 pp (→ ~4.6%) |
| 10 bps | 20 bps | −103 bps (→ ~2.99%) | −12.8 pp (deployed edge gone) |
At 5 bps/side — an ordinary spread-cross on a liquid large cap — slippage costs the deployed slice ~6.4 pp/yr, more than half the ~11% it earns; at 10 bps/side the deployed edge is gone. Because the entries fire in dislocations, the realistic figure sits at the high end. The deployed-slice return must be read with a slippage assumption attached; a regime-conditional stress (slippage scaling with VIX or recent ATR) is the next test before that number is taken at face value.
Cash is a return you must add
The headline CAGR is computed at 0% on idle cash, and the book sits in cash roughly 90% of the time. We deliberately do not credit a historical cash yield — the rate varies by period and by how the reader manages the balance. For a complete picture you must add the true cash return on the ~90% idle balance at the rate you would actually earn over the period. That cash return is additive to the deployed-slice edge above and, at money-market rates, is the larger of the two components of total return.
Fill realism — does the edge survive the limit-fill assumptions?
For a short-term limit-order system the fill model is the strategy, so we stress it directly on the OOS (2018–26) decision window against the canonical Bandy CAR25. Three cuts, and they point in opposite directions — the honest answer is two-sided.
1. Deterministic frictions leave the edge intact-to-improved. Modelling realistic fills — a gap-down that opens through the limit fills at the open, and a mere touch that does not trade ~10 bps through is assumed to miss the queue — plus a 5 bps/side cost does not degrade the result. Gap-open price improvement outweighs the queue misses and the cost; the headline is conservative in the direction that matters, not optimistic.
| OOS fill model | N | Win% | CAGR | CAR25 |
|---|---|---|---|---|
| optimistic (fill on any touch) | 539 | 64.9% | 5.4% | 6.7% |
| realistic (gap-open + 10 bps queue) | 515 | 65.6% | 6.2% | 8.1% |
| + 5 bps/side cost | 511 | 64.6% | 5.5% | 6.9% |
Robust to the queue assumption: sweeping the buffer 0 / 10 / 25 bps holds CAR25 at 9.1 / 8.1 / 6.9% — at or above the optimistic 6.7% across the range. On the full 2005–26 window the same ordering holds (optimistic CAR25 5.1%, realistic 6.1%, +both 5.4%).
2. The vulnerability is the fill rate, not the fill price. If only a fraction of the orders that clear the queue actually execute (you sit deep in the book), the edge compresses fast. Twenty seeds per level, median [min–max]:
| Share of clear-the-queue touches filled | CAGR | CAR25 |
|---|---|---|
| 100% | 6.2% | 8.1% |
| 75% | 4.6% [3.4–5.4] | 5.6% |
| 50% | 3.1% [2.2–4.0] | 3.8% |
At a 75% fill rate CAR25 already falls below the optimistic 6.7% — the edge depends on actually getting your fills.
3. The fills you get are adversely selected. Measuring the forward 5-day return from the same limit price, the touches that miss the queue returned +3.70% (87% win) versus +0.64% (64% win) for the touches that fill — a −3.06 pp selection gap. A limit buy fills precisely when the move keeps going against it and misses the touch-and-reverse bottoms. The realistic model nets ahead (cut 1) only because gap-open price improvement offsets this adverse selection; the two nearly cancel.
Bottom line. The optimistic headline is not inflated in the feared direction — realistic pricing is, if anything, slightly favourable. The genuine sensitivities are fill rate (random partial execution below ~75% pushes CAR25 under the baseline) and a structural adverse selection masked by gap-open improvement. A live paper-fill reconciliation since publication is the next test that would settle the fill rate from data rather than assumption.
13. Constraints
- Fill model. The headline books a limit fill whenever the next day’s low touches the 3%-below price. §12 “Fill realism” stress-tests this — gap-through pricing, a queue buffer, a 5 bps cost, partial fills, and the filled-vs-missed selection. The deterministic frictions leave the edge intact-to-improved; the genuine sensitivities are fill rate and adverse selection. Real fills are fewer than the optimistic count (515 vs 539 OOS).
- Costs, slippage and cash are modeled in §12. Commissions are negligible (<~10 bps/yr); slippage is the binding friction because entries fire in dislocations and the deployed slice turns over ~65×/yr; the headline runs at 0% on idle cash, which the reader supplies at their own rate on the ~90% idle balance.
- “20 slots at 10% each” rarely binds in practice (average < 1 position held).
- The Russell 2000 cross-check has not been re-run on split-adjusted prices in this run (membership unavailable in this environment) — reported as a pending next test, not a current result.
- Prices are split- and dividend-adjusted (ratio reconstruction from adjusted close).
14. Code, data & artifacts
Everything behind the result. The survivorship-bias-free dataset and its loader stay internal to the research platform; the engine code, the daily trade logs, and the derived CSVs — listed below so you can see exactly what is included — are available as a single package on request.
Get the artifact package. The engine plus the baseline, out-of-sample, and CAR25 drivers, and every CSV behind the numbers. Enter your email and we send a personal download link (valid 30 days).
| run_date | 2026-06-06 |
| data_as_of | 2026-05-15 (last bar) |
| test window | 2005-01-03 → 2026-05-15; ruleset OOS pivot 2025; robustness split 2018; sizing pivot 2018 |
| universe | SBF S&P 500 membership, resolved per trading date (~959 ever-members) |
| prices | EOD, split- & dividend-adjusted (ratio reconstruction from adjusted close) |
| engine | prototype.py (NDX10 dip-buy); develop_eval.py (monkey + levers); run_oos.py (IS/OOS); car25_eval.py; refinements.py |
| rerun mode | Re-fit — prices corrected to split-adjusted |
| source | Glenn Osborne, HV-RSI specification (January 2025) |
Code
- prototype.py engine: data prep (adjusted), short-term-low signal, 3%-limit entry, 20-slot portfolio
- develop_eval.py cash-matched monkey (skill) + capital-efficiency levers, canonical CAR25
- run_oos.py in-sample / out-of-sample split
- car25_eval.py risk-normalized CAR25 / safe-f via canonical Bandy
- refinements.py sample chart, blotter, metrics-by-lookback, sizing OOS, exposure
- friction_pass.py cost + realistic-fill anchors with canonical CAR25 (§12)
- fill_realism.py queue-depth sweep, partial-fill seed distribution, adverse-selection cut (§12)
Data (CSV)
- System equity curve daily equity, positions, cash
- Metrics by lookback win rate, R distribution, Sharpe, Calmar
- Recent trade blotter most recent 25 closed positions
- Buy & hold SPY equity benchmark daily equity
- Monthly returns (heatmap data)
- Rolling 12m by regime system vs SPY per index-return bucket
Full report
Reproduce
# Head-to-head (split-adjusted)
python prototype.py --index SPY_SandP_500 --start 2005-01-01 --end 2026-05-15
# In-sample / out-of-sample split
python run_oos.py --index SPY_SandP_500
# Cash-matched monkey + capital-efficiency levers (40 seeds/window)
python develop_eval.py --monkey-seeds 40
# Risk-normalized CAR25 / safe-f + refinement artifacts
python car25_eval.py
python refinements.py