← Blog

I Traded Crypto With a Genetic Algorithm for 6 Weeks — Brutal Honest Post-Mortem

live-tradinggenetic-algorithmpost-mortembinance-futuresresults

Six weeks ago the Darwin Lab bot went live on Binance Futures with $100 of real capital. As of today, the wallet sits at $104. That is a +4% return. Sounds underwhelming. The story underneath that number is worth telling.

Numbers pulled from Binance Futures /fapi/v1/income API (verifiable at /api/stats.json):

The fee line is the entire story. $44 in fees against $56 gross — that is 78% of gross profit gone to the exchange. We will get back to this.

How the System Works (Without the Secret Sauce)

Darwin Lab is built around three ideas that are not new in academic literature but are rarely implemented together in a working system:

1. Genetic algorithm strategy evolution. We start with a population of candidate trading strategies. Each strategy is a set of parameters: indicators (BBands, RSI, MACD, Divergence, ATR-based), timeframes (1m to 4h), direction bias, leverage, and position sizing rules. The genetic algorithm evolves this population across multiple generations. Strategies that trade well on historical data survive and reproduce. Strategies that fail die. After convergence, the survivors are the "champions."

2. Live paper arena for forward selection. Here is where most algo trading projects go wrong. They backtest a strategy, it looks great, they deploy it, it bleeds. The reason is simple: a backtest cannot simulate the real world — slippage, partial fills, changing liquidity conditions, regime shifts. Our solution: every evolved champion competes in a paper arena that runs on real-time market data every 2 minutes, before any real money is involved. Promotion to the live roster requires ≥15 arena trades, positive forward PnL, ≥25% WR, and PF ≥ 1.2. Backtest performance is never used for live selection.

3. Multi-regime classification. The market behaves differently in bull, bear, range, and crash conditions. A strategy that works in strong_bull can destroy capital in strong_bear. The system classifies market regime continuously (seven states: strong_bull, weak_bull, neutral, range, weak_bear, strong_bear, crash) and adjusts entry sizing (0.40× in crash, 1.20× in strong_bull). More importantly, hostile regime blocks new entries entirely.

What Went Wrong

Fees: the silent killer

The bot runs many small positions. At 0.05% taker fee per side on Binance Futures, a $20 position opened and closed costs $0.02 in fees. Sounds trivial. At scale — 1,178 trades averaging $25 notional — it compounds fast.

We caught this through a forensic audit (what we internally call a "fees drag" analysis). The fix is maker-only order routing: placing limit orders instead of market orders reduces fees to ~0.02% per side instead of 0.05%. That single change could cut annual fee drag by ~60% at current trade volume. Implementation is in progress.

Lesson: on any high-frequency small-wallet strategy, model your fees before you model your alpha.

The phantom loss loop (Pitfall #187)

When the system held multiple DCA entries for the same pair, the position reconciliation compared Binance's single aggregate position against each individual entry. This created phantom losses — the system "thought" each entry was underwater relative to the whole position — and triggered re-arm loops that executed ~67 times per day at peak.

The state cumPnL counter became useless. We now exclusively use the Binance income API as the ground truth for PnL. The fix was a two-line change: sum all entries per pair before reconciliation.

Lesson: aggregate vs granular mismatches in state management are a silent bleed vector. Audit your reconciliation logic.

The substring stop-loss bug (Pitfall #200 + #201)

This one is embarrassing to write. A naked substring match:

if "STOP" in order_type:
    cancel_order(...)

This matched both STOP_MARKET (the stop-loss) and TRAILING_STOP_MARKET (the trailing runner). Every time the system tightened a trailing stop or moved a stop to break-even, it was silently cancelling the trailing mechanism in the same pass.

A counterfactual backtest replay confirmed it: 157 of 175 positions with trailing armed never had the trailing stop fire. That is an 89.7% kill rate. The entire moonshot runner capture system was dead for an extended period without obvious log output.

Fix: strict equality order_type == "STOP_MARKET". Two characters changed. Expected weekly EV recovered: ~$10/week at current volume.

Lesson: substring matching on enum-like strings is a category-1 bug in financial code. Always use strict equality or an enum.

Frankenstein — 82% backtested WR → 49% live WR

Before the current system, we deployed a strategy called Frankenstein internally (ID: eb686789). It was a 5-minute BTC scalper with 82-86% backtested win rate across multiple windows.

After three weeks live, win rate: 49%.

The diagnosis: the 5-minute timeframe amplifies microstructure noise. The strategy was fit to historical patterns that do not generalize — classic regime-specific overfit. We killed it, improved the k-fold walk-forward validation to require profitability in 3/5 folds, and added niche forcing that pushes 30% of each evolution generation to explore non-obvious parameter regions.

See: Why We Killed Our Best Backtested Strategy

What Is Working

Forward selection is the only selection

The paper arena has now evaluated 3,400+ strategy variants. The live roster selects the top 10 based purely on forward performance. Of the current 10 live champions, 8 have 81–97% win rates on recent arena sessions. The 53% arena-wide WR is the signal quality floor before filtering.

Hardware stop-losses survive crashes

Every live position has a STOP_MARKET order placed directly on the exchange — not a software stop, not a conditional. This order exists on Binance's matching engine and executes even if the bot is offline. A separate watchdog cron checks every 5 minutes for positions without hardware SLs and re-arms any it finds.

Regime gating blocked the bad entries

We do not have a precise counterfactual for "what would have happened without regime gating" — that would require running a second live account without it. But looking at the correlation between entry blocks during weak_bear/range periods and the market conditions in March–April 2026, the regime classifier appears to have prevented a meaningful number of would-be losers from entering.

The Real Numbers, Verified

Every claim in this post can be verified:

curl https://darwintrade.com/api/stats.json

The /api/stats.json endpoint sources directly from Binance Futures income API and is updated hourly. The Kill Feed shows every closed position including losses. The Dashboard shows equity curve and drawdown.

Past performance does not guarantee future results. This is not financial advice.

Frequently Asked Questions

Q: Why share this if you're barely profitable?

Because "barely profitable after fees" with a working automated system that has caught and fixed 206+ bugs in 6 weeks is more useful information than a cherry-picked screenshot of a $5,000 win. The architecture is worth more than the current wallet size. We share the process.

Q: What's the path to actual profitability?

Maker-only order routing (cuts fee drag ~60%), sizing up once fees are controlled, and letting the genetic algorithm continue evolving better strategies. The current 63% WR with a working trailing runner at 0 fee drag would generate meaningful returns at scale.

Q: Is this open source?

No. The specific fitness weights, authority formula, R-multiple tiers, and champion DNA structure are proprietary. The high-level architecture (genetic algorithm + multi-regime + XGBoost gate + forward arena) is described conceptually without implementation detail.


Live signals on Telegram FREE channel. Stats always at /proof.

Risk disclaimer: Trading futures involves substantial risk of loss. Past performance is not indicative of future results. Full disclaimer →

← Back to Blog