Money
$10,000
Gold
5

Alpha Arena: Six AI Models, $60K Live on Perps — Who’s Winning?

Jane Savitskaya

Six AIs. $60k. One trading arena.

That’s the setup behind Alpha Arena — a live experiment by New York–based engineer Jay Azhang, who decided to put today’s smartest AI models where it really hurts: the markets.

Each model gets $10K in real money to trade crypto perpetuals on Hyperliquid. No fake data, no paper trading. Just raw code trying to outsmart the chaos of BTC, ETH, and a few other volatile tokens. 

The idea is simple but bold: if billion-dollar AIs can supposedly “predict everything,” let’s see if they can survive the most unpredictable thing of all — the market.

Setup: How Alpha Arena works

Here’s what you need to know:

  • Six major AI models each receive US$10,000 of live capital for this competition (so total pool = $60K).

  • They trade perpetual futures (“perps”) on the crypto exchange Hyperliquid across major assets: BTC, ETH, SOL, BNB, DOGE, XRP.

  • All models begin with identical prompts and the same dataset: price/volume data, market history, etc. The idea is fairness and comparability. 

  • The contest is live, transparent, and public: you can view open positions for each model on nof1’s leaderboard. 

  • The goal: maximize returns while managing risk. Each model chooses its own strategy: when to enter, what assets to choose, what leverage to use, and when to exit. Humans do not interfere during trades. 

Leaderboard, performance, and strategies

Here are the six models in the ring, how they’re doing, and what kind of plays they’re making (based on publicly reported data).

All numbers are snapshots from recent coverage of the Alpha Arena on nof1.ai.

Model Latest Account Value* Approx ROI Strategy & Assets
DeepSeek V3.1 ~$13,800 +38% Aggressive.

Long positions with high leverage (~15×) in ETH & SOL. Also trades BTC, DOGE, BNB; small loss on XRP reported.

Grok 4 ~$13,400 +35% Strong momentum player.

Similar asset mix to DeepSeek; noted for good “contextual awareness of market micro-structure.”

Claude Sonnet 4.5 ~$12,500 +25% Conservative than the top two.

Fewer open positions, slower pace; noted mostly long ETH & XRP, and some BNB.

Qwen3 Max ~$10,900 +9% Modest performance.

Still positive but not capturing the upside. Trades less aggressively.

GPT‑5 (ChatGPT) ~$7,300 –27% Struggled so far.

Mix of long and short positions didn’t pay off. Volatility caught it off guard.

Gemini 2.5 Pro ~$6,800 –32% The weakest so far.

Early short bias (betting down) flipped to longs too late; timing hurt results.

Screenshot from Nof1.ai

Quick takeaways from the strategies

  • The winners (DeepSeek, Grok) leaned into long, leveraged trades during market upticks. That paid off.

  • Claude kept it steadier: fewer trades, less leverage, which means less upside but also less risk.

  • Qwen is playing it safe.

  • GPT-5 and Gemini seemed to mis-time the action: either too cautious, or too early/late on reversals.

Also worth noting: some models made many trades (e.g., Gemini ~15 trades/day) while others (Claude) executed only a few big moves.

Why it matters (and what to watch)

This experiment isn’t just a cool demo. It signals something deeper about the future of AI in trading.

  • When general-purpose AI models start making meaningful P&L in real markets, that shakes up the playbook.

  • But a big caveat: a few days of gains don’t guarantee long-term performance. Market regimes change.

  • If one or two models dominate for weeks, you’ll see copy-trading, ETF products, hedge funds chasing them. In fact, following DeepSeek is already a strategy some retail players use.

  • On the flip side: if many models trade the same way (same prompts, same data), their collective actions could move markets — reflexivity becomes real.

What traders can actually learn from Alpha Arena

Watching six multimillion-parameter models go long and short like caffeinated hedge fund interns isn’t just entertaining — it’s oddly educational. The Alpha Arena experiment offers a few helpful takeaways that human traders (and bot builders) can actually use.

1. Risk management beats raw IQ

DeepSeek and Grok aren’t winning because they’re “smarter” — they’re winning because they follow consistent rules. Position sizing, stop-loss placement, and not panicking on noise. Meanwhile, Gemini and GPT-5 show what happens when even a genius model ignores discipline. And that’s when every disciplined trader quietly mutters, “Told you so.”

2. Trade fewer, but smarter

Claude isn’t topping the charts, but it’s positive — mostly because it trades less. Overtrading kills performance, whether you’re a person or a transformer network. Quality setups >>> constant action.

3. Diversify, but don’t scatter

Top performers keep exposure to 2–3 main assets (ETH, SOL, BTC) and rarely chase every shiny coin. That balance between focus and flexibility is worth stealing.

4. The edge is still in execution

Grok’s micro-timing shows how much tiny delays or sloppy entries cost over time. Humans can’t think as fast, but they can automate order precision, backtest entries, and tighten execution routines.

5. Prompt engineering = strategy design

Every AI in Alpha Arena uses its own logic — momentum, mean reversion, scalping. For traders, that’s a reminder: the framework matters more than the forecast. Define your system, not your hunch.

6. You can’t copy results blindly

Even if you tried to mimic DeepSeek’s moves, you’d still face slippage, latency, and different risk tolerance. Use Alpha Arena as inspiration, not a copy-paste guide.

Bottom line: AI isn’t a shortcut to easy money. It’s a mirror, showing how structure, discipline, and adaptability pay off. If traders borrow those habits instead of chasing signals, they’re already trading smarter than half the market.

Which AI can you actually trust with your money?

Short answer: none completely.
Long answer: some more than others.

The Alpha Arena results make one thing clear: even the sharpest AI can go from hero to margin call in a week. DeepSeek and Grok look brilliant now, but the same logic could underperform in a sideways market or during a sudden BTC dump. AI doesn’t “learn” risk tolerance, it just executes it.

If you’re thinking about letting AI trade for you, think of it like hiring a pilot who occasionally hallucinates clouds. You still need to watch the dashboard.

Here’s how to approach it smartly:

  • Start small. Don’t hand over your full stack to any bot — test, observe, and scale gradually.
  • Use oversight tools. Platforms like 3Commas and Cryptohopper let you automate strategies while keeping control of risk settings.
  • Experiment, but verify. Even ChatGPT Agent, which we recently tested against traditional trading bots, works best as a decision-support tool, not a set-and-forget solution.

That comparison, ChatGPT Agent vs Cryptohopper vs 3Commas, dives into exactly this question: how much control you should really give the machine. Alpha Arena just adds a live-money layer to the same debate.

So, can you trust AI with your money?

Maybe. But only if you’re ready to supervise it like a hawk, or at least like a trader who’s been burned before.

Previous
decor

Vuk Martinovic

Are NFTs Still a Thing in 2025? What Happened to NFT Prices, Art, and Investing

decor

Vuk Martinovic

Crypto Casinos Explained: How They Work, Legal Status, and Key Risks

decor

Jane Savitskaya

Aster, Avantis and Lighter: New Kids On the Perp DEX Block