MEMBER EXCLUSIVE

Elon Musk says xAI is targeting 50 million 'H100 equivalent' AI GPUs in five years — 230k GPUs, including 30k GB200s already reportedly operational for training Grok

Four banks of xAI's HGX H100 server racks, holding eight servers each.
(Image credit: ServeTheHome)

Leading AI companies have been bragging about the number of GPUs they use or plan to use in the future. Just yesterday, OpenAI announced plans to build infrastructure to power two million GPUs, but now Elon Musk has revealed even more colossal plans: the equivalent of 50 million H100 GPUs to be deployed for AI use over the next five years. But while the number of H100 equivalents looks massive, the actual number of GPUs to be deployed may not be quite as great. Unlike the power they will consume.

50 ExaFLOPS for AI training

Swipe to scroll horizontally
Nvidia enterprise GPU roadmap

Year

2022

2023

2024

2025

2026

2027

Architecture

Hopper

Hopper

Blackwell

Blackwell Ultra

Rubin

Rubin

GPU

H100

H200

B200

B300 (Ultra)

VR200

VR300 (Ultra)

Process Technology

4N

4N

4NP

4NP

N3P (3NP?)

N3P (3NP?)

Physical Configuration

1 x Reticle Sized GPU

1 x Reticle Sized GPU

2 x Reticle Sized GPUs

2 x Reticle Sized GPUs

2 x Reticle Sized GPUs, 2x I/O chiplets

4 x Reticle Sized GPUs, 2x I/O chiplets

FP4 PFLOPs (per Package)

-

-

10

15

50

100

FP8/INT6 PFLOPs (per Package)

2

2

4.5

10

?

?

INT8 PFLOPS (per Package)

2

2

4.5

0.319

?

?

BF16 PFLOPs (per Package)

0.99

0.99

2.25

5

?

?

TF32 PFLOPs (per Package)

0.495

0.495

1.12

2.5

?

?

FP32 PFLOPs (per Package)

67

67

1.12

0.083

?

?

FP64/FP64 Tensor TFLOPs (per Package)

34/67

34/67

40

1.39

?

?

Memory

80 GB HBM3

141 GB HBM3E

192 GB HBM3E

288 GB HBM3E

288 GB HBM4

1 TB HBM4E

Memory Bandwidth

3.35 TB/s

4.8 TB/s

8 TB/s

4 TB/s

13 TB/s

32 TB/s

GPU TDP

700 W

700 W

1200 W

1400 W

1800 W

3600 W

CPU

72-core Grace

72-core Grace

72-core Grace

72-core Grace

88-core Vera

88-core Vera

Swipe to scroll horizontally

GPU Model

TFLOPS (Dense)

Power per GPU (W)

GPUs Needed

Total Power (GW)

H100

1,000

700

50,000,000

35

B200

2,400

1,200

20,833,333

25.00

B300

4,800

1,400

10,416,666

14.58

Rubin

9,600

1,800

5,208,333

9.37

Rubin Ultra

19,200

3,600

2,604,166

9.37

Feynman

38,400

?

1,302,083

4.685 (?)

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • Bikki
    Thanks for the anslysis, very well done.
    Sam with the power infra and Musk with compute infra may meet at each other’s ends in 2029.
    Reply
  • joeer77
    If you don't own Nvidia stock, you are missing the boat!
    Reply
  • vanadiel007
    100 million, 200, 700 mllion!

    And we still will have no intelligence in AI. In my opinion they are a glorified data search engine.
    The only accomplishment I can see them make is dethrone Google as a search engine.
    Reply
  • Gururu
    Dollars to donuts in that timeframe we will have tech (hardware/software if not completely hardware) running at least tenfold more efficient as "H100 equivalents".
    Reply
  • bigdragon
    No wonder my electricity, water, and gaming costs keep increasing. All these hardware improvements should help Grok hallucinate in a fraction of the amount of time it takes now! I'm so excited to have it unreliably perform all the same tasks Siri can do with 1,000,000x the environmental impact!
    Reply
  • jp7189
    16 bit performance hasn't changed in quite a while. The only things changing are the <mostly marketing> tricks used to report the numbers. Apples to apples it shakes out like this (source techpowerup):

    H100 sxm= 267.6 TFLOPS
    H200 sxm= 267.6
    B200 sxm= 248.3*

    *These are single die numbers, B200 has 2x dies per package, so while the package is a good bit faster, the underlying point is the same; Nvidia doesn't have any fundamentally ground breaking new performance now or anywhere in sight.
    Reply
  • JRStern
    Gururu said:
    Dollars to donuts in that timeframe we will have tech (hardware/software if not completely hardware) running at least tenfold more efficient as "H100 equivalents".
    We (or they) can already build today's chatbots about 1000x faster than five years ago.
    By next year they'll add another 10x at least when all the B200/B300s are up and running.
    There are several different hardware and software improvements kicking around now that claim another 10x or better, each.
    Basically by Christmas the world will have all the GPUs they'll need for the next ten years.

    By 2035 high school kids will build their own chatbots with components from hobby lobby and their own smartphones.
    Reply
  • JRStern
    jp7189 said:
    16 bit performance hasn't changed in quite a while. The only things changing are the <mostly marketing> tricks used to report the numbers. Apples to apples it shakes out like this (source techpowerup):

    H100 sxm= 267.6 TFLOPS
    H200 sxm= 267.6
    B200 sxm= 248.3*

    *These are single die numbers, B200 has 2x dies per package, so while the package is a good bit faster, the underlying point is the same; Nvidia doesn't have any fundamentally ground breaking new performance now or anywhere in sight.
    The addition of more HBM helps overall throughput.
    So does faster networking.
    So do FP8 and FP4.
    There are several other improvements kicking around. FP16 may not change but there is much else that can improve, a lot.
    Reply
  • JRStern
    I mean, Musk talks a lot, and no doubt retains the option of changing his mind.
    Altman's megalomaniac dreams and fears - are apparently fully present in Musk as well.
    HOWEVER the idea that AI means Scale means AI, has pretty much already failed.
    Musk can spend every penny he can lay hands on buying GPUs and get himself in the history books as the craziest dude in history.
    He may already be up for that with Starship and his Mars project, which is already on the edge of flaming out. There's no purpose in it, Mars is a poor destination, but it will cost at least 3x what Musk has imagined to complete it. Is it worth that?
    I salute Musk's madness for trying, with so little analysis. Full employment for thousands of engineers and craftsman. Better than just building giant yachts and stuff.
    Musk's SpaceX is a great success, though they had to hose him down a few times so they could put in the discipline to make it work.
    Musk's Tesla gave us the modern EV ten years before it would otherwise have arrived. Whether that is a good thing or not, I can't say.
    Musk bought Twitter and cleaned it up like Hercules and the Augean stables, a story for all time, even if he blew an extra $20b on it simply because he ran his mouth - that's a story for all time, too.
    And he got Trump elected or we'd now have President Kamala, you can rate that as you like.
    Crazy old world, ain't it.
    But the whole world doesn't need 50 million H100s now or ever.
    Better he build a ten gigawatt Magic 8 Ball.
    Reply
  • dalek1234
    3600 Watts per GPU in 5 years? Does performance-per-watt no longer matter?
    Reply