Skip to content

Why graphics cards for AI and crypto currency?

Featured Replies

I know the answer from Ai, but I don’t know the reason traditional processors can’t out perform a Nvida graphics card with 4 times the cores.

I looked at on of the graphics cards and it was $5,000.

But I think there is a demand here. If someone can utilize the traditional processors for better parallel processing than Nvida they would save the pc industry.

Of course I am not on board with Ai. It makes changes to the computer while you program with it. But I can’t ignore it. I just don’t understand why traditional processors can’t be as efficient.

8 hours ago, Trurl said:

I know the answer from Ai, but I don’t know the reason traditional processors can’t out perform a Nvida graphics card with 4 times the cores.

I looked at on of the graphics cards and it was $5,000.

But I think there is a demand here. If someone can utilize the traditional processors for better parallel processing than Nvida they would save the pc industry.

Of course I am not on board with Ai. It makes changes to the computer while you program with it. But I can’t ignore it. I just don’t understand why traditional processors can’t be as efficient.

It’s not 4x the cores

My mac’s CPU has 8 cores; I think their top of the line these days is 24.

According to Nvidia, a GPU has thousands. One thousand is more than 40x. (40 x 24 = 960)

https://blogs.nvidia.com/blog/why-gpus-are-great-for-ai/

So “thousands” would be ~100x the number of cores

1 hour ago, swansont said:

I think their top of the line these days is 24.

Not really. AMD Ryzen Threadripper PRO 9995WX has 96/192.

https://www.cpubenchmark.net/cpu.php?cpu=AMD+Ryzen+Threadripper+PRO+9995WX&id=6693

1 hour ago, swansont said:

According to Nvidia, a GPU has thousands. One thousand is more than 40x. (40 x 24 = 960)

H100 i.e. what is used by ChatGPT has 16896 cores.

A100 has 6912 cores.

9 hours ago, Trurl said:

I looked at on of the graphics cards and it was $5,000.

Gfx for $5k is too weak. Such cards are used for bitcoin mining.

ChatGPT's A100 is for $10-20k and H100 is for $30-$40k

>= 30k of such A100/H100 is used simultaneously..

10 minutes ago, swansont said:

Are those used in Macs, as I had specified?

The best one as of 2026 from Mx line is this one:

https://www.cpubenchmark.net/cpu.php?cpu=Apple+M5+Max+18+Core&id=7231

Total Cores: 18 Cores, 18 Threads

Primary Cores: 6 Cores, 6 Threads, 2.0 GHz Base, 4.6 GHz Turbo

Secondary Cores: 12 Cores, 12 Threads, 3.0 GHz Base, 4.3 GHz Turbo

It looks poor in comparison to multi-thread charts of the best AMD and Intel. 3x slower (173k vs 58k rank)

M5 only wins in single-thread charts. https://www.cpubenchmark.net/singleThread.html

4 hours ago, swansont said:

I think you've got the wrong page, because here we see some kind of carrot grater... -p

GRATER.png

ps. But seriously M5 Max (release date March 2026) is newer and faster than M2 Ultra (released 2023)..

4 hours ago, swansont said:

“M2 Ultra chip for phenomenal performance 24-core CPU”

“Think differently” (which I interpret as: "think like an idiot")

Search for M5 and M2 on this list:

(they are in 1/3 of the page)

(just a few lines below M5 there is Intel for $420 and a bit below is AMD Ryzen 9 for $400)

  • Author

I purchased 2 Xeon workstations for $800 each. They have 2 processors at 6 cores each. So that’s 12 cores.

I am way behind on modern computers because I find the new computers lack useful software. That is my opinion that modern software is not useful. I know there is a lot of free software, but the user experience for purchased software is bad.

But my figuring was to use the Xeon servers to crunch numbers. But a web search leads to AI and it says graphics cards are for bitcoin and Ai because the cores process smaller matrices faster.

My question is why can’t a Xeon 12 cores crunch better than a graphics card? I’m sure Intel is wondering the same thing. There is a market for someone to solve Intel’s problem.

I would like to see a return of older computer models of design. I want a computer that does what I tell it, not where I ok it to make changes to my computer that I don’t know where it is changing.

Basically I want a Ai that can be trained on traditional processors.

Of course I don’t have the knowledge of how a CPU or GPU actually work, but maybe it could be as simple as organizing tasks. I have done a little research and parallel processing is hard to adjust in simple programs. For instance when programming a math algorithm someone posted to a search that led to Reddit, it is more efficient to let the computer it self to decide how to utilize the cores.

This again is the trend to let the computer do the thinking.

BTW

Mathematica now supports over 5 cores on a software reinstall.

Do you think I spent my $1600 on the Xeon workstations wisely? I mean I can’t afford a graphics card that costs as much as a car. And I don’t want to crunch numbers blindly. By that I mean is I can edit Mathematica. I don’t want to train a GPU with my equations, have it write a program, and have it give me a number.

Am I just old? Will future processors be built for something other than Ai?

4 hours ago, Trurl said:

Do you think I spent my $1600 on the Xeon workstations wisely?

That depends on which Xeon you bought and when..

But Xeon with 6 cores sounds now outdated and for $800 actually too much.

Find you Xeon on cpubenchmark.net and link it here.

For example, Intel Xeon E5-2660 @ 2.20GHz , Cores: 8 Threads: 16:

https://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+E5-2660+%40+2.20GHz&id=1219

has rank 8074. It is ~ 14% (1/7) of the best Mac laptop on the world (because M5 has rank 58514).

Two such Xeons, and it is 1/4 of the best Mac laptop on the world.

But, refurbished such CPU (because it is too old, not produced anymore), I have here for.. 8 USD.

For it, obviously you need special mobo, and it can be some money, but it is still 94 USD (HP Proliant DL380p G8 732143-001 Dual Socket LGA2011 Server)

You add to it 2x coolers and some old gfx (for a start), a lot of memory. One chip for 8 USD. 24 slots x 8 USD = 192 USD. 4GB x 24 = 96 GB RAM.

Price about 8x2+94+192= 300 USD so far.. and it is just because I maxed DDR3 slots ;)

4 hours ago, Trurl said:

it is more efficient to let the computer it self to decide how to utilize the cores.

Operating systems decide for any program, since ever, which thread runs on which core. And they switch with time, so no core is 100% and other cores 0% idle. It is because one would overheat and other don't.

User or program can set up "Affiliate". You can do it, again, since ever, even on Windows XP, if I recall correctly, inside of Task Manager. This one you can force one specific process to use one specific core. But you don't want to do it without a good reason i.e. program is obsolete and does not work well on multi-threaded CPUs.

4 hours ago, Trurl said:

This again is the trend to let the computer do the thinking.

I think they simply meant that instead of specifying the number of threads by human, it is read from CPU info, and the same thread count is used, so each thread (in programming sense) runs on each thread (in CPU sense, so sometimes it means core).

Simply ask ChatGPT to generate it:

#include <iostream>
#include <thread>

int main() {
    // Get the number of hardware-supported threads (logical cores)
    unsigned int threadCount = std::thread::hardware_concurrency();

    std::cout << "Available threads: " << threadCount << std::endl;

    return 0;
}

(don't ask ChatGPT too much in a single request, ask for a one simple logic block, at this it is OK)

Usage:

#include <thread>
#include <vector>

int main() {
    // Get hardware thread count
    unsigned int n = std::thread::hardware_concurrency();

    // Fallback in case the value is not available
    if (n == 0) n = 4;

    std::vector<std::thread> threads;

    for (unsigned int i = 0; i < n; ++i) {
        threads.emplace_back([]() {
            // Work to be done in each thread
        });
    }

    // Join all threads
    for (auto& t : threads) {
        t.join();
    }

    return 0;
}

I compiled it with g++ src.cpp -o dst and both worked fine.

You do your heavy computation inside of emplace_back()

4 hours ago, Trurl said:

I purchased 2 Xeon workstations for $800 each. They have 2 processors at 6 cores each. So that’s 12 cores.

2x machines 2x cpus 6x = 24 cores for me ;)

And they probably have 2x threads = so 48 threads all total.

That's a very good machine for mathematician or 3D graphician or simply game player.. ;)

But you need to split work wisely for threads. And have network communication between machines, so they tell each other what range of values they crunch etc.

If it is meant to primes, it can be pretty simple, 1st machine does only primes-to-be with last digit 1 and 3, and 2nd machine does only 7 and 9 (or 1 & 7, 3 & 9). We can safely exclude 2,4,6,8 and 5,0 for obvious reasons.

You can set some environment variable with value 1 and 2 on the other, then read it inside of C++ code, and from this crunch different ranges (so no actual network communication is needed, it could be hard for you).

ChatGPT generated:

#include <cstdlib>
#include <string>

// Get integer value from environment variable
// Returns defaultValue if variable is missing or invalid
int getEnvInt(const char* name, int defaultValue) {
    // Try to get environment variable
    if (const char* val = std::getenv(name)) {
        try {
            // Convert string to integer
            return std::stoi(val);
        } catch (...) {
            // Conversion failed (invalid format, overflow, etc.)
        }
    }

    // Return default if not found or conversion failed
    return defaultValue;
}

#include <iostream>

int main() {
    int machine = getEnvInt("MACHINE", 0);
    std::cout << "Machine: " << machine << std::endl;
}

Set environment variable on each machine to a different value, then in code read it and voila, they crunch different ranges.

6 hours ago, Trurl said:

Basically I want a Ai that can be trained on traditional processors.

Training your own LLM on your home computer has very little sense. You have ChatGPT. Deepseek. Gemini etc.

What for training your own? It won't be comparable to the available engines.. i.e. will make even more errors..

Training ChatGPT takes from a few months to a year. On a >= 30k GPU cards.. What can you do with your single machine?

When you are at this, you can try projects from https://huggingface.co/

It is kinda like "github for LLM".

6 hours ago, Trurl said:

And I don’t want to crunch numbers blindly.

LLM won't change this. Even if you will train your own LLM on your own hardware. It will not give answers which were not already inside of training material. It won't give you unknown yet magic mathematical formulas, nor it won't give you a new physical equations and theories etc. Simply forget about such ridiculous tasks..

I have a discussion with ChatGPT about writing algorithm of finding patterns in a number.

i.e. you don't need to go from 2...sqrt(333) to find that 333 is dividable by 111 (or 3), just because there is visible pattern in this number straight away.

But it is visible for human, not to computer / algorithm. Pattern in decimal system, won't be visible in binary or hexadecimal system, and vice versa.

So, instead of a brute-force algorithm, find a pattern in a number, and you know it is not a prime.

If pattern is not easily visible, do brute force method to be sure.

Did you try to write probabilistic primary tests in C/C++?

https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test

Graphics cards rely on massively paralleled 'simple' processors that are optimized for shifting bits /bytes/words.
Complex processors ( Intel/AMD ) can do many more complex operations.
The 'simple' processors are adequate for the tasks required for AI/bitcoin mining, yet the thousands of cores in them will dissipate as much power as a complex processor with 8 cores.

Which would you use ?

18 minutes ago, MigL said:

Graphics cards rely on massively paralleled 'simple' processors that are optimized for shifting bits /bytes/words.
Complex processors ( Intel/AMD ) can do many more complex operations.

This is ChatGPT-generated CUDA (nVidia GPUs only) for prime testing:

#include <stdio.h>
#include <math.h>

// Kernel function to check if numbers are prime
__global__ void checkPrimes(int *numbers, int *results, int n) {
    int idx = blockIdx.x * blockDim.x + threadIdx.x;

    // Make sure we don't go out of bounds
    if (idx < n) {
        int num = numbers[idx];

        // Assume number is prime
        if (num < 2) {
            results[idx] = 0;
            return;
        }

        int isPrime = 1;

        // Check divisibility up to sqrt(num)
        for (int i = 2; i <= sqrt((float)num); i++) {
            if (num % i == 0) {
                isPrime = 0;
                break;
            }
        }

        results[idx] = isPrime;
    }
}

int main() {
    const int N = 10;

    int h_numbers[N] = {2, 3, 4, 5, 16, 17, 19, 20, 23, 24};
    int h_results[N];

    int *d_numbers, *d_results;

    // Allocate memory on GPU
    cudaMalloc((void**)&d_numbers, N * sizeof(int));
    cudaMalloc((void**)&d_results, N * sizeof(int));

    // Copy data from host to device
    cudaMemcpy(d_numbers, h_numbers, N * sizeof(int), cudaMemcpyHostToDevice);

    // Define block and grid sizes
    int threadsPerBlock = 256;
    int blocksPerGrid = (N + threadsPerBlock - 1) / threadsPerBlock;

    // Launch kernel
    checkPrimes<<<blocksPerGrid, threadsPerBlock>>>(d_numbers, d_results, N);

    // Copy results back to host
    cudaMemcpy(h_results, d_results, N * sizeof(int), cudaMemcpyDeviceToHost);

    // Print results
    for (int i = 0; i < N; i++) {
        printf("%d is %s\n", h_numbers[i], h_results[i] ? "prime" : "not prime");
    }

    // Free GPU memory
    cudaFree(d_numbers);
    cudaFree(d_results);

    return 0;
}

Compilation:

nvcc prime_cuda.cu -o prime_cuda
./prime_cuda

Some things are not possible on GPU (like calling operating system function, network connection, disk access or other hardware etc), but calculating prime numbers, is not one of them..

What Trurl wants is pretty simple and easy (see, I just asked ChatGPT and voila). But the weakness is only he..

This code should work on any nVidia with CUDA even for $20.

Download nVidia CUDA Compiler from:

https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/

27 minutes ago, MigL said:

Which would you use ?

CPU is a supervisor programming GPUs. Sending data, and receiving ready result.

Search net for your gfx card model + "cuda cores" => you will know what approximate speed increase to expect from it vs CPU cores/threads.

ChatGPT made stupid mistake which will slow down this code. Good I noticed. Fixed part:

float limit = sqrtf((float)num);

for (int i = 2; i <= limit; i++) {
    if (num % i == 0) {
        isPrime = 0;
        break;
    }
}

Otherwise sqrt() is called every iteration..

ps. @Trurl what GFX card do you have in your Xeon workstations.. ? Do they have slots for PCI-Express or so.. ?

27 minutes ago, Sensei said:

This code should work on any nVidia with CUDA even for $20.

nVidia card for $20 has 384 cores, for $100 has 960-1024 CUDA cores. No need to buy some expensive monster just to make tests. Compile code and test, benchmark it.

When you will see it works and has sense, and you need more video ram and more speed, you will invest more money in it.

Create an account or sign in to comment

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.