# Why do people need fast/strong computers

On this diagram of a Skylake die, I show what is useful to me. Intel won't be delighted with it, and Amd neither, but this is what I really need. Maybe a third player gets inspiration from it?

• I don't need the graphics processor at all, nor the display controller.
• I don't need three of the four cores.
• I don't need three-quarters of the 256-bit Avx.
• I don't need hyperthreading nor the Sse and Avx instruction set extensions, so most of the sequencer can drop out. Not represented on the diagram, but that's much area, power, and cycle length.
• How much L1 and L2 do I need? Do I need an L3? Unclear to me.
• The Usb links do not need to be on the Cpu. But I do need a good Dram and Pci-E interface.

So of the execution area and power, I use 1/10th to 1/30th. The rest of the die consumes far less and can stay as is.

The Core 2 added already much hardware useless to me but it did accelerate my applications a lot. Since then, progress is minimal, like 25%. Could that possibly be a reason why customers don't buy new computers any more?

For new applications, a few editors and programmers sometimes make software that uses several cores and the vector instructions. Better: compilers begin to vectorize automatically source code that would be vectorial but is written sequentially - excellent, necessary, and it begins to work. Fine.

But I want my to accelerate my old applications. I don't have the source, the authors have changed their activity - there is no way these get rewritten nor even recompiled. Sorry for that.

That's why I suggested that a magic software (difficult task!) should take existing binaries and vectorize them. It's not always possible, it's a hard nut (artificial intelligence maybe), it may produce wrong binaries sometimes, but this would take advantage of the available Sse and Avx.

The much better way would be that the processor manufacturer improves the sequencer to run several loop passes in parallel. This too is difficult for sure since the sequencer doesn't have as much time as an optimizer to make smart choices, but it would apply to binaries that can't be modified. The Skylake has hardware for four 64b mul-acc (in just one core), and a better sequencer would eventually take advantage of it. Even better, this would work on an OS that isn't aware of the Avx registers.

Hey, is anybody there?

Intel's Knights Landing is the new thing for supercomputing. One socket carries 1152 multipliers-accumulators on 64b floats working at 1.3GHz to crunch 3TFlops on 64b. That's 3 times its predecessor and much more than a usual Core Cpu. Better: the new toy accesses the whole Dram directly, not through a Core Cpu, and Intel stresses its ability to run an OS directly.

In other words, one can make a Pc of it, far better than with its predecessor, and software would use the new component a bit more easily than the predecessor. This shines new light on the question: "Why do people need faster computer?" - which, for the new component, would mean: "How many people would buy a Pc built around the Knights Landing?"

----------

My first answer would be: I still want the many single-tasked operations of a Pc run quickly on the machine, but I don't want two separate main Dram in the machine so a separate Core Cpu is excluded, hence please have within the Knights Landing some good single-task ability. Maybe one tile that accelerates to 3GHz if the others do nothing. Or have one Core in the chip that accesses the unique Dram properly and slows down a lot when most of the chip runs.

----------

Finite elements are an obvious consumer of compute power in a professional Pc. 3D fields (static electromagnetism, stress...) run better and better on a Core, but 3D+time are still very heavy: fluid dynamics, convection - with natural convection being the worst.

Finite elements use to run together with Cad programs, which themselves demand processing power, but less as linear algebra and more as if-then-else binaries. Fine, the Knights Landing handles them better than a Gpu does, as it runs the x86-i64 instruction set and has gotten a good sequencer.

Then you have many scientific applications that may run inefficiently on a vector Cpu with slow Dram, like simulating the collision of molecules, folding proteins... These fit better chips designed with one compute unit per very wide Dram access, as I describe elsewhere.

What would need a different design are databases. Presently they run on standard Pc which are incredibly inefficient on such binaries. What is needed is agility on unpredictable branches and small accesses anywhere in the whole Dram - for which nothing has improved in the past 20 years. Many programming techniques of artificial intelligence have the same needs and are expected to spread now. Web servers have similar needs too.

For databases and AI, but in fact for most applications, we need the Dram latency to improve a lot, not the compute capacity. This need has not been stressed recently because neither Os, video games nor common applications require it heavily, but databases do and they get increasingly important.

For reasons other than gaming, what are the main points that an average person could use a strong computer for certain subjects (business/school)

Being able to have 15 different programs open and 20 different windows on your internet browser are important; the laptop I'm using right now can't handle it.

MigL said:

In case anyone else wants one...
AMD has just introduced the Threadripper 3990X, built using 7 um ( Zen 2 ) technology.

Should that be 7 nm?

MigL said:

MigL said:

This is a young person that ask why we need a fast computer. I was in graphics school in 1997 and the computers were state of art at the time, and storage was the main factor. If you made a graphic in Photoshop you had no media to back up your file. Thirty people printing at a time made the network crawl.

It used to be you had to buy a computer every year because yours went out of date.

Now I have 12 cores. I think the problem is programming so those cores work as one. But I am more interested in using less processing power to solve tasks. From what I know of programming it is more complex to programming cores. Does anyone know any good sites on C++ and Python that discuss this? Many cores made the Sega Saturn and PlayStation 3 hard to program. Games were delayed for the PS3. Xbox 360 games were smoother. So even with 7 cores of the Cell processors, if you cannot program it efficiently it does not get used.

On the other hand, programmers complained the Nintendo Wii wasn’t powerful enough.

If youre into animation and media that requires a ton of rendering you definitely need to have a fast computer.

