Sean R. Lynch ☑️ is a user on literati.org. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.

I suspect people trying to find alternate CPU architectures that don't suffer from #Spectre - like bugs have misunderstood how fundamental the problem is.

Your CPU will not go fast without caches. Your CPU will not go fast without speculative execution. Solving the problem will require more silicon, not less.

I don't think the market will accept the performance hit implied by simpler architectures. OS, compiler and VM (including the browser) workarounds are the way this will get mitigated.

@HerraBRE you're right, BUT big caviat here: that this is necessary is software's fault. Remember, programmers add abstractions as fast as (often faster than) Moore's law. Our computers are ridiculously powerful and still would be even without OOO or speculative execution, we've just grown accustomed to hugely overpowered machines and designed our software with that in mind.

@sir @HerraBRE Our languages are at fault, too, since they only offer very limited forms of parallelism.

@seanl @HerraBRE parallelism has very little to do with the problem

@sir @HerraBRE Lack of parallelism increases demand for single-threaded performance. Are you saying that there will always be a demand for as fast of single threaded performance as we can get even at the cost of lower overall performance per watt?

@seanl @sir Some important computer science problems are proven to not be solvable by parallel processing.

So, yes.

For the ones that are... we have GPUs. 😉

Sean R. Lynch ☑️ @seanl

@HerraBRE @sir But the speedup from speculative execution IS from parallelism. We're just asking the CPU to find it instead of the compiler. So couldn't you move the smarts into the compiler?

@sir @HerraBRE Yes I've switched from languages to compilers and the blame from languages to ISAs

@clacke @sir yes, and that's how you get Itanium. No seriously, that was what you were supposed to do with Itanium and nobody used it because it was such a pain in the ass to duplicate all the effort to get efficient pipelining in every compiler. Plus i wasn't backwards compatible.

@tekk @sir @clacke Itanium had lots of problems. Lack of backward compatibility just meant people weren't willing to adopt it just to get 64 bit. Intel also made the mistake of trying to use it to get people to use their own compiler IIRC, just like they've done with TBB and their other proprietary crap. None of these problems would happen with a more open approach.

@clacke @sir @tekk These days if you want to sell a new architecture you just add support to LLVM and maybe GCC; you don't try to get people to switch compilers.

@seanl @clacke @tekk @sir The biggest problem is that Intel's own Itanium was utter crap (remember the joke from the time: Intel has 1,000 engineers with 5 years experience and HP has 40 engineers with 25 years of experience). When they shipped HP's Itanium, it was too late. Nobody wanted to deal with that turd. Also, AMD happened.

@seanl @HerraBRE @sir this might be interesting on platforms where JIT compilation is more prevalent - server-side Java, Android anyone?

(there's a history of companies making custom chip for Java, the latest I can think of is Azul's Vega although that's tuned more for massive multi-core concurrency)

@seanl @herrabre @sir Yes and no. CPU has access to runtime state that ahead-of-time compiler does not know (branch predictor is a form of that, but then it is exploited). So you could not move _those_ smarts. But yes, you can move a lot and compiler generally has more resources, especially the computation time.

@seanl

This is essentially what intel bet on for the itanium architecture. It failed spectacularly in this aspect for two reasons: Compliers are pretty good already, and while you can improve them some, the biggest gains are in the past. Also, a lot of the branching is dependent on runtime data, the compiler can at best know which is likelier to appear, but the CPU either knows or can take both paths when it doesn't.

@HerraBRE @sir

@sir @HerraBRE @seanl

The other bit is that compiler improvements also work on the other CPUs, so that the gap to the ones without OOO etc will still be there.