Will MPP Survive?

Caution: Out-of-school, forward-looking, conceptual thinking ahead. Please do not post comments telling me that this makes no sense and that I'm an idiot. I already know that I'm an idiot. Please do post comments and tell me why this makes no sense.

A while ago I saw this ad for a system with 80 cores and 640GB of RAM, and I thought, "Wow, you could make a nice little database server out of that".

That got me to wondering... how large will SMP systems get? The possibilities for stuffing hardware into one box seem limited, but what about inverse virtualization - multiple machines acting like one rather than one machine acting like multiple?

I'm not a hardware guy, particularly as it relates to clustering, so I don't have even the faintest guess as to a realistic answer. But on a conceptual level, doesn't that seem possible? If VMWare can make one machine act like three without requiring application changes, how long will it be before it's possible to make three machines act like one without application changes?

I discovered later that the system promoted in the ad subtly purports to do just that. I'm in no position to know whether that's correct or to evaluate said claim, but that's my wholly uneducated take. So maybe this isn't such a crazy idea.

Present technology aside, however, virtualized SMP is starting to feel inevitable to me. Finding ways to stuff more and more processors and RAM into a single box seems self-defeating. Abstracting the requisite connections out of the hardware and into software, and relying on network technologies to make the necessary communications fast enough, seems like a much smarter way to scale a system.

Where this leads, of course, is to the title question: If sufficiently fast and scalable SMP becomes possible, will MPP databases survive? Particularly if solid-state storage makes I/O transfer costs almost nil, thus removing distributed I/O as an MPP advantage, what's to stop an SMP database from being competitive with the MPP databases? Specialized implementations will always be faster, sure, but will they be fast enough to warrant the extra complexity? Might Oracle still win the database wars simply by waiting?

I don't know the answer. I don't even know if the premise is valid, the be honest. Sometimes completely naive questions lead to interesting answers though, so I figured I'd throw the idea out there. Please, someone educate me. :-)

Holy Missing the Point, Batman!

From this otherwise uninteresting article:

"Our clients say, 'My god, is [query] performance worth sacrificing all the other gains'" of traditional database systems. They include ... the enterprise's existing investment in trained database administrators...

Holding on to a technology because you've already trained people to use it is what I call "growing dinosaurs". It makes understandable short-term sense, but anyone using it as a serious long-term justification is in big, T-rex-size trouble.

And for the record, the other reasons to stick with existing database systems that are listed in the article are all short-term maturity problems, not serious fundamental flaws. I wonder how IBM and others will attempt to downplay column-oriented systems in a few years when all those wrinkles have been ironed out...