Cloud Computing: Good or Bad for Open Source?

by Glyn Moody

Cloud computing: you may have heard of it. It seems to be everywhere these days, and if you believe the hype, there's a near-unanimous consensus that it's the future. Actually, a few of us have our doubts, but leaving that aside, I think it's important to ask where does open source stand if the cloud computing vision *does* come to fruition? Would that be a good or bad thing for free software?

Richard Stallman has no doubts:

"It's stupidity. It's worse than stupidity: it's a marketing hype campaign," he told The Guardian.

"Somebody is saying this is inevitable – and whenever you hear somebody saying that, it's very likely to be a set of businesses campaigning to make it true."

The 55-year-old New Yorker said that computer users should be keen to keep their information in their own hands, rather than hand it over to a third party.

Go on, Richard, tell us what you really think.

The problem is that even if – or may be even though – he is right, people are going to use cloud computing solutions because they are so convenient (well, that's my excuse). Obviously, we need to mitigate the risks of doing so, for example by insisting on the right to move our data out from such services, and requiring stringent privacy safeguards. So, again, re-framing my question: assuming we can sort out issues of security, privacy and the rest, and use cloud computing as well as it can be used, it it good or bad for open source?

At one level, it looks pretty good. Cloud computing is about harnessing economies of scale; that, in its turn, almost forces suppliers to deploy free software, because the licensing costs for the software that keeps the cloud humming (or whatever noise clouds make) do *not* scale for traditional closed-source software (unless, of course, you are Microsoft, and can use the code for free). And indeed, we find that much of Amazon's and Google's cloud computing infrastructure is based on free software.

So in that sense, cloud computing is a huge win for open source. As a result, it will always be cheaper to run enterprise applications running on GNU/Linux in the cloud, for example, so this may be enough to steer cost-conscious companies in that direction given that they also won't have to worry about the messy hands-on stuff like installing or maintaining free software.

What's more problematic is that the use of free software by cloud computing providers does not trigger the the distribution clause of the basic GNU GPL. This means that any cloudy tweaks made to free software by companies like Amazon or Google are not necessarily contributed back to the community. The use of the GNU Affero GPL solves that in theory, but not in practice, since the core infrastructure programs – Linux, MySQL etc. - don't use it.

So this would seem to suggest that a move from on-premises to cloud computing would actually *reduce* the contributions of code back to these projects. Now, it's plainly not in the interests of the cloud computing providers to kill off the very applications they depend on, so presumably some kind of compromise will be found whereby they contribute back some of their tweaks to help improve the code they run. But the more pervasive cloud computing becomes, the fewer the on-premise deployments of free software, and the fewer the number of independent external contributions back to those projects there are likely to be.

The situation for general users of Gmail and Google Docs, say, is even worse. There, they are unlikely even to be aware that they are running on free software – at least companies migrating to the cloud have to choose which platform to run their apps on. This makes me wonder whether the open source world needs to address this problem directly. I think it does, if it wants to remain relevant to vast majority of computer users; the question is how.

Ideally, what we need is a completely open source cloud computing infrastructure on which applications providing people with things like (doubly) free email and word processing services could be offered. Now, it's clearly not possible to create the kind of huge facilities that Amazon, Google and Microsoft are building around the world. Not even Mr Shuttleworth, with all his millions, could sustain that for long without charging somewhere along the line. So simply running open source programs like Eucalyptus is not going to work. The trick here is not to fight the battle on the opponents' terms, but to come up with something completely different.

For example, how about creating an open source, *distributed* cloud? By downloading and running some free code on your computer, you could contribute processing power and disc space that collectively creates a global, distributed cloud computing system. You would benefit by being able to use services that run on it, and at the same time you would help to sustain the entire open source cloud ecosystem in a scalable fashion. Collateral benefits would be resilience – it would be almost impossible to take down such a cloud – plus integral privacy if data is scattered across thousands of machines in the right way.

Is there something like that already? The nearest thing I could find is Swarm, “a true, distributed programming language.” This comes from Ian Clarke, probably best known for his Freenet, which is:

free software which lets you anonymously share files, browse and publish "freesites" (web sites accessible only through Freenet) and chat on forums, without fear of censorship. Freenet is decentralised to make it less vulnerable to attack, and if used in "darknet" mode, where users only connect to their friends, is very difficult to detect.

There's clearly a lot of commonality between Freenet and distributed cloud computing systems. And as far as I can tell from Clarke's video on Swarm, it seems to be addressing the right issues, although I'd be interested to hear the views of people whose programming skills are better than mine, which never progressed beyond Fortran.

So, is Swarm the way forward for open source cloud computing? Or are there other, better projects out there that might solve some of the issues raised? Or should we just stick to Google and be grateful? Your comments, as ever, are welcome.

Follow me @glynmoody on Twitter or identi.ca.