Protocol Buffers: Google's Open Source Sidestep of XML
If you've ever wondered how Google manages to deal with all the information thrown at it in a given second, much less an hour or day, then listen up because we now know the answer: Protocol Buffers. Even better, Google has branded them with the Apache license and turned them out into the wild.
So, what are Protocol Buffers? They're Google's way of encoding the "thousands of different data formats" used to carry information through the labyrinth of Google projects. Unlike XML — which Google developers describe as "an extremely expensive proposition" when you have as much traffic as they do — Protocol Buffers allow the developer to "define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data...using a variety of languages" twenty to one hundred times faster, and in three to ten times less space. Currently they are available for Python, Java, and C++, with Perl under development.
Why did Google decide to go Open Source? "We have many other projects we would like to release as [O]pen [S]ource that use protocol buffers, so to do this, we needed to release protocol buffers first." We are certainly interested to see just what these other projects are, and will be paying close attention when they see the light of day. If you're up to digging into it all right away, take a stroll to Google Code and grab yourself a copy.