Quantcast

Speaking of Games…

This is a dumb one but totally sucks me in because I suffer from OCD. (hehe, it’s embedded below so you can play right now…in fact, I’m gonna play *just one* game right now…*just one* and then I’m closing the browser, I promise :)

Gobby: Open-Source, Cross-Platform Collaborative Text Editing!

Shag and I were collaboratively hacking on a new radio station for the archive, and we needed a collaborative text editor. SubEthaEdit is great, but Mac-only. Shag found Gobby, which is a like an open-source SubEtha that works great on Linux.

If you haven’t used a collaborative editor before, multiple authors can work on the same files, and everyone sees each others edits in real time, differentiated by background color. Gobby has syntax highlighting, integrated chat, over-the-wire encryption, and is a pretty solid text editor too. We love it! Here are some ideas we had for future patches:

  • indent-region
  • sound cue upon message receipt
  • auto-indent
  • birds-eye view of the file to watch changes go in
  • function dropdown

As soon as we get indent-region and function dropdowns patched in, I’ll switch my main unix editor from KDevelop to Gobby.

On Software Development

PaulR said this at lunch and we all lost it:

Software isn’t finished until the last user is dead.

Original source unknown…

Using git For Large Scale Digital Archiving: An Outline

Here are some notes on how one might re-architect Internet Archive infrastructure to meet some additional goals:

  • easy to set up and replicate
  • provide versioning and transactions
  • handle more media types well
  • better ingest/locate/read apis
  • better search

The current architecture looks like this:
iaarch.png

The diagram is simplified a lot. There are currently about 1800 nodes in the cluster, most of which are storage nodes (low power 1U nodes with 4 1TB hard drives). The deriver nodes are used for crunching things like pdfs and h.264s, and there are about 300 of those. There are 5 www frontends, hidden behind a couple load balancers, and database server has at least one read-only secondary.

What I like about the current infrastructure:

  • Easy to add more storage. Some other archival solutions do not scale well, since they insist all hard drives be connected to the same machine. This starts to break down at the petabox scale.
  • Easy to add more bandwidth. Currently IA is pushing 5+Gbps of outbound bandwidth. Every storage node runs an Apache server, which lessens load on the homenode, which is a problem with other archival systems.
  • Database hits are not required to locate an item on the cluster. When an item is requested through the Locator service, a multicast is sent, and machines that have the item will respond. The lessens load to the DB server, which is important when getting thousands of web requests per second.

What I find interesting about the current infrastructure:
  • RAID is not used. Items are backed up on to a secondary machine when added to the archive.
  • This is mostly due to “RAID is hard to get right” and cost
  • This means there are two machines (and two apaches) ready to serve the same content.
  • One machine can be taken down for repair while the content is still online.
  • I would like to see use of either RAID or maybe RAID_Z

An idea on how to re-architect things using git as a storage backend to provide versioning and transactions
  • git is the version control system used for the linux kernel.
  • git is a totally new way to operate on data. Read this if you are a non-believer.
  • We could keep the infrastructure mostly the same as IA, but store items as git repositories. This would not be a large architecture change.
  • git would become a supported access protocol, in addition to http, ftp, and rsync. Backups could be simple a git pull. We could git clone the entire cluster.
  • We would get versioning!

Changes needed to repo.git to make it useful in an archive cluster:
  • Change reguser.cgi to tie into the existing user database (talk to dbserver)
  • Change regprog.cgi to work in a cluster environment. Repositories are inited in /{0-4}/items/id/id.git on a primary node (talk to catalog/homenode)
  • Use post-commit hook to queue backup and derive tasks (talk to catalog)
  • Change gitweb to show custom view of movie, audio, texts (book), and photo collections. Software collections would show standard gitweb view.

I don’t think this would take too long to implement, but I’m lacking co-conspirators these days.. Maybe when shag makes it to SF we will have to knock something out :)

Ruby 1.9 gains block-level scope

I was watching this Google TechTalk that Yukihiro Matsumoto gave on Ruby and learned that Ruby 1.9 had block-level scope.. cool!

Apparently, block scope is some sort of thing with me:

Remove Leopard Dock’s Obnoxious Mirror Effect

Apple seems to be trading in productivity for flash … here is how to disable the “reflecting mirror” effect from the Dock in Leopard:

defaults write com.apple.dock no-glass -boolean YES

Then send kill -1 to the Dock process.

Link to the blog I found this on

Page-Turner Coffee

Peliom sends in this observation about Philz:

I got the “fire alarm” coffee from philz this AM


sooooo good!


certain coffees I call “page turners” becaues I take a sip and then I have to take another sip, and on and on


just like a book I can’t put down


CC by-nc Photo by Scott Beale/Laughing Squid

WiiMe

wiime1.jpg
They sent me 9 consecutive text messages this morning, alternating between “Wiis are in stock at Amazon! Grab one!” and “No Wiis available :-( they were up for 8 seconds.

I’d pretty much given up until the 9th msg came through and I got one! yay! I can’t believe that it’s been a year already and it’s *STILL* so hard to get a Wii on Amazon. Yes, I know it’s possible to get one at a store if you live in Missouri (or just not in the Bay Area)…or if you wait in line at Best Buy at the crack of dawn, but I really didn’t want to do that (cause I’m lazy). I just wanted to get one online for the office and WiiMe came through!

Memories of Programming the Mac, Pre-OSX

Last night, I had a dream about programming the Mac back in the old days, before OS X. The more I think about it, the more I think I’m still dreaming. Did this stuff really happen? I remember:

  • MPW, the Macintosh Programmers Workshop. The old mac didn’t have a console, or even run.exe. We had MPW, which gave us a commandline of sorts. We could access cvs using MPW. It is still being distributed by Apple.
  • MacsBug. The low-level debugger. Its hard to believe this is all we had. We loved it. DebugStr() was the poor Mac Programmer’s console. My SE/30 (and all Macs) had a “Programmer’s Key” that would invoke MacsBug. If you didn’t have MacsBug installed, the built-in MicroBug would come up instead. Apple still distributes MacsBug. Fortunately, I’ll never need it again.
  • Vague memories of Projector, which was Apple’s version control thing, and Jasik Debugger (The Debugger)
  • MrC. This was Apple’s C compiler. We used MPW to compile our code with MrC. Even if we used Metrowerks to initially write the code, MrC was what we used to compile the engineering builds. Back in the the day, the shipping versions were actually compiled by a *third* compiler, from Motorola, which ran on an AIX box or something.
  • Metrowerks CodeWarrior. I loved CodeWarrior. It was blazingly fast. It has a source-level debugger, which often ignored my breakpoints. It had a great IDE. It had a great Editor. The project files had a .µ extension, no joke. I bought my first copy of CodeWarrior in 1995, at the student rate, using the proceeds of my first real programming job (which is where I met peliom). CodeWarrior still brings back warm memories.
  • BBEdit. I’ve been using BBEdit since forever. It doesn’t suck. It makes me happy, in a security blanket kind of way.
  • Pascal. The Mac Toolbox interface was originally Pascal. Pascal was *the* way to write mac apps way back when. I tried to learn Pascal before I learned C, but never got anywhere.
  • Think C and Think Pascal. Compilers sold by Symantec. I learned C programming using Think C on a SE/30.
  • EvenBetterBusError. I don’t remember BusError, or BetterBusError, but EvenBetterBusError sticks in my mind. I don’t remember what it did, or why I needed it, but I think it was a System Extension.
  • System Extensions. Marching across the screen on boot up. Little friends there to make your life better. The thing people noticed when booting OS X for the first time was that their little friends were all gone.
  • Inside Macintosh. This was the Mac API documentation, originally in Pascal. A giant set of bound volumes, or available in electronic form. I think they were in HelpViewer or DocView format or something..
  • Pascal strings. You still needed them for DebugStr and window titles and such. c2pstr() was often used.
  • PlayMPEGInWindow(). I don’t remember if this was the exact function name, but peliom and I were trying to display MPEG video, and when we tried to see how QuickTime programmers did it, we found this function, and it cracked us up. So easy! When peliom and I both ended up working at Apple, we ended up working directly with the guy that wrote PlayMPEGInWindow(). Small world.
  • System7 Pack. SpeedyFinder7. Greg’s Buttons. These were crazy programs that modified the system in crazy ways.
  • Talking Moose.
  • Hypercard. A programming environment that was way too easy to use. Kids could write awesome, fully-functional programs. It was obviously too powerful, and had to be killed off. One day I found out the guy who worked across the hall was the guy that wrote the HyperCard parser. I was in awe.
  • MoreMasters(). You had to call this several times at app startup to allocate master pointers. Really.
  • WaitNextEvent(). You had to call it in your stupid event loop. If you didn’t, no other apps would get scheduled on the CPU.
  • The MultiFinder. WTF? Finder->Special->Set Startup->Start Up System with MULTIFINDER!!!
  • The Chooser. Background Printing. AppleTalk. I never understood the Chooser.
  • RAM Cache. Built right into the System 6 Control Panel.
  • Command-I Get Info. Increase the Application memory size.
  • Option+”About this Mac”. You can see the sun setting over the hills in Cupertino. When I worked at the lab with peliom, we made a video streaming app, and I rendered a 3D version of this scene using Bryce for our About Box. I wish I still had that around somewhere. When I got to Apple, I saw this same view out of my office every day.

That’s all for now. I’ll leave by thanking all those responsible for gcc and gdb. And UNIX. Thank you.

Only One More Year of Email!

Prof. Knuth on email:

I have been a happy man ever since January 1, 1990, when I no longer had an email address. I’d used email since about 1975, and it seems to me that 15 years of email is plenty for one lifetime

I’ve been using email since 1993, and I am so done with it. One more year, and then I can pull a Knuth.(via)

Ruby is SERIOUS BUSINESS!

Tim Bray is MAD that RubyConf was on a weekend. _why agrees:

People, Ruby isn’t a game. It isn’t a hobby. It’s certainly not a very good food source and it’s not an article of clothing. You can’t just put Ruby in the wash with a load of whites. Nice try, but no. No. Jeez, grow a brain. Ruby isn’t a tambourine you can bang loudly in my ear. I’m trying to use my iPhone here, guy.

And Ruby is not some bachelor’s party with a foxy lady in a sherlock holmes hat. Hardly: Ruby is all dads. Put a petticoat on, woman. Pop those balloons. We’re all getting paid here and we’re all having kids here. Get with the program.

Ruby is serious business. Real business and totally bankable. Fact: You cannot do it late at night. The office is closed during those hours. You should be in bed like all the other dads. Now, have a nightcap and go put your PJs on, we’ve got to wake up early tomorrow, it’s pancake day.

I love why. Read the whole thing, it’s spot on.

Let me see… I think I can pencil you in between Ubuntu installs.

I did two things last week: sleep, and install Ubuntu. That’s all I did. Acutally, I didn’t really sleep very much, because I was busy installing Ubuntu about 54,000 times. Here, I made a chart:

Liveblogging an Ubuntu 7.10 installation

Photo_102.jpg

Bob, Shag and I are trying to move our book scanning hardware to Ubuntu 7.10 - the Gutsy Gibbon. It’s a ridiculous process, and our hardware is crap. Here are some notes:

  • chai:20 (4:20) - Started up the installer app on the live cd. Unfortuantely the screen rez is 800×600, so we can’t see the important back/next/ok buttons on the bottom of the installer panel. What kind of installer requires greater than 800×600 screen rez?
  • chai:23 - Somehow, by logging the Live CD user out and fucking with the screen rez, we got the screen to display a larger screen res, but we can’t see the entire desktop on our screen. Moving the mouse around seems to pan the desktop, which would kinda work, if we could see the mouse cursor.
  • chai:25 - We are asked for the timezone, and San Francisco isn’t one of the available options. Los Angeles is. However, we opt to move to La Paz.
  • chai:30 - It is now officially time for chai.
  • chai:40 - We have found that starting a lot of xeyes processes lets us estimate where the invisible mouse cursor should be. There are fifty eyeballs on our screen
  • chai:45 - Bob starts playing minesweeper
  • chai:48 - Someone figures out that this version of xeyes lets us resize the window, so there is a GIANT EYEBALL staring at me
  • chai:50 - Installation done, rebooting!
  • Mouse works after reboot! Now to try and scan books!

Photo_101.jpg

Photo_10.jpg

“The System” … 2007



After 9 months of putting it off, I’m creating new folders and filing everything I have in my “To Be Filed” inbox …. I don’t know what happened but … 9 months … geez.

Here’s a little tip when real life presents a bulk-insert task like this … when you’re trying to find something it really helps to have the papers sorted chronologically. On the other hand you know that you will probably never look at these statements and receipts again … this is a good case for using Lazy Evaluation. I just dump everything in the proper category (e.g. Bank Statements, 2007) and then don’t order them chronologically until I actually have to find something like a specific charge on an old credit card statement.

Link to some interesting GTD stuff

Four Posts About Copyright

Copyright issues are so technical and complex that it makes talking about copyright in a public forum very hard. Who wants to listen to me talk about “changes to traditional contours of copyright protection”? Nobody! Not even me, and I *love* listening to myself talk!

Well, here are four posts about copyright that hopefully won’t bore you to death.

1. Rick Falkvinge and the Pirate Party of Sweden

IMG_0389.JPG

I got to hear Rick speak when he came to visit the Internet Archive, and he blew me away. Due to lobbying by the (mostly US-based) entertainment industry, broader copyright protection laws are enacted around the world every year. These expanding copyright laws threaten privacy and other civil liberties. In response to this, Rick founded the Piratpartiet in Sweden. They are now the 10th largest political party in Sweden, and are starting to influence real policy change. Rick does a great job about explaining problems with complex copyright laws to the general public. Check out his Google tech talk:

2. Antigua, Online Gambling, the WTO, and Hollywood
The WTO has ruled that the US ban on offshore internet gambling is illegal. The US disagrees, and refuses to lift the ban. Antigua argues that the ban has cost the country $3.4 billion in damages, and has asked the WTO for permission to violate copyright law and distribute US movies and music as a form of compensation.

3. SQLite, the public domain, Germany, and submarine patents
sqlite is an awesome, free, open-source filesystem-based database engine that is in the public domain, which means anyone can use it for any purpose they want. Almost every large technology company embeds sqlite into one of their products.

In Germany, the public domain doesn’t exist as it does in the US. In Germany, authors can’t dedicate a work into the public domain, and thus can’t contribute to sqlite!

Also, due to patent concerns, sqlite uses 17-year old technology, exclusively.. Crazy stuff!

4. A big victory: Golan v. Gonzales

Remember when Kahle vs Gonzales was heard in the 9th Circuit? Well, that went poorly. However, in the case of Golan v. Gonzales, the 10th Circuit has voted unanimously that First Amendment review clause in Eldred has been triggered, and the case has been remanded to the district court. This bodes well for a Supreme Court review of Kahle vs. Gonzales as well.

Massively Asymmetrical Bandwidth

Results of the Speakeasy bandwidth test, 60Mbit down/1Mbit up:
picture6.png
Can you imagine how different the net would be if everyone had this kind of bandwidth at home, and it was symmetric? I think the market for desktop apps would collapse overnight, and owners licensers of broadcast spectrum would pass laws to cripple YouTube bitrates.

A non-profit story

One of the perks of working as a programmer for a non-profit is that you get to collaborate with some top-notch hackers. The other day, I needed to find out the return values of the rm command, so of course I type ‘man rm’ on the Ubuntu command line. The manual page for rm didn’t list the return values, but it did list the authors. One of the names in the Authors section seemed vaugley familiar. “Hmmmm…”, I thought, “isn’t that the guy upstairs?” We ran upstairs and caught him off guard. “Um… that was a long time ago…” he told us. We suggested that he add a return values section to the man page, and then ran back downstairs, giggling like schoolkids. He probably thought we were nuts :)

Help! Disney is breaking the copyright protection on my movie!

I made a movie, encrypted it with a ZIP password, and put it online on my website and so only paying customers can view it. Paying customers can download it from here and watch it.

However, it appears that Disney has published the key needed to decrypt this movie on their website! In fact, a quick google search shows that every web page on disney.com contains this key!!

Does anyone have any advice on how get Disney to cease and desist their copyright circumvention???

AJAX Best Practices

I saw this banner while surfing the web:

Here, I’ll save you $1300:

Best Practices for AJAX content development, rule #1: When developing AJAX web apps, don’t waste your time trying to work around Safari’s many DOM/Javascript/Canvas bugs. Instead, force your mac users to upgrade to Firefox*.

*Corollary to rule #1: At least force them to upgrade to a nightly build of WebKit, but even then, you’re walking into a very buggy minefield.

I would have saved a few weeks of development time if I had followed rule #1 sooner.

Firefox: It sucks, but it sucks much less.

The ECONO BIN EB-200 !!!

I’ve always been looking for some way to store, display, see and manage my maps so they were stored out of the way but not hard to get to. And no folding. I hate folding. This has been for years and last week I threw up my hands and just decided to order the first thing I found on the internet.

I had to back down off of that because the first thing I found costs $1249. It was then that I began to understand that this might get a little pricey. Even finding poster display hardware was a pain. Google search “poster display” and let me tell you … you’re not going to find anything that helps you display your posters.

So I opted for the EB-200 “Econobin” at a mere $200. I know as soon as I post this someone is probably going to tell me I can get the same thing at IKEA for $79.99, but whatever …. I like the industrial look. And this thing is built to last, it’s going to be the only thing left in my apartment after The Big One

Host unlimited photos at slide.com for FREE!


The ECONO BIN arrived wrapped in so much packaging I had to play like a field medic and cut it all off. It’s made up of decent square and round powder coated steel tubes.

Host unlimited photos at slide.com for FREE!


Here is The BIN assembled with the copious packaging in the foreground. I bet the UPS guy was glad I came downstairs to meet him and drag these boxes up myself.

Host unlimited photos at slide.com for FREE!

And here are some extremely organized maps. Shown here are the SFBC Bicycle Map, The NYC Subway Centennial Map of 2004, and the AAA Baja California Travel Map.

Host unlimited photos at slide.com for FREE!



Naturally I was kind of smoking crack when I bought this thing. It’s way too big, awkwardly shaped and doesn’t fit anywhere. But I like it and I’ll get to see my maps a lot more now. And I have a map bin in my house !!

Rules for the New Bubble

Remember the previous bubble?

Remember how we used to go out every day of the week?

Remember dancing all night and watching the sunrise? On a Wednesday? Remember the Best Tuesdays Ever?

Remember how much beer we drank? Remember how much beer we drank at work?

What happened? Now we all work too much. And type to much. We’re old and sober, with hurty wrists.

Rules for the new bubble:

  1. You must see at least five friends every week. Housemates and workmates don’t count. Have more dinner parties. Eat more sushi.
  2. At least twice a month, you must go out on a weeknight. Listen to music, go dancing, go out on a date.
  3. Once a month, you are to go on an adventure. A real adventure. Mountains, Hot Springs, and Playa are all recommended.
  4. Re-institute the 11-o’clock rule: you have to leave for work by 11am. The best thing about this rule is that you are highly encouraged to break it, and if you aren’t out of the house by 11, you might as well go to brunch.
  5. One day a week you are to not touch the computer. No web browsing or email. No typing! Go outside and enjoy the beautiful city and hang out with your awesome friends!

What do you guys think?

Pushy Architects Unveil Groundbreaking Technology For Teen Titans

I just got the strangest email.. It came from pandemicrawfrankfurters at yourhippyfriendskilledaroosterbyblowingtoomuchsmokeinitsbeak.com

I don’t know what you are selling, dude, but I want in. Just let me know where to bury the shoebox full of money.

BREAKING NEWS – BREAKING NEWS – BREAKING NEWS – BREAKING NEWS

Pushy Architects Unveil Groundbreaking Technology For Teen Titans, Brush Aside Conspiracy Theorists Claims of Failure*

BUTTE, MT – NOVEMBER 31, 2008 –

In a remarkable discovery by Scientists who know things of importance, Banned substances composed of gold, freeze-dried tomatoes, bunions, cold frankfurters and extremely wet paste made in Peru have been combined in an innovative fashion to reveal what man has always feared: granite no longer weighs as much as it did in the Stone Age.

Against all odds and despite incredible circumstances of little merit, these Scientists determined this astounding fact while testing 5-year-old Beowulfs infected with the passive-aggressive DNA of long-extinct wooly mammoths.

During the initial testing phase of this mouth-watering experiment, Scientists were surprised to realize the 5-year-old Beowulfs responded with extreme vim and vigor to granite containing the aforementioned banned substances, adamantly licking the granite for days on end with little concern for their surroundings. Eventually, these Scientists came to the conclusion that their test subjects were drawn to the granite because the highly toxic DNA injections created an additional saliva gland completely obsessed with granite. The obsessive licking stripped the granite of its mineral exoskeleton, which proved to be the majority of granite weight.

“It’s a startling discovery, one for the ages. I suspect this will alter the future of man for days to come,” exclaimed Dr. Waz, one of the lead Scientists assigned to the project by his lazy neighbor, Jo-Jo. “I enjoy making things that people look at with their eyes, not their ears. This will eventually cure leprosy, we believe. If Lazarus were alive today, he’d be shooting arrows into the ground.”

Because this information is just now being released to the general public, there are fears of rampant rationing of atrophied orangutan livers, especially among sonar enthusiasts. But Jordan Jordanian, co-director of the experiment, dismisses these fears as short-sighted superstition.

“Spaghetti and meatballs will never be quite the same, as long as I have anything to say about it,” whispered Mr. Jordanian while sipping a bowl of spinach. “We stand by our findings. The results speak for themselves.”

The results quickly added, “Why can’t we save all the rhinoplasty victims, for God’s sake? Man wasn’t placed here to erect large monuments. We were placed here to erect social and sexual mores for the needy. What more needs to be said?”

What happens next, only the future knows. And the future isn’t talking in a loud voice anymore.

“Bacon will always be better than pork chops, that’s one thing you can count on,” echoed the future from a previous interview edited for television.

Yet, there are some who believe the Scientists need to conduct further tests before releasing these findings to the public.
“As a pushy architect with a zest for killing wild boars on partly cloudy days, I fear for the safety of teen titans from sea to shining sea,” brayed dr. P. P. timmmii, founder of the Swiss Foundation for Found Founders. “It’s still OK to prefer pavement. Granite is for the weak and ill-fitted. But I will say this – those stupid jackoffs who go to Burning Man every year will finally be able to ride faster, once we replace the granite. Gravity isn’t pretty.”

For more information regarding this important discovery by the Scientists, contact your local Notary and ask for more peanuts. Zygotes not included without written permission from your kidnapper. For less information, please
visit:
http://yourhippyfriendskilledaroosterbyblowingtoomuchsmokeinitsbeak.com/

DIY MultiTouch Keyboard Roadmap

avrusbkey

Today my AVR USBKey dev board finally came and I’m on my way to making an open source clone of my beloved TouchStream keyboard. I’m using the Cypress CapSense parts for the multitouch sensing and AVR parts for doing the processing and communication. The first prototypes will have most the processing done in software, actually, and then I’ll decide between AVR and ARM7 later.

Here is the roadmap:

  1. Write OSX userspace app to communicate with AVR over USB and emulate mouse/scroll wheel
  2. Prototype (onetouch) CapSense slider using Cypress PSoC chip
  3. Have PSoC chip communicate with AVR using SPI or CAN bus
  4. Implement slider that emulates scroll wheel that I can attach to the side of a Cinema Display (using userspace app for processing).
  5. Prototype small 2D multitouch touchpad using one PSoC chip communicating with AVR
  6. Make larger 2D multitouch surface with multiple PSoC drivers, all talking to the AVR
  7. Work on gesture recognition code in the userspace app
  8. Port gesture code to the AVR or ARM7
  9. Get keyboard to work as a HID device without drivers or the userspace control app.
  10. Done with version one!

How to keep WordPress from borking your post

Sometimes you just want to post a code snippet or some xml on your blog, but WordPress borks the formatting. So we installed the iG Syntax Hilighter, which mostly works great. It uses GeSHi under the hood.

But sometimes, the syntax highlighter does too much highlighting. I wrote a minimal GeSHi language file for ‘nocode’, which basically lets me use the syntax highlighter like a glorified pre tag. No more worrying about wptexturize() pulling a Swedish Chef on your post! Here is the language file if you want to use it:

NOCODE:
  1. <?php
  2. /*************************************************************************************
  3.  * nocode.php
  4.  * ——-
  5.  * Author: rajbot
  6.  * I wanted a <pre> that didn’t fuck everything up..
  7.  ************************************************************************************/
  8.  
  9. $language_data = array (
  10.     ‘LANG_NAME’ => ‘NOCODE’,
  11.     ‘COMMENT_SINGLE’ => array(),
  12.     ‘QUOTEMARKS’ => array(),
  13.     ‘KEYWORDS’ => array(),
  14.     ‘OBJECT_SPLITTERS’ => array(
  15.         ),
  16.     ‘REGEXPS’ => array(),
  17.     ‘SCRIPT_DELIMITERS’ => array(
  18.         )
  19. );
  20. ?>

Long Term Preservation for Open Source Software

Even though I never used the SourceForge compile farm, reading about its recent demise made me sad for some reason. It also made me think about the important role SourceForge plays in the Open Source world.

When SourceForge started in 1999, they provided a unique service that helped start thousands of open source projects, both large and small. Today, however, the core functionality of SF project hosting is easy to replicate by simply installing Subversion and Trac on a hosted server. A few competitors such as Google and Savannah have also sprung up. The services that SourceForge provides are no longer unique, but that’s not what makes it so important.

When software authors use a version control system like Subversion to write software, they preserve the history of all their changes. Although it is possible to preserve a log of edits in other mediums, most authors, musicians, painters, sculptors, or other “content creators” (ugh) do not save this kind of detailed history of their work.

Imagine being able to take your favorite book and roll back every change the author made, one edit at a time, so you could see the author’s thought process, and learn how a bunch of words were arranged to create something beautiful. It would be a huge learning experience for new authors. This is something that new software authors can do easily with open source software. Being able to explore fine-grain history makes software a unique kind of content. Future generations would find this information hugely valuable, just like we would find having Shakespeare’s first, second, third, fourth, and fifth drafts hugely valuable.

Open source software is worth preserving, both for its utility, and for the history it provides. Open source projects start and die out all the time, but hosting the project on SourceForge means that even if an author stops developing it, the world will still have access to it. Others would be able to look at the source code, view its history, and even incorporate the code into their own Open Source software.

SourceForge is owned by VA Software, a company that provides a great service to the Open Source world. But it is a company that can be bought by someone who might not understand the value of long-term preservation of open source software. Companies in general aren’t concerned with doing anything on a long-term scale. As far as I know, there is no one thinking about how to preserve software repositories for 100 or 1000 years. This stuff is important. How are we going to preserve it?

Older Posts »