Tagged: archive RSS Toggle Comment Threads | Keyboard Shortcuts

  • rajbot 4:07 pm on January 18, 2008 Permalink | Reply
    Tags: archive, crawlers, data, datasets   

    theinfo.org: for people with large data sets 

    Aaron Swartz has launched theinfo.org, a wiki for people who crawl and analyze large datasets.

    This is a site for large data sets and the people who love them: the scrapers and crawlers who collect them, the academics and geeks who process them, the designers and artists who visualize them. It’s a place where they can exchange tips and tricks, develop and share tools together, and begin to integrate their particular projects.

     
  • rajbot 12:54 am on December 20, 2007 Permalink | Reply
    Tags: archive, CC0, CCZero, , ,   

    Creative Commons launches new CC Zero License 

    May, Shag, and I went to the Creative Commons 5-Year Birthday Party and got to hear Lawrence Lessig announce a new CC License, CC Zero. Licensing a work under CC0 is similar to placing it in the Public Domain, but CC0 is meant to work better internationally. Did you know Germany (and maybe other EU countries) don’t allow authors to dedicate their own works into the public domain? I’m glad smart people are working on this problem!

    The CC 5-Year party was cool, but the sound system was turned down so low that it was hard to hear Lessig and Gilberto Gil. Fortunately, DJ Spooky turned up the sound for his set, but that caused others to complain that they could no longer talk over the music.

    Here’s the press release and wiki page for CC0 (where I got the CC0 image above). The tool to generate the CC0 machine-readable license should be available on Jan 15.

    Lawrence Lessig’s blog has a list of the amazing number of announcements at the CC 5-Year party.

     
  • rajbot 12:19 am on December 7, 2007 Permalink | Reply
    Tags: archive, InternetArchive, , ,   

    Behind the scenes at the Internet Archive 

    When you are building a digital library to provide Universal Access to Human Knowledge, how to you hold all the data?

    You start with a few racks of machines to hold the data using redundant storage:

    The red boxes are built by Capricorn. Each one is a 1U half-depth low-power server that can hold four 1TB hard drives:

    Add a bunch of homemade routers:

    And some BigIron: (this thing pushed 6Gb/s today!)

    Now you need to power it up:
    IMG_3715.JPG

    And cool it down:
    IMG_3705.JPG

    And fill it with books:
    IMG_3714.JPG

    For some reason, you need a 1980′s-era Connection Machine:
    IMG_3712.JPG

    Finally, no Archive is complete without a world-class Linux kernel hacker:
    IMG_3703.JPG

    IMG_3711.JPG

     
  • rajbot 7:54 pm on November 21, 2007 Permalink | Reply
    Tags: archive, , , , , netradio, , streaming   

    Creative Commons Radio from archive.org 

    Shag and I are testing a netradio station, streaming CC ShareAlike-licensed tracks from archive.org. Give these a listen and let us know what you think!

    Ambient Drone Electronic Folk Indie IDM Pop

    Ambient:
    Drone:
    Electronic:
    Folk:
    Indie:
    IDM:
    Pop:

     
    • may 4:11 pm on November 26, 2007 Permalink | Reply

      is there a way to listen without starting up itunes?

    • rajbot 4:19 pm on November 26, 2007 Permalink | Reply

      Sure, just go here and click on the ‘Click To Listen’ link. It should open in your default mp3 player. If iTunes starts up, then the default player is set wrong… are you using windows?

    • rajbot 4:27 pm on November 26, 2007 Permalink | Reply

      Alternatively, you can open up your favorite mp3 player, choose the ‘Open Stream’ or ‘Open URL’ menu option, and paste in one of the the .m3u urls from above…

    • may 7:05 pm on November 26, 2007 Permalink | Reply

      oh, I guess i can use itunes then. i was just wondering if there was a way to stream the music from the browser (i’ve been listening to music via pandora a lot and also through a music blog app via facebook. i think i sent you a linky today :-)

    • rajbot 12:46 am on November 27, 2007 Permalink | Reply

      Oh, of course!

      I updated the post to have in-browser players! The ID3 tags don’t seem to show up, but you can listen to the stations just fine. Maybe I need to install a newer audio plugin for wordpress…

    • may 12:51 pm on November 27, 2007 Permalink | Reply

      awesome! thank you!!

    • shag 2:26 pm on November 27, 2007 Permalink | Reply

      heh, we need to hack in resampling support to the server. some of the bitrate changes wig audacious out.

    • rajbot 2:28 pm on November 27, 2007 Permalink | Reply

      Yeah, I have a bunch of ideas of how to make this much more awesome. Do you want to get lunch tomorrow and talk about it?

    • may 5:17 pm on November 28, 2007 Permalink | Reply

      i just discovered this neat music site today! http://www.seeqpod.com/
      apparently it used to be a lawrence berkely national labs project

  • rajbot 5:34 pm on November 20, 2007 Permalink | Reply
    Tags: archive, , , ,   

    Something for you to listen to: CC-licensed ShareAlike Albums 

    Here are roughly 2000 albums, all licensed under a Creative Commons ShareAlike license. Tons of awesome stuff for you to listen to, all tagged by genre, artist, and label!

    The code used to make the list of albums can be found at the NetLabelShareAlike wiki page.

     
  • rajbot 5:21 pm on November 4, 2007 Permalink | Reply
    Tags: archive, , , , ,   

    What Will Libraries Look Like in the Future? 

    For the Open Content Alliance meeting two weeks ago, the conference room at the Internet Archive HQ was transformed into a prototype library that will soon be open to the public. Here are some pictures of what Brewster calls the Open Library.

    When you enter, you are greeted with a sign that explains the library:

    This is a prototype library of the future that has access to millions of books, videos, and audio items from thousands of libraries worldwide. This library fits into a small room but still can house music, videos, one of a kind or popular books, and a librarian. It has download capabilities for patrons with music players, e-books, audio books and storage devices, and a Print on Demand machine that can print and bind a book in ten minutes.

    The purpose of the open library is to provide universal access to all published knowledge. By using digitizing equipment, computer storange, and the Internet, we can realize the dream of the Library of Alexandria.

    IMG_1688.JPG

    When you walk in, the first thing that grabs your attention is the Espresso Book Machine, which can print a book and bind a book in about ten minutes.

    The EBM completely changes the physical structure of the library. Using the public access terminal in the library or your own laptop, you can order one of the 200,000+ books from the Internet Archive book collection. It takes about five minutes of preparation and another five minutes of printing, and then a perfect-bound book shoots out of the machine. Here is some video of the EBM in action.

    Even though this prototype library is pysically quite small, it has a collection larger than 80% of the libraries in the US. The Internet Archive book collection is growing at a rapid pace (15,000 books a month and rising). Soon, this might be the largest library in the world, and you will be able to put one in every town!

    IMG_1699.JPG

    In the two pictures above, you can see the ingredients of the Library of the Future:

    • Librarian’s Desk
    • Ten Minute Press
    • A public internet terminal, for ordering books form other libraries, printing books out, and filling up your iPod/ebook reader.
    • One-of-a-Kind Books, including:
    • E-Book Readers, in this case, the OLPC
    • Banned Books
    • Foreign-language books
    • Local-interest and technology books
    • 78 rpm records, and other non-book material
    • A comfy chair

    What do you think? Anything we should add to the prototype Open Library?

     
    • bobslobster 9:33 am on November 6, 2007 Permalink | Reply

      yeah…coffee. a regular espresso machine may complement the book machine quite nicely.

    • may 2:44 pm on November 6, 2007 Permalink | Reply

      I think the library of future should look like a living room / cafe with lots and lots of cooshy chairs and couches :-)

    • mangtronix 8:00 pm on December 4, 2007 Permalink | Reply

      awesome!

    • el dorado 4:15 pm on January 12, 2008 Permalink | Reply

      Very interesting article.
      Thanks.
      Regard from Poland

  • rajbot 5:01 pm on October 25, 2007 Permalink | Reply
    Tags: , archive, , , , ,   

    Liveblogging an Ubuntu 7.10 installation 

    Photo_102.jpg

    Bob, Shag and I are trying to move our book scanning hardware to Ubuntu 7.10 – the Gutsy Gibbon. It’s a ridiculous process, and our hardware is crap. Here are some notes:

    • chai:20 (4:20) – Started up the installer app on the live cd. Unfortuantely the screen rez is 800×600, so we can’t see the important back/next/ok buttons on the bottom of the installer panel. What kind of installer requires greater than 800×600 screen rez?
    • chai:23 – Somehow, by logging the Live CD user out and fucking with the screen rez, we got the screen to display a larger screen res, but we can’t see the entire desktop on our screen. Moving the mouse around seems to pan the desktop, which would kinda work, if we could see the mouse cursor.
    • chai:25 – We are asked for the timezone, and San Francisco isn’t one of the available options. Los Angeles is. However, we opt to move to La Paz.
    • chai:30 – It is now officially time for chai.
    • chai:40 – We have found that starting a lot of xeyes processes lets us estimate where the invisible mouse cursor should be. There are fifty eyeballs on our screen
    • chai:45 – Bob starts playing minesweeper
    • chai:48 – Someone figures out that this version of xeyes lets us resize the window, so there is a GIANT EYEBALL staring at me
    • chai:50 – Installation done, rebooting!
    • Mouse works after reboot! Now to try and scan books!

    Photo_101.jpg

    Photo_10.jpg

     
    • tracey pooh 11:28 am on October 31, 2007 Permalink | Reply

      3l33t h@x0r5!

      when you are in next, you, steve and i can try to see how far down a “unified gutsy image” we can get for all archive boxes, yayzors.

    • rajbot 12:56 pm on November 3, 2007 Permalink | Reply

      The unified image should contain important debugging tools like xeyes and maybe glgears.

  • rajbot 3:02 pm on October 4, 2007 Permalink | Reply
    Tags: archive, , , , ,   

    Pics from the Prelinger Library 

    We went to the Illuminated Corridor event, Prelinger on Prelinger, at the Prelinger Library last night. Lots of video art! Some pics:

    linky to pics on flickr

     
  • rajbot 8:45 pm on September 30, 2007 Permalink | Reply
    Tags: archive,   

    The Copyright Database has been set Free 

    Rick Prelinger of the Internet Archive, along with university libaries and other public interest groups, asked the Register of Copyrights to free the copyright cataloging database, which sells for $86,625.

    Although the Copyright Office has decided to continue charging for the database, the fine folk at public.resource.org has set the copyright database free!

    Archivists and researchers will be happy tonight! Download away!

     
  • rajbot 4:18 pm on September 25, 2007 Permalink | Reply
    Tags: archive, , , , ,   

    Video of the Espresso Book Machine printing a book! 

    This is the first time I got the Espresso Book Machine to print and bind a book without human intervention! I happend to capture a video of Flatland being printed. Very cool!

    http://www.archive.org/download/EspressoBookMachineFlatland/EspressoBookMachineFlatland.flv', }"/>
    (click play to start) (link to other sizes)

     
  • rajbot 12:28 pm on September 20, 2007 Permalink | Reply
    Tags: archive, , , , , ,   

    Video and Pics of the Espresso Book Machine 

    Here is a short video of a test run of the Open Content Alliance’s Espresso Book Machine, an automatic print-on-demand robot that makes perfect-bound paperback books. The Espresso Book Machine was created by On Demand Books.

    This video was shot during configuration of the machine, so you can see the printing/binding process, but the book gets stuck and comes out mangled.. I’ll upload another video after the machine is set up..

    http://www.archive.org/download/EspressoBookMachineTestRun/EspressoBookMachineTestRun.flv', }"/>
    (press play to start video) (link to other sizes)

    IMG_0994.JPG

    IMG_0995.JPG

    IMG_0998.JPG

     
  • rajbot 4:17 pm on July 24, 2007 Permalink | Reply
    Tags: archive,   

    Backup genny? We don’t need no backup genny! 

    It seems like half the net just got knocked out by six back-to-back power outages in downtown San Francisco. A bunch of great sites went down: archive.org, craigslist, LJ, yelp. Did Slide go down, too?

    A bunch of our racks are still powered down…

     
    • rajbot 4:24 pm on July 24, 2007 Permalink | Reply

      Laughing Squid says 365 Main went down too. That’s crazy… I can understand our datacenter going down for an hour, or United Layer going down for a day, but 365 Main??? WTF?

    • rajbot 4:58 pm on July 24, 2007 Permalink | Reply

      It was a transformer explosion.

      365 Main claims to have TEN of these 2.1 MW Hitec Coninuous Power Systems, and three 20,000 gallon deisel tanks. A lot of good that did.

    • Adam Rosi-Kessel 5:40 pm on July 24, 2007 Permalink | Reply

      Could this be why Netflix was offline for almost 24 hours???

    • rajbot 6:00 pm on July 24, 2007 Permalink | Reply

      Yup, Netflix was down for the same reason…

    • rajbot 6:08 pm on July 24, 2007 Permalink | Reply

      Here is a pic of the nerd convention at 365 Main today:
      http://tastic.brillig.org/~jwb/dorks.jpg

    • rajbot 6:44 pm on July 24, 2007 Permalink | Reply

      Sorry, I was wrong I think. The netflix outage seems to predate the power outage. Laughing Squid has updated their post to point to this article, which claims that the netflix outage is unrelated.

    • rajbot 6:59 pm on July 24, 2007 Permalink | Reply

      Talked to some people working at 365 main to get their racks up.. it seems only part of 365 main went down.. heard lots of bad things about the competence of their staff. Can’t believe people are paying so much for that crap.

    • may 12:25 pm on July 26, 2007 Permalink | Reply

      apparently this is what really happened…


      Breaking News: All Online Data Lost After Internet Crash

    • may 7:52 pm on August 20, 2007 Permalink | Reply

      okay, this is only tangentially related but it cracked me up (and demonstrates how fragile our networks are!)

      from the skype blog http://tinyurl.com/ytlck3

      “On Thursday, 16th August 2007, the Skype peer-to-peer network became unstable and suffered a critical disruption. The disruption was triggered by a massive restart of our users’ computers across the globe within a very short timeframe as they re-booted after receiving a routine set of patches through Windows Update.

      The high number of restarts affected Skype’s network resources. This caused a flood of log-in requests, which, combined with the lack of peer-to-peer network resources, prompted a chain reaction that had a critical impact. “

  • rajbot 2:05 pm on July 16, 2007 Permalink | Reply
    Tags: archive, ,   

    Announcing the Open Library! 

    Announcing The Open Library!

    What if there was a library which held every book? Not every book on sale, or every important book, or even every book in English, but simply every book—our planet’s cultural legacy.

    First, the library must be on the Internet. No physical space could be as big or as universally accessible as a public web site. The site would be like Wikipedia—a public resource that anyone in any country could access and that others could rework into different formats.

    Second, it must be grandly comprehensive. It would take catalog entries from every library and publisher and random Internet user who is willing to donate them. It would link to places where each book could be bought, borrowed, or downloaded. It would collect reviews and references and discussions and every other piece of data about the book it could get its hands on.

    But most importantly, such a library must be fully open. Not simply “free to the people,” as the grand banner across the Carnegie Library of Pittsburgh proclaims, but a product of the people: letting them create and curate its catalog, contribute to its content, participate in its governance, and have full, free access to its data. In an era where library data and Internet databases are being run by money-seeking companies behind closed doors, it’s more important than ever to be open.

    So let us do just that: let us build the Open Library.

    From Aaron Swartz’s blog:

    I thought of the smartest programmers and designers I knew and gave them a ring, sat down for coffee with them, threatened to fly out to their homes and knock on their doors. In the end, we got together an amazing group of people — all sworn to secrecy of course — and in the past few months we’ve put together what’s probably the biggest project I ever worked on.

    So today I’m extraordinarily proud to announce the Open Library project. Our goal is to build the world’s greatest library, then put it up on the Internet free for all to use and edit. Books are the place you go when you have something you want to share with the world — our planet’s cultural legacy. And never has there been a bigger attempt to bring them all together.

    Congrats Aaron and team!

     
  • rajbot 10:48 pm on July 3, 2007 Permalink | Reply
    Tags: archive, ,   

    First Edition Principia Discordia Recovered from JFK Assasination Archive 

    This is highly weird. In April 2006, a First Edition copy of the Principia Discordia was recovered from the John F. Kennedy Archives (see routing slip). Here is a bit of detail on how it was found:

    I stumbled upon knowledge of the Dead SeePresident Scrolls purely by chance – a reference number on a scan of a copy of something I did not believe I was looking at: so much so that I passed over the title page of the first edition of the Principia Discordia (How The West Was Lost) many times before it dawned on me what it was before my eyes.

    On that sheet was an Accession Number. And that number pointed to a secret which has lain hidden for over 30 years, trapped unseen in a musty, dusty vault in Maryland.

    As luck would have it, the Rev. Karl Musser happened to be in the neighbourhood of that very vault, and willing to do me a favour, All Blessings Unto Him.

    But how did these papers end up in the Assassination Archive in the first place?

    In the late sixties, founding Discordian Kerry Thornley, who had been in the Marines with Oswald, found himself under the microscope of those investigating the Assassination of John F. Kennedy. Such Official Investigations generate a Paper Trail – evidence proffered is indexed and stored… preserved against the erosion of time. (Well, mostly…)

     
  • rajbot 11:46 pm on July 2, 2007 Permalink | Reply
    Tags: archive, ,   

    How the Pentagon Papers Came to be Published by the Beacon Press: A Remarkable Story Told by Whistleblower Daniel Ellsberg, Dem Presidential Candidate Mike Gravel and Unitarian Leader Robert West 

    http://www.archive.org/download/dn2007-0702_vid/dn2007-0702.flv', }"/>You should watch this episode of Democracy Now.

    Thirty-five years ago this weekend, Beacon Press lost a Supreme Court case brought against it by the US government for publishing the first full edition of the Pentagon Papers. It is now well known how the New York Times first published excerpts of the top-secret documents in June 1971. But less well known is how the Beacon Press – a small, nonprofit publisher affiliated with the Unitarian Universalist Association – came to publish the complete 7,000 pages that exposed the true history of U.S. involvement in Vietnam. Their publication led the Press into a spiral of two and a half years of harassment, intimidation, near-bankruptcy, and the possibility of criminal prosecution.

    Today, we hear the story from three men at the center of the storm: Former Pentagon and RAND Corporation analyst, famed whistleblower Daniel Ellsberg who leaked the Pentagon Papers to the New York Times. Mike Gravel – the former Alaska Senator who is now a Democratic Presidential candidate – who tells the dramatic story of how he entered the Pentagon Papers into the Congressional record and got them to the Beacon Press. And Robert West, the former president of the Unitarian Universalist Association which owned the Press and agreed to risk publication of the Pentagon Papers. [includes rush transcript]

    This is a story that has rarely been told in its entirety. Last weekend I moderated an event at the Unitarian Universalist conference in Portland, Oregon commemorating the publication of the Pentagon Papers and its relevance today.

    (via Tracey)

     
  • rajbot 8:42 pm on June 19, 2007 Permalink | Reply
    Tags: archive, corruption, FreeCulture, , ,   

    Lessig To Shift Efforts From Free Culture To Fighting Corruption 

    For the last 10 years, Lawrence Lessig has been at the forefront of the Free Culture movement. At the iCommons summit, Lessig announced that he will stop working on Free Culture issues, and shift his work to fighting corruption:

    I don’t want to be a part of that business. And more importantly, I don’t want this kind of business to be a part of public policy making. We’ve all been whining about the “corruption” of government forever. We all should be whining about the corruption of professions too. But rather than whining, I want to work on this problem that I’ve come to believe is the most important problem in making government work.

    Best of luck, Professor Lessig! You make the world a better place, and we are all thankful! (via brewster)

     
  • rajbot 2:19 pm on May 24, 2007 Permalink | Reply
    Tags: archive, , , ,   

    reCAPTCHA: stop spam and help digitize books 

    reCaptcha is a project by Prof. Luis van Ahn at CMU.

    Over 60 million CAPTCHAs are solved every day by people around the world. reCAPTCHA channels this human effort into helping to digitize books from the Internet Archive. When you solve a reCAPTCHA, you help preserve literature by deciphering a word that was not readable by computers.

    reCAPTCHA is a great project. I added the WordPress plugin to TikiRobot, which will hopefully reduce all the crap that Akismet fails to catch. If you haven’t seen Prof. van Ahn’s TechTalk on Human Computation, check it out. It’s very good!

    His other projects are The ESP Game and PeekABoom.

    Update: Here is a quote from Brewster:

    “I think it’s a brilliant idea — using the Internet to correct OCR mistakes,” said Brewster Kahle, director of the Internet Archive, in a statement. “This is an example of why having open collections in the public domain is important. People are working together to build a good, open system.”

     
  • rajbot 10:00 pm on May 18, 2007 Permalink | Reply
    Tags: , archive,   

    A non-profit story 

    One of the perks of working as a programmer for a non-profit is that you get to collaborate with some top-notch hackers. The other day, I needed to find out the return values of the rm command, so of course I type ‘man rm’ on the Ubuntu command line. The manual page for rm didn’t list the return values, but it did list the authors. One of the names in the Authors section seemed vaugley familiar. “Hmmmm…”, I thought, “isn’t that the guy upstairs?” We ran upstairs and caught him off guard. “Um… that was a long time ago…” he told us. We suggested that he add a return values section to the man page, and then ran back downstairs, giggling like schoolkids. He probably thought we were nuts :)

     
  • rajbot 9:13 pm on April 26, 2007 Permalink | Reply
    Tags: archive, ,   

    Something for you to listen to: Sound-Free #4 

    I installed the Audio Player WordPress plugin so we can easily embed mp3s into our posts.

    To demonstrate, I’ll point you to episode 4 of the excellent Sound-Free podcast, which digs through a huge number of Creative Commons and netlabel releases and mixes the best stuff.

    I embedded the above flash mp3 player using the following code:
    [nocode]

    [/nocode]

    Perfect late night mix. Listening back, it reminds me of the days when I use to listen to late night radio, flicking between commercial FM stations and DJ’s with a licence to experiment (where did they go?). Hence some tracks that I’ve christened softrock-electronica, some fine 4-4 house and plenty of other oddments…

    Here is the track listing and the archive.org page for Sound-Free #4.

     
    • may 7:44 am on April 27, 2007 Permalink | Reply

      ooh, this is useful. thanks for finding & installing it!

  • rajbot 10:07 pm on April 7, 2007 Permalink | Reply
    Tags: archive, ,   

    Cybersquatters block the Internet Archive Wayback Machine 

    We know that domain squatters are scummy, but I never knew they were this scummy. The squatters who picked up http://goldengatetunnel.com/ are specifically blocking the Wayback Machine. Their robots.txt file not only prevents any new crawls, but since the Wayback Machine applies robots.txt retroactively, we can’t even see the previoiusly-archived (pre-squatter) web site :(
    [nocode]
    User-agent: ia_archiver
    Disallow: /
    User-agent: *
    Disallow:
    [/nocode]

    Anyway, Dodger pointed me to this awesome bumper sticker.. Those who know, go below!
    a.jpg

     
    • Herve 1:54 pm on April 9, 2007 Permalink | Reply

      The golden gate tunnel has a donut court with NINE different donut vendors. Don’t miss it next time you go to Marin.
      I have some extra stickers. I can be bribed with donuts…

  • rajbot 9:57 am on March 10, 2007 Permalink | Reply
    Tags: , archive,   

    Long Term Preservation for Open Source Software 

    Even though I never used the SourceForge compile farm, reading about its recent demise made me sad for some reason. It also made me think about the important role SourceForge plays in the Open Source world.

    When SourceForge started in 1999, they provided a unique service that helped start thousands of open source projects, both large and small. Today, however, the core functionality of SF project hosting is easy to replicate by simply installing Subversion and Trac on a hosted server. A few competitors such as Google and Savannah have also sprung up. The services that SourceForge provides are no longer unique, but that’s not what makes it so important.

    When software authors use a version control system like Subversion to write software, they preserve the history of all their changes. Although it is possible to preserve a log of edits in other mediums, most authors, musicians, painters, sculptors, or other “content creators” (ugh) do not save this kind of detailed history of their work.

    Imagine being able to take your favorite book and roll back every change the author made, one edit at a time, so you could see the author’s thought process, and learn how a bunch of words were arranged to create something beautiful. It would be a huge learning experience for new authors. This is something that new software authors can do easily with open source software. Being able to explore fine-grain history makes software a unique kind of content. Future generations would find this information hugely valuable, just like we would find having Shakespeare’s first, second, third, fourth, and fifth drafts hugely valuable.

    Open source software is worth preserving, both for its utility, and for the history it provides. Open source projects start and die out all the time, but hosting the project on SourceForge means that even if an author stops developing it, the world will still have access to it. Others would be able to look at the source code, view its history, and even incorporate the code into their own Open Source software.

    SourceForge is owned by VA Software, a company that provides a great service to the Open Source world. But it is a company that can be bought by someone who might not understand the value of long-term preservation of open source software. Companies in general aren’t concerned with doing anything on a long-term scale. As far as I know, there is no one thinking about how to preserve software repositories for 100 or 1000 years. This stuff is important. How are we going to preserve it?

     
    • peliom 9:44 pm on March 10, 2007 Permalink | Reply

      Wow, that is sad about the compile farm, and certainly doesn’t send a positive message about VA’s ability to continue supporting sourceforge.

      The compile farm adds *huge* value to the OS community because it allows developers with very little resources (e.g. college students) to get their software to at least compile on a machine/system they don’t own. They someone who has that system (who may not be a developer) can test it.

      I totally agree with you about preserving open source projects. In 99 years all this stuff is going to be an anthropologist’s wet dream.

      I would be curious to know what the cost structure is like for Sourceforge. But whatever the cost, I’m sure it is well within the means of say, Wikipedia. They like to save things and they can raise tons of money in a jiffy. Of course archive.org would be a sensible place as well ;-)

      What I think is interesting is that as time goes on, the ratio of hardware power to source code is going up up up. We are controlling these huge behemouth machines with tiny text files of source code.

  • rajbot 9:19 pm on February 8, 2007 Permalink | Reply
    Tags: archive,   

    Giving Away Books 

    A couple weeks ago, we went to a discussion at the Commonwealth Club entitled The Future of the Book: Dead or Alive.

    IMG_2900.JPG

    Unsurprisingly, no one thought that books were going to die out anytime soon. Brewster Kahle showed off a prototype book reader on the OLPC, and said something which I’ve never heard anyone say before. He held up a book produced by the Internet Archive Bookmobile and said that now it costs less to print a book and give it away than it does to loan it out from a library. And that means we can give away books and pay authors at the same time.

     
  • rajbot 11:13 pm on December 2, 2006 Permalink | Reply
    Tags: archive, , ,   

    Dec 6: Film Fragments of 20th-Century San Francisco 

    Rick Prelinger is showing rare early video footage of San Francisco on Wednesday!

    Film is much more than entertainment—it’s rich and often vivid evidence that backs up imperfect memories and infuses institutional histories with traces of everyday life. Drawing from the Bert Gould collection of silent and early sound films, the Vista collection of exuberant early-1960s city views, diverse home movies and industrial films, this program will include rarely-seen views of San Francisco, contextualizing them in time and space. We hope for audience participation, especially to help identify mystery scenes!

    The exhibit is showing at CounterPULSE on Wednesday at 8pm. Let’s go!

    Update: here is the upcoming listing.

     
    • may 8:48 am on December 4, 2006 Permalink | Reply

      cool! unfortunately wednesday night is butter night (bike loop around the city…yep even in the dark). but if you guys end up going, I’ll try to meet up with you afterwards :-)

  • rajbot 5:04 pm on November 29, 2006 Permalink | Reply
    Tags: archive,   

    New DMCA Exemptions Granted 

    oldgamesT.pngFrom Brewster:

    Thanks to the hard work of two great law school students of Peter Jaszi of American University, Jieun Kim and Doug Agopsowicz, the Internet Archive and other libraries may continue to preserve software and video game titles without fear of going to jail. This is a happy moment, but on the other hand this exception is so limited it leaves the overall draconian nature of the DMCA in effect. A total of more than $50,000 of pro-bono lawyer time has been spent to just affect this exemption and its continuation. We hope that Congress, and other governments, will pass more balanced copyright laws to allow at least libraries, archives, research and scholarship to flourish without the current dark clouds of litigation.

    Links to announcement, full recommendation (pdf, still reading through it..), mostly useless slashdot discussion, and Diesel Sweeties Play-Old-Videogames tshirt.

     
  • rajbot 10:55 pm on November 16, 2006 Permalink | Reply
    Tags: archive, , , ,   

    Trip Report: Mechanics’ Institute Library 

    IMG_2298.JPGThis afternoon, Peliom and I took a field trip to the Mechanics’ Institute Library, a private library that has been around since 1854. It’s in a great old building at 57 Post Street, and they’ve been in that location since 1906! The Institute’s mission was to provide a technical library at a time when such resources were scarce in San Francisco. In contrast to the California Academy of Sciences, which was founded the previous year, the Mechanics’ Institute was formed as a corporation, with shareholders as well as dues-paying subscribers.


    IMG_2285.JPGA trip to the Library feels like a trip to old San Francisco. We arrived in the afternoon and found a perfectly quiet library with mostly older patrons. A few people were asleep in the comfy leather chairs, which seemed like a nice escape from the hustle of the financial district just outside the giant windows. Since we weren’t members, we got day passes ($10) and went upstairs to the periodical reading room, which is pictured here.


    IMG_2287.JPGDespite being such an old-school institution, the Library is surprisingly modern. Their 150K volume collection is kept up-to-date by adding 3000 items annually, and they subscribe to 600 periodicals (which you can check out!). The computers were all in use, and the Library offers access to several reference databases for patrons. They also have wifi. We spent a good part of our visit camped out in the reference room, using the wifi and chatting over IM (the library is so quiet that no one even whispers). Here is Peliom hiding behind a Fart Party bag among many late 19th-century volumes.


    IMG_2295.JPGIn addition to books, the Library has a large video selection (located in the Ladies Parlor), a small CD collection, and lots of tech books (Head Rush Ajax was checked out). The Mechanics’ Institute also hosts the oldest Chess Club in the US within the Library. There were probably 50-75 patrons using the library while we were there, mostly in the two large reading areas but a few at the tables scattered in the maze-like stacks. There are 5000 members total, and membership is $95 annually ($35 for students). (OK peliom, it’s your turn! Make this post better!)

    (peliom): Wow, everything looks beautiful, thanks for taking the pictures! I’m afraid all I have to offer is this Slide Show.

    The Mechanics’ is fantastic place and fabulous resources. All those catwalks and low ceiling bookshelves! If you’re a Web 2.0 warrior without a fixed location, or just need a break, the $95 yearly membership fee is a steal. That’s cheaper than dot-Mac!!

    A Note: the WiFi unfortunately uses WEP encryption. They will happily give you the password at the info desk (assuming you have a day pass or a membership) but make sure to just ask for the “wireless network instructions for Mac” otherwise they might get confused. The key is printed at the bottom of the one page printout.

    (rajbot again): Peliom touched on one of the two reasons why I won’t be getting a membership to the Mechanics’ Institute. The first reason involves the concept of library access. There are a few different kinds of private libraries, and this library isn’t one that cares about Universal Access to Human Knowledge. The Mechanics’ Institute goes out of their way to limit access. When we first arrived we couldn’t even get inside since they have a swipecard reader on the front door and we weren’t members, just visitors wanting to purchase a daypass. We got in by waiting for someone to exit and then sneaking inside like criminals. This library has very little outreach; definitely not the kind of library that would fund a bookmobile. They go so far as to lock down their wireless internet access.. even the shopping mall across the street has free, open wifi. Private libraries definitely have their place, but this one feels like it tries too hard to be exclusive. Member dues only make up 11% of their revenue, so their stuffy attitude seems odd and rubs me the wrong way.

    The second reason I won’t be joining the Mechanics’ Institute is it’s lack of technical depth. I was excited by the fact that there was a technical library in SF, but if you want to keep up with the state-of-the art, you need access to the relevant journals in your field. In my case, this means access to IEEE Xplore, which this library doesn’t provide. Their new titles are mostly non-technical, with a very small number of novice computer books and a couple general-interest science volumes. I loved walking through their stacks and saw a huge number of non-technical books that I would love to have the time to read. But for my casual reading, the internet and Amazon better serve my needs. It was a fun visit, and I’m sure I’ll visit many more times on a daypass, especially when I’m in the financial district and want to play a game of chess followed by a leisurely nap in a big chair :)

    (peliom): BWAHAHAHAHAH!!! “relevant journals in your field?” … who is going to order the only journal rajbot will ever need?

    Not me … uh uh, no way …

     
    • may 2:39 pm on November 17, 2006 Permalink | Reply

      drat, I wish I could have gone, but alas, I was at work :-(

c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel