Libraries are well-known for archiving all kinds of materials, from books to manuscripts, to art and other tangible objects. Now, with the advent and proliferation of the internet, libraries are beginning to archive webpages and other born-digital material. Internet archives have increasing significance in the legal field as time goes on, as websites and their content change rapidly and are now serve as crucial evidence in all types of cases (issues concerning authenticating this type of evidence will be discussed in a future post). The International Internet Preservation Consortium was organized with the mission to “acquire, preserve and make accessible knowledge and information from the Internet for future generations everywhere” (http://netpreserve.org/about-us/mission-goals). The Library of Congress is one of many contributing members (including the well-known Internet Archive, a non-profit website) in addition to libraries and universities across the globe. In 2004, the Library of Congress’ Office of Strategic Initiatives created a Web Archiving Team to support the goal of managing and sustaining at-risk digital content.
The Library of Congress’ web archives can be viewed at http://www.loc.gov/websites/collections/, and you can browse their various collections. One collection of particular interest to lawyers is the library’s Legal Blawg Archive . This archive, which began in 2007, collects legal blogs the site as possible, including html pages, images, flash, PDFs, audio, and video files, to provide context for future researchers. At the moment, the library is collecting 134 law related blogs. Other collections in the web archives include an Iraq war archive and a visual image website archive.
If you are the owner of a legal blog or other website, it’s possible that you may be contacted by the Library of Congress for permission to have your website included, if it meets the library’s collection policies. If you agree, you maintain the copyright to your website and its materials, and the library’s webcrawler will collect images of your website about once a week. Then, the pages are made publically accessible about a year later. Take a look at the websites already archived, and stay tuned for more on legal issues concerning online archives and use of web content.