There is, right now, a lawsuit going on which could have sweeping ramifications for The Internet Archive, content publishers (of all kinds), and the future of digital media archives.
Four book publishers -- Hachette, Wiley, Penguin Random House, & HarperCollins -- have sued The Internet Archive over the "Controlled Digital Lending" program.
As much as it pains me to say it, The Internet Archive is mostly likely going to lose this fight. Not because The Internet Archive is fighting the unstoppable behemoth of corporate media... but because, quite simply, The Internet Archive is wrong.
And, as a result of their creation of the Controlled Digital Lending (CDL) program, there is a very real chance that the (extremely valuable and useful) services of The Internet Archive may be ultimately shut down.
Internet Archive Background (The Short Version)
The Internet Archive started back in 1996 -- with the simple goal of archiving web pages. For multiple years, they created snapshots of absolutely massive numbers of websites (both personal and corporate), eventually leading to the public launch of "The Wayback Machine" (which allowed searching of how specific websites looked on specific dates) in 2001.
As the years went on, The Internet Archive dramatically expanded the type of material that they archived: including out of print newspapers, magazines, books, public domain music and movies, abandoned software, and more.
- 832 Billion archived webpages.
- 38 Million printed materials (magazines, books, etc.).
- 2.6 Million pieces of software
- 11.6 Million videos files.
- 15 Million audio files.
- 4.7 Million images.
And counting. Absolutely massive amounts of storage and bandwidth -- not to mention the human effort put into archiving and cataloging that material.
In order to finance all of this, The Internet Archives operates on a budget that is a tiny fraction of that of other foundations -- bringing in between $20 and $30 Million per year.
In other words: The Internet Archive, while well funded, is not a behemoth. They don't have hundreds of Millions of dollars in monetary assets (like Wikimedia). Comparatively, The Internet Archive, is practically on a shoestring budget.
The Legal Gray Area
All of which is important to bear in mind when you consider that much of the content hosted by The Internet Archive... is not, necessarily, completely legal to share publicly.
- A huge quantity of the material archived is either clearly legal to share -- often within the Public Domain.
- Yet, another large chunk of material falls distinctly into a legal gray area: stradling the line of Copyright Law and Fair Use.
- And many other archived items (such as some software from the 1980s) is technically under copyright and, legally, should not be distributed -- but because of the fact that nobody is earning revenue from those older pieces of software, nobody objects to their availability on The Internet Archive (usually).
Could The Internet Archive be sued out of existence, should enough Copyright holders challenge the archiving and availability of some of those works? You bet. That is, absolutely, a very real possibility.
But, thankfully, that hasn't happened. Thanks in large part to so much material being in that "gray area" of legality, combined with other material simply not being currently profitable for any Copyright holder.
In a way, it's a sort of stalemate. The Internet Archive continues to publish and distribute Copyrighted works... and the Copyright holders allow it because most of those works tend to be non-profitable or out of print.
Then, in 2011, The Internet Archive began down a road that was destined to get them into legal trouble.
Enter: Controlled Digital Lending
The idea of "Controlled Digital Lending" (CDL) is simple: Take a physical book, scan it to create a digital version, then allow people to download that digital book.
We aren't just talking about extremely old, out of Copyright texts here. Many of the books being scanned and distributed by The Internet Archive are currently being printed and sold, with authors and publishers who still retain the Copyright on them.
And this isn't simply a handful of Copyrighted books, either. 3.6 Million books, still under Copyright, are distributed digitally by The Internet Archive.
To help illustrate the problem here, imagine the following scenario:
- You buy a DVD of a Marvel's "Avengers: Infinity War".
- You then rip that DVD, and turn it into a digital file (such as an .MP4).
- You then put up a website offering anyone who wants to watch it... to download it from you directly.
What do you think Disney / Marvel would have to say about that? Would you get in some level of legal trouble? You bet your tuchus you would!
If you purchase a physical work (such as a book or movie), that simply doesn't give you the right to make a digital copy and distribute it to others. That, right there, is "Piracy". And every adult knows that is going to get you into hot water.
Even if you stated -- as The Internet Archive has -- that you only allow as many people to download the digital file as you have physical copies. Irrelevant. You'd still get in trouble.
This was, quite possibly, the biggest example of "poking the beehive with a stick" I've seen in a long, long time. The folks running The Internet Archive had to know, from day one, that this was going to get them sued.
Then The Internet Archive Made it Worse
On March 24, 2020, The Internet Archive launched the "National Emergency Library".
This program was launched, in response to the lockdowns during the COVID pandemic, with the stated goal of providing digital copies of books to people who couldn't get to a library. It was, in essence, the "Controlled Digital Lending" system... but without the need to wait for your turn.
Want a book? Grab it. For free. It's yours. The author or copyright holder doesn't even need to know about it.
The restrictions -- as vague and difficult to enforce as they were -- that existed within the Controlled Digital Lending system were gone. And publisheres were, obviously, not happy.
The Inevitable Lawsuit
In 2020, four publishers (Hachette, Wiley, Penguin Random House, & HarperCollins) came together to file "Hachette v. Internet Archive" -- alleging that over 33,000 different titles, of theirs, were being distributed, without their permission, by The Internet Archive.
A claim that was easy to prove with a simple search on The Internet Archive's website. Complete with details on the number of people who downloaded each book. The end result? The publishers claimed hundreds of millions in damages.
Which, again, The Internet Archive had to know was coming. It was simply too obvious. They built a website that, in essence, stated, "We pirated your books and distributed them to exactly this number of people".
On March 26th, 2023, the judge in this case (Judge John G. Koeltl of the U.S. District Court in Manhattan) handed down his judgement. And it was exactly what you would expect:
“At bottom, [the Internet Archive’s] fair use defense rests on the notion that lawfully acquiring a copyrighted print book entitles the recipient to make an unauthorized copy and distribute it in place of the print book, so long as it does not simultaneously lend the print book. But no case or legal principle supports that notion. Every authority points the other direction.”
This was a judgement that was destined, from the moment The Internet Archive started the Controlled Digital Lending system was started, to come to pass.
Just because you buy a book, you don't -- under the current laws -- have the rights to take copyrighted material, copy it, and distribute it however you wish. The law is both clear and well understood by... just about everyone.
The Bizarre Response from The Internet Archive
On December 15th, 2023, The Internet Archive (being represented, in part, by the Electronic Frontier Foundation), filed a brief in their appeal of that judgement. Of that appeal, the founder of The Internet Archive, Brewster Kahle, made the following statement:
"Why should everyone care about this lawsuit? Because it is about preserving the integrity of our published record, where the great books of our past meet the demands of our digital future. This is not merely an individual struggle; it is a collective endeavor for society and democracy struggling with our digital transition. We need secure access to the historical record. We need every tool that libraries have given us over the centuries to combat the manipulation and misinformation that has now become even easier."
They are fighting for "democracy" and against "misinformation". None of which has any relevance to the court case.
Followed by:
"The stakes of the lower court decision are high. Publishers coordinated by the AAP (Association of American Publishers), have removed hundreds of thousands of books from controlled digital lending. The publishers have taken more than 500 banned books from our lending library, such as 1984, The Color Purple, and Maus. This is a devastating loss for digital learners everywhere."
The statement that "publishers have taken more than 500 banned books from our lending library" is more than a little misleading. Not only are the books listed readily available in libraries and book stores across the entire country... but they were not removed from the Internet Archive's "Controlled Digital Lending" system because they were "banned" or objectionable in some way.
Those books, along with many others, are under Copyright. And The Internet Archive violated that by copying the books, and disributing digital files without consent of the publisher or author.
The Internet Archive seems to be attempting to suggest that there are some sort of anti-book activists trying to ban books from The Internet Archive. When the real truth is... authors and publishers are making the case that The Internet Archive is stealing their property and giving it to others (in exchange for donations). No activists or book banning involved.
In fact, the statement from The Internet Archive does not actually address the core issue within the lawsuit. Instead it makes a number of unrelated statements that appear designed to cause fear around some sort of nonexistant war on libraries.
Such as this odd closing line:
"In the face of challenges to truth, libraries are more vital than ever."
Truly bizarre.
Especially when you consider that The Internet Archive is not representing the libraries of America in this case -- many libraries offer digital lending services that they negotiate with publishers. What The Internet Archive is doing is for The Internet Archive.
What Happens Now?
A judge has ruled on the case (against The Internet Archive) and an appeal has been filed.
So... what happens next? What real, practical impact will this have on The Internet Archive, Digital media, Libraries, and the like?
- There is no reason to believe that the first judge's decision will be reversed on appeal. Copyright law is pretty well established and tested -- and The Internet Archive was clearly in the wrong, from a legal perspective.
- The more The Internet Archive spends on failed lawsuits -- and programs that cause them to get sued -- the less money they have to run the rest of their programs (such as The Wayback Machine).
- Every lawsuit they lose -- dealing with illegal copying and distribution of Copyrighted material -- is going to increase the odds of more lawsuits being filed against them. The Internet Archive is, in effect, opening the floodgates of potential lawsuits across the spectrum of archived material (including music, software, and more).
- Because the "Controlled Digital Lending" program is part of The Internet Archive, the entire organization is liable for any damages. And, quite frankly, they don't have the money to spare to afford those damages.
- None of this will have much impact on libraries -- which have a variety of digital lending systems in place (working with a variety of publishers).
All of that is fairly obvious to any outside observer. Even someone who is a big fan of The Internet Archive (as I am), can see how the current course being followed will lead to some significant negative outcomes.
Worst case scenario?
- The Internet Archive (including The Wayback Machine, and the entire archive of software, music, and other cultural items) will be forced to shut down due to legal liabilities (and legal defense costs) from years of Copyright infringement.
- Other people, foundations, and companies interested in archiving culturally significant material will be increasingly hesitant to do so (they don't want to get sued out of existence either).
- Obtaining public domain and historical material will be significantly harder going forward.
All because of the Controlled Digital Lending program -- the Internet Archive simply pushed it too far. Far beyond the "legal gray areas" they previously operated in.
If any of those items actually come to pass, that would truly be a shame. The Internet Achive provides a valuable service for the world -- one which I use both personally and professionally.
How likely do I think that "worse case scenario" is? Pretty gosh darned likely. In large part because The Internet Archive seems determined -- from day one -- to make it happen.