Scraping Google To See What Happens

freeandfun1

VIP Member
Feb 14, 2004
6,201
296
83
Wasn't sure where to put this, so I put it here. Some might not be interested by this, but I thought it was an interesting discussion on who really owns stuff posted on the net. I find it interesting how google want to be able to scrape all websites on the web (otherwise, how can they search for what you are looking for?) yet they don't others to do it to them.....

Daniel Brandt, Google's most relentless thorn, has released code which scrapes Google, sans ads. Techdirt covers it here. The Register (also a Google thorn) covers it here. Highlights:

Brandt fully expects Google to throw legal and technical resources at him, but says he welcomes the challenge if only to clarify copyright issues.

Google took people's free stuff and made a $50 billion business from it, he argues.

"The commercialization of the web became possible only because tens of thousands of noncommercial sites made the web interesting in the first place," he writes. "All search engines should make a stable, bare-bones, ad-free, easy-to-scrape version of their results available for those who want to set up nonprofit repeaters. Even if it cuts into their ad profits slightly, there's no easier way to give back some of what they stole from us."

OK, there are a lot of issues here, and I really must write the book. Really...must...write...aww hell. I'll say this, in any case: Google hasn't stolen anything from anyone. Has the company profited from innovation in assembly and the architecture of participation? Hell yes. But that's OK, after all, those who innovate in assembling data, and those who take the patterns from the aggregate and make sense of them for the individual, well, they deserve the rewards of the marketplace.

But the question of public data as a copyrightable fact is an interesting one. It's been around the legislative maypole (as noted here) and I don't have time to get fully smart on it, but it is an interesting dilemma.

Think of the implications for the public domain material in the Google Print/Library project, for instance....
 

Forum List

Back
Top