<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The New Zealand Web Harvest 2008 Harvests Too Much</title>
	<atom:link href="http://sandfly.net.nz/blog/2008/10/the-new-zealand-web-harvest-2008-harvests-too-much/feed/" rel="self" type="application/rss+xml" />
	<link>http://sandfly.net.nz/blog/2008/10/the-new-zealand-web-harvest-2008-harvests-too-much/</link>
	<description>Blog, blog, blog, blog</description>
	<lastBuildDate>Mon, 08 Mar 2010 19:57:04 +1300</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Andrew</title>
		<link>http://sandfly.net.nz/blog/2008/10/the-new-zealand-web-harvest-2008-harvests-too-much/comment-page-1/#comment-338</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Thu, 16 Oct 2008 00:45:40 +0000</pubDate>
		<guid isPermaLink="false">http://sandfly.net.nz/blog/?p=245#comment-338</guid>
		<description>Courtney,

I can see your reasoning, I just think you have made a &quot;brave&quot; call that will likely cause trouble for some people. But it&#039;s great to see that you guys are being so responsive about any problems and hopefully nothing too bad will happen.

Aside from that, The National Web Harvest sounds like a fantastic project. Rock on NatLib!</description>
		<content:encoded><![CDATA[<p>Courtney,</p>
<p>I can see your reasoning, I just think you have made a &#8220;brave&#8221; call that will likely cause trouble for some people. But it&#8217;s great to see that you guys are being so responsive about any problems and hopefully nothing too bad will happen.</p>
<p>Aside from that, The National Web Harvest sounds like a fantastic project. Rock on NatLib!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Courtney Johnston</title>
		<link>http://sandfly.net.nz/blog/2008/10/the-new-zealand-web-harvest-2008-harvests-too-much/comment-page-1/#comment-337</link>
		<dc:creator>Courtney Johnston</dc:creator>
		<pubDate>Wed, 15 Oct 2008 04:54:47 +0000</pubDate>
		<guid isPermaLink="false">http://sandfly.net.nz/blog/?p=245#comment-337</guid>
		<description>Hi Andrew

Thanks for the feedback (I know people usually say that snarkily, but I&#039;m really sincere here).

We&#039;ve just &lt;a href=&quot;http://librarytechnz.natlib.govt.nz/2008/10/2008-web-harvest-let-us-know-how-we-can.html&quot; rel=&quot;nofollow&quot;&gt;just blogged&lt;/a&gt; about the Web Harvest &amp; the pain it&#039;s causing (and the comments we&#039;ve seen). Our intentions really are good - to collect &amp; preserve &amp; make accessible NZ&#039;s digital heritage for people in the future, the same way we do already for books &amp; newspapers &amp; photographs - and we&#039;re trying to respond and fix people&#039;s problems as quickly as we can.

You can send comments &amp; requests for us to change the crawler&#039;s behaviour to &lt;a href=&quot;mailto:web-harvest-2008@natlib.govt.nz&quot; rel=&quot;nofollow&quot;&gt;web-harvest-2008@natlib.govt.nz&lt;/a&gt;.

Thanks heaps,

Courtney</description>
		<content:encoded><![CDATA[<p>Hi Andrew</p>
<p>Thanks for the feedback (I know people usually say that snarkily, but I&#8217;m really sincere here).</p>
<p>We&#8217;ve just <a href="http://librarytechnz.natlib.govt.nz/2008/10/2008-web-harvest-let-us-know-how-we-can.html" rel="nofollow">just blogged</a> about the Web Harvest &amp; the pain it&#8217;s causing (and the comments we&#8217;ve seen). Our intentions really are good &#8211; to collect &amp; preserve &amp; make accessible NZ&#8217;s digital heritage for people in the future, the same way we do already for books &amp; newspapers &amp; photographs &#8211; and we&#8217;re trying to respond and fix people&#8217;s problems as quickly as we can.</p>
<p>You can send comments &amp; requests for us to change the crawler&#8217;s behaviour to <a href="mailto:web-harvest-2008@natlib.govt.nz" rel="nofollow">web-harvest-2008@natlib.govt.nz</a>.</p>
<p>Thanks heaps,</p>
<p>Courtney</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew</title>
		<link>http://sandfly.net.nz/blog/2008/10/the-new-zealand-web-harvest-2008-harvests-too-much/comment-page-1/#comment-336</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Wed, 15 Oct 2008 02:19:03 +0000</pubDate>
		<guid isPermaLink="false">http://sandfly.net.nz/blog/?p=245#comment-336</guid>
		<description>Yeah, I could do that (Rick Astley must be archived for future generations!) However it is robots.txt that is supposed to prevent bots, not some mod-rewrite tricks. 

I feel I have honored my end of the bargain in warning bots away from parts of my site that it would be useless to spider; NatLib is not upholding their responsibilities when running their bot.</description>
		<content:encoded><![CDATA[<p>Yeah, I could do that (Rick Astley must be archived for future generations!) However it is robots.txt that is supposed to prevent bots, not some mod-rewrite tricks. </p>
<p>I feel I have honored my end of the bargain in warning bots away from parts of my site that it would be useless to spider; NatLib is not upholding their responsibilities when running their bot.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave</title>
		<link>http://sandfly.net.nz/blog/2008/10/the-new-zealand-web-harvest-2008-harvests-too-much/comment-page-1/#comment-335</link>
		<dc:creator>Dave</dc:creator>
		<pubDate>Wed, 15 Oct 2008 02:11:57 +0000</pubDate>
		<guid isPermaLink="false">http://sandfly.net.nz/blog/?p=245#comment-335</guid>
		<description>easy fix: RewriteCond %{HTTP_USER_AGENT} ^NLNZHarvester2008 RewriteRule ^(.*)$ http://tinyurl.com/2w4apm</description>
		<content:encoded><![CDATA[<p>easy fix: RewriteCond %{HTTP_USER_AGENT} ^NLNZHarvester2008 RewriteRule ^(.*)$ <a href="http://tinyurl.com/2w4apm" rel="nofollow">http://tinyurl.com/2w4apm</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
