Nutch – Plugin Tutorial

Nutch

In one of my previous posts about Nutch, I already mentioned plugins. The plugin system is central to how Nutch works and allows you to customize Nutch to your personal needs in a very flexible and maintainable way. Everybody who … read more

Nutch – How It Works

Nutch

After the installation of Nutch as described in my previous post, you can either follow this tutorial without the need of thinking, or get a sense of how Nutch actually works beforehand. I recommend doing both in parallel. And since you won’t find … read more

Nutch – Installation

Nutch

Nutch is a flexible and powerful open source tool for web crawling, developed by the Apache Software Foundation and its community. It builds on Apache Solr and comes with an integration of the highly popular Apache Hadoop, which actually started … read more