REVIEW

Book Review: Webbots, Spiders and Screen Scrapers by Michael Schrenk

Written by T. Michael Testi
Published May 11, 2007

According to the Michael Schrenk, the internet is bigger and better than what a mere browser allows. Webbots, Spiders and Screen Scrapers was written to show you how to take advantage of the vast resources available on the internet. When you are regulated to the world of a browser, you are limited in what is available to you. Webbots, Spiders and Screen Scrapers goal is to open up the Web and enhance you online experience.

What is the problem with browsers? It is a manual tool that downloads and renders websites. You still need to decide if the website is relevant to you. Your browser cannot think. It cannot anticipate your actions and won't notify you when something important happens. To accomplish this, you will need the automation and intelligence only available in a webbot; also known as a web robot.

Webbots, Spiders and Screen Scrapers contains 28 chapters that break down into four sections. I will focus on the four sections highlighting the chapters as needed. What you will need to work with this book is a fundamental understanding of HTML, and how the internet works. It should be known that this book is not going to teach you how to program, or how things like TCP/IP; the protocol of the internet work. Pretty much any kind of Pentium computer running Windows, Linux or Mac operating system will do. You will also want to get PHP, cURL and MySQL, all of which are free on the internet. Again, this book will not teach you how to use these products, but rather use these products to teach you how to create webbots, spiders and screen scrapers.

Part one, "Fundamental Concepts and Techniques," introduces the concepts of web automation and explores the elementary techniques that will allow you to harness the resources of the web. It begins by explaining why it is fun to write webbots and how writing webbots can be a rewarding career. It tells where you can get ideas for webbot projects and talks about existing as well as potential webbot projects. You will learn how to download web pages, parse those pages, automatically submit forms, and how to manage large amounts of data. All of these topics will set you up for the rest of the book.

Part two, "Projects," expands on the concepts that you learned in part one. According to the author, with further development, any of these projects could be transformed in to a marketable product. The projects include; Price-monitoring webbots where you can collect and analyze online prices from any number of websites. There is an image-capturing webbots that will download all of the images from a website as well as a Email reading an email sending webbot. All in all there are eleven projects included in part two.

Part three, "Advanced Technical Considerations," explores the finer technical aspects of webbots and spider development. Here the author shares some hard learned lessons while teaching you how to write some specialized webbots and spiders. Here you will learn about spiders; a webbot that finds and follow links both within a website as well as those that crawl along the web searching out specific information. You will learn how to create snippers; webbots that automatically purchase items from places like auction sites when a specific set of criterion has been met. You will also find out how to deal with cryptography, authentication and scheduling.

page 1 | 2
T. Michael Testi is a photographer, writer, software developer and ardent fan of fantasy football and horse race handicapping. He also blogs at PhotographyTodayNet and at All This and Everything Else.
Keep reading for information and comments on this article, and add some feedback of your own!
Book Review: Webbots, Spiders and Screen Scrapers by Michael Schrenk
Published: May 11, 2007
Type: Review
Section: Books
Filed Under: Books: Computers and Internet, Review, Sci/Tech: Computers, Sci/Tech: Internet, Sci/Tech: Programming, Sci/Tech: Software
Part of a feature: The RAM Review
Writer: T. Michael Testi
T. Michael Testi's BC Writer page
T. Michael Testi's personal site
Spread the Word
Like this article?
Email this
Submit to del.icio.us Save to del.icio.us
RSS Feeds
All RSS Feeds (240+)
Comments on this article
Articles in this series
BC articles by T. Michael Testi
Books: Computers and Internet
Review
Sci/Tech: Computers
Sci/Tech: Internet
Sci/Tech: Programming
Sci/Tech: Software
All Books Articles
All Review articles
All BC articles
All BC Comments

Comments

Want comments emailed to you? No spam, promise! Address:

Add your comment, speak your mind

(Or ping: http://blogcritics.org/mt/tb/63747)

Personal attacks are not allowed. Please read our comment policy.





Remember Name/URL?

Please preview your comment!

Fresh
Articles
Fresh
Comments