If you are reading this post, you likely already are familiar with ScrapeBox, dubbed by the company itself as “The Swiss Army Knife of SEO” for its versatility and wide distribution. ScrapeBox is considered by many to be an awesome SEO tool that can also make the process of link prospecting much easier.
ScrapeBox or (SB for short) has become popular among webmasters and web developers especially for Google harvesting and scraping and for blog commenting. ScrapeBox that it can even scrape articles on the internet within five minutes or less.
This software supports the constant daily demands for SEO and Internet Marketing of companies of all sizes. It is also a beneficial tool for individuals and freelancers worldwide.
As it names suggests, ScrapeBox collects (or scrapes) information oﬀ of the internet. Some of its features include:
- Harvesting of proxies
- Being able to create a sitemap of a website
- Ability to make a RSS feed
- Collecting keyword ideas
- Backlink checker
- Domain name checker
- Collecting websites based on a footprint
- And more…
Answers When the Harvester Isn’t Working
From time to time, the harvester in ScrapeBox may fail to provide the data you are requesting.
If you are scraping a search engine (such as Google), and you are not getting any results to
your query, it is generally due to one of a couple reasons:
1. The terms you are scraping simply do not have any results to return.
2. Your proxies are blocked or some other error is happening. The easiest way to check this is go to the settings menu. Uncheck “use multi-threaded harvester”. Then try again to perform the harvest request. ScrapeBox will display each query, the proxy used and the result, including any error messages. For example:
If you see lines that say:
Results 0 completed using proxy xxx.xxx.xxx.xxx:xxx , then it means that the request finished, but the engine returned no results.
If you see:
Results 0 Error xxx received using proxy xxx.xxx.xxx.xxx:xxx – Then you can look at the error message and generally determine the problem.
Issues with the Harvester
Generally speaking when you have a problem with ScrapeBox’s Harvester tools these are the basic steps to follow.
1. Restart your computer…a solution for a myriad of issues.
2. It may be that the proxies that were found are not acceptable on the search engine you are using. To check this, try another proxy site that is not banned to see if it goes through
3. Update ScrapeBox to the latest version. Frequently, the updated version provides a solution to fix the problem.
4. You may need to authenticate your proxies for your local server IP address. This can
usually be done from the online portal of your proxy provider. However, if your proxies are properly authenticated, and you are still experiencing issues with using them, it is likely that there is an issue with your proxies such as “connections refused or timed out” or 404 errors (the proxy is damaged). If you experience connection failures like this, the best thing to do is to reboot your proxies. This may involve contacting your proxy provider.
5. Take a moment to go back thru your settings and double check all of the most obvious things particularly “maximum connections” and “RND Delay Range.” Often a simple setting can cause great grief.
6. Reinstall the ScrapeBox software. Yes, reinstalling any program is a pain, but it may be the simplest way to solve the problem. Then re-download the latest version of ScrapeBox and try your request again
7. Visit You-Tube. Both the creator of ScrapeBox as well as other knowledgeable “geeks” have posted short tutorials on troubleshooting that should help resolve the majority of issues
8. If all else fails contact support directly: http://www.ScrapeBox.com/contact-us
Error Codes for a Troubled Harvester
If there is a problem with the harvester, a specific error message may be received. For a complete explanation on all error codes choose the “help menu” in ScrapeBox: Help -> Server Error Code Reference.
The most common error messages you will receive are:
- Error 302 – Your IP is blocked
- Error 404 – the proxy is bad or was never found
- Error 407 – your proxy requires authentication
- Error 500 when commenting to WordPress…your data is bad, most likely your email addresses are invalid. Double check to assure you entered a valid email address
The Handy Harvester
Those who own ScrapeBox have several options in using the harvester:
Search Engine Harvester
ScrapeBox has the capability to harvest thousands of URL’s from over 30 search engines such as Google, Yahoo and Bing in seconds through its powerful and trainable URL harvester.
One of the biggest advantages of the ScrapeBox Search Engine Harvester is the huge potential for detailed statistics. By running this harvester, a myriad of statistics will divulge the number of results obtained for every keyword in every search engine.
The harvester can also save the keyword with every harvested URL so you can easily identify what keywords produced what results.
SEO gurus praise the harvesting of URL’s with ScrapeBox to identify high PageRank sites and say this is where ScrapeBox does its magic. When you choose “Select Start Harvesting,” ScrapeBox will automatically search for related blog sites containing the keywords you selected. Results can be filtered, sorted and manipulated to provide the exact information you need. Then you have the ability to export the data as an excel sheet for easy use.
ScrapeBox features a lightning fast, multi-threaded keyword scraper.
This tool is capable of taking one or more keywords and scraping thousands of related keywords in just a few seconds. The ScrapeBox Keyboard Harvester is able to produce thousands of long-tail keywords from a single base keyword. (As a reminder, long-tail keywords are extremely valuable to folks working with SEO. Long tail keywords are those three and four keyword phrases which are very, very specific to whatever you are selling and are essential when customer uses a highly specific search phrase).
Why is this important? If you oﬀer a product or service ScrapeBox’s Keyword Harvester can
provide detailed data on the keywords and key phrases people are searching for. This valuable harvester tool oﬀers a simple method for optimizing your company’s websites SEO or products to target exactly people are searching for….which leads to a higher conversion rate.
Included in ScrapeBox are a powerful proxy harvester and tester. This handy proxy checker allows you to check if your proxies are working properly. It also permits the user to keep their work private through the use of thousands of free proxies. Usually the best way to find a Google proxy is to use the built-in ScrapeBox Proxy Harvester.
Since lists of proxy websites are published daily, it is virtually impossible to manually visit each site and test them. ScrapeBox’s Proxy Manager oﬀers a far simpler solution. It has 22 proxy sources already built in, plus it allows you to add custom sources by adding the URL’s of any sites that publish proxies.
When you run the Proxy Harvester, it will visit each website and extract all the proxies from the pages and automatically remove the duplicate proxies that may be published on multiple web sites. So with one click you can pull in thousands of proxies from numerous websites.
For your proxies to work with URL Profiler, they will need to be authenticated. You can either authenticate via IP address or username and password.
If your proxies are not authenticated, ScrapeBox will respond with a “failed” message for that specific proxy URL.
A key benefit to the ScrapeBox Proxy Harvester is that is has a “Trainable Proxy Scanner” feature. This means you can fully configure where you want to scrape proxies and meet any specific needs of your business.
The ScrapeBox Custom Harvester is an additional feature included (at no charge) when you purchase ScrapeBox. It allows you the freedom to add your own search engines to ScrapeBox. This allows the user to harvest from virtually any URL’s from just about any website that has a search feature.
A Few Additional SEO Tools
The Scrapebox harvester can be used for more than scraping proxies. Once you have your proxies working, you can try them with:
The Backlink Checker
Every SEO enthusiast loves backlinks. Links are truly the building blocks of SEO and they play a major role in the ranking power of any website.
The number of backlinks is one of the indicators of the popularity or importance of the article. Leading search engines, especially Google, will give higher consideration to these websites if they possess a number of quality backlinks.
Quality is the key here. When search engines calculate the relevance of a site to a keyword, they consider the number of quality inbound links to that site.
ScrapeBox includes a free backlink checker and allows up to 1,000 links returned in the report. This is great tool for doing backlink audits for your website, or doing competitive analysis to view the URL’s where your competitors have obtained their backlinks. So if you need to find out where your competitors are getting their links you will want to employ this excellent feature.
The Bulk URL Shortener
If your business regularly uses Facebook or Twitter it is essential that the URL be shortened in order to make it fit within the specified number of characters. The majority of programs allow the user to only do this process one URL at a time…a method that is not feasible when you have dozens of URLs to compact all at the same time.
One of the more popular Add-Ons from ScrapeBox is the URL Shortener. All you have to do is type in your list of URLs and you will almost instantly receive the new, shorter links you require.
The WHOIS Checker
Without this tool, it becomes a pain to check ownership of multiple websites on WHOIS.com. ScrapeBox delivers an easy solution. Their WHOIS Checker is a reliable tool that can check multiple domain names simultaneously with a click of a button.
A Final Word
The conversation of white hat vs. black hat SEO is a popular topic these days. For the most part, “white hat” is seen as being ethical and contributing to the web and not spamming while “black hat” is viewed as spamming the web to benefit your site. Originally, ScrapeBox was thought to be a black hat tool of darkness.
Today, the updates, features and innovations of ScrapeBox have moved it into the realm of a white hat SEO gadget, finding new life as a timesaver that is simple to learn and relatively inexpensive to purchase.
Although ScrapeBox is a Windows only software, it delivers a helpful set of tools to organize and speed up manual searches and link processing. SB delivers an overwhelming amount of power that can help speed up daily tasks, support a variety of SEO projects and overall production.
ScrapeBox software requires only a one-time payment for a lifetime license. This includes all updates (there have been over 400 updates since ScrapeBox was released in 2009), loads of free add-ons and hundreds of add-on updates. All you have to do now is start harvesting.