Version History
- v8.3 (Released 27.12.2011)
- v8.2 (Released 06.07.2011)
- v8.1 (Released 27.12.2010)
- v8.0 (Released 01.07.2010)
- v7.3 (Released 17.06.2009)
- v7.2 (Released 10.04.2009)
- v7.1 (Released 15.04.2008)
- v6.0 (Released 17.04.2007)
- v5.0 (Released 06.07.2006)
- v4.3 (Released 20.09.2005)
- v4.0 (Released 11.12.2004)
Testimonials
Web Addresses
“We downloaded and ran the trial version of your web link extractor. I compared it to another program and yours kicked it's butt. Your's scanned 9000 files while finding over 1500 links vs. the other only scanned 1200 file, and found only about 400 links. (This was using the exact same search file).”
Proven uses of WDE
1.I want to extract contact data of travel related companies.
2. I want to extract contact data of travel related companies of Australia.
3.I want to get more data of travel related companies of Australia.
4.I want to extract all data from a website http://www.mydomain.com
6.I have a list of urls in a file and I want to extract data from those urls.
8.I want to extract real estate companies phone / fax numbers of Canada, Toronto area.
9.I want to build a domain list of health/medicine related websites.
(1) I want to extract contact data of travel related companies.
Go to New Session Dialog
Select "Source = Search Engines"
Enter travel in Keyword Box
Select what type of data you want to extract (email, phone, fax, ...)
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV or line by line
Click OK button
(2) I want to extract contact data of travel related companies of Australia.
Repeat (1) but select "Engine = Australia" from Engine Listing Dialog. You can lunch this dialog by clicking "Engines" button of New Session - General Tab.
By default US/International Engines are selected.
(3) I want to get more data of travel related companies of Australia.
Repeat (2) but use more keywords like
- travel
- hotel
- cruise
(4) I want to extract all data from a website.
Go to New Session Dialog
Select "Source = WebSite"
Enter website URL in Starting Address box: like http://www.mydomain.com
Select depth = 0 (to spider entire website , see more about depth here)
Select what type of data you want to extract (email, phone, fax, ...)
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV or line by line
Click OK button
(5) I want to extract all photographers contact data from yahoo dir like http://dir.yahoo.com/Arts/Visual_Arts/Photography/Photographers/ to send them invitation to visit my new photographer forum.
Go to New Session Dialog
Select "Source = WebSite"
Enter website URL in Starting Address box: like http://dir.yahoo.com/Arts/Visual_Arts/Photography/Photographers/
Select depth = 0 ; Check "Stay within Full URL"
These 2 combination tells WDE to process entire photographers dir but not other part of yahoo dir.
Select what type of data you want to extract (email, phone, fax, ...)
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV or line by line
Now go to External Site tab - select "Follow External URLs" - Select Spidering Mode (Intelligent or you define depth)
[ Intelligent Spidering:When you set this mode, WDE uses special technique and only processes potential pages that may contain contact information (phone, fax, email). ]
Now back to General tab and Click OK button
(6) I have a list of urls in a file and I want to extract data from those urls.
Go to New Session Dialog
Select "Source = URLs from File"
Enter url file path in File name box. This file must be plain text file with one URL per line and starting with http:// string each line.
Select Depth = 0 for entire website extraction of each website located in the text file or select "process 1 page only" to spider only the specified url.
Select what type of data you want to extract (email, phone, fax, ...)
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV or line by line
Click OK button
(7) I want to compile a list of offshore, banking, tax related websites that do link exchange with other sites.
Go to New Session Dialog
Select "Source = Search Engines"
Generate Keywords using following 2 lists:
- offshore banking tax accounting
- link exchange trade links swap link add url
Select Extract Meta Tag and Extract Email
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV
Now go to External Site Tab. Select "Follow External URL". Select Spidering Mode = I will Select the Depth. Select "Process 1 Page Only". Select "Spider Base URL only"
Now go to Filters - Text Filters tab. Check "page must contain following text" . Enter following string in the box:
- links.html
- link.html
- resource.html
- add url
- submit url
- add your site
- submit your site
So that WDE will extract data from only those websites who do link exchange or add urls to their directories.
Now back to General tab and Click OK button.
After extraction completed, go to Data Tab - Meta Tag list. These are the related sites that do link exchange with other sites.
(8) I want to extract real estate companies phone / fax numbers of Canada, Ontario area.
Go to New Session Dialog
Select "Source = Search Engines"
Enter real estate in Keyword Box
Select "Engine = Canada" from Engine Listing Dialog. You can lunch this dialog by clicking "Engines" button
Select Extract phone, fax
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV or line by line
Now go to Filters Tab - Data Filters - Phone/Fax box. Enter
416
1416
in both boxes. so that WDE will extract only those phone/fax numbers that contain 416 or 1416
(See more info about phone/fax filter in Session Details page)
Now back to General tab and Click OK button.
Click OK button
(9) I want to build a domain list of health/medicine related websites.
Go to New Session Dialog
Select "Source = Search Engines"
Enter following Keywords:
health
medicine
so on...
Select Extract URL (select Base URL)
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - line by line
Click OK button
(10) I have url list in a SQL database. I want to extract url, title, description, keyword, plain page text of html <BODY> to </BODY> and merge them into database.
WDE can not access SQL database. You need to export url list from SQL database to a plain text disk file, and use this file in WDE.
Go to New Session Dialog
Select "Source = URLs from File"
Enter url file path in File name box. This file must be plain text file with one URL per line and starting with http:// string each line.
Select "process 1 page only" to extract meta tag of specified root domain. If you need to extract meta tag of ALL pages of each website then select depth=0
Select Extract Meta tag, Extract Body (you can set text size limit by clicking ... button)
Select "Save Data" folder , i.e. where program will save the data
Select Save Format - CSV
Uncheck "View - Display data in data tab" for very large URL Meta tag extraction, so that WDE will not display data within program but will write directly to disk file - this will surely increase program performance.
Click OK button
After extraction completed, you can import this csv file (metatag.txt) to SQL databse and do further processing, query, etc...
Country Specific Search Engine List
Intelligent Spidering Mode
(extracts more contact data without visiting entire site)
Parser options
Meta tags limit, prefixes and more