|
Support
If you have a question that isn't answered here, then please
contact us.
What is WebExtract? [Top]
WebExtract is a very powerful email extractor which can find a
variety of different email addresses from websites on the Internet.
It can find from specific countries, domains and areas of interest.
It's simply one of the most powerful tools out there. Look on the main page for the entire
list of features found in WebExtract.
Does this
program contain ANY malware? [Top]
WebExtract is, and always will be 100% free of all kinds of
harmful code such as spyware, adware and viruses. If you don't like
the program you can always uninstall it without risking any kind of
harm to your system.
System requirements [Top]
Windows 95/98/2000/2003 Server/ME/XP/Vista
256 MB of RAM (512 recommended)
Internet connection
Good to
know about Windows XP SP2 [Top]
Users running Windows XP with Service Pack 2 installed have been
limited to only using 10 connections by Microsoft. This limit is
easily to removed by running an unofficial patch found at
lvllord.de
Remember that this is a third-party patch and it's only recommended
for experienced computerusers.
How do I start it? [Top]
After you have installed it, open your start-menu and click on
'Programs', under that tab you will see a group called 'WebExtract Shareware'.
How do I use
it? [Top]
WebExtract basically have two different modes, keyword extraction
and URL extraction. The different methods of extraction only defines
how websites are found. Under the first tab in the program
("Keywords") you can enter different phrases and words that you want
the program to use when querying the search engine. The sites found
will be relevant to the words and phrases add under the "Keywords"
tab.
The other mode is under the tab "URL extraction" where you can enter
URL's directly, those entered in the list will be spidered for
emails and no search engines will be used. Remember that if you want
to use the URL extraction mode you need to enable it under that tab,
"Work in URL list mode".
Those are basically the two different ways WebExtract can find
emails.
Read on to see explanations of the other tabs and functions in the
program.
How do I
save the emails?
[Top]
When you click on "Start" you will be asked if you are ready to
begin, press "Yes". The next window that now appears is where you
set where the emails will be saved to. Emails will by default be
saved automatically when more than 100,000 have be found. This can
be changed by setting a different number under the field "Autosave
emails when more than this number have been found" under the tab
"Expert". The emails will also be saved when you click on "Stop" or
when the search has been completed.
What is a keyword? [Top]
A keyword is an essential part of the program, they are used to
query the search engine in order to find sites to harvest for
emails.
This is how it works.
1. You add the keyword "cats" in the keyword list and click on
"Start".
2. WebExtract goes to the search engine and makes a search using the
keyword.
3. The search engine processes the keyword and returns the matching
sites.
4. WebExtract starts visiting the sites returned by the search
engine in order to find emails.
5. When the program is running low on sites it takes the next
keyword in the list and starts from step 2.
6. When no more sites can be found, the program stops.
That's how it works.
How do I get started as quickly as possible? [Top]
First of all you will need some keywords that will be used with the
search engine to get sites to harvest. For now, click on the tab
"Keywords" and then click on the "Import test keywords" button to
import some common keywords that you can test with. Now click on the
"Connection settings" tab and define the number of Harvesting
connections that you would like to use. You should use about 50 if
your connection is a slow or medium fast DSL or Cable connection.
About 100-200 if you have a good connection, fast DSL, Cable or a
T1. You should use 200 or above if your connection is faster than a
T1. If you are unsure about this setting, set it to 100.
You are now ready to begin harvesting (we will get in to the other
settings in the next section).
Click on start. It will ask you if you are ready to begin
harvesting, click on "Yes". You will now need to select the path
where all the accepted emails will be saved to. Define a path or let
the default path be as it is, your choice. Then click on "Done".
It will now begin to query the search engine with the keywords that
you have defined, it will get as many sites as defined in the
connection settings, then it will harvest them for emails. There is
no need to do anything more now, it will work until it runs out of
keywords and sites. Good luck!
How can I find sites and emails from specific countries? [Top]
If you want to use your keywords to find sites and emails from for
example Australia then do the following:
- Go to the tab "Keywords"
and add some keywords which you would like to use.
- Under the tab "Expert"
enter this in the field called "Custom query string":
&as_sitesearch=.au
- Make sure that "Work in URL
list mode" under the tab "URL list" is disabled.
- So far you have made the
settings to only get sites from Australia from the searchengine.
If you also want to make sure that all the emails that you find
are truly australian, then do the following, otherwise skip to
step 5: Under the tab "Email parsing" select the option "Only
accept emails with these domain extensions" and enter the
following in the box below that option: .au
- You're all set, click on
"Start" to begin.
If you want to change the
country then just replace .au in step 2 and 4 with any other
country code which you can find in the list below. Like .fi
for Finland or .eg for Egypt or .ar for Argentina and
so on.
If this didn't work then click on the button "Reset" and wait five
seconds to return all the settings to normal. Then start from step 1
again. If it still doesn't work then contact the support.
Here are all the country code top level domains for each country in
the world that has internet access.
.ac – Ascension
Island
.ad – Andorra
.ae – United Arab Emirates
.af – Afghanistan
.ag – Antigua and Barbuda
.ai – Anguilla
.al – Albania
.am – Armenia
.an – Netherlands Antilles
.ao – Angola
.aq – Antarctica
.ar – Argentina
.as – American Samoa
.at – Austria
.au – Australia
.aw – Aruba
.ax – Aland Islands
.az – Azerbaijan
.ba – Bosnia and Herzegovina
.bb – Barbados
.bd – Bangladesh
.be – Belgium
.bf – Burkina Faso
.bg – Bulgaria
.bh – Bahrain
.bi – Burundi
.bj – Benin
.bm – Bermuda
.bn – Brunei Darussalam
.bo – Bolivia
.br – Brazil
.bs – Bahamas
.bt – Bhutan
.bv – Bouvet Island
.bw – Botswana
.by – Belarus
.bz – Belize
.ca – Canada
.cc – Cocos (Keeling) Islands
.cd – Congo, The Democratic Republic of the
.cf – Central African Republic
.cg – Congo, Republic of
.ch – Switzerland
.ci – Cote d'Ivoire
.ck – Cook Islands
.cl – Chile
.cm – Cameroon
.cn – China
.co – Colombia
.cr – Costa Rica
.cu – Cuba
.cv – Cape Verde
.cx – Christmas Island
.cy – Cyprus
.cz – Czech Republic
.de – Germany
.dj – Djibouti
.dk – Denmark
.dm – Dominica
.do – Dominican Republic
.dz – Algeria
.ec – Ecuador
.ee – Estonia |
.eg – Egypt
.eh – Western Sahara
.er – Eritrea
.es – Spain
.et – Ethiopia
.eu – European Union
.fi – Finland
.fj – Fiji
.fk – Falkland Islands (Malvinas)
.fm – Micronesia, Federated States of
.fo – Faroe Islands
.fr – France
.ga – Gabon
.gb – United Kingdom
.gd – Grenada
.ge – Georgia
.gf – French Guiana
.gg – Guernsey
.gh – Ghana
.gi – Gibraltar
.gl – Greenland
.gm – Gambia
.gn – Guinea
.gp – Guadeloupe
.gq – Equatorial Guinea
.gr – Greece
.gs – South Georgia and the South Sandwich Islands
.gt – Guatemala
.gu – Guam
.gw – Guinea-Bissau
.gy – Guyana
.hk – Hong Kong
.hm – Heard and McDonald Islands
.hn – Honduras
.hr – Croatia/Hrvatska
.ht – Haiti
.hu – Hungary
.id – Indonesia
.ie – Ireland
.il – Israel
.im – Isle of Man
.in – India
.io – British Indian Ocean Territory
.iq – Iraq
.ir – Iran, Islamic Republic of
.is – Iceland
.it – Italy
.je – Jersey
.jm – Jamaica
.jo – Jordan
.jp – Japan
.ke – Kenya
.kg – Kyrgyzstan
.kh – Cambodia
.ki – Kiribati
.km – Comoros
.kn – Saint Kitts and Nevis
.kp – Korea, Democratic People's Republic
.kr – Korea, Republic of
.kw – Kuwait
.ky – Cayman Islands
.kz – Kazakhstan
.la – Lao People's Democratic Republic |
.lb – Lebanon
.lc – Saint Lucia
.li – Liechtenstein
.lk – Sri Lanka
.lr – Liberia
.ls – Lesotho
.lt – Lithuania
.lu – Luxembourg
.lv – Latvia
.ly – Libyan Arab Jamahiriya
.ma – Morocco
.mc – Monaco
.md – Moldova, Republic of
.me – Montenegro
.mg – Madagascar
.mh – Marshall Islands
.mk – Macedonia, The Former Yugoslav Republic of
.ml – Mali
.mm – Myanmar
.mn – Mongolia
.mo – Macao
.mp – Northern Mariana Islands
.mq – Martinique
.mr – Mauritania
.ms – Montserrat
.mt – Malta
.mu – Mauritius
.mv – Maldives
.mw – Malawi
.mx – Mexico
.my – Malaysia
.mz – Mozambique
.na – Namibia
.nc – New Caledonia
.ne – Niger
.nf – Norfolk Island
.ng – Nigeria
.ni – Nicaragua
.nl – Netherlands
.no – Norway
.np – Nepal
.nr – Nauru
.nu – Niue
.nz – New Zealand
.om – Oman
.pa – Panama
.pe – Peru
.pf – French Polynesia
.pg – Papua New Guinea
.ph – Philippines
.pk – Pakistan
.pl – Poland
.pm – Saint Pierre and Miquelon
.pn – Pitcairn Island
.pr – Puerto Rico
.ps – Palestinian Territory, Occupied
.pt – Portugal
.pw – Palau
.py – Paraguay
.qa – Qatar
.re – Reunion Island
.ro – Romania
.rs – Serbia |
.ru – Russian
Federation
.rw – Rwanda
.sa – Saudi Arabia
.sb – Solomon Islands
.sc – Seychelles
.sd – Sudan
.se – Sweden
.sg – Singapore
.sh – Saint Helena
.si – Slovenia
.sj – Svalbard and Jan Mayen Islands
.sk – Slovak Republic
.sl – Sierra Leone
.sm – San Marino
.sn – Senegal
.so – Somalia
.sr – Suriname
.st – Sao Tome and Principe
.su – Soviet Union (being phased out)
.sv – El Salvador
.sy – Syrian Arab Republic
.sz – Swaziland
.tc – Turks and Caicos Islands
.td – Chad
.tf – French Southern Territories
.tg – Togo
.th – Thailand
.tj – Tajikistan
.tk – Tokelau
.tl – Timor-Leste
.tm – Turkmenistan
.tn – Tunisia
.to – Tonga
.tp – East Timor
.tr – Turkey
.tt – Trinidad and Tobago
.tv – Tuvalu
.tw – Taiwan
.tz – Tanzania
.ua – Ukraine
.ug – Uganda
.uk – United Kingdom
.um – United States Minor Outlying Islands
.us – United States
.uy – Uruguay
.uz – Uzbekistan
.va – Holy See (Vatican City State)
.vc – Saint Vincent and the Grenadines
.ve – Venezuela
.vg – Virgin Islands, British
.vi – Virgin Islands, U.S.
.vn – Vietnam
.vu – Vanuatu
.wf – Wallis and Futuna Islands
.ws – Samoa
.ye – Yemen
.yt – Mayotte
.yu – Yugoslavia
.za – South Africa
.zm – Zambia
.zw – Zimbabwe |
How can I find emails from a specific domain? [Top]
This can be divided into two different sections, the first being if
you want to find emails that are on specific domain(s) of your
choice.
For example, if you chose the domain tripod.com and the program
finds the site http://hello.tripod.com/contact.html then ALL the
valid emails on that URL will be accepted regardless if they contain
@tripod.com or not.
The second being if you want to find emails that contain the domain
of your choice.
For example, if you chose the domain tripod.com and the program
finds the site http://www.anywhere.com/page.html then ONLY the valid
emails which contain @tripod.com on page.html will be accepted.
The following examples will use the domain tripod.com as an example,
this can of course be changed to any other domain of your choice.
The first method
1. Go to the tab "Keywords" and add some keywords which you would
like to use.
2. Under the tab "Expert" enter this in the field called "Custom
query string": &as_sitesearch=tripod.com
3. Make sure that "Work in URL list mode" under the tab "URL list"
is disabled.
4. Click on "Start" to begin.
The second method
1. Go to the tab "Keywords" and add keywords in the following
format: @tripod.com keyword
where keyword is any common word of your choice. This will greatly
increase your chances of getting emails containing the string @tripod.com
2. Under the tab "Email parsing" select the option "Only accept
emails with these domains" and enter the following in the box below
that option: tripod.com
2. Make sure that "Work in URL list mode" under the tab "URL list"
is disabled.
3. Click on "Start" to begin.
If this didn't work then click on the button "Reset" and wait five
seconds to return all the settings to normal. Then start from step 1
again. If it still doesn't work then contact the support.
How can I find each email on a specific site? [Top]
If you want to find emails from for example http://www.nexo-sa.com/asp/news/news.asp
and all the the other URL's on that site, then do the following:
Go to the tab "URL list" and enable the setting "Work in URL list
mode".
Add the URL to the site which the harvesting should begin at, in
this case
http://www.nexo-sa.com/asp/news/news.asp
Go to the tab "Connection settings" and change the value "Harvesting
connections" to 2. The reason it should be so low is because that is
the maximum number of simultaneous connections a normally configured
web server allows from one single IP.
Under the same tab, disable the function "Avoid potential
spam-traps".
Go to the tab "Email parsing" and disable "Discard emails with these
usernames".
Go to the tab "Deep crawling" and enable that function. Also enable
the box that says "Only accept URL's that contain these strings" and
enter the following in that box: nexo-sa.com
We do this so the deep crawling function only will get URL's from
the site nexo-sa.com and no other outside domain.
You are good to go, click on "Start" to begin.
If this didn't work then click on the button "Reset" and wait five
seconds to return all the settings to normal. Then start from step 1
again. If it still doesn't work then contact the support.
Why should I purchase the full version? [Top]
The free Shareware version has some limitations compared to the
registered full version. All these limits will be removed when you
purchase the full version. For a list of the limitations please see
the buy now page.
URL list
[Top]
If you have a list of URL's you would like WebExtract to spider for
emails, enter them here. Make sure you enable "Work in URL list
mode" for this function to be enabled.
Work in URL mode
When you enable this you are entering the URL list mode of
WebExtract. Meaning that it will not use any keywords that you have
added. It will only spider the URL's in the list for emails.
Generate URL's (X to Y, Step Z)
This function can generate URL's using a logical pattern. It will by
default (this can be changed) start at X (1) and go up to Y (100)
and it will increase with Z (1) until Y (100) has been reached.
Click on the "Generate" button to further explore this function. The
number generated will replace the string %NUMBER% in your URL.
Connection Settings
[Top]
Harvesting connections
This is the number of connections that will be used to connect
to the websites found by the searchengine or in your URL list. You
should use about 50 if your connection is a slow or medium fast DSL
or Cable connection. About 100-200 if you have a good connection,
fast DSL, Cable or a T1. You should use 200 or above if your
connection is faster than a T1. If you are unsure about this
setting, set it to 100.
Searchengine connections
This is the number of connections that will be used to connect
to the searchengine to get websites. This is disabled by default
since the "Emulate human behavior" function under the tab "Expert"
is enabled, which in turn disables this. Don't disable that unless
you know what you are doing, doing so may set you at risk of getting
temporarily blocked by the searchengine.
Timeout (seconds)
This is the number of seconds it takes for a connection to
timeout and move on after it has received no response from the
server it was connected to. Timeout applies to both the searchengine
connections and the harvesting connections. The lower you keep this,
the faster it will go through websites, but you will also find less
emails. The higher you set it the more emails it will find, but it
will work slower. A good suggestion is to keep it in the middle. The
default 30 seconds is good.
Sites to find before harvest
This is the minimum number of sites it will accept from the
searchengine before starting to harvest, unless you run out of
keywords. It works like this, it gets 1000 sites, it harvests them
for emails, it gets 1000 sites, it harvests them for emails, it gets
1000 sites... etc. etc. The higher you set it the more memory it
will consume. Keep this at 1000 unless you have some specific reason
to change it.
Maximum URL's to remember
This is an important setting if you plan on harvesting several
million emails. It will prevent that the duplication of emails goes
to high and that you don't visit the same site. It keeps this number
of the last visited sites in the memory and makes sure that you
don't visit any of them again during this session. Note that it
might take quite much RAM if you set this too high. You should have
at least 256 MB RAM for having it on 50,000 but 512 MB is prefered
for speed and overall performance. Don't change this unless you are
positive your computer can handle it, otherwise you may experience a
poor performance after a few hours.
Save related URL with each email address
If you want to know which site each email was found on, enable
this. They will be saved in the format of "site url","email" in the
accepted emails file.
Save ALL of the URL's
This will save all URL's WebExtract comes across, regardless if
there is an email on it or not. They will be saved in the format of
"site url","email" in the accepted emails file.
Keep cache URL formatting
When enabled the program will keep the original google cache
formatting URL instead of determining the source URL. They will be
saved in the format of "site url","email" in the accepted emails
file. This setting will only be used if you have enabled to get
sites from the Google cache.
Avoid potential spam-traps
This is checked by default and will make sure that the program
doesn't collect any harmful emails. It avoids several different
spam-traps and so called honeypots that would cause you a headache
in normal cases by possibly getting your server shut down.
Accept no more than this number of Bytes from an URL
This setting will make sure that the program doesn't get stuck
downloading huge sites. Keeping it at 150,000 bytes is just fine.
150,000 bytes is about 150 kb which is about 0,15 mb. It's up to you
what you want to keep this at, if you don't know, don't change it.
Search engine user-agent, Morphing - These 2 settings work
together.
The search engine user-agent is how the program identifies to the
search engine it communicates with. If you set it to morph it will
always generate a different user-agent each time, making it near
impossible for anyone except for you to keep track of it, it also
prevents it from being possibly blocked based on the user-agent. If
you uncheck morph user-agent it will identify as one of the agents
you chose in the list of agents, you can also type in your own agent
to identify as.
Harvesting user-agent, Morphing - These 2 settings work
together.
Same as above, except for that this user-agent setting refers to the
sites that the program visits and harvests emails from.
Extend keywords
This will extend the keywords with various extensions to
increase the number of emails you find. If you have it on disabled
it will not do this at all, it will be slower but the emails will be
very targeted. If you have it on Default it will do it a little and
give the best return rate of emails, it will be very fast and the
emails will be quite targeted. If you have it on high it will do it
a lot and will work a long time on each keyword finding quite many
emails, but it won't be very fast and the emails won't be especially
targeted.
Expert
[Top]
Domain name servers (DNS)
Here you can add the DNS that the program will use to resolve
hosts to IP's. This is a very important factor to get a good speed
and actually being able to harvest at all. The faster the DNS is the
faster you can harvest. On startup your default DNS will be detect
and added to the dropdown list. If you wish to not use them, simply
click on clear list. Then use the field below to add the one of your
choice. Good luck.
Emulate human behavior
This will emulate human behavior when getting sites from the
search engine. It will prevent you from getting blocked. When
enabling this the program will override your setting for "Searchengine
connections" and set it to 1. Between each connection it will pause
for X seconds plus a short random interval to increase the odds of
being taken for a human. Set the number of seconds between each
query in the field to the right of this option.
Autosave emails when more than this number have been found
This is the number of emails to keep in memory before saving
them to a file. Each one of the last X emails (where X is the value
of this field) contains 0% duplicates. The higher you set it the
more RAM it will take and the total of emails in the list will have
less duplicated emails. If you get too many duplicate emails when
performing a large harvesting session. Try increasing this value,
that will make it better. But remember to have a good deal of RAM
available if you increase it.
Custom query string
This is a very effective field if you use it right. With it you
can add queries the program makes to the search engine just like
google adds them in the address bar. Here is an example, if you set
the field to: "&as_sitesearch=tripod.com" (without the quotes). Then
only sites from the domain tripod.com will be returned in the search
results. You can also replace tripod.com with for example .tw to
only get sites from Taiwan. If you set the field to "&lr=lang_it"
then only sites in Italian will be displayed ("it" being the short
for "italian"). You can also combine two settings like this: "&as_sitesearch=.co.uk&lr=lang_en".
With that query only sites from the United Kingdom written in
English will be returned. You can learn more about what you can add
in this field by going to http://www.google.com/advanced_search and
testing what the different settings you make change in the address
bar. It might be a little complicated at first, but you will soon
get the hang of it.
Only get sites from the Google cache (VERY FAST)
Using this option you have the ability to scan the google cache
for emails. Since all the google servers are on very fast
connections you will never have bad speeds on this one if your
keywords are decent. You can retrieve over 2 million emails per hour
if you're on a fast connection. It consumes keywords and sites at
and impressing rate. Allthough remember that google will detect this
and quickly block your IP for a few hours. This is NOT recommend
unless you really know what you are doing.
Do not visit URL's on these domains - When enabled and if you have
any domains listed in the field below, then any URL located on
either one of the domains entered in the field will be skipped and
will not be harvested or even connected to. Useful setting to dodge
"bad neighbourhoods" on the internet.
Email parsing
[Top]
Neither
This will make sure that none of the two below mentioned
functions "Only accept emails with these domain extensions" and
"Discard emails with these domain extensions" are used.
Only accept emails with these domain extensions
Only the domains and domain extension of your choice will be
accepted if you select this. You can use this to only accept emails
from, for example Spain and China. You would do this by adding this
string to the textbox: .es .ch
Discard emails with these domain extensions
Same as the previous function except for that instead of only
accepting the specified extensions, any emails that have the
extensions will be discarded.
Only accept emails with these domains
Only the domains of your choice will be accepted if you select
this. You can use this to only accept from, for example hotmail.com
and aol.com. You would do this by adding this string to the textbox:
hotmail.com aol.com
Discard emails with these usernames
Discards any email that has any of the usernames found in the
textbox.
Discard emails with these strings
Discards any email that contains any of the strings in the
textbox.
Discard emails containing more than this number of digits in a
row
Self explanatory.
Discard emails that are longer that this number of characters
Self explanatory.
Never extract more than this number of emails from an URL
This makes sure that you never get more than X emails (300 by
default) from one single URL. This is there to make sure that
someone just hasn't constructed 50,000 bad emails and put them on
their site to mess with you.
Proxies
[Top]
At the moment SOCKS v5, HTTP and HTTPS proxies are supported in the
program. Any other types of proxies will not work.
Enable proxysupport on all connections
If you enable this the proxies in the list will be used for
harvesting and connecting to the search engine. Make sure that you
actually have proxies otherwise it won't work.
Enable proxysupport only on search engine connections
If you enable this the proxies in the list will be used for
connecting to the searchengine only. Make sure that you actually
have proxies otherwise it won't work.
Proxies are separated with
This defines how each proxy in the list is separated. For
example, if your list looks this: "231.4.12.1:1080 98.24.11.75:1080
56.123.2.1:1080" that would make each proxy in your list separated
with a Blank space. This function only applies if you have chosen to
load proxies from an URL or a local file.
Proxy load interval (m)
This defines the interval at which the list of proxies you use
is refreshed, remember that the proxies aren't checked for validity,
that is up to you to take care of for the moment. If for example
this is set to 2 and you have chosen to load proxies from an URL,
than every 2 minutes that URL will be queried for the proxies there
and the current ones will be removed and these new ones will added.
This function only applies if you have chosen to load proxies from
an URL or a local file.
Autoload proxies from this URL
Defines the URL from where the list of proxies will be
retrieved. Make sure that you enter the correct URL, otherwise it
won't work.
Autoload proxies from "proxies.txt" in the application directory
This will load proxies from the file proxies.txt that will be
residing in the same directory as the WebExtract executable. Keep
this file updated with fresh proxies to keep WebExtract flawlessly
working if you are using proxies. This is especially useful if you
have a program which is constantly finding fresh proxies for you.
Deep crawling
[Top]
This is a very effective function if you want to go deeper into a
site. Keep in mind though that you are less likely to come across
emails using this function. If you active deep crawling the program
will follow any links on sites it comes across that matches the
settings which you define. Then it will attempt to extract more
emails and sites from the URL's it finds until the maximum depth
level has been reached (depth level and maximum sites to extract
from each site can be altered to fit you preference). Enabling this
feature will give you the ability to find practically an unlimited
amount of sites with as little as one URL as a starting point, or
one keyword.
Neither
This will make sure that none of the two below mentioned
functions "Ignore URL's that contain these strings" and "Only accept
URL's that contain these strings" are used.
Ignore URL's that contain these strings
Doesn't visit any URL that contains any of the strings defined
in the textbox.
Only accept URL's that contain these strings
Only visits an URL that contains any of the strings defined in
the textbox.
Never go deeper than this number of levels
If you visit one site and the program finds a few links, then
you visit one of those links, then you are on level 2 in depth. If
you visit yet another url found on one of the links on depth 2, then
you will be at depth 3.
Never extract more than this number of URL's per site
This is the maximum amount of URL's that will be extracted from
each individual URL the program comes across. It will extract the X
first URL's it finds on each URL.
Status [Top]
This will keep you updated with what is happening in the program.
How many emails that have been accepted, how many that has been
discarded (emails are only discarded based on your email parsing
settings). And how many that are invalid, how many accepted emails
you collect each hour, how many sites that are left before the
searchengine will be queried for more. How much data you have
received and how many keywords that are left. When you run out of
keywords and sites the search is finished.
It won't start, it says I'm missing some file? [Top]
If the program says that you're missing an OCX or DLL file then
download and install this file from the Microsoft website (restart
your computer after you have installed it):
Runtime library
If it still won't work then contact us
and tell us what file that you are missing.
No data is
being downloaded? [Top]
This is a common problem if you are running a firewall or other data
controlling software which can be blocking WebExtract.
Make sure that you set your firewall to allow WebExtract full access
to the Internet. Also make sure that you have administrator access
on the system where WebExtract is running.
If that doesn't work then temporarily shut down your firewall and
try again.
If you still have a problem then contact us with as much detailed
information as possible and we will help you.
I have
a question, how do I reach you?
[Top]
You can find our contact information on this
page, feel free to contact us with any questions that you might
have and we will get back to you as soon as possible. Normally
within 24 hours.
|