Thread overview | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 02, 2003 interesting spam trap | ||||
---|---|---|---|---|
| ||||
http://www.unclebobsuncle.com/antispam.html roland :-) |
June 02, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to roland | Wow thanks for bringing that to our attention. I can't wait to put that on my site. Quite funny too. "roland" <--nancyetroland@free.fr> wrote in message news:bbgc75$2339$1@digitaldaemon.com... > http://www.unclebobsuncle.com/antispam.html > > roland :-) > |
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to roland | Interesting indeed, but it does not work. Besides most of the statements on the page have no ground. First of all, any decent spider or crawler would keep track of URL's it has processed. I mean think about it, every decent website probable has circular references in the form of x.html -> y.html -> z.html -> x.html. I know for a fact that quite a few of my sites have many of these. Obviously this is something anyone developing a spider or crawler, which I have done ;-), will run into. So the idea is cute, but I don't think it really works. Second, quite a bit of the page is generated through JavaScript. Many spiders or crawlers do NOT run JavaScript. I know for a fact that JavaScript is a serious challenge for many of the search engines on the internet. Third, some, more advanced spiders or crawlers do not just look at mailto: tags, but recorgnize a '@' and check the prefix and suffix. Run the complete string through an email syntax checker, to make sure the address only contains legal email address characters and such and actually ends with an existing Top Level Domain (TLD) such as .com, .net. .com, etc and later match check the domain through DNS and/or Whois. Fourth, the invalid email addresses have no effect on spammers. They will burn some more bandwidth, but as they usually use non-existent From: and Return-Path: in their messages anyone, but not the spammer will receive the bounces. Fifth, if the spammer would actually have some form of decency and bulk mail to a list and honor a removal mechanism the mechanism usually is intelligent enough to keep track of bounces, probe them and next remove them from the list automagically. Check here for instance http://www.ezmlm.org/ which works with MySQL http://www.mysql.com/ through which it is rather easy to maintain a database with millions of email addresses. To actually *fight* SPAM what would make sence is report SPAM ASAP at http://www.spamcop.net/ as that results into more than just reporting. One of the great features is that once a lot people start reporting a certain SPAM spamcop will at the originating IP address to bl.spamcop.net which can be used by email receiving servers (SMTP servers) to block incoming email if it comes from one of the many blocked IP addresses. Unfortunately, most people just seem to delete SPAM and most email providers do not seem to use bl.spamcop.net for email blocking. Of course, not publishing you email address ANYWHERE on the internet would help the most! ;-) However, I have noticed that quite a few company's that collect email addresses with online sales or other forms of subscription also sell those email addresses to others... Just my 2 cents... Oh, in case there is any doubt... ;-) I have written a couple of crawlers and actually also crawlers that do handle JavaScripts very well. I have been hosting Internet services for 3 years. I do report almost all spam at http://www.spamcop.net/ and yes, the mail servers here do check bl.spamcop.net (and a few others) before they actually receive the email, well that is if the domain owners want it. Check http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml for some statistics on SPAM blocking... Recently I patched the SMTP server again so it does block all non-existent email adresses on local domains. roland wrote: > http://www.unclebobsuncle.com/antispam.html > > roland :-) -- ManiaC++ Jan Knepper |
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jan Knepper | And are you run sendmail or qmail or postfix? "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us... > Just my 2 cents... Oh, in case there is any doubt... ;-) I have > written a couple of crawlers and actually also crawlers that do > handle JavaScripts very well. I have been hosting Internet > services for 3 years. I do report almost all spam at > http://www.spamcop.net/ and yes, the mail servers here do check > bl.spamcop.net (and a few others) before they actually receive > the email, well that is if the domain owners want it. Check > http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml > for some statistics on SPAM blocking... > Recently I patched the SMTP server again so it does block all > non-existent email adresses on local domains. |
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jan Knepper | "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us... > Just my 2 cents... Oh, in case there is any doubt... ;-) I have written a couple of crawlers and actually also crawlers that do handle JavaScripts very well. I have been hosting Internet services for 3 years. I do report almost all spam at http://www.spamcop.net/ and yes, the mail servers here do check bl.spamcop.net (and a few others) before they actually receive the email, well that is if the domain owners want it. Check http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml for some statistics on SPAM blocking... I use a javascript generated mailto: on the digitalmars web pages. Are the javascript aware scrapers able to figure those out? |
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jan Knepper | hello
thanks for the interesting information
cheers
roland
Jan Knepper wrote:
> Interesting indeed, but it does not work. Besides most of the
> statements on the page have no ground.
>
> First of all, any decent spider or crawler would keep track of
> URL's it has processed. I mean think about it, every decent
> website probable has circular references in the form of x.html
> -> y.html -> z.html -> x.html. I know for a fact that quite a
> few of my sites have many of these. Obviously this is something
> anyone developing a spider or crawler, which I have done ;-),
> will run into. So the idea is cute, but I don't think it really
> works.
>
> Second, quite a bit of the page is generated through JavaScript.
> Many spiders or crawlers do NOT run JavaScript. I know for a
> fact that JavaScript is a serious challenge for many of the
> search engines on the internet.
>
> Third, some, more advanced spiders or crawlers do not just look
> at mailto: tags, but recorgnize a '@' and check the prefix and
> suffix. Run the complete string through an email syntax checker,
> to make sure the address only contains legal email address
> characters and such and actually ends with an existing Top Level
> Domain (TLD) such as .com, .net. .com, etc and later match check
> the domain through DNS and/or Whois.
>
> Fourth, the invalid email addresses have no effect on spammers.
> They will burn some more bandwidth, but as they usually use
> non-existent From: and Return-Path: in their messages anyone,
> but not the spammer will receive the bounces.
>
> Fifth, if the spammer would actually have some form of decency
> and bulk mail to a list and honor a removal mechanism the
> mechanism usually is intelligent enough to keep track of
> bounces, probe them and next remove them from the list
> automagically. Check here for instance http://www.ezmlm.org/
> which works with MySQL http://www.mysql.com/ through which it is
> rather easy to maintain a database with millions of email
> addresses.
>
> To actually *fight* SPAM what would make sence is report SPAM
> ASAP at http://www.spamcop.net/ as that results into more than
> just reporting. One of the great features is that once a lot
> people start reporting a certain SPAM spamcop will at the
> originating IP address to bl.spamcop.net which can be used by
> email receiving servers (SMTP servers) to block incoming email
> if it comes from one of the many blocked IP addresses.
> Unfortunately, most people just seem to delete SPAM and most
> email providers do not seem to use bl.spamcop.net for email
> blocking.
>
> Of course, not publishing you email address ANYWHERE on the
> internet would help the most! ;-) However, I have noticed that
> quite a few company's that collect email addresses with online
> sales or other forms of subscription also sell those email
> addresses to others...
>
> Just my 2 cents... Oh, in case there is any doubt... ;-) I have
> written a couple of crawlers and actually also crawlers that do
> handle JavaScripts very well. I have been hosting Internet
> services for 3 years. I do report almost all spam at
> http://www.spamcop.net/ and yes, the mail servers here do check
> bl.spamcop.net (and a few others) before they actually receive
> the email, well that is if the domain owners want it. Check
> http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
> for some statistics on SPAM blocking...
> Recently I patched the SMTP server again so it does block all
> non-existent email adresses on local domains.
>
>
>
> roland wrote:
>
>
>>http://www.unclebobsuncle.com/antispam.html
>>
>>roland :-)
>>
>
> --
> ManiaC++
> Jan Knepper
>
>
>
|
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | Walter wrote:
> "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us...
> > Just my 2 cents... Oh, in case there is any doubt... ;-) I have written a couple of crawlers and actually also crawlers that do handle JavaScripts very well. I have been hosting Internet services for 3 years. I do report almost all spam at http://www.spamcop.net/ and yes, the mail servers here do check bl.spamcop.net (and a few others) before they actually receive the email, well that is if the domain owners want it. Check http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml for some statistics on SPAM blocking...
>
> I use a javascript generated mailto: on the digitalmars web pages. Are the javascript aware scrapers able to figure those out?
Yes! My crawler will pick those up with out ANY problem.
Jan
|
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to KarL | Definitely not sendmail...
Patched qmail...
KarL wrote:
> And are you run sendmail or qmail or postfix?
>
> "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us...
>
> > Just my 2 cents... Oh, in case there is any doubt... ;-) I have
> > written a couple of crawlers and actually also crawlers that do
> > handle JavaScripts very well. I have been hosting Internet
> > services for 3 years. I do report almost all spam at
> > http://www.spamcop.net/ and yes, the mail servers here do check
> > bl.spamcop.net (and a few others) before they actually receive
> > the email, well that is if the domain owners want it. Check
> > http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
> > for some statistics on SPAM blocking...
> > Recently I patched the SMTP server again so it does block all
> > non-existent email adresses on local domains.
|
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jan Knepper | "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDC8F17.9616A80A@smartsoft.us... > Walter wrote: > > I use a javascript generated mailto: on the digitalmars web pages. Are the > > javascript aware scrapers able to figure those out? > Yes! My crawler will pick those up with out ANY problem. Does that mean I have to write a cgi program to do it? <g> |
June 03, 2003 Re: interesting spam trap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | Walter wrote:
> "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDC8F17.9616A80A@smartsoft.us...
> > Walter wrote:
> > > I use a javascript generated mailto: on the digitalmars web pages. Are
> the
> > > javascript aware scrapers able to figure those out?
> > Yes! My crawler will pick those up with out ANY problem.
>
> Does that mean I have to write a cgi program to do it? <g>
No, I can provide you with that, if you want...
|
Copyright © 1999-2021 by the D Language Foundation