Jump to page: 1 2 3
Thread overview
interesting spam trap
Jun 02, 2003
roland
Jun 02, 2003
Greg Peet
Jun 03, 2003
Jan Knepper
Jun 03, 2003
KarL
Jun 03, 2003
Jan Knepper
Jun 03, 2003
Walter
Jun 03, 2003
Jan Knepper
Jun 03, 2003
Walter
Jun 03, 2003
Jan Knepper
Jun 03, 2003
roland
Jun 04, 2003
roland
Jun 05, 2003
Jan Knepper
Jun 05, 2003
roland
Jun 05, 2003
Jan Knepper
Jun 06, 2003
roland
Jun 06, 2003
Greg Peet
Jun 06, 2003
roland
Jun 07, 2003
Scott Dale Robison
Jun 07, 2003
gf
Jun 07, 2003
Jan Knepper
Jun 08, 2003
Scott Dale Robison
Jun 09, 2003
Jan Knepper
Jun 10, 2003
Scott Dale Robison
June 02, 2003
http://www.unclebobsuncle.com/antispam.html

roland :-)

June 02, 2003
Wow thanks for bringing that to our attention. I can't wait to put that on my site. Quite funny too.

"roland" <--nancyetroland@free.fr> wrote in message news:bbgc75$2339$1@digitaldaemon.com...
> http://www.unclebobsuncle.com/antispam.html
>
> roland :-)
>


June 03, 2003
Interesting indeed, but it does not work. Besides most of the statements on the page have no ground.

First of all, any decent spider or crawler would keep track of URL's it has processed. I mean think about it, every decent website probable has circular references in the form of x.html -> y.html -> z.html -> x.html. I know for a fact that quite a few of my sites have many of these. Obviously this is something anyone developing a spider or crawler, which I have done ;-), will run into. So the idea is cute, but I don't think it really works.

Second, quite a bit of the page is generated through JavaScript. Many spiders or crawlers do NOT run JavaScript. I know for a fact that JavaScript is a serious challenge for many of the search engines on the internet.

Third, some, more advanced spiders or crawlers do not just look at mailto: tags, but recorgnize a '@' and check the prefix and suffix. Run the complete string through an email syntax checker, to make sure the address only contains legal email address characters and such and actually ends with an existing Top Level Domain (TLD) such as .com, .net. .com, etc and later match check the domain through DNS and/or Whois.

Fourth, the invalid email addresses have no effect on spammers. They will burn some more bandwidth, but as they usually use non-existent From: and Return-Path: in their messages anyone, but not the spammer will receive the bounces.

Fifth, if the spammer would actually have some form of decency and bulk mail to a list and honor a removal mechanism the mechanism usually is intelligent enough to keep track of bounces, probe them and next remove them from the list automagically. Check here for instance http://www.ezmlm.org/ which works with MySQL http://www.mysql.com/ through which it is rather easy to maintain a database with millions of email addresses.

To actually *fight* SPAM what would make sence is report SPAM ASAP at http://www.spamcop.net/ as that results into more than just reporting. One of the great features is that once a lot people start reporting a certain SPAM spamcop will at the originating IP address to bl.spamcop.net which can be used by email receiving servers (SMTP servers) to block incoming email if it comes from one of the many blocked IP addresses. Unfortunately, most people just seem to delete SPAM and most email providers do not seem to use bl.spamcop.net for email blocking.

Of course, not publishing you email address ANYWHERE on the internet would help the most! ;-) However, I have noticed that quite a few company's that collect email addresses with online sales or other forms of subscription also sell those email addresses to others...

Just my 2 cents... Oh, in case there is any doubt... ;-) I have
written a couple of crawlers and actually also crawlers that do
handle JavaScripts very well. I have been hosting Internet
services for 3 years. I do report almost all spam at
http://www.spamcop.net/ and yes, the mail servers here do check
bl.spamcop.net (and a few others) before they actually receive
the email, well that is if the domain owners want it. Check
http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
for some statistics on SPAM blocking...
Recently I patched the SMTP server again so it does block all
non-existent email adresses on local domains.



roland wrote:

> http://www.unclebobsuncle.com/antispam.html
>
> roland :-)

--
ManiaC++
Jan Knepper


June 03, 2003
And are you run sendmail or qmail or postfix?

"Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us...

> Just my 2 cents... Oh, in case there is any doubt... ;-) I have
> written a couple of crawlers and actually also crawlers that do
> handle JavaScripts very well. I have been hosting Internet
> services for 3 years. I do report almost all spam at
> http://www.spamcop.net/ and yes, the mail servers here do check
> bl.spamcop.net (and a few others) before they actually receive
> the email, well that is if the domain owners want it. Check
> http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
> for some statistics on SPAM blocking...
> Recently I patched the SMTP server again so it does block all
> non-existent email adresses on local domains.



June 03, 2003
"Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us...
> Just my 2 cents... Oh, in case there is any doubt... ;-) I have written a couple of crawlers and actually also crawlers that do handle JavaScripts very well. I have been hosting Internet services for 3 years. I do report almost all spam at http://www.spamcop.net/ and yes, the mail servers here do check bl.spamcop.net (and a few others) before they actually receive the email, well that is if the domain owners want it. Check http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml for some statistics on SPAM blocking...

I use a javascript generated mailto: on the digitalmars web pages. Are the javascript aware scrapers able to figure those out?


June 03, 2003
hello

thanks for the interesting information

cheers

roland

Jan Knepper wrote:

> Interesting indeed, but it does not work. Besides most of the
> statements on the page have no ground.
> 
> First of all, any decent spider or crawler would keep track of
> URL's it has processed. I mean think about it, every decent
> website probable has circular references in the form of x.html
> -> y.html -> z.html -> x.html. I know for a fact that quite a
> few of my sites have many of these. Obviously this is something
> anyone developing a spider or crawler, which I have done ;-),
> will run into. So the idea is cute, but I don't think it really
> works.
> 
> Second, quite a bit of the page is generated through JavaScript.
> Many spiders or crawlers do NOT run JavaScript. I know for a
> fact that JavaScript is a serious challenge for many of the
> search engines on the internet.
> 
> Third, some, more advanced spiders or crawlers do not just look
> at mailto: tags, but recorgnize a '@' and check the prefix and
> suffix. Run the complete string through an email syntax checker,
> to make sure the address only contains legal email address
> characters and such and actually ends with an existing Top Level
> Domain (TLD) such as .com, .net. .com, etc and later match check
> the domain through DNS and/or Whois.
> 
> Fourth, the invalid email addresses have no effect on spammers.
> They will burn some more bandwidth, but as they usually use
> non-existent From: and Return-Path: in their messages anyone,
> but not the spammer will receive the bounces.
> 
> Fifth, if the spammer would actually have some form of decency
> and bulk mail to a list and honor a removal mechanism the
> mechanism usually is intelligent enough to keep track of
> bounces, probe them and next remove them from the list
> automagically. Check here for instance http://www.ezmlm.org/
> which works with MySQL http://www.mysql.com/ through which it is
> rather easy to maintain a database with millions of email
> addresses.
> 
> To actually *fight* SPAM what would make sence is report SPAM
> ASAP at http://www.spamcop.net/ as that results into more than
> just reporting. One of the great features is that once a lot
> people start reporting a certain SPAM spamcop will at the
> originating IP address to bl.spamcop.net which can be used by
> email receiving servers (SMTP servers) to block incoming email
> if it comes from one of the many blocked IP addresses.
> Unfortunately, most people just seem to delete SPAM and most
> email providers do not seem to use bl.spamcop.net for email
> blocking.
> 
> Of course, not publishing you email address ANYWHERE on the
> internet would help the most! ;-) However, I have noticed that
> quite a few company's that collect email addresses with online
> sales or other forms of subscription also sell those email
> addresses to others...
> 
> Just my 2 cents... Oh, in case there is any doubt... ;-) I have
> written a couple of crawlers and actually also crawlers that do
> handle JavaScripts very well. I have been hosting Internet
> services for 3 years. I do report almost all spam at
> http://www.spamcop.net/ and yes, the mail servers here do check
> bl.spamcop.net (and a few others) before they actually receive
> the email, well that is if the domain owners want it. Check
> http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
> for some statistics on SPAM blocking...
> Recently I patched the SMTP server again so it does block all
> non-existent email adresses on local domains.
> 
> 
> 
> roland wrote:
> 
> 
>>http://www.unclebobsuncle.com/antispam.html
>>
>>roland :-)
>>
> 
> --
> ManiaC++
> Jan Knepper
> 
> 
> 

June 03, 2003
Walter wrote:

> "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us...
> > Just my 2 cents... Oh, in case there is any doubt... ;-) I have written a couple of crawlers and actually also crawlers that do handle JavaScripts very well. I have been hosting Internet services for 3 years. I do report almost all spam at http://www.spamcop.net/ and yes, the mail servers here do check bl.spamcop.net (and a few others) before they actually receive the email, well that is if the domain owners want it. Check http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml for some statistics on SPAM blocking...
>
> I use a javascript generated mailto: on the digitalmars web pages. Are the javascript aware scrapers able to figure those out?

Yes! My crawler will pick those up with out ANY problem.

Jan


June 03, 2003
Definitely not sendmail...
Patched qmail...



KarL wrote:

> And are you run sendmail or qmail or postfix?
>
> "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDBFCDB.9D953019@smartsoft.us...
>
> > Just my 2 cents... Oh, in case there is any doubt... ;-) I have
> > written a couple of crawlers and actually also crawlers that do
> > handle JavaScripts very well. I have been hosting Internet
> > services for 3 years. I do report almost all spam at
> > http://www.spamcop.net/ and yes, the mail servers here do check
> > bl.spamcop.net (and a few others) before they actually receive
> > the email, well that is if the domain owners want it. Check
> > http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
> > for some statistics on SPAM blocking...
> > Recently I patched the SMTP server again so it does block all
> > non-existent email adresses on local domains.

June 03, 2003
"Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDC8F17.9616A80A@smartsoft.us...
> Walter wrote:
> > I use a javascript generated mailto: on the digitalmars web pages. Are
the
> > javascript aware scrapers able to figure those out?
> Yes! My crawler will pick those up with out ANY problem.

Does that mean I have to write a cgi program to do it? <g>


June 03, 2003
Walter wrote:

> "Jan Knepper" <jan@smartsoft.us> wrote in message news:3EDC8F17.9616A80A@smartsoft.us...
> > Walter wrote:
> > > I use a javascript generated mailto: on the digitalmars web pages. Are
> the
> > > javascript aware scrapers able to figure those out?
> > Yes! My crawler will pick those up with out ANY problem.
>
> Does that mean I have to write a cgi program to do it? <g>

No, I can provide you with that, if you want...


« First   ‹ Prev
1 2 3