468,247 Members | 1,310 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,247 developers. It's quick & easy.

Censor list words ?

All,
I am starting to compile a list of "censor" words for use on our site, for
example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.
Jul 17 '05 #1
19 8219
My advise: Don't even try. It's too easy to get around. I can easily type F
U C K to get around your filter.

Uzytkownik "Mangina" <su*****@microsoft.com> napisal w wiadomosci
news:10*************@corp.supernews.com...
All,
I am starting to compile a list of "censor" words for use on our site, for
example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.

Jul 17 '05 #2
This is no easy task, needs fuzzy logic, ah pussywillow! Oh, did you know
there was dick seargent or was it dick yourk on bewitched? I also got a new
cumberbun :)

--
Mike Bradley
http://www.gzentools.com -- free online php tools
"Mangina" <su*****@microsoft.com> wrote in message
news:10*************@corp.supernews.com...
All,
I am starting to compile a list of "censor" words for use on our site, for
example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.

Jul 17 '05 #3
Chung Leong wrote:
My advise: Don't even try. It's too easy to get around. I can easily type F
U C K to get around your filter.

<snip>!!!!

Not hard to write a filter to find that

~Cameron
Jul 17 '05 #4
Cameron wrote:
Chung Leong wrote:
My advise: Don't even try. It's too easy to get around. I can easily type F
U C K to get around your filter.

<snip>!!!!

Not hard to write a filter to find that


F<wrongtag/>U<browser_ignores_this/>C<another_one with_parameters="true"/>K

:))
--
--= my mail box only accepts =--
--= Content-Type: text/plain =--
--= Size below 10001 bytes =--
Jul 17 '05 #5
Pedro Graca wrote:
Cameron wrote:
Chung Leong wrote:
My advise: Don't even try. It's too easy to get around. I can easily type F
U C K to get around your filter.


<snip>!!!!

Not hard to write a filter to find that

F<wrongtag/>U<browser_ignores_this/>C<another_one with_parameters="true"/>K

:))


What has the browser got to do with it? the browser doesn't filter the
stuff, the server side script does that, and any forum or whatever that
allows all tags to pass through and into a message isn't a very good
one, IMHO.

~Cameron
Jul 17 '05 #6
Cameron wrote:
What has the browser got to do with it? the browser doesn't filter the
stuff, the server side script does that, and any forum or whatever that
allows all tags to pass through and into a message isn't a very good
one, IMHO.


I agree. It's MHO too.

It is just one more thing for the OP to pay attention to.
He has to filter bad words *and* tags (allowed (if any) and disallowed).

As another poster said, I think the best is to all the text, remove HTML
tags and replace a few [...] tags with <...> for output formatting

<?php
$submitted = 'BAD <wrongtag>WORDS</wrongtag> ARE NOT WELCOME';

//remove HTML tags
$submitted = preg_replace('@<[^>]*>@U', '', $submitted);

// substitute valid [...] by <...>
$submitted = preg_replace('@\[(/?(?:b|i|u))\]@i', '<$1>', $submitted);

# $submitted is now "BAD WORDS ARE <b>NOT</b> WELCOME"
?>
--
--= my mail box only accepts =--
--= Content-Type: text/plain =--
--= Size below 10001 bytes =--
Jul 17 '05 #7
You would have to filter for these too

F U CK
F U C K
F UCK
F.UCK
FU.CK
FUC.K
F.U.CK
....

The hard part is writing something that filter out these various
combinations while not generating false positives. People will always find
ways around it anyway (use capital i as l, type in all caps and use l as i,
"FUC|<", ). And they can always, kurwa macz, cuss in a different language.

Uzytkownik "Cameron" <fo*@bar.invalid> napisal w wiadomosci
news:c0**********@newsg3.svr.pol.co.uk...
Chung Leong wrote:
My advise: Don't even try. It's too easy to get around. I can easily type F U C K to get around your filter.

<snip>!!!!

Not hard to write a filter to find that

~Cameron

Jul 17 '05 #8
On Sun, 15 Feb 2004 10:31:53 -0500, Chung Leong wrote:
You would have to filter for these too

F U CK
F U C K
F UCK
F.UCK
FU.CK
FUC.K
F.U.CK
...

snip

Let's not forget ars, fuk, sht and cck.

And don't be too trigger-happy with any swear-filters, or the entire
population of Scunthorpe might start wondering why you're blocking them.
Jul 17 '05 #9
"Chung Leong" <ch***********@hotmail.com> wrote in message
news:wZ********************@comcast.com...
You would have to filter for these too

F U CK
F U C K
F UCK
F.UCK
FU.CK
FUC.K
F.U.CK
...

The hard part is writing something that filter out these various
combinations while not generating false positives. People will always find
ways around it anyway (use capital i as l, type in all caps and use l as i, "FUC|<", ). And they can always, kurwa macz, cuss in a different language.


lest we foget *false* positives?
* cock a doodle do
* pussy willow
* cockroach (not sure about spelling, but looks cool)
* The mexican rode the ass into town. (ass=donkey)
* I rode her ass all night (I borrowed her donkey :)

hmm, mya I expand on you "FUC|<", FUC|&lt;
--
Mike Bradley
http://www.gzentools.com -- free online php tools
Jul 17 '05 #10
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list.
I have some routines that work pretty well using both metaphone and
levenshtein when compared against a list. The LIST is what I am looking for.
All my searches, filters, etc. using "sound" spelling, and works pretty damn
good. Now it doesn't work with special characters or anything. A good
example is, tattoo, tatoo, tat2, tatu, tatooz, tattwo, etc. all return the
same results. Same with fuck phuck fuhck etc.

So, again, anyone with a LIST of words ?

"Mangina" <su*****@microsoft.com> wrote in message
news:10*************@corp.supernews.com...
All,
I am starting to compile a list of "censor" words for use on our site, for
example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.

Jul 17 '05 #11
There are only like 7 words, but then you have to worry about it being not a
bad word in another context or language.

Besides, we were not trying to be *unhelpfull* just trying to let you know
there is more too it, perhaps we all should just be narrow minded and answer
just your question.

here my list:
fuck off
fuck you

oh wait, you wanted just the root words:
fuck
dick, nope thats not one, its someones name, richard
cunt
asshole
mutherfucker
Btw, phucket is not a bad word, its a name of a city on one of the islands,
or an island name, I do not remember.

--
Mike Bradley
http://www.gzentools.com -- free online php tools
"Mangina" <su*****@microsoft.com> wrote in message
news:10*************@corp.supernews.com...
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list.
I have some routines that work pretty well using both metaphone and
levenshtein when compared against a list. The LIST is what I am looking for. All my searches, filters, etc. using "sound" spelling, and works pretty damn good. Now it doesn't work with special characters or anything. A good
example is, tattoo, tatoo, tat2, tatu, tatooz, tattwo, etc. all return the
same results. Same with fuck phuck fuhck etc.

So, again, anyone with a LIST of words ?

"Mangina" <su*****@microsoft.com> wrote in message
news:10*************@corp.supernews.com...
All,
I am starting to compile a list of "censor" words for use on our site, for example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.


Jul 17 '05 #12
Mangina wrote:
So, again, anyone with a LIST of words ?


You might want to try your luck at
news:alt.usage.english or news:alt.english.usage

Anyway, after a bit of googling I found a site that might interest you:
http://www.georgecarlin.com/georgeca...irty/2443.html
--
--= my mail box only accepts =--
--= Content-Type: text/plain =--
--= Size below 10001 bytes =--
Jul 17 '05 #13
Too funny, I saw his bit the other night on TV, thats where I came up "only
about 7" :)

--
Mike Bradley
http://www.gzentools.com -- free online php tools
"Pedro Graca" <he****@hotpop.com> wrote in message
news:c0*************@ID-203069.news.uni-berlin.de...
Mangina wrote:
So, again, anyone with a LIST of words ?


You might want to try your luck at
news:alt.usage.english or news:alt.english.usage

Anyway, after a bit of googling I found a site that might interest you:
http://www.georgecarlin.com/georgeca...irty/2443.html
--
--= my mail box only accepts =--
--= Content-Type: text/plain =--
--= Size below 10001 bytes =--

Jul 17 '05 #14

On 15-Feb-2004, "Mangina" <su*****@microsoft.com> wrote:
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list.
I have some routines that work pretty well using both metaphone and
levenshtein when compared against a list. The LIST is what I am looking
for.
All my searches, filters, etc. using "sound" spelling, and works pretty
damn
good. Now it doesn't work with special characters or anything. A good
example is, tattoo, tatoo, tat2, tatu, tatooz, tattwo, etc. all return the
same results. Same with fuck phuck fuhck etc.

So, again, anyone with a LIST of words ?


Profanity is very much contextual. I think most of the other respondents
were simply trying to warn you that censorship is not so simple, even
phonetic matches are problematic (pussy cat). Add to the mix foreign
languages and the odds on false matches goes way up. Phuc is a common name
in Vietnamese and Phuket is a city in Thailand.

This link has some common profane words:
http://en.wikipedia.org/wiki/Profanity

--
Tom Thackrey
www.creative-light.com
tom (at) creative (dash) light (dot) com
do NOT send email to ja*********@willglen.net (it's reserved for spammers)
Jul 17 '05 #15
Wxll, lxt sxx hxw wxll yxxr "sxxnd" spxllxng hxndlxs thxs! Hx, hx, hx. Yxx
cxn gx fxck yxxrsxlf! Gx gxvx cxnnxlxngxs tx x dxnkxy's xss! Xh, xnd BTW, X
dxn't hxvx x fxckxng lxst xf wxrds.

Uzytkownik "Mangina" <su*****@microsoft.com> napisal w wiadomosci
news:10*************@corp.supernews.com...
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list.
I have some routines that work pretty well using both metaphone and
levenshtein when compared against a list. The LIST is what I am looking for. All my searches, filters, etc. using "sound" spelling, and works pretty damn good. Now it doesn't work with special characters or anything. A good
example is, tattoo, tatoo, tat2, tatu, tatooz, tattwo, etc. all return the
same results. Same with fuck phuck fuhck etc.

So, again, anyone with a LIST of words ?

"Mangina" <su*****@microsoft.com> wrote in message
news:10*************@corp.supernews.com...
All,
I am starting to compile a list of "censor" words for use on our site, for example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.


Jul 17 '05 #16
"Mangina" <su*****@microsoft.com> wrote in message news:<10*************@corp.supernews.com>...
All,
I am starting to compile a list of "censor" words for use on our site, for
example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.


You may want to look at the source of phpBB (<http://www.phpbb.com/>)

--
"Success = 10% sweat + 90% tears"
If you live in USA, please support John Edwards.
Email: rrjanbiah-at-Y!com
Jul 17 '05 #17
actually, it filtered just fine. however, it replaced all the x's with
chinky. not sure why. hmmm.

"Chung Leong" <ch***********@hotmail.com> wrote in message
news:mr********************@comcast.com...
Wxll, lxt sxx hxw wxll yxxr "sxxnd" spxllxng hxndlxs thxs! Hx, hx, hx. Yxx
cxn gx fxck yxxrsxlf! Gx gxvx cxnnxlxngxs tx x dxnkxy's xss! Xh, xnd BTW, X dxn't hxvx x fxckxng lxst xf wxrds.

Uzytkownik "Mangina" <su*****@microsoft.com> napisal w wiadomosci
news:10*************@corp.supernews.com...
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list.
I have some routines that work pretty well using both metaphone and
levenshtein when compared against a list. The LIST is what I am looking

for.
All my searches, filters, etc. using "sound" spelling, and works pretty

damn
good. Now it doesn't work with special characters or anything. A good
example is, tattoo, tatoo, tat2, tatu, tatooz, tattwo, etc. all return the same results. Same with fuck phuck fuhck etc.

So, again, anyone with a LIST of words ?

"Mangina" <su*****@microsoft.com> wrote in message
news:10*************@corp.supernews.com...
All,
I am starting to compile a list of "censor" words for use on our site,

for example,

$search = array("fuck",
"ass",
"dick",
"fucker",
"fucking",
"cunt",
"pussy",
"cock",
"c0ck",
"cum",
"twat",
"clit",
"bitch",
"fuk",
"fuking",
"motherfucker");

does anyone have/know where a list already exists that I can use ?
thanks.



Jul 17 '05 #18
On Sun, 15 Feb 2004 13:38:56 -0900, Mangina wrote:
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list. snip
So, again, anyone with a LIST of words ?

snip


Most verbose listing would be Roger's Profanisaurus, which can be found:

http://www.viz.co.uk/profanisaurus/profanis.htm

Complete a-z listing, along with explanations of what it all means.

If that's too much, these lyrics pretty much cover it:

http://www.lyricscafe.com/b/blink182/familyreunion.htm

Seriously, take a look at the Advertising Standards Agency (UK)'s survey
on the offensiveness of words.

As a PDF, it can be found here:
http://www.asa.org.uk/research/documents/upl_10.pdf

Appendix 2 includes a list of all the words they surveyed on and how
offensive they were rated. Other appendices show which words have been
complained about on television programmes and adverts.

You'll have to do a bit of copy-and-paste, but it should give you the list
you so desperately need.

Oh, and I am not responsible for the content of other websites, so please
don't sue me if seeing swear words on screen makes you upset.

AdamT

Jul 17 '05 #19
thank you - actually i love swear words, however, a site that i am working
on does not. its one of those "see what others have searched for..." lists.
this way, i can weed out most phrases with a simple list, then run some of
my "soundspelling" functions to further check, etc.

thanks for the very informative post =)

"adamt" <ad***@ueterep.netvoss.co.uk> wrote in message
news:pa***************************@ueterep.netvoss .co.uk...
On Sun, 15 Feb 2004 13:38:56 -0900, Mangina wrote:
Thanks for the many unhelpful posts. I can care less about every which way
to spell fuck. I am simply looking for a list.

snip

So, again, anyone with a LIST of words ?

snip


Most verbose listing would be Roger's Profanisaurus, which can be found:

http://www.viz.co.uk/profanisaurus/profanis.htm

Complete a-z listing, along with explanations of what it all means.

If that's too much, these lyrics pretty much cover it:

http://www.lyricscafe.com/b/blink182/familyreunion.htm

Seriously, take a look at the Advertising Standards Agency (UK)'s survey
on the offensiveness of words.

As a PDF, it can be found here:
http://www.asa.org.uk/research/documents/upl_10.pdf

Appendix 2 includes a list of all the words they surveyed on and how
offensive they were rated. Other appendices show which words have been
complained about on television programmes and adverts.

You'll have to do a bit of copy-and-paste, but it should give you the list
you so desperately need.

Oh, and I am not responsible for the content of other websites, so please
don't sue me if seeing swear words on screen makes you upset.

AdamT

Jul 17 '05 #20

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

29 posts views Thread by David Eng | last post: by
26 posts views Thread by Swroteb | last post: by
7 posts views Thread by Jim Carlock | last post: by
1 post views Thread by yawgmoth7 | last post: by
3 posts views Thread by Ruben van Engelenburg | last post: by
18 posts views Thread by Grant Edwards | last post: by
1 post views Thread by Tom Jackson | last post: by
reply views Thread by NPC403 | last post: by
reply views Thread by zattat | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.