Page 1 of 1

[DEV] MS Word HTML Crap Cleaner Help! (Work in Progress)

Posted: Tue 5. Dec 2006, 15:49
by marcus@localhorst
Hi,
I began to setup a filter for all that annoying MS Word Crap Tags that your backenduser always paste in the FCK Editor (I know the features of FCK and the semi-automatic cleanup and I know my editors, who give a fuck about using that ;-) and wrote their text in word, to copy&paste it to the editor and the next step is calling me, why the layout has fucked up.

to prevent those suckers I will write a bunch of regEx which strip out the whole empty tags, proprietary attributes and tags etc.
So I need your contribution of the most hated tags you will always find in your sourcecode.
Maybe you can tell me your experience with that annoying stuff and how you handle this.

EDIT: On some host which allow exec() you can use HTMLTidy which works fine for that.
http://phpwcms.de/forum/viewtopic.php?p=68348 (OG removed the stuff in newer phpwcms versions!)

Thanks and best wishes
marcus

Posted: Mon 11. Dec 2006, 08:37
by phalancs
Hey!

This is exactly what i was looking for, got no clue how to help, but please inform us about news on this!

:)

Posted: Mon 11. Dec 2006, 08:42
by marcus@localhorst
phalancs wrote:Hey!

This is exactly what i was looking for, got no clue how to help, but please inform us about news on this!
:)
you could provide some of the msword crap Tags like <o:p> or all that other suspect stuff.
thanks and best
marcus

ok

Posted: Mon 11. Dec 2006, 08:43
by phalancs
i will have a look and post!