I have a text area that users typically paste content from Microsoft Word into. I am using Tiny MCE for formatting. The problem is they string that gets pasted always has style definitions that are commented out. I need a way to strip this commented stuff out of the string.
Here is an example of the comments that get added:
<!-- /* Font Definitions */ @font-face {font-family:"Courier New"; panose-1:2 7 3 9 2 2 5 2 4 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 0 0 0 1 0;} @font-face {font-family:Wingdings; panose-1:5 2 1 2 1 8 4 8 7 8; mso-font-charset:2; -->
This is just a very small chunk of it, it ussually goes on for hundreds of lines.
anyway, im using strip_tags to get rid of unwanted HTML tags and i've tried using the follow preg_replace but the style comments are always there:
$e_description = preg_replace('/<!--(.|\s)*?-->/', '',$_POST['description']);
Any suggestions on how to get rid of this junk?
Thanks.
From stackoverflow
-
Why not just add the
ms
modifiers (m
is multi-line,s
is "dot-all" where.
matches all characters:preg_replace('/<!--.*?-->/ms', '', $_POST['description']);
That MAY work for you (try it out)...
Mikulas Dite : I rather suggest `'//ims'` since user may want to input simple comment. Even this is quite hazardous.Daelan : this doesn't do anything //ms and this replaces everything in the string not just the commented area '//ims' Thanks for the suggestions though.
0 comments:
Post a Comment