Friday 25 February 2011

Oracle SQL STRIP HTML Tags

You can strip all HTML Tags from a text string using:

SELECT REGEXP_REPLACE('string or column containing HTML goes here','<[^>]+>','') FROM DUAL

-- CREATE FUNCTION

CREATE OR REPLACE
FUNCTION "STRIPHTML" (strArg IN CLOB)
RETURN CLOB IS
BEGIN
RETURN regexp_replace(strArg, '<[^>]+>', NULL);
END STRIPHTML;

2 comments:

Anonymous said...

Hi there ,

This blog was pretty useful however when I use this tip in my query 'SELECT
REGEXP_REPLACE(x.DESCRIPTION, '<[^>]+>', '') "Dup_DESCRIPTION"
FROM XX_Table x '

The output still is left with '&nbsp , &'etc . Can you please tell me the pattern to remove it

Anonymous said...

Well after a couple of tries i got the following :

SELECT REGEXP_REPLACE(PPRO.DESCRIPTION, '<[^>]+>|\&(nbsp;)|(amp;)', ' ') "Dup_DESCRIPTION"
FROM XX_Tablex