Search
Other ReReplacer questions
Forum

Remove bad structured data

Paolo G's Avatar Paolo G
Hi,

I'm trying to remove the default Joomla structured data as suggested here www.tassos.gr/joomla-extensions/google-s...ult-joomla-microdata but without success. In the search field I put this string but it doesn't work:
itemscope itemtype=(\"?)http(s?):\/\/schema.org\/(Article|BlogPosting|Blog|BreadcrumbList|AggregateRating)(\"?)
The same for
itemscope itemtype="https://schema.org/Article"

How can I fix? Is there maybe a problem with the order of the plugin?

Many thanks
Peter van Westen's Avatar Peter van Westen ADMIN
Make sure you set the Search Area to Everywhere or HTML Head.

Replacing URLs can be a bit tricky, as the Joomla SEF plugin can change stuff. So it is not always clear what the URLs actually look like at the time ReReplacer gets to see the HTML.

Regarding plugin ordering, that could be part of it.
See: regularlabs.com/blog/242-plugin-order-is-important
Please post a rating at the Joomla! Extensions Directory
Paolo G's Avatar Paolo G
Dear Peter,

Many thanks for your support. Doing some tests I observe that ReReplacer has an effect on the page, perhaps the structured data is not removed because the inserted rule does not match the generated data.

Looking at the bad structured data, I tried to write a new rule like this:
<article id=(.*) class="uk-article" data-permalink=(.*) typeof="Article" vocab="https://schema.org/">

    <meta property="name" content=(.*)>
    <meta property="author" typeof="Person" content=(.*)>
    <meta property="dateModified" content=(.*)>
    <meta property="datePublished" content=(.*)>
    <meta class="uk-margin-remove-adjacent" property="articleSection" content=(.*)>

Yesterday the rule worked and the articles were correctly displayed, without any errors in the structured data. Checking again this morning I found that the same pages are blank, probably lagging due to cache. By removing the rule, the pages became active again but obviously with the wrong structured data.

I'm aware that this goes beyond Rereplacer support, but I'd like to try asking you what's wrong with the rule I wrote: I hope you can help me.

Thanks and regards
Peter van Westen's Avatar Peter van Westen ADMIN
This is down to your Regular Expressions. It is doing what you are telling it to do.

By default, Regular Expressions are greedy. So '<.*>' will search for the first '<' and grab everything till the last '>'.
To make '.*' non-greedy, change it to '.*?'.

Also, you are grouping the dynamic stuff with parentheses: '(.*)'
Groups can be referenced in the replacement. But they do take up memory (and time).
And you don't need them, so remove the parentheses.

If you simply want to remove all <meta> tags after the <article> tags, you could do:
<article [^>]*>(\s*<meta [^>]*>)+
Please post a rating at the Joomla! Extensions Directory
Paolo G's Avatar Paolo G
Thanks a lot Peter,

your rule does its job and ReReplacer completely removes the structured data that Joomla created incorrectly.

It will surely be useful to other users too, thanks

Best regards
You can only post on the extension support forum if you have an active subscription and you log in

Buy a Pro subscription