Various MediaWiki 1.6.1 parser tests, that fail HTML validation. These were all found by fuzz testing of MediaWiki, using a modified PHP port of the Python port of mangleme.
Test
|
Wiki Source
|
Validate HTML
|
Tidy HTML
|
Security aspects?1
|
Visible Artefacts?
|
Notes and any extra info.
|
MediaWiki/Parser1
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
Yes
|
Stikes out almost all text. Explanation for this + Parser1-hidden + Parser2 + Parser3 + Parser4 + Parser5.
|
MediaWiki/Parser1-hidden
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
Yes
|
Hides almost all text, which also makes all page links unavailable.
|
MediaWiki/Parser2
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No
|
MediaWiki/Parser3
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No
|
MediaWiki/Parser4
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No
|
MediaWiki/Parser5
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
Yes
|
Shrinks font, moves the top page action links up about 5 pixels and left about 10 pixels.
|
MediaWiki/Parser6
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
Yes
|
Shrinks font, moves the left navigation bar down about 160 pixels, strikes out almost all text.
|
MediaWiki/Parser7
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No.
|
Completely fixed in 1.6.1 - valid HTML, no artefacts, no tidy errors.
|
MediaWiki/Parser8
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No
|
MediaWiki/Parser9
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No
|
MediaWiki/Parser10
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No
|
MediaWiki/Parser11
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes No.
|
No.
|
Explanation. Security aspects now fixed in 1.6, although still fails W3C Validation.
|
MediaWiki/Parser12
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes No.
|
No.
|
Explanation. Security aspects now fixed in 1.6, although still fails W3C Validation.
|
MediaWiki/Parser13
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes.
|
No.
|
Drops the '<a href="xxx' string. Explanation for this + Parser14 + Parser14-table.
|
MediaWiki/Parser14
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes.
|
No.
|
MediaWiki/Parser14-table
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes.
|
No.
|
MediaWiki/Parser15
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No.
|
Generates Tidy error due to <caption> tags out of order. As of 1.6.1 just fails validation.
|
MediaWiki/Parser16
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes.
|
No.
|
Generates Tidy error due to <th> tags out of order. As of 1.6.1, now drops the '<a href="xxx' string.
|
MediaWiki/Parser17
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes. No.
|
No.
|
Completely fixed in 1.6.1 - valid HTML, no artefacts, no tidy errors.
|
MediaWiki/Parser18
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes. No.
|
No.
|
Completely fixed in 1.6.1 - valid HTML, no artefacts, no tidy errors.
|
MediaWiki/Parser19
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes. No.
|
No.
|
Completely fixed in 1.6.1 - valid HTML, no artefacts, no tidy errors.
|
MediaWiki/Parser20
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
No
|
No.
|
Generates multi-line hrefs. Passes W3C validation, but tidy gives warnings, and the links don't act like normal links (in Firefox, at least) - clicking on them does nothing.
|
MediaWiki/Parser21
|
Export Wiki Source
|
W3C Validator
|
Tidy HTML
|
Yes.
|
No.
|
|
1: For the above table, "security aspect" is defined as anything that causes the start of a tag to be missing, or the end to be missing, or attributes of any type that should not be there to be injected. For example:
- <p><td><s></p> would not be considered to have a security aspect because all the tags are appearing ok (are not malformed), although it is invalid HTML.
- <a href="http://as<td></td><td class="external free"><p>user text here would be considered to have a security aspect because the "href" string is not properly terminated, and so the "external free" part is injected as attributes.
- A string missing the start of a tag would also be considered to have a security aspect - e.g. <th>|||||" class="external free" title="https://||||||" rel="nofollow">https://</th> - because the <a href="xxx part has been cut off. Probably not exploitable - but certainly a worse category of bug than just getting tags in the wrong order.
So to sum up: if tags are just in the wrong order, but are otherwise complete and well-formed, then it is not a security issue; otherwise it is considered to potentially be, and is listed as "Yes" in the above table.