Skip to content

Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. #13

@buntekuh55

Description

@buntekuh55

when parsing a website with such a content

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html
    PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xml:lang="de" lang="de" xmlns="http://www.w3.org/1999/xhtml">
<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<!-- 
	This website is powered by TYPO3 - inspiring people to share!
	TYPO3 is a free open source Content Management Framework initially created by Kasper Skaarhoj and licensed under GNU/GPL.
	TYPO3 is copyright 1998-2016 of Kasper Skaarhoj. Extensions are copyright of their respective owners.
	Information and contribution at http://typo3.org/
-->

<title>Test</title>
<meta name="generator" content="TYPO3 CMS" />
<meta name="robots" content="index,follow" />
<meta name="copyright" content="Test" />
<meta name="revisit-after" content="7 days" />
<meta name="title" content="Test" />
<meta name="date" content="2018-03-29" />

</head>
<body class="layout0 language2 type0 pid8860 parentPid8852 ">

Test


</body>
</html>

a warning is thrown
Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

Are there any workarounds with MWC?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions