Skip to content

AttributeError: 'NoneType' object has no attribute 'strip' #19

@Amirelkanov

Description

@Amirelkanov

Desctiption: Getting an AttributeError when passing an html-like string with a corrupted <style> tag in the AdvancedHTMLParser.AdvancedHTMLParser().parseStr method.

String input:

<!DOCTYPE html><html><head><title>W33ZpsIOCysn9GGU45y0LW9EpuPHBlAuxCRRusKRvowefQLMy2</title><style:p { color: red; }</style></head><body><ul><li>rp52OnfCuzqBsp7</li><li>wrAAhIfvfpvMeyoTdmoF1oxezMhscNlgTqo0fPhfUS7XWZvECi2iVMsldLpqJq6W34KuOeoJ74cx5</li><li>8ymeXTKNEDb3jDnYwKt3lFMc4s7pJxDIVgSXljWIlOjv7JGr8cXf8SJOmpiyD05PyTzj9UATCFo1XqBpCqXR7KcjUYinCI4kZYI</li></ul> 6L1gB6g0z</body></html>

Bytearray input:

[60, 33, 68, 79, 67, 84, 89, 80, 69, 32, 104, 116, 109, 108, 62, 60, 104, 116, 109, 108, 62, 60, 104, 101, 97, 100, 62, 60, 116, 105, 116, 108, 101, 62, 87, 51, 51, 90, 112, 115, 73, 79, 67, 121, 115, 110, 57, 71, 71, 85, 52, 53, 121, 48, 76, 87, 57, 69, 112, 117, 80, 72, 66, 108, 65, 117, 120, 67, 82, 82, 117, 115, 75, 82, 118, 111, 119, 101, 102, 81, 76, 77, 121, 50, 60, 47, 116, 105, 116, 108, 101, 62, 60, 115, 116, 121, 108, 101, 58, 112, 32, 123, 32, 99, 111, 108, 111, 114, 58, 32, 114, 101, 100, 59, 32, 125, 60, 47, 115, 116, 121, 108, 101, 62, 60, 47, 104, 101, 97, 100, 62, 60, 98, 111, 100, 121, 62, 60, 117, 108, 62, 60, 108, 105, 62, 114, 112, 53, 50, 79, 110, 102, 67, 117, 122, 113, 66, 115, 112, 55, 60, 47, 108, 105, 62, 60, 108, 105, 62, 119, 114, 65, 65, 104, 73, 102, 118, 102, 112, 118, 77, 101, 121, 111, 84, 100, 109, 111, 70, 49, 111, 120, 101, 122, 77, 104, 115, 99, 78, 108, 103, 84, 113, 111, 48, 102, 80, 104, 102, 85, 83, 55, 88, 87, 90, 118, 69, 67, 105, 50, 105, 86, 77, 115, 108, 100, 76, 112, 113, 74, 113, 54, 87, 51, 52, 75, 117, 79, 101, 111, 74, 55, 52, 99, 120, 53, 60, 47, 108, 105, 62, 60, 108, 105, 62, 56, 121, 109, 101, 88, 84, 75, 78, 69, 68, 98, 51, 106, 68, 110, 89, 119, 75, 116, 51, 108, 70, 77, 99, 52, 115, 55, 112, 74, 120, 68, 73, 86, 103, 83, 88, 108, 106, 87, 73, 108, 79, 106, 118, 55, 74, 71, 114, 56, 99, 88, 102, 56, 83, 74, 79, 109, 112, 105, 121, 68, 48, 53, 80, 121, 84, 122, 106, 57, 85, 65, 84, 67, 70, 111, 49, 88, 113, 66, 112, 67, 113, 88, 82, 55, 75, 99, 106, 85, 89, 105, 110, 67, 73, 52, 107, 90, 89, 73, 60, 47, 108, 105, 62, 60, 47, 117, 108, 62, 32, 54, 76, 49, 103, 66, 54, 103, 48, 122, 60, 47, 98, 111, 100, 121, 62, 60, 47, 104, 116, 109, 108, 62]

Code that reproduces the error:

import AdvancedHTMLParser

parser = AdvancedHTMLParser.AdvancedHTMLParser()
parser.parseStr(string_input) # The same string_input as above in issue

Expected Result: Ignore invalid input or raise a specified exception (like MultipleRootNodeException)

Actual Result:

Traceback (most recent call last):
  File "C:\Users\AmEl\IdeaProjects\Joker2023\src\main\python\main.py", line 55, in main
    python_method(input_data)
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 980, in parseStr
    self.feed(html)
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 948, in feed
    HTMLParser.feed(self, contents)
  File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 111, in feed
    self.goahead(0)
  File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 171, in goahead
    k = self.parse_starttag(i)
        ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 338, in parse_starttag
    self.handle_starttag(tag, attrs)
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 138, in handle_starttag
    newTag = AdvancedTag(tagName, attributeList, isSelfClosing, ownerDocument=self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Tags.py", line 196, in __init__
    myAttributes[key] = value
    ~~~~~~~~~~~~^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 96, in __setitem__
    tag.style = StyleAttribute(value, tag)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 424, in __init__
    self._styleDict = StyleAttribute.styleToDict(styleValue)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 650, in styleToDict
    styleStr = styleStr.strip()
               ^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'strip'

Additional information:

  • OS: Windows 10, 22H2 (19045.4984)
  • Python version: Python 3.12.6
  • You can achieve this error on input like this: <s</style>

P.s. You can see the same info in reportAttributeError.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions