Skip to content

Conversation

@Earlopain
Copy link
Collaborator

Many characters have special meaning and break formatting

Before:
grafik

After:
grafik

@byroot can you check if this fixes #3895 for you? It doesn't break when I run it locally even though I use the same version as you.

@byroot
Copy link
Member

byroot commented Feb 2, 2026

Yep, that works for me.

# This module contains methods for escaping characters in Doxygen comments.
module Doxygen
def self.escape(value)
value.gsub(/[\.*%!`#<>_+-]/, '\\\\\0')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain this is correct though, as e.g. mainy comments do contain markdown.

The one comment I had issue with had a single backtick, but many other have proper mardown codeblocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. I think your PR breaks:

comment: "frozen by virtue of a `frozen_string_literal: true` comment or `--enable-frozen-string-literal`; only for adjacent string literals like `'a' 'b'`"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I tried /verbatim but that wraps the result in a codeblock. I think I can just rename this method and it should be ok.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It's only applied to that one specific place where token comments are shown)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't follow. These comment do contain doxygen code (markdown really but WTV) so it shouldn't be fully escaped.

IMO it's not that we're missing some automated escaping, it's that the YAML file do contain invalid markdown/doxygen, the fix can only be manual.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only for these comments

prism/config.yml

Lines 327 to 655 in 62c7300

tokens:
# The order of the tokens at the beginning is important, because we use them
# for a lookup table.
- name: EOF
value: 1
comment: final token in the file
- name: BRACE_RIGHT
comment: "}"
- name: COMMA
comment: ","
- name: EMBEXPR_END
comment: "}"
- name: KEYWORD_DO
comment: "do"
- name: KEYWORD_ELSE
comment: "else"
- name: KEYWORD_ELSIF
comment: "elsif"
- name: KEYWORD_END
comment: "end"
- name: KEYWORD_ENSURE
comment: "ensure"
- name: KEYWORD_IN
comment: "in"
- name: KEYWORD_RESCUE
comment: "rescue"
- name: KEYWORD_THEN
comment: "then"
- name: KEYWORD_WHEN
comment: "when"
- name: NEWLINE
comment: "a newline character outside of other tokens"
- name: PARENTHESIS_RIGHT
comment: ")"
- name: PIPE
comment: "|"
- name: SEMICOLON
comment: ";"
# Tokens from here on are not used for lookup, and can be in any order.
- name: AMPERSAND
comment: "&"
- name: AMPERSAND_AMPERSAND
comment: "&&"
- name: AMPERSAND_AMPERSAND_EQUAL
comment: "&&="
- name: AMPERSAND_DOT
comment: "&."
- name: AMPERSAND_EQUAL
comment: "&="
- name: BACKTICK
comment: "`"
- name: BACK_REFERENCE
comment: "a back reference"
- name: BANG
comment: "! or !@"
- name: BANG_EQUAL
comment: "!="
- name: BANG_TILDE
comment: "!~"
- name: BRACE_LEFT
comment: "{"
- name: BRACKET_LEFT
comment: "["
- name: BRACKET_LEFT_ARRAY
comment: "[ for the beginning of an array"
- name: BRACKET_LEFT_RIGHT
comment: "[]"
- name: BRACKET_LEFT_RIGHT_EQUAL
comment: "[]="
- name: BRACKET_RIGHT
comment: "]"
- name: CARET
comment: "^"
- name: CARET_EQUAL
comment: "^="
- name: CHARACTER_LITERAL
comment: "a character literal"
- name: CLASS_VARIABLE
comment: "a class variable"
- name: COLON
comment: ":"
- name: COLON_COLON
comment: "::"
- name: COMMENT
comment: "a comment"
- name: CONSTANT
comment: "a constant"
- name: DOT
comment: "the . call operator"
- name: DOT_DOT
comment: "the .. range operator"
- name: DOT_DOT_DOT
comment: "the ... range operator or forwarding parameter"
- name: EMBDOC_BEGIN
comment: "=begin"
- name: EMBDOC_END
comment: "=end"
- name: EMBDOC_LINE
comment: "a line inside of embedded documentation"
- name: EMBEXPR_BEGIN
comment: "#{"
- name: EMBVAR
comment: "#"
- name: EQUAL
comment: "="
- name: EQUAL_EQUAL
comment: "=="
- name: EQUAL_EQUAL_EQUAL
comment: "==="
- name: EQUAL_GREATER
comment: "=>"
- name: EQUAL_TILDE
comment: "=~"
- name: FLOAT
comment: "a floating point number"
- name: FLOAT_IMAGINARY
comment: "a floating pointer number with an imaginary suffix"
- name: FLOAT_RATIONAL
comment: "a floating pointer number with a rational suffix"
- name: FLOAT_RATIONAL_IMAGINARY
comment: "a floating pointer number with a rational and imaginary suffix"
- name: GLOBAL_VARIABLE
comment: "a global variable"
- name: GREATER
comment: ">"
- name: GREATER_EQUAL
comment: ">="
- name: GREATER_GREATER
comment: ">>"
- name: GREATER_GREATER_EQUAL
comment: ">>="
- name: HEREDOC_END
comment: "the end of a heredoc"
- name: HEREDOC_START
comment: "the start of a heredoc"
- name: IDENTIFIER
comment: "an identifier"
- name: IGNORED_NEWLINE
comment: "an ignored newline"
- name: INSTANCE_VARIABLE
comment: "an instance variable"
- name: INTEGER
comment: "an integer (any base)"
- name: INTEGER_IMAGINARY
comment: "an integer with an imaginary suffix"
- name: INTEGER_RATIONAL
comment: "an integer with a rational suffix"
- name: INTEGER_RATIONAL_IMAGINARY
comment: "an integer with a rational and imaginary suffix"
- name: KEYWORD_ALIAS
comment: "alias"
- name: KEYWORD_AND
comment: "and"
- name: KEYWORD_BEGIN
comment: "begin"
- name: KEYWORD_BEGIN_UPCASE
comment: "BEGIN"
- name: KEYWORD_BREAK
comment: "break"
- name: KEYWORD_CASE
comment: "case"
- name: KEYWORD_CLASS
comment: "class"
- name: KEYWORD_DEF
comment: "def"
- name: KEYWORD_DEFINED
comment: "defined?"
- name: KEYWORD_DO_LOOP
comment: "do keyword for a predicate in a while, until, or for loop"
- name: KEYWORD_END_UPCASE
comment: "END"
- name: KEYWORD_FALSE
comment: "false"
- name: KEYWORD_FOR
comment: "for"
- name: KEYWORD_IF
comment: "if"
- name: KEYWORD_IF_MODIFIER
comment: "if in the modifier form"
- name: KEYWORD_MODULE
comment: "module"
- name: KEYWORD_NEXT
comment: "next"
- name: KEYWORD_NIL
comment: "nil"
- name: KEYWORD_NOT
comment: "not"
- name: KEYWORD_OR
comment: "or"
- name: KEYWORD_REDO
comment: "redo"
- name: KEYWORD_RESCUE_MODIFIER
comment: "rescue in the modifier form"
- name: KEYWORD_RETRY
comment: "retry"
- name: KEYWORD_RETURN
comment: "return"
- name: KEYWORD_SELF
comment: "self"
- name: KEYWORD_SUPER
comment: "super"
- name: KEYWORD_TRUE
comment: "true"
- name: KEYWORD_UNDEF
comment: "undef"
- name: KEYWORD_UNLESS
comment: "unless"
- name: KEYWORD_UNLESS_MODIFIER
comment: "unless in the modifier form"
- name: KEYWORD_UNTIL
comment: "until"
- name: KEYWORD_UNTIL_MODIFIER
comment: "until in the modifier form"
- name: KEYWORD_WHILE
comment: "while"
- name: KEYWORD_WHILE_MODIFIER
comment: "while in the modifier form"
- name: KEYWORD_YIELD
comment: "yield"
- name: KEYWORD___ENCODING__
comment: "__ENCODING__"
- name: KEYWORD___FILE__
comment: "__FILE__"
- name: KEYWORD___LINE__
comment: "__LINE__"
- name: LABEL
comment: "a label"
- name: LABEL_END
comment: "the end of a label"
- name: LAMBDA_BEGIN
comment: "{"
- name: LESS
comment: "<"
- name: LESS_EQUAL
comment: "<="
- name: LESS_EQUAL_GREATER
comment: "<=>"
- name: LESS_LESS
comment: "<<"
- name: LESS_LESS_EQUAL
comment: "<<="
- name: METHOD_NAME
comment: "a method name"
- name: MINUS
comment: "-"
- name: MINUS_EQUAL
comment: "-="
- name: MINUS_GREATER
comment: "->"
- name: NUMBERED_REFERENCE
comment: "a numbered reference to a capture group in the previous regular expression match"
- name: PARENTHESIS_LEFT
comment: "("
- name: PARENTHESIS_LEFT_PARENTHESES
comment: "( for a parentheses node"
- name: PERCENT
comment: "%"
- name: PERCENT_EQUAL
comment: "%="
- name: PERCENT_LOWER_I
comment: "%i"
- name: PERCENT_LOWER_W
comment: "%w"
- name: PERCENT_LOWER_X
comment: "%x"
- name: PERCENT_UPPER_I
comment: "%I"
- name: PERCENT_UPPER_W
comment: "%W"
- name: PIPE_EQUAL
comment: "|="
- name: PIPE_PIPE
comment: "||"
- name: PIPE_PIPE_EQUAL
comment: "||="
- name: PLUS
comment: "+"
- name: PLUS_EQUAL
comment: "+="
- name: QUESTION_MARK
comment: "?"
- name: REGEXP_BEGIN
comment: "the beginning of a regular expression"
- name: REGEXP_END
comment: "the end of a regular expression"
- name: SLASH
comment: "/"
- name: SLASH_EQUAL
comment: "/="
- name: STAR
comment: "*"
- name: STAR_EQUAL
comment: "*="
- name: STAR_STAR
comment: "**"
- name: STAR_STAR_EQUAL
comment: "**="
- name: STRING_BEGIN
comment: "the beginning of a string"
- name: STRING_CONTENT
comment: "the contents of a string"
- name: STRING_END
comment: "the end of a string"
- name: SYMBOL_BEGIN
comment: "the beginning of a symbol"
- name: TILDE
comment: "~ or ~@"
- name: UAMPERSAND
comment: "unary &"
- name: UCOLON_COLON
comment: "unary ::"
- name: UDOT_DOT
comment: "unary .. operator"
- name: UDOT_DOT_DOT
comment: "unary ... operator"
- name: UMINUS
comment: "-@"
- name: UMINUS_NUM
comment: "-@ for a number"
- name: UPLUS
comment: "+@"
- name: USTAR
comment: "unary *"
- name: USTAR_STAR
comment: "unary **"
- name: WORDS_SEP
comment: "a separator between words in a list"
- name: __END__
comment: "marker for the point in the file at which the parser should stop"
(https://ruby.github.io/prism/c/ast_8h.html#adb9ccf5e858ff029b17095d6298cf42b)

Makes sense for these I think, there is no markdown (yet). You would have to escape many other places manually otherwise. I tried that but doxygen doesn't really do what I want it to.

I'm also fine with your PR. I never looked at the generated docs before so it doesn't really matter that much to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I guess that currently work since none of these comments do use Doxygen formatting, but would be a problem in the future if one wanted to.

You have more context than me so up to you (or to Kevin).

Many characters have special meaning and break formatting
@Earlopain Earlopain force-pushed the doxygen-comment-verbatim branch from c588996 to 0b9d516 Compare February 2, 2026 11:59
Copy link
Collaborator

@kddnewton kddnewton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is fine.

@kddnewton kddnewton merged commit 745355f into ruby:main Feb 2, 2026
66 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants