CWG3177 [lex.pptoken] *header-name* formation rule is defined in terms of itself

### Reference (section label): [lex.pptoken], [cpp.pre]

### Issue description:

[[lex.pptoken]/5.4.2.1](https://eel.is/c++draft/lex.pptoken#5.4.2.1) is supposed to indicate when we form a *header-name* token, and says we only do so "immediately after the include, embed, or import preprocessing token in a #include ([cpp.include]), #embed ([cpp.embed]), or import ([cpp.import]) directive"

But [[cpp.pre]/2.2](https://eel.is/c++draft/cpp.pre#2.2) says when we are processing an import directive, and that that happens only when we encounter "an import preprocessing token immediately followed on the same logical source line by a [header-name](https://eel.is/c++draft/lex.header#nt:header-name), <, [identifier](https://eel.is/c++draft/lex.name#nt:identifier), or : preprocessing token"

So, given:

```c++
import <foo>;
```

... we form a *header-name* token if we're processing an import directive, and we're processing an import directive if we form a *header-name* token. This rule lacks foundation.

We first got the [lex.pptoken] side of this from [P1703R1](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1703r1.html). We then got the [cpp.pre] side of this from [P1857R3](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1857r3.html). The author of P1857R3 [has indicated](https://github.com/llvm/llvm-project/issues/190693#issuecomment-4203243858) the intent was that we form a *header-name* token if possible in a context where we *could* form an import directive.

Also, the wording uses the term "immediately after" and "immediately following" without definition, and the intended definition is not the obvious one -- we don't mean "with no intervening characters" and we don't mean "as the very next preprocessing token", we actually mean "as the very next preprocessing token *and on the same logical source line*".

### Suggested resolution:

Change in [lex.pptoken]/5.4:

> Otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail, except that
> - [(5.4.1)](https://eel.is/c++draft/lex#pptoken-5.4.1) a [string-literal](https://eel.is/c++draft/lex#nt:string-literal) token is never formed when a [header-name](https://eel.is/c++draft/lex#nt:header-name) token can be formed, and
> - [(5.4.2)](https://eel.is/c++draft/lex#pptoken-5.4.2) a [header-name](https://eel.is/c++draft/lex#nt:header-name) ([[lex.header]](https://eel.is/c++draft/lex#header)) is only formed
>     - [(5.4.2.1)](https://eel.is/c++draft/lex#pptoken-5.4.2.1) immediately after the include<del>,</del> <ins>or</ins> embed<del>, or import</del> preprocessing token in a #include ([[cpp.include]](https://eel.is/c++draft/cpp.include))<del>,</del> <ins>or</ins> #embed ([[cpp.embed]](https://eel.is/c++draft/cpp.embed))<del>, or import ([[cpp.import]](https://eel.is/c++draft/cpp.import))</del> directive, respectively, or
>     - <ins>(5.4.2.1+) immediately after an `import` preprocessing token that is at the start of a logical source line, or</ins>
>     - [(5.4.2.2)](https://eel.is/c++draft/lex#pptoken-5.4.2.2) immediately after a preprocessing token sequence of `__has_include` or `__has_embed` immediately followed by `(` in a `#if`, `#elif`, or `#embed` directive ([[cpp.cond]](https://eel.is/c++draft/cpp.cond), [[cpp.embed]](https://eel.is/c++draft/cpp.embed))[.](https://eel.is/c++draft/lex#pptoken-5.4.sentence-1)
> <ins>A preprocessing token is considered to be *immediately* after another preprocessing token if the tokens are on the same logical source line and there are no intervening preprocessing tokens.

It might also be worth investigating whether the wording in [cpp.pre] can be cleaned up by making more use of "logical source line" terminology in place of talking about whitespace not containing a newline character, since they're different ways of saying the same thing and we currently use a mixture of the two approaches. (It's also surprising that a `/* \n */` comment is considered to be whitespace that does not contain a newline character for these purposes!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CWG3177 [lex.pptoken] header-name formation rule is defined in terms of itself #881

Reference (section label): [lex.pptoken], [cpp.pre]

Issue description:

Suggested resolution:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CWG3177 [lex.pptoken] *header-name* formation rule is defined in terms of itself #881

Description

Reference (section label): [lex.pptoken], [cpp.pre]

Issue description:

Suggested resolution:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

CWG3177 [lex.pptoken] header-name formation rule is defined in terms of itself #881