Semantic Tokens
Lexical Tokens
Emitted from a lexical pass over the source file, independent of the AST.
Comments
Numbers, characters, strings
Keywords (including
override,final)Preprocessor directives (
#if,#define, etc.)Header names in
#includeMacro names at
#definesiteLiteral prefixes and suffixes — highlight encoding prefixes and type suffixes as distinct tokens
cppR"(raw string)" // R prefix u8"utf-8 string" // u8 prefix L"wide string" // L prefix 0xFF // 0x prefix 0b1010 // 0b prefix 42u // u suffix 3.14f // f suffix 1'000'000 // digit separatorsUser-defined literal suffixes — highlight the suffix as a separate token linked to the literal operator
cppusing namespace std::chrono_literals; auto d = 500ms; // ms suffix → operator""ms auto s = "hello"s; // s suffix → operator""s auto x = 3.14_deg; // _deg suffix → operator""_degEscape sequences within string/character literals
cpp"hello\nworld" // \n highlighted differently from surrounding text '\x41' // \x41 highlighted as escapeDeclarator vs operator disambiguation for
&,*,&&— distinguish pointer/reference declarators from arithmetic/logical operators (clangd#1421)cppint* p; // * is a declarator (pointer type) int x = a * b; // * is an arithmetic operator int& r = x; // & is a declarator (reference type) int y = a & b; // & is a bitwise operator
Token Types
Declarations & References
Namespace and namespace alias
Class, struct, union, enum
Type alias (
typedef,using)Function, method, constructor, destructor, operator
Variable (local, global, parameter, field, binding)
Enum member
Template parameter (type and non-type)
Concept
Label (
gototarget)Structured binding variables — should highlight as local variables, not global (clangd#485)
cppauto [x, y] = point; // x, y should be highlighted as local variablesDeduction guides
cpptemplate<typename T> vector(T, T) -> vector<T>; // deduction guide should get a tokenusing enum— the enum name should get a semantic token (clangd#1283)cppusing enum Color; // Color should be highlighted as enum typeExplicit instantiation (clangd#316)
cpptemplate class vector<int>; // vector and int should be highlightedMember initializer list fields (clangd#122)
cppstruct Widget { int width, height; Widget(int w, int h) : width(w), height(h) {} // ^^^^^ ^^^^^^ should be highlighted as fields };Lambda init-capture — should consistently get a token (clangd#868)
cppauto f = [val = compute()] {}; // val should be highlighted as local variableUsing declarations — consistent token for the introduced name (clangd#2619)
cppusing std::vector; // vector should get a semantic tokensizeof...— the pack parameter should get a token (clangd#213)cpptemplate<typename... Ts> constexpr auto count = sizeof...(Ts); // Ts should be highlighted as templateParameter
Module Tokens
-
import/export/modulekeywords highlighted distinctly - Module names in import declarations
- Full AST-level token resolution for module declarations (currently lexical fallback)
Attributes
Attribute names should receive semantic tokens
cpp[[nodiscard]] int compute(); [[deprecated("use v2")]] void old_func(); [[maybe_unused]] int x;Vendor-specific attribute namespaces
cpp[[gnu::packed]] struct S {}; // gnu namespace [[clang::optnone]] void f(); // clang namespaceExpressions inside attributes should be fully highlighted (clangd#2209)
cpp[[assume(ptr != nullptr)]]; // ptr, nullptr should get tokens
Additional Token Types
- Primitive token type for built-in types (
int,float,void, etc.) - Bracket token type for matching bracket pairs (
[],(),{},<>)
Token Modifiers
Implemented
- Declaration vs definition distinction
- Static (class static members, static local variables, static functions)
- Readonly / const
- Deprecated (
[[deprecated]]) - Abstract (pure virtual methods)
- Virtual
- Default library (symbols from system headers)
- Constructor / destructor marker
- Templated (clice extension, not part of the standard LSP semantic tokens protocol)
- DependentName for dependent names in templates (clice extension)
Planned
Deduced modifier for deduced types (e.g.,
auto,decltype)Scope modifiers — function scope, class scope, file scope, global scope (clangd#352)
cppint global; // globalScope static int file_local; // fileScope struct Foo { int member; // classScope void bar() { int local; // functionScope } };Mutable reference / mutable pointer — mark arguments passed by non-const reference or pointer (clangd#839)
cppvoid modify(int& out); modify(x); // x should have usedAsMutableReference modifierUser-defined operator modifier — distinguish built-in vs user-defined operators (clangd#1521)
cppVec a, b; auto c = a + b; // + should have userDefined modifier int x = 1 + 2; // + is built-in, no modifierObject-like vs function-like macro distinction — currently all macros receive the Macro token type (clangd#2649)
cpp#define MAX_SIZE 1024 // object-like macro (no distinct kind yet) #define CHECK(x) assert(x) // function-like macro (no distinct kind yet)Context-dependent readonly —
conston the value vs on the pointer level (clangd#1585)cppconst int* p; // p itself is NOT readonly (pointer can change) int* const q; // q IS readonly (pointer is const) const int* const r; // r IS readonly
Conflict / Ambiguity
C++ allows structurally different entities to share the same name. When a single name refers to multiple entities of different kinds, the semantic token type cannot be unambiguously determined. These cases receive a special Conflict token type (not a modifier) and are displayed in a neutral color (e.g., gray).
Overloaded names via
using— a name may introduce both a type and a functioncppnamespace N { struct Widget {}; void Widget(); // legal in C++: function shares name with struct } using N::Widget; // brings both → conflict: type or function?Dependent name ambiguity across instantiations — a dependent call may resolve to different kinds in different instantiations
cpptemplate<typename T> void process(T& obj) { obj.foo(); // foo could be a method or a functor data member } struct A { void foo(); }; // A::foo is a method struct B { std::function<void()> foo; }; // B::foo is a variable process(a); // T = A → foo is a method (function color) process(b); // T = B → foo is a variable (variable color) // conflict: no single correct token typeInjected class name — inside a class, the class name can refer to both the type and the constructor
cppstruct Widget { Widget(int); Widget create() { return Widget(42); // Widget here: type? constructor? } };
Dependent Name Highlighting
Highlighting of names in uninstantiated template bodies that depend on template parameters.
Dependent type names (clangd#154, clangd#214)
cpptemplate<typename T> void process() { typename T::value_type val; // value_type should be highlighted as type T::static_func(); // static_func should be highlighted }Dependent template names (clangd#484)
cpptemplate<typename T> void process(T& obj) { obj.template get<int>(); // get should be highlighted as method }Heuristic-based coloring — when a dependent name has a likely target, use that target's kind (clangd#297)
cpptemplate<typename T> void process(std::vector<T>& v) { v.push_back(val); // push_back could be colored as method (vector's known member) }Members imported from dependent base via
using(clangd#686)cpptemplate<typename T> struct Derived : Base<T> { using Base<T>::member; void foo() { member(); } // member should get a token };Dependent names with overloaded targets — when the name resolves to an overload set of mixed kinds (clangd#1057)
Inactive Code Regions
Dim inactive preprocessor branches (
#if 0, false#ifdef, etc.) (clangd#132)cpp#ifdef _WIN32 // active code (normal highlighting) #else // inactive code (dimmed / grayed out) #endifCorrect inactive region boundaries with
#elifchains (clangd#602)cpp#if defined(A) // ... #elif defined(B) // the #elif directive line itself should NOT be dimmed #else // ... #endifPreserve syntax highlighting within inactive regions (clangd#1664)
cpp#if 0 int x = 42; // should still have keyword/type coloring, just dimmed std::string s; // not plain gray text #endifDon't treat inactive regions as comments — they should remain distinct from actual comment blocks (clangd#1545)
Unreachable code dimming — code after unconditional
return,throw, infinite loops, etc. (clangd#1828)cppvoid foo() { return; int x = 42; // unreachable — should be dimmed }
Format String Highlighting
std::format/std::printformat string placeholders (clangd#1709)cppstd::format("Hello, {}! You are {} years old.", name, age); // ^^^^^^ ^^ ^^ // string placeholder placeholder (different color)Highlight invalid format specifiers as errors
Protocol Support
- Full document semantic tokens (
textDocument/semanticTokens/full) - UTF-16 delta-encoded token positions
- Range-based semantic tokens (
textDocument/semanticTokens/range) — only compute tokens for the visible viewport, critical for performance on large files - Delta updates (
textDocument/semanticTokens/full/delta) — send only changed tokens since the previous response, reducing payload size on incremental edits
Token Correctness
Known issues from clangd that clice aims to get right from the start:
Constructors and destructors use the
methodtoken type, notclass(clangd#1509)Destructor token covers the leading
~character (clangd#2078)Destructor token modifiers match the declaration (clangd#872)
Apply token modifiers to operands of overloaded operators (clangd#2547)
cppvoid process(Vec& out); out += rhs; // out should have usedAsMutableReference modifier (via operator+=)Don't mark
autoparameter in a lambda/function astypeParameter(clangd#1390)cppauto f = [](auto x) {}; // auto here is NOT a template type parameter void g(auto y) {} // same — abbreviated function template, not typeParameterNested name specifier in pointer-to-member should get a token (clangd#2235)
cppint Foo::*ptr; // Foo should be highlighted as class::new— thenewkeyword should get a semantic token even with global scope prefix (clangd#1627)co_yield/co_awaitshould not lose highlighting when the coroutine return type is a template (clangd#2437)cpptemplate<typename T> generator<T> gen(T val) { co_yield val; // co_yield should still be highlighted as keyword }
Changelog
| Date | Change | PR |
|---|---|---|
| — | Initial semantic token types and modifiers, full document tokens | — |