Skip to content

Inliner, Break Optimization and Fixing UCS Tail Calls#423

Open
AnsonYeung wants to merge 34 commits intohkust-taco:hkmc2from
AnsonYeung:inliner
Open

Inliner, Break Optimization and Fixing UCS Tail Calls#423
AnsonYeung wants to merge 34 commits intohkust-taco:hkmc2from
AnsonYeung:inliner

Conversation

@AnsonYeung
Copy link
Copy Markdown
Contributor

@AnsonYeung AnsonYeung commented Mar 21, 2026

This PR implements inlining. As naive inlining generates unnecessary labels and break, this PR adds break optimization to optimize away useless breaks. The UCS part of #435 is addressed.

Method inlining is not enabled with this PR.

Copy link
Copy Markdown
Contributor

@LPTK LPTK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks a bit more complex than I expected. I am curious: what is the main source of complexity?

import semantics.Elaborator.State

object Inliner:
object TermSymbolPath:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls document why we need this strange extractor.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the Call node takes in a path, this allow matching for a specific function or method. However method can be virtual and also the this parameter needs to be adjusted for inlining and so method is not actually inlined right now. I added this as the doc:

// Reference to a function body can occur as a.f or f, this handles both cases.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also mathc virtual calls, right? But presumably virtual functions won't be in the mapping we're maintaining so that's fine?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now method is not considered at all for inlining so it's fine. However I think later if a class have 2 methods f and g where f calls g, then we need to know that g is not virtual / final in order to inline. Inlining methods should be somewhat important since most functions in a program will be methods.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can consider method inlining later. Following the previous version of MLscript, I intend to make methods not virtual by default, requiring a virtual modifier to make them so.

@AnsonYeung
Copy link
Copy Markdown
Contributor Author

AnsonYeung commented Mar 24, 2026

The logic looks a bit more complex than I expected. I am curious: what is the main source of complexity?

The main complexity imo is that function definition and usage is not in order, which requires at least 2 traversals. The function could be mutually recursive for example. Even in this case, one function could be inlined into another if one of the function is only used once or is small enough.

Other complexities are symbol renaming, and spread arguments. Multiple chained calls to multiple param list is not supported for inlining currently.

A global function symbol to definition map would simplify this a bit, it would mean no more Define node for functions however.

@LPTK
Copy link
Copy Markdown
Contributor

LPTK commented Mar 24, 2026

The main complexity imo is that function definition and usage is not in order, which requires at least 2 traversals. The function could be mutually recursive for example. Even in this case, one function could be inlined into another if one of the function is only used once or is small enough.

Oh, this makes me think about that old GHC paper where they had what looked like a reasonable solution: Secrets of the Glasgow Haskell Compiler inliner. It would be great if you could see whether you could implement something like their solution (sec. 4.2). Would also be a nice paper to present in a group meeting/seminar.

Other complexities are symbol renaming, and spread arguments. Multiple chained calls to multiple param list is not supported for inlining currently.

Would it help to make Call take multiple argument lists? I've been thinking that doing so would make sense, to reflect multiple parameter lists.

A global function symbol to definition map would simplify this a bit, it would mean no more Define node for functions however.

Can't you just define a reusable pass that creates that mapping? We could use it in other places as well, so that would be useful. We can't really do away with Define nodes because functions are scoped, at least nested ones. And carrying around/updating a map all over the place isn't necessarily a win, as far as I can see.

@LPTK
Copy link
Copy Markdown
Contributor

LPTK commented Apr 1, 2026

When I try to run the tests locally, I get in CamlLightTest.mls:

//│ ═══[RUNTIME ERROR] RangeError: Maximum call stack size exceeded

Does inlining remove tail-recursion optimization opportunities?

@AnsonYeung AnsonYeung changed the title Inliner Inliner, Break Optimization and Fixing UCS Tail Calls Apr 1, 2026
@AnsonYeung AnsonYeung marked this pull request as ready for review April 2, 2026 10:57
@AnsonYeung
Copy link
Copy Markdown
Contributor Author

I'm not sure whether SymbolRefresher really handles everything. The logic is quite weird since a symbol can be introduced in many weird ways.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an IR-level inliner in the codegen optimization pipeline and adds control-flow cleanups (notably eliminating redundant breaks/labels) to keep generated IR/JS smaller and to avoid UCS consequent-sharing patterns that accidentally move tail calls out of tail position (addressing the UCS part of #435). It also wires the feature into diff-test configuration with :noInline / :inlineThreshold directives and updates many golden test outputs accordingly.

Changes:

  • Add configurable inlining (Config.Inliner) and run it from BlockSimplifier, with diff-test directives to toggle/parameterize it.
  • Add/extend control-flow simplifications (useless-break removal, match/begin simplifications) and UCS normalization tweaks to preserve tail positions under sharing.
  • Update a large set of expected JS/IR outputs and add new inliner-focused test coverage.

Reviewed changes

Copilot reviewed 71 out of 71 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
hkmc2DiffTests/src/test/scala/hkmc2/MLsDiffMaker.scala Adds :noInline and :inlineThreshold directives and wires them into Config.inlining.
hkmc2DiffTests/src/test/scala/hkmc2/JSBackendDiffMaker.scala Adjusts BlockSimplifier invocation to provide required contextual parameters.
hkmc2/shared/src/test/mlscript/wasm/ControlFlow.mls Updates wasm golden output for new control-flow behavior.
hkmc2/shared/src/test/mlscript/ucs/patterns/RestTuple.mls Updates JS golden output after break/label optimization changes.
hkmc2/shared/src/test/mlscript/ucs/normalization/SimplePairMatches.mls Updates JS golden output after UCS/codegen control-flow changes.
hkmc2/shared/src/test/mlscript/ucs/normalization/ExcessiveDeduplication.mls Updates golden outputs; adds a regression-style case around optimization behavior.
hkmc2/shared/src/test/mlscript/ucs/normalization/DeduplicationWhile.mls Updates while + deduplication golden output.
hkmc2/shared/src/test/mlscript/ucs/normalization/Deduplication.mls Updates golden outputs for redundant break removal in switches/matches.
hkmc2/shared/src/test/mlscript/ucs/general/JoinPoints.mls Updates golden outputs reflecting simpler join-point codegen.
hkmc2/shared/src/test/mlscript/ucs/general/BooleanPatterns.mls Updates switch golden output after break elision.
hkmc2/shared/src/test/mlscript/tailrec/TailRecOpt.mls Updates tailrec golden output; shows interaction with inlining outputs.
hkmc2/shared/src/test/mlscript/optim/AbortivePrefix.mls Updates IR golden output for abortive-prefix control flow representation.
hkmc2/shared/src/test/mlscript/opt/DeadVarRemoval.mls Updates optimized IR golden output (inlined temp/value naming/structure).
hkmc2/shared/src/test/mlscript/opt/DeadSelRemoval.mls Updates golden output after dead-branch simplification.
hkmc2/shared/src/test/mlscript/opt/DeadObjRemoval.mls Updates golden output after inlining affects trivial call sites.
hkmc2/shared/src/test/mlscript/llir/ControlFlow.mls Updates LLIR golden output for simplified match branches.
hkmc2/shared/src/test/mlscript/llir/BasicCpp.mls Updates C++ golden output for match/default codegen changes.
hkmc2/shared/src/test/mlscript/lifter/Loops.mls Updates golden output after control-flow simplification in lifted code.
hkmc2/shared/src/test/mlscript/lifter/FunInFun.mls Updates golden output after control-flow simplification in nested functions.
hkmc2/shared/src/test/mlscript/invalml/InvalMLCodeGen.mls Updates golden output (redundant breaks removed).
hkmc2/shared/src/test/mlscript/handlers/StackSafety.mls Updates golden output (breaks removed in handler-generated state machines).
hkmc2/shared/src/test/mlscript/handlers/SavedVars.mls Updates golden output (breaks removed in handler-generated state machines).
hkmc2/shared/src/test/mlscript/handlers/RecursiveHandlers.mls Updates golden output (breaks removed in handler-generated state machines).
hkmc2/shared/src/test/mlscript/handlers/NoStackSafety.mls Updates golden output; shows inlining effects even under warning scenarios.
hkmc2/shared/src/test/mlscript/handlers/NonLocalReturns.mls Updates expected diagnostics/output; adds TODO note about inliner interaction.
hkmc2/shared/src/test/mlscript/handlers/Effects.mls Updates handler golden output for break removal and line shifts.
hkmc2/shared/src/test/mlscript/handlers/Debugging.mls Updates golden stack traces/line numbers and handler control-flow output.
hkmc2/shared/src/test/mlscript/deforest/todos.mls Updates deforest golden output after symbol refresh/flow changes.
hkmc2/shared/src/test/mlscript/deforest/simple.mls Updates deforest golden output (name hygiene/ids affected by new passes).
hkmc2/shared/src/test/mlscript/deforest/nestedMatch.mls Updates deforest golden output (name/id shifts).
hkmc2/shared/src/test/mlscript/deforest/determinism.mls Updates deforest golden output after symbol refreshing and simplification.
hkmc2/shared/src/test/mlscript/deforest/clashes.mls Updates deforest golden output (inlined value naming).
hkmc2/shared/src/test/mlscript/deforest/basic.mls Updates deforest golden output; reflects symbol-refresh and match simplifications.
hkmc2/shared/src/test/mlscript/codegen/Throw.mls Updates JS golden output for inlining behavior around throw.
hkmc2/shared/src/test/mlscript/codegen/SwitchSpecialization.mls Updates JS golden output after redundant break removal.
hkmc2/shared/src/test/mlscript/codegen/ScopedBlocksAndHandlers.mls Updates JS golden output; shows inlining affecting handler-local lambdas.
hkmc2/shared/src/test/mlscript/codegen/SanityChecks.mls Updates sanitized JS golden output; shows inlining at call sites.
hkmc2/shared/src/test/mlscript/codegen/PartialApps.mls Updates JS golden output (parameter hygiene changes).
hkmc2/shared/src/test/mlscript/codegen/ParamClasses.mls Updates JS golden output for inlined trivial wrapper calls.
hkmc2/shared/src/test/mlscript/codegen/MergeMatchArms.mls Updates golden output after join-point/break optimization.
hkmc2/shared/src/test/mlscript/codegen/InnerNameHygiene.mls Disables inlining for name-hygiene test stability.
hkmc2/shared/src/test/mlscript/codegen/Inliner.mls New test file covering inliner behavior and limits.
hkmc2/shared/src/test/mlscript/codegen/InlineLambdas.mls Disables inlining to keep lambda-specific expectations stable.
hkmc2/shared/src/test/mlscript/codegen/Getters.mls Disables inlining to keep getter-related expectations stable.
hkmc2/shared/src/test/mlscript/codegen/FunInClass.mls Updates golden output for inlining within class method bodies / closures.
hkmc2/shared/src/test/mlscript/codegen/Functions.mls Disables inlining for a specific initialization-order regression test.
hkmc2/shared/src/test/mlscript/codegen/FirstClassFunctionTransform.mls Disables inlining to keep :checkIR repro stable; updates line numbers.
hkmc2/shared/src/test/mlscript/codegen/CaseShorthand.mls Updates JS golden output (scrutinee naming hygiene).
hkmc2/shared/src/test/mlscript/block-staging/Syntax.mls Updates staged IR golden output after begin/match simplification.
hkmc2/shared/src/test/mlscript/basics/MemberProjections.mls Disables inlining for a member-projection error repro; updates golden output.
hkmc2/shared/src/test/mlscript/basics/CompanionModules_Functions.mls Updates expected runtime error output after codegen changes.
hkmc2/shared/src/test/mlscript/backlog/ToTriage.mls Disables inlining for a tail-position triage case; updates line numbers.
hkmc2/shared/src/test/mlscript/backlog/TailRecFailure.mls Updates lowered IR golden output demonstrating tailrec fix under sharing.
hkmc2/shared/src/test/mlscript/backlog/Lifter.mls Disables inlining for a lifter-related repro; updates line numbers.
hkmc2/shared/src/test/mlscript/apps/parsing/CamlLightTest.mls Moves “example of” output blocks; updates expected transcript positioning.
hkmc2/shared/src/test/mlscript-compile/Predef.mjs Updates compiled JS runtime output (notably equals) to reflect new codegen.
hkmc2/shared/src/test/mlscript-compile/apps/parsing/Lexer.mls Enables lifting and rewrites some calls to preserve tail-call optimization.
hkmc2/shared/src/main/scala/hkmc2/semantics/ucs/Normalization.scala Avoids introducing break/result-temp when continuation is already return/throw (tail-position preservation).
hkmc2/shared/src/main/scala/hkmc2/MLsCompiler.scala Adjusts BlockSimplifier invocation to provide proper contextual scoping.
hkmc2/shared/src/main/scala/hkmc2/Config.scala Adds Config.Inliner and new inlining config field with default.
hkmc2/shared/src/main/scala/hkmc2/codegen/wasm/text/WatBuilder.scala Treats missing match default as nop during lowering.
hkmc2/shared/src/main/scala/hkmc2/codegen/SymbolRefresher.scala New utility for refreshing symbols during transformations/inlining.
hkmc2/shared/src/main/scala/hkmc2/codegen/Printer.scala Adds printing support for AssignDynField.
hkmc2/shared/src/main/scala/hkmc2/codegen/Lowering.scala Updates comment explaining mayRet in lowering context.
hkmc2/shared/src/main/scala/hkmc2/codegen/js/JSBuilder.scala Switch/match codegen tweaks: tolerate no-default, and avoid redundant break in abortive arms.
hkmc2/shared/src/main/scala/hkmc2/codegen/deforest/Rewrite.scala Replaces ad-hoc symbol refresh logic with SymbolRefresher and tightens assumptions.
hkmc2/shared/src/main/scala/hkmc2/codegen/BufferableTransform.scala Refreshes/rebinds parameter symbols while rewriting bufferable classes.
hkmc2/shared/src/main/scala/hkmc2/codegen/BlockTraverser.scala Traverses disambiguation term symbols in Value.Ref for correctness of analyses.
hkmc2/shared/src/main/scala/hkmc2/codegen/BlockTransformer.scala Routes Value.Ref local rewriting through applyLocal to support custom remappers.
hkmc2/shared/src/main/scala/hkmc2/codegen/BlockSimplifier.scala Adds inliner and label/break optimizations; drives inlining based on Config.inlining.
hkmc2/shared/src/main/scala/hkmc2/codegen/Block.scala Simplifies Match/Begin construction when defaults/arms are empty and when rest is empty.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +187 to +191
// * Remove useless break
case Break(label) if tailLabels.contains(label) =>
tl.log(s"Break ${label} is eliminated: current tail label list is ${tailLabels}")
registerChange
End()
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl.log(...) is used here, but tl is not in scope. BlockSimplifier currently takes an unnamed contextual TL, so this will not compile. Either name the contextual parameter (e.g., (using ..., tl: TL)) or replace these calls with summon[TL].log(...) (and similarly for the other tl.log usages in this file).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl is a methods defined in hkmc2/shared/src/main/scala/utils/TraceLogger.scala.

Comment on lines +262 to +270
def matchArgs(args: List[Arg], params: ParamList): Option[List[(VarSymbol, Result)]] =
if args.exists(_.spread.isDefined) then
// we require a precise match when any arg is a spread arg
if params.restParam.isEmpty then return N
if args.exists(_.spread.exists(!_.isEager)) then return N
val pairs = args.zip(params.params.iterator.map((_, false)) ++ params.restParam.map((_, true)))
if pairs.exists((arg, param) => arg.spread.isDefined =/= param._2) then return N
S(pairs.map((arg, param) => (param._1.sym, arg.value)))
else
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matchArgs claims to require a “precise match when any arg is a spread arg”, but it never checks that args.size matches the expected arity (fixed params + the single rest param). Because zip truncates to the shorter list, extra call arguments can be silently ignored and missing call arguments won’t be detected, which can lead to incorrect inlining/misbinding. Add an explicit size/shape check before zipping (and reject calls with more than one spread).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnsonYeung What happens when there is a length mismatch and zip drops some elements from either side? Is the call semantics of the non-inlined version preserved? Please add a comment about that as well as regression tests for both cases.

case bms: BlockMemberSymbol =>
val newBms = new BlockMemberSymbol(bms.nme, Nil, bms.nameIsMeaningful)
newBms.tsym = bms.tsym.map: t =>
val newOwner = t.owner.map(o => existingMapping.getOrElse(o, o).asInstanceOf)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newOwner is computed via existingMapping.getOrElse(o, o).asInstanceOf (context-inferred cast). If existingMapping ever contains a non-TermSymbol mapping for an owner, this will throw at runtime. Consider using a type-safe match (e.g., existingMapping.get(o).collect { case ts: TermSymbol => ts }.getOrElse(o)) so unexpected mappings are handled explicitly.

Suggested change
val newOwner = t.owner.map(o => existingMapping.getOrElse(o, o).asInstanceOf)
val newOwner = t.owner.map: o =>
existingMapping.get(o) match
case Some(ts: TermSymbol) => ts
case _ => o

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code proposed by Copilot does seem better?

Copy link
Copy Markdown
Contributor

@LPTK LPTK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! The code looks pretty clean, although still under-documented.

Please address the comments, and otherwise LGTM.

@@ -0,0 +1,242 @@
:js

:sjs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use :soir consistently, instead of :sjs? Here, and everywhere else in the PR.

//│ };
//│ —————————————————| Output |—————————————————————————————————————————————————————————————————————————

:soir
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:soir
// * TODO: support spread arguments like this one
:soir

Comment on lines +187 to +191
// * Remove useless break
case Break(label) if tailLabels.contains(label) =>
tl.log(s"Break ${label} is eliminated: current tail label list is ${tailLabels}")
registerChange
End()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl is a methods defined in hkmc2/shared/src/main/scala/utils/TraceLogger.scala.

Comment on lines +262 to +270
def matchArgs(args: List[Arg], params: ParamList): Option[List[(VarSymbol, Result)]] =
if args.exists(_.spread.isDefined) then
// we require a precise match when any arg is a spread arg
if params.restParam.isEmpty then return N
if args.exists(_.spread.exists(!_.isEager)) then return N
val pairs = args.zip(params.params.iterator.map((_, false)) ++ params.restParam.map((_, true)))
if pairs.exists((arg, param) => arg.spread.isDefined =/= param._2) then return N
S(pairs.map((arg, param) => (param._1.sym, arg.value)))
else
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnsonYeung What happens when there is a length mismatch and zip drops some elements from either side? Is the call semantics of the non-inlined version preserved? Please add a comment about that as well as regression tests for both cases.

case bms: BlockMemberSymbol =>
val newBms = new BlockMemberSymbol(bms.nme, Nil, bms.nameIsMeaningful)
newBms.tsym = bms.tsym.map: t =>
val newOwner = t.owner.map(o => existingMapping.getOrElse(o, o).asInstanceOf)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code proposed by Copilot does seem better?

Comment on lines +175 to +177
tailLabels += lbl
val result = applyBlock(bod)
tailLabels -= lbl
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use freshLabelCtx/nestLabelCtx?

Comment on lines +213 to +215
if (scrut2 is scrut) &&
(arms2 is arms) &&
(dflt2 is dflt) && (rst2 is rst)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (scrut2 is scrut) &&
(arms2 is arms) &&
(dflt2 is dflt) && (rst2 is rst)
if (scrut2 is scrut)
&& (arms2 is arms)
&& (dflt2 is dflt)
&& (rst2 is rst)

if (sub2 is sub) && (rst2 is rst) then b else Begin(sub2, rst2)

case Match(scrut, arms, dflt, _: End) => super.applyBlock(b)
case Match(scrut, arms, dflt, rst) =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating this error-prone code, can we just modify BlockTransformer so that it uses an applyMatchArmBlock in these positions, that we can easily override? Try to keep it DRY.

):
def isPrivate = !symbolsToPreserve.contains(defn.sym)

def canBeInlineEliminated =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this doc correct?

Suggested change
def canBeInlineEliminated =
/** Whether the definition can be removed 1from the program after it has been inlined. */
def canBeInlineEliminated =

or please update with a more accurate doc.

def isPrivate = !symbolsToPreserve.contains(defn.sym)

def canBeInlineEliminated =
isPrivate && !isMethod && useCount <= 1 && !hasNakedRef && !isLoopBreaker
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we eliminate functions used several times?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants