Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: rust-lang/rust
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: pnkfelix/rust
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: copy-prop-upvar-to-elim-local-for-62958
Choose a head ref
Can’t automatically merge. Don’t worry, you can still create the pull request.
  • 6 commits
  • 18 files changed
  • 1 contributor

Commits on Aug 17, 2023

  1. refactoring: move replace_base utility function to root of rustc_mi…

    …r_transform crate.
    
    incorporated review feedback: simplify replace_base via Place::project_deeper
    pnkfelix committed Aug 17, 2023
    Copy the full SHA
    6277dbb View commit details

Commits on Aug 18, 2023

  1. UpvarToLocalProp is a MIR optimization for Future size blowup.

    Problem Overview
    ----------------
    
    Consider: `async fn(x: param) { a(&x); b().await; c(&c); }`. It desugars to:
    
    `fn(x: param) -> impl Future { async { let x = x; a(&x); b().await; c(&c); }`
    
    and within that desugared form, the `let x = x;` ends up occupying *two*
    distinct slots in the generator: one for the upvar (the right-hand side) and one
    for the local (the left-hand side).
    
    The UpvarToLocalProp MIR transformation tries to detect the scenario where we
    have a generator with upvars (which look like `self.field` in the MIR) that are
    *solely* moved/copied to non-self locals (and those non-self locals may likewise
    be moved/copied to other locals). After identifying the full set of locals that
    are solely moved/copied from `self.field`, the transformation replaces them all
    with `self.field` itself.
    
    Note 1: As soon as you have a use local L that *isn't* a local-to-local
    assignment, then that is something you need to respect, and the propagation will
    stop trying to propagate L at that point. (And likewise, writes to the
    self-local `_1`, or projections thereof, need to be handled with care as well.)
    
    Note 2: `_0` is the special "return place" and should never be replaced with
    `self.field`.
    
    In addition, the UpvarToLocalProp transformation removes all the silly
    `self.field = self.field;` assignments that result from the above replacement
    (apparently some later MIR passes try to double-check that you don't have any
    assignments with overlapping memory, so it ends up being necessary to do this
    no-op transformation to avoid assertions later).
    
    Note 3: This transformation is significantly generalized past what I
    demonstrated on youtube; the latter was focused on matching solely `_3 = _1.0`,
    because it was a proof of concept to demostrate that a MIR transformation
    prepass even *could* address the generator layout problem.
    
    Furthermore, the UpvarToLocalProp transformation respects optimization fuel: you
    can use `-Z fuel=$CRATE=$FUEL` and when the fuel runs out, the transformation
    will stop being applied, or be applied only partially.
    
    Note 4: I did not put the optimization fuel check in the patching code for
    UpvarToLocalProp: once you decide to replace `_3` with `_1.0` in `_3 = _1.0;`,
    you are committed to replacing all future reads of `_3` with `_1.0`, and it
    would have complicated the patch transformation to try to use fuel with that
    level of control there. Instead, the way I used the fuel was to have it control
    how many local variables are added to the `local_to_root_upvar_and_ty` table,
    which is the core database that informs the patching process, and thus gets us
    the same end effect (of limiting the number of locals that take part in the
    transformation) in a hopefully sound manner.
    
    Note 5: Added check that we do not ever call `visit_local` on a local that is
    being replaced. This way we hopefully ensure that we will ICE if we ever forget
    to replace one.
    
    But also: I didnt think I needed to recur on place, but failing to do so meant I
    overlooked locals in the projection. So now I recur on place.
    
    Satisfying above two changes did mean we need to be more aggressive about
    getting rid of now useless StorageLive and StorageDead on these locals.
    
    Note 6: Derefs invalidate replacement attempts in any context, not just
    mutations.
    
    Updates
    -------
    
    rewrote saw_upvar_to_local.
    
    Namely, unified invalidation control paths (because I realized that when looking
    at `_l = _1.field`, if you need to invalidate either left- or right-hand side,
    you end up needing to invalidate both).
    
    Also made logic for initializing the upvar_to_ty map more robust: Instead of
    asserting that we encounter each upvar at most once (because, when chains stop
    growing, we cannot assume that), now just ensure that the types we end up
    inserting are consistent. (Another alternative would be to bail out of the
    routine if the chain is not marked as growing; I'm still debating about which of
    those two approaches yields better code here.)
    
    Fixed a bug in how I described an invariant on `LocalState::Ineligible`.
    
    Updated to respect -Zmir_opt_level=0
    pnkfelix committed Aug 18, 2023
    Copy the full SHA
    f0cbd03 View commit details
  2. InlineFutureIntoFuture is a peephole optimization improving UpvarToLo…

    …calProp.
    
    Problem Overview
    ----------------
    
    When `async fn` is desugared, there's a whole bunch of local-to-local moves that
    are easily identified and eliminated. However, there is one exception: the
    sugaring of `.await` does `a = IntoFuture::into_future(b);`, and that is no longer obviously a move from the viewpoint of the analysis.
    
    However, for all F: Future, `<F as IntoFuture>::into_future(self)` is
    "guaranteed" to be the identity function that returns `self`.
    
    So: this matches `a = <impl Future as IntoFuture>::into_future(b);` and replaces
    it with `a = b;`, based on reasoning that libcore's blanket implementation of
    IntoFuture for impl Future is an identity function that takes `self` by value.
    
    This transformation, in tandem with UpvarToLocalProp, is enough to address both
    case 1 and case 2 of Rust issue 62958.
    
    InlineFutureIntoFuture respects optimization fuel, same as UpvarToLocalProp
    (much simpler to implement in this case).
    
    inline-future-into-future: improved comments during code walk for a rubber duck.
    
    MERGEME inline_future_into_future revised internal instrumentation to print out arg0 type
    
    (because that is what is really going to matter and I should be doing more to
    let it drive the analysis.)
    
    Updates
    -------
    
    respect -Zmir_opt_level=0
    pnkfelix committed Aug 18, 2023
    Copy the full SHA
    e88e117 View commit details
  3. Updated the future-sizes tests to opt-into the MIR transformations (r…

    …egardless
    
    of their default settings) and then updated the numbers to reflect the
    improvements those transformations now yield.
    pnkfelix committed Aug 18, 2023
    Copy the full SHA
    83e15f8 View commit details
  4. Updated the print-type-sizes async.rs test to opt into the new MIR

    transformations added here (regardless of their default settings) and updated
    the numbers in the output to reflect the improvements those transformations
    yield.
    pnkfelix committed Aug 18, 2023
    Copy the full SHA
    6dbe78d View commit details
  5. Some local unit tests to track progress and capture interesting cases…

    … as I identify them.
    
    issue-62958-c.rs was reduced from the tracing-attributes proc-macro crate.
    
    issue-62958-d.rs was reduced from the doc test attached to `AtomicPtr::from_mut_slice`.
    
    issue-62958-e.rs covers some important operational characteristics.
    pnkfelix committed Aug 18, 2023
    Copy the full SHA
    f4a1d4b View commit details
9 changes: 1 addition & 8 deletions compiler/rustc_mir_transform/src/generator.rs
Original file line number Diff line number Diff line change
@@ -182,14 +182,7 @@ impl<'tcx> MutVisitor<'tcx> for PinArgVisitor<'tcx> {
}
}

fn replace_base<'tcx>(place: &mut Place<'tcx>, new_base: Place<'tcx>, tcx: TyCtxt<'tcx>) {
place.local = new_base.local;

let mut new_projection = new_base.projection.to_vec();
new_projection.append(&mut place.projection.to_vec());

place.projection = tcx.mk_place_elems(&new_projection);
}
use crate::replace_base;

const SELF_ARG: Local = Local::from_u32(1);

162 changes: 162 additions & 0 deletions compiler/rustc_mir_transform/src/inline_future_into_future.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
//! Converts `y = <Future as IntoFuture>::into_future(x);` into just `y = x;`,
//! since we "know" that matches the behavior of the blanket implementation of
//! IntoFuture for F where F: Future.
//!
//! FIXME: determine such coalescing is sound. In particular, check whether
//! specialization could foil our plans here!
//!
//! This is meant to enhance the effectiveness of the upvar-to-local-prop
//! transformation in reducing the size of the generators constructed by the
//! compiler.
use crate::MirPass;
use rustc_index::IndexVec;
use rustc_middle::mir::interpret::ConstValue;
use rustc_middle::mir::visit::MutVisitor;
use rustc_middle::mir::*;
use rustc_middle::ty::{self, Ty, TyCtxt};
use rustc_span::def_id::DefId;

pub struct InlineFutureIntoFuture;
impl<'tcx> MirPass<'tcx> for InlineFutureIntoFuture {
fn is_enabled(&self, sess: &rustc_session::Session) -> bool {
sess.mir_opt_level() > 0 // on by default w/o -Zmir-opt-level=0
}

fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
let Some(into_future_fn_def_id) = tcx.lang_items().into_future_fn() else { return; };
let Some(future_trait_def_id) = tcx.lang_items().future_trait() else { return; };
let mir_source_def_id = body.source.def_id();
trace!("Running InlineFutureIntoFuture on {:?}", body.source);
let local_decls = body.local_decls().to_owned();
let mut v = Inliner {
tcx,
into_future_fn_def_id,
future_trait_def_id,
mir_source_def_id,
local_decls,
};
v.visit_body(body);
}
}

struct Inliner<'tcx> {
tcx: TyCtxt<'tcx>,
mir_source_def_id: DefId,
into_future_fn_def_id: DefId,
future_trait_def_id: DefId,
local_decls: IndexVec<Local, LocalDecl<'tcx>>,
}

#[derive(Copy, Clone, PartialEq, Eq)]
enum FoundImplFuture {
Yes,
No,
}

#[derive(Copy, Clone, PartialEq, Eq)]
enum FoundIntoFutureCall {
Yes,
No,
}

struct ImplFutureCallingIntoFuture<'tcx> {
args: Vec<Operand<'tcx>>,
destination: Place<'tcx>,
target: Option<BasicBlock>,
}

impl<'tcx> Inliner<'tcx> {
// This verifies that `ty` implements `Future`, according to the where
// clauses (i.e. predicates) attached to the source code identified by
// `mir_source_def_id`).
fn does_ty_impl_future(&self, ty: Ty<'tcx>) -> FoundImplFuture {
let mir_source_predicates = self.tcx.predicates_of(self.mir_source_def_id);
let predicates = mir_source_predicates.instantiate_identity(self.tcx);
for pred in &predicates.predicates {
let Some(kind) = pred.kind().no_bound_vars() else { continue; };
let ty::ClauseKind::Trait(trait_pred) = kind else { continue; };
let ty::TraitPredicate { trait_ref, polarity: ty::ImplPolarity::Positive } = trait_pred else { continue };

// FIXME: justify ignoring `substs` below. My current argument is
// that `trait Future` has no generic parameters, and the blanket
// impl of `IntoFuture` for all futures does not put any constrants
// on the associated items of those futures. But it is worth running
// this by a trait system expert to validate.
let ty::TraitRef { def_id: trait_def_id, .. } = trait_ref;
let self_ty = trait_ref.self_ty();
if trait_def_id == self.future_trait_def_id {
if self_ty == ty {
return FoundImplFuture::Yes;
}
}
}
FoundImplFuture::No
}
}

impl<'tcx> MutVisitor<'tcx> for Inliner<'tcx> {
fn tcx<'a>(&'a self) -> TyCtxt<'tcx> {
self.tcx
}
fn visit_basic_block_data(&mut self, _bb: BasicBlock, bb_data: &mut BasicBlockData<'tcx>) {
let Some(term) = &mut bb_data.terminator else { return; };
let Some(result) = self.analyze_terminator(term) else { return; };
let ImplFutureCallingIntoFuture {
args, destination: dest, target: Some(target)
} = result else { return; };

// At this point, we have identified this terminator as a call to the
// associated function `<impl Future as IntoFuture>::into_future`
// Due to our knowledge of how libcore implements Future and IntoFuture,
// we know we can replace such a call with a trivial move.

let Some(arg0) = args.get(0) else { return; };

trace!("InlineFutureIntoFuture bb_data args:{args:?} dest:{dest:?} target:{target:?}");

bb_data.statements.push(Statement {
source_info: term.source_info,
kind: StatementKind::Assign(Box::new((dest, Rvalue::Use(arg0.clone())))),
});
term.kind = TerminatorKind::Goto { target }
}
}

impl<'tcx> Inliner<'tcx> {
fn analyze_terminator(
&mut self,
term: &mut Terminator<'tcx>,
) -> Option<ImplFutureCallingIntoFuture<'tcx>> {
let mut found = (FoundImplFuture::No, FoundIntoFutureCall::No);
let &TerminatorKind::Call {
ref func, ref args, destination, target, fn_span: _, unwind: _, call_source: _
} = &term.kind else { return None; };
let Operand::Constant(c) = func else { return None; };
let ConstantKind::Val(val_const, const_ty) = c.literal else { return None; };
let ConstValue::ZeroSized = val_const else { return None; };
let ty::FnDef(fn_def_id, substs) = const_ty.kind() else { return None; };
if *fn_def_id == self.into_future_fn_def_id {
found.1 = FoundIntoFutureCall::Yes;
} else {
trace!("InlineFutureIntoFuture bail as this is not `into_future` invocation.");
return None;
}
let arg0_ty = args.get(0).map(|arg0| arg0.ty(&self.local_decls, self.tcx()));
trace!("InlineFutureIntoFuture substs:{substs:?} args:{args:?} arg0 ty:{arg0_ty:?}");
let Some(arg0_ty) = arg0_ty else { return None; };
found.0 = self.does_ty_impl_future(arg0_ty);
if let (FoundImplFuture::Yes, FoundIntoFutureCall::Yes) = found {
trace!("InlineFutureIntoFuture can replace {term:?}, a {func:?} call, with move");
if !self.tcx.consider_optimizing(|| {
format!("InlineFutureIntoFuture {:?}", self.mir_source_def_id)
}) {
return None;
}
let args = args.clone();
Some(ImplFutureCallingIntoFuture { args, destination, target })
} else {
None
}
}
}
13 changes: 13 additions & 0 deletions compiler/rustc_mir_transform/src/lib.rs
Original file line number Diff line number Diff line change
@@ -76,6 +76,7 @@ mod ffi_unwind_calls;
mod function_item_references;
mod generator;
pub mod inline;
mod inline_future_into_future;
mod instsimplify;
mod large_enums;
mod lower_intrinsics;
@@ -104,6 +105,7 @@ mod simplify_comparison_integral;
mod sroa;
mod uninhabited_enum_branching;
mod unreachable_prop;
mod upvar_to_local_prop;

use rustc_const_eval::transform::check_consts::{self, ConstCx};
use rustc_const_eval::transform::promote_consts;
@@ -491,6 +493,13 @@ fn run_runtime_lowering_passes<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
// `AddRetag` needs to run after `ElaborateDrops`. Otherwise it should run fairly late,
// but before optimizations begin.
&elaborate_box_derefs::ElaborateBoxDerefs,
// `InlineFutureIntoFuture` needs to run before `UpvarToLocalProp`, because its
// purpose is to enhance the effectiveness of the latter transformation.
&inline_future_into_future::InlineFutureIntoFuture,
// `UpvarToLocalProp` needs to run before `generator::StateTransform`, because its
// purpose is to coalesce locals into their original upvars before fresh space is
// allocated for them in the generator.
&upvar_to_local_prop::UpvarToLocalProp,
&generator::StateTransform,
&add_retag::AddRetag,
&Lint(const_prop_lint::ConstProp),
@@ -626,3 +635,7 @@ fn promoted_mir(tcx: TyCtxt<'_>, def: LocalDefId) -> &IndexVec<Promoted, Body<'_

tcx.arena.alloc(promoted)
}

fn replace_base<'tcx>(place: &mut Place<'tcx>, new_base: Place<'tcx>, tcx: TyCtxt<'tcx>) {
*place = new_base.project_deeper(&place.projection[..], tcx)
}
Loading