Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILSpy produces code with a redundant return statement not present in the CIL code #2924

Closed
ElektroKill opened this issue Mar 14, 2023 · 5 comments · Fixed by #2972
Closed
Labels
Bug Decompiler The decompiler engine itself

Comments

@ElektroKill
Copy link
Contributor

Input code

private static Hashtable DuplicateReturnTest(string[] arg)
{
	var results = new Hashtable();

	using (MemoryStream memoryStream = new MemoryStream())
	{
		if (memoryStream != null)
		{
			foreach (string name in arg)
			{

			}
		}
	}

	return results;
}

IL produced by compiler:

.method private hidebysig static 
		class [mscorlib]System.Collections.Hashtable DuplicateReturnTest (
			string[] arg
		) cil managed 
	{
		// Method begins at RVA 0x2050
		// Header size: 12
		// Code size: 49 (0x31)
		.maxstack 2
		.locals init (
			[0] class [mscorlib]System.Collections.Hashtable results,
			[1] class [mscorlib]System.IO.MemoryStream memoryStream,
			[2] string[],
			[3] int32
		)

		IL_0000: newobj instance void [mscorlib]System.Collections.Hashtable::.ctor()
		IL_0005: stloc.0
		IL_0006: newobj instance void [mscorlib]System.IO.MemoryStream::.ctor()
		IL_000b: stloc.1
		.try
		{
			IL_000c: ldloc.1
			IL_000d: brfalse.s IL_0023

			IL_000f: ldarg.0
			IL_0010: stloc.2
			IL_0011: ldc.i4.0
			IL_0012: stloc.3
			IL_0013: br.s IL_001d
			// loop start (head: IL_001d)
				IL_0015: ldloc.2
				IL_0016: ldloc.3
				IL_0017: ldelem.ref
				IL_0018: pop
				IL_0019: ldloc.3
				IL_001a: ldc.i4.1
				IL_001b: add
				IL_001c: stloc.3

				IL_001d: ldloc.3
				IL_001e: ldloc.2
				IL_001f: ldlen
				IL_0020: conv.i4
				IL_0021: blt.s IL_0015
			// end loop

			IL_0023: leave.s IL_002f
		} // end .try
		finally
		{
			IL_0025: ldloc.1
			IL_0026: brfalse.s IL_002e

			IL_0028: ldloc.1
			IL_0029: callvirt instance void [mscorlib]System.IDisposable::Dispose()

			IL_002e: endfinally
		} // end handler

		IL_002f: ldloc.0
		IL_0030: ret
	} // end of method DuplicatedReturn::DuplicateReturnTest

Erroneous output

private static Hashtable DuplicateReturnTest(string[] arg)
{
	Hashtable results = new Hashtable();
	using MemoryStream memoryStream = new MemoryStream();
	if (memoryStream != null)
	{
		for (int i = 0; i < arg.Length; i++)
		{
			_ = arg[i];
		}
		return results;
	}
	return results;
}

There is an additional return statement in this method found inside the if block which also does not appear in the IL of the method.
The difference between foreach/for is expected due to compiler optimization

Details

  • Product in use: ILSpy version 8.0.0.7261-preview3
@ElektroKill ElektroKill added Bug Decompiler The decompiler engine itself labels Mar 14, 2023
@siegfriedpammer
Copy link
Member

This is the code responsible for the behavior:

// We don't want to always inline the return directly, because this
// might force us to place the return within a loop, when it's better
// placed outside.
// But we do want to move the return block into the correct try-finally scope,
// so that loop detection at least has the option to put it inside
// the loop body.
context.Step("Copy return block into try block", branch);
Block blockCopy = (Block)branch.TargetBlock.Clone();
BlockContainer localContainer = branch.Ancestors.OfType<BlockContainer>().First();
blocksToAdd.Add((localContainer, blockCopy));
branch.TargetBlock = blockCopy;

I haven't looked into it, but it might be possible to make it less aggressive by checking whether there is any "useful" code after the end of the finally block other than a return.

Further analysis reveals that early in the decompiler pipeline the code would have been decompiled using an early return, which would not look as strange as the code above does... however, then the ReduceNestingTransform kicks in and actually increases the nesting by moving everything into the if-statement.

@dgrunwald
Copy link
Member

Another option would be to extend the "common exit point" detection to detect both returns as equivalent exit points, and which will cause ConditionDetection to remove the return inside the if. AFAIK we already do this for value-less return;, but not for return local;.

In general, there is no one to one correspondence between IL ret and C# return. The C# compiler often combines multiple return into a single ret (especially because IL ret is not allowed inside try-blocks!), so ILSpy needs to duplicate those again (otherwise functions where the returns were merged often cannot be decompiled without goto).

@ElektroKill
Copy link
Contributor Author

ElektroKill commented Apr 10, 2023

I did some further testing using System.Private.CoreLib from .NET 7.0.4 as a test file (ILSpy version used a compiled from source build from #2959). I decompiled the assembly to a project, modified the step mentioned by @siegfriedpammer in ControlFlowSimplification to be more strict and only copy the return block if it has one incoming edge (this restricted version pretty much removed all return duplication done by this step) verified that this modification leads in all unit tests being passed and proceeded to decompile the same DLL again. After diffing the two directories in WinMerge I got some interesting results.

Decompiler change applied:
The condition here got changed by adding && branch.TargetBlock.IncomingEdgeCount == 1 to it.

else if (branch.TargetContainer != branch.Ancestors.OfType<BlockContainer>().First())
{
// We don't want to always inline the return directly, because this
// might force us to place the return within a loop, when it's better
// placed outside.
// But we do want to move the return block into the correct try-finally scope,
// so that loop detection at least has the option to put it inside
// the loop body.
context.Step("Copy return block into try block", branch);
Block blockCopy = (Block)branch.TargetBlock.Clone();
BlockContainer localContainer = branch.Ancestors.OfType<BlockContainer>().First();
blocksToAdd.Add((localContainer, blockCopy));
branch.TargetBlock = blockCopy;
}

The general conclusion is that the current return duplication is way too aggressive and can lead to code that is often times polluted with return statements or different from the source code (eg. source code uses a single return with a variable while ILSpy optimizes this variable out but inserts tons of return statements). In all cases where this change led to a difference between the unmodified decompiler and the modified decompiler, the new code generation matched the source code from https://github.com/dotnet/runtime/tree/v7.0.4 exactly or bore a stronger resemblance to it.

Here are two examples of the original source code compared against the two decompiler versions tested:

Example: SemaphoreSlim.Release(int)

Original source code:

public int Release(int releaseCount)
{
    CheckDispose();

    if (releaseCount < 1)
    {
        throw new ArgumentOutOfRangeException(
            nameof(releaseCount), releaseCount, SR.SemaphoreSlim_Release_CountWrong);
    }
    int returnCount;

    lock (m_lockObjAndDisposed)
    {
        // Read the m_currentCount into a local variable to avoid unnecessary volatile accesses inside the lock.
        int currentCount = m_currentCount;
        returnCount = currentCount;

        // If the release count would result exceeding the maximum count, throw SemaphoreFullException.
        if (m_maxCount - currentCount < releaseCount)
        {
            throw new SemaphoreFullException();
        }

        // Increment the count by the actual release count
        currentCount += releaseCount;

        // Signal to any synchronous waiters, taking into account how many waiters have previously been pulsed to wake
        // but have not yet woken
        int waitCount = m_waitCount;
        Debug.Assert(m_countOfWaitersPulsedToWake <= waitCount);
        int waitersToNotify = Math.Min(currentCount, waitCount) - m_countOfWaitersPulsedToWake;
        if (waitersToNotify > 0)
        {
            // Ideally, limiting to a maximum of releaseCount would not be necessary and could be an assert instead, but
            // since WaitUntilCountOrTimeout() does not have enough information to tell whether a woken thread was
            // pulsed, it's possible for m_countOfWaitersPulsedToWake to be less than the number of threads that have
            // actually been pulsed to wake.
            if (waitersToNotify > releaseCount)
            {
                waitersToNotify = releaseCount;
            }

            m_countOfWaitersPulsedToWake += waitersToNotify;
            for (int i = 0; i < waitersToNotify; i++)
            {
                Monitor.Pulse(m_lockObjAndDisposed);
            }
        }

        // Now signal to any asynchronous waiters, if there are any.  While we've already
        // signaled the synchronous waiters, we still hold the lock, and thus
        // they won't have had an opportunity to acquire this yet.  So, when releasing
        // asynchronous waiters, we assume that all synchronous waiters will eventually
        // acquire the semaphore.  That could be a faulty assumption if those synchronous
        // waits are canceled, but the wait code path will handle that.
        if (m_asyncHead is not null)
        {
            Debug.Assert(m_asyncTail is not null, "tail should not be null if head isn't null");
            int maxAsyncToRelease = currentCount - waitCount;
            while (maxAsyncToRelease > 0 && m_asyncHead is not null)
            {
                --currentCount;
                --maxAsyncToRelease;

                // Get the next async waiter to release and queue it to be completed
                TaskNode waiterTask = m_asyncHead;
                RemoveAsyncWaiter(waiterTask); // ensures waiterTask.Next/Prev are null
                waiterTask.TrySetResult(result: true);
            }
        }
        m_currentCount = currentCount;

        // Exposing wait handle if it is not null
        if (m_waitHandle is not null && returnCount == 0 && currentCount > 0)
        {
            m_waitHandle.Set();
        }
    }

    // And return the count
    return returnCount;
}

The original decompiler output:

public int Release(int releaseCount)
{
	CheckDispose();
	if (releaseCount < 1)
	{
		throw new ArgumentOutOfRangeException("releaseCount", releaseCount, SR.SemaphoreSlim_Release_CountWrong);
	}
	lock (m_lockObjAndDisposed)
	{
		int currentCount = m_currentCount;
		int num = currentCount;
		if (m_maxCount - currentCount < releaseCount)
		{
			throw new SemaphoreFullException();
		}
		currentCount += releaseCount;
		int waitCount = m_waitCount;
		int num2 = Math.Min(currentCount, waitCount) - m_countOfWaitersPulsedToWake;
		if (num2 > 0)
		{
			if (num2 > releaseCount)
			{
				num2 = releaseCount;
			}
			m_countOfWaitersPulsedToWake += num2;
			for (int i = 0; i < num2; i++)
			{
				Monitor.Pulse(m_lockObjAndDisposed);
			}
		}
		if (m_asyncHead != null)
		{
			int num3 = currentCount - waitCount;
			while (num3 > 0 && m_asyncHead != null)
			{
				currentCount--;
				num3--;
				TaskNode asyncHead = m_asyncHead;
				RemoveAsyncWaiter(asyncHead);
				asyncHead.TrySetResult(true);
			}
		}
		m_currentCount = currentCount;
		if (m_waitHandle != null)
		{
			if (num == 0)
			{
				if (currentCount > 0)
				{
					m_waitHandle.Set();
					return num;
				}
				return num;
			}
			return num;
		}
		return num;
	}
}

The modified decompiler output:

public int Release(int releaseCount)
{
	CheckDispose();
	if (releaseCount < 1)
	{
		throw new ArgumentOutOfRangeException("releaseCount", releaseCount, SR.SemaphoreSlim_Release_CountWrong);
	}
	int num;
	lock (m_lockObjAndDisposed)
	{
		int currentCount = m_currentCount;
		num = currentCount;
		if (m_maxCount - currentCount < releaseCount)
		{
			throw new SemaphoreFullException();
		}
		currentCount += releaseCount;
		int waitCount = m_waitCount;
		int num2 = Math.Min(currentCount, waitCount) - m_countOfWaitersPulsedToWake;
		if (num2 > 0)
		{
			if (num2 > releaseCount)
			{
				num2 = releaseCount;
			}
			m_countOfWaitersPulsedToWake += num2;
			for (int i = 0; i < num2; i++)
			{
				Monitor.Pulse(m_lockObjAndDisposed);
			}
		}
		if (m_asyncHead != null)
		{
			int num3 = currentCount - waitCount;
			while (num3 > 0 && m_asyncHead != null)
			{
				currentCount--;
				num3--;
				TaskNode asyncHead = m_asyncHead;
				RemoveAsyncWaiter(asyncHead);
				asyncHead.TrySetResult(true);
			}
		}
		m_currentCount = currentCount;
		if (m_waitHandle != null && num == 0 && currentCount > 0)
		{
			m_waitHandle.Set();
		}
	}
	return num;
}
Example: ResourceManager.InternalGetResourceSet(CultureInfo, bool, bool)

Original source code:

protected virtual ResourceSet? InternalGetResourceSet(CultureInfo culture, bool createIfNotExists, bool tryParents)
{
    Debug.Assert(culture != null, "culture != null");
    Debug.Assert(_resourceSets != null);

    Dictionary<string, ResourceSet> localResourceSets = _resourceSets;
    ResourceSet? rs = null;
    CultureInfo? foundCulture = null;
    lock (localResourceSets)
    {
        if (localResourceSets.TryGetValue(culture.Name, out rs))
        {
            return rs;
        }
    }

    ResourceFallbackManager mgr = new ResourceFallbackManager(culture, _neutralResourcesCulture, tryParents);

    foreach (CultureInfo currentCultureInfo in mgr)
    {
        lock (localResourceSets)
        {
            if (localResourceSets.TryGetValue(currentCultureInfo.Name, out rs))
            {
                // we need to update the cache if we fellback
                if (culture != currentCultureInfo) foundCulture = currentCultureInfo;
                break;
            }
        }

        // InternalGetResourceSet will never be threadsafe.  However, it must
        // be protected against reentrancy from the SAME THREAD.  (ie, calling
        // GetSatelliteAssembly may send some window messages or trigger the
        // Assembly load event, which could fail then call back into the
        // ResourceManager).  It's happened.

        rs = _resourceGroveler.GrovelForResourceSet(currentCultureInfo, localResourceSets,
                                                   tryParents, createIfNotExists);

        // found a ResourceSet; we're done
        if (rs != null)
        {
            foundCulture = currentCultureInfo;
            break;
        }
    }

    if (rs != null && foundCulture != null)
    {
        // add entries to the cache for the cultures we have gone through

        // currentCultureInfo now refers to the culture that had resources.
        // update cultures starting from requested culture up to the culture
        // that had resources.
        foreach (CultureInfo updateCultureInfo in mgr)
        {
            AddResourceSet(localResourceSets, updateCultureInfo.Name, ref rs);

            // stop when we've added current or reached invariant (top of chain)
            if (updateCultureInfo == foundCulture)
            {
                break;
            }
        }
    }

    return rs;
}

The original decompiler output:

protected virtual ResourceSet? InternalGetResourceSet(CultureInfo culture, bool createIfNotExists, bool tryParents)
{
	Dictionary<string, ResourceSet> resourceSets = _resourceSets;
	ResourceSet value = null;
	CultureInfo cultureInfo = null;
	lock (resourceSets)
	{
		if (resourceSets.TryGetValue(culture.Name, out value))
		{
			return value;
		}
	}
	ResourceFallbackManager resourceFallbackManager = new ResourceFallbackManager(culture, _neutralResourcesCulture, tryParents);
	foreach (CultureInfo item in resourceFallbackManager)
	{
		lock (resourceSets)
		{
			if (resourceSets.TryGetValue(item.Name, out value))
			{
				if (culture != item)
				{
					cultureInfo = item;
				}
				break;
			}
		}
		value = _resourceGroveler.GrovelForResourceSet(item, resourceSets, tryParents, createIfNotExists);
		if (value != null)
		{
			cultureInfo = item;
			break;
		}
	}
	if (value != null && cultureInfo != null)
	{
		foreach (CultureInfo item2 in resourceFallbackManager)
		{
			AddResourceSet(resourceSets, item2.Name, ref value);
			if (item2 == cultureInfo)
			{
				return value;
			}
		}
		return value;
	}
	return value;
}

The modified decompiler output:

protected virtual ResourceSet? InternalGetResourceSet(CultureInfo culture, bool createIfNotExists, bool tryParents)
{
	Dictionary<string, ResourceSet> resourceSets = _resourceSets;
	ResourceSet value = null;
	CultureInfo cultureInfo = null;
	lock (resourceSets)
	{
		if (resourceSets.TryGetValue(culture.Name, out value))
		{
			return value;
		}
	}
	ResourceFallbackManager resourceFallbackManager = new ResourceFallbackManager(culture, _neutralResourcesCulture, tryParents);
	foreach (CultureInfo item in resourceFallbackManager)
	{
		lock (resourceSets)
		{
			if (resourceSets.TryGetValue(item.Name, out value))
			{
				if (culture != item)
				{
					cultureInfo = item;
				}
				break;
			}
		}
		value = _resourceGroveler.GrovelForResourceSet(item, resourceSets, tryParents, createIfNotExists);
		if (value != null)
		{
			cultureInfo = item;
			break;
		}
	}
	if (value != null && cultureInfo != null)
	{
		foreach (CultureInfo item2 in resourceFallbackManager)
		{
			AddResourceSet(resourceSets, item2.Name, ref value);
			if (item2 == cultureInfo)
			{
				break;
			}
		}
	}
	return value;
}

The following .zip file contains the compiled assembly I used for the test, the source codes produced by the original decompiler (#2959) as well as the patched decompiler (changes described above), a regular patch file to show the differences, and a HTML diff report generated by WinMerge.
results.zip

Overall this change looks like it positively impacts the decompiler output without any consequences. I am yet to find code that produces less accurate results with the aforementioned change applied. If this change is acceptable and the green light is given I can make a PR where I apply the change proposed above in perhaps, a more optimal way (block cloning will no longer be necessary as this patch pretty much makes this transformation equivalent to moving the block into a different parent container).

Feedback on my proposed change and research is greatly appreciated!

@dgrunwald
Copy link
Member

I'm pretty sure there used to be a reason why this code was cloning, not just moving.
I thought that code like:

try {
   if (a) return 1;
   if (b) return 2;
} finally { ... }

would result in both return statements using the same ret instruction. Is that no longer the case?
Even if this changed, we still need to support assemblies compiled by older C# compiler versions where this used to be the case.

I would expect that your change causes the tests to fail. Although there's some really weird interactions with the InlineReturnTransform which is someone else's re-implementation of the same return-duplicating logic (for some other unrelated reason), so maybe that saves the cases that this code in ControlFlowSimplification was meant to address?

@ElektroKill
Copy link
Contributor Author

All the tests definitely do pass! image

I also went ahead and tested the following code (based on the code snippet you suggested):

public int M(bool a, bool b) {
	try {
		if (a) return 1;
		if (b) return 2;
	} finally {
		Console.WriteLine();
	}
	return 0;
}

IL code produced:

.try
{
    IL_0000: ldarg.1
    IL_0001: brfalse.s IL_0007

    IL_0003: ldc.i4.1
    IL_0004: stloc.0
    IL_0005: leave.s IL_0018

    IL_0007: ldarg.2
    IL_0008: brfalse.s IL_000e

    IL_000a: ldc.i4.2
    IL_000b: stloc.0
    IL_000c: leave.s IL_0018

    IL_000e: leave.s IL_0016
} // end .try
finally
{
    IL_0010: call void [System.Console]System.Console::WriteLine()
    IL_0015: endfinally
} // end handler

IL_0016: ldc.i4.0
IL_0017: ret

IL_0018: ldloc.0
IL_0019: ret

and it decompiles into an identical code snippet (ignoring code style differences):

public int M(bool a, bool b)
{
	try
	{
		if (a)
		{
			return 1;
		}
		if (b)
		{
			return 2;
		}
	}
	finally
	{
		Console.WriteLine();
	}
	return 0;
}

Checking the debug steps, it looks like the InlineReturnTransform is duplicating the return block starting at IL_0018 and placing it inside the try region so your assumption was correct!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Bug Decompiler The decompiler engine itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants