Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Generic Enum implementation #22161

Closed
wants to merge 56 commits into from
Closed

Conversation

TylerBrinkley
Copy link

@TylerBrinkley TylerBrinkley commented Jan 23, 2019

From my experience with Enums.NET here is a generic implementation for enums.

No new API's have been added but this new implementation will easily allow adding new generic API's when approved which will provide even better performance.

The only known breaking changes involve the parameter name included in thrown ArgumentExceptions and methods that used to throw for float, double, IntPtr, and UIntPtr enums now work as would be expected.

There may also be breaking changes between the order of thrown Exceptions when there are multiple argument errors in a method call.

Here are the benchmark results of this generic enum implementation.

Notes:

  • Before is what's in master so these gains are already on top of the gains from Improve performance of Enum.{Try}Parse #21214.
  • Compare, Equals, and HasFlag are not included as their implementation was not changed for performance reasons.
  • For performance reasons the implementation utilizes a hybrid search where for enums with 32 members or less uses a linear search and for enums with more than 32 members uses a binary search. 32 was chosen from benchmarking linear vs binary search in the worst case scenario for linear search.
  • Big is defined as a 36 member enum.
  • Small is defined as a 3 member enum.

Summary:

  • GetValues has a huge performance gain shown here from ~16-58x depending on enum size.
  • IsDefined has ~5.7x performance gain for contiguous enums and ~3.2-5x performance gain for non-contiguous enums depending on enum size.
  • ToString with a defined value has ~1.8-2.9x performance gain depending on enum size.
  • Generic Parse of names has ~1.7x performance gain.
  • Non-generic Parse of names has ~1.9x performance gain.
  • Generic Parse of values has ~1.6x performance gain.
  • Non-generic Parse of values has ~1.8x performance gain.
  • ToObject(Object) has ~2.3x performance gain.
  • ToObject(Int32) has ~3x performance gain.
  • The IConvertible.To* methods have ~3.2x performance gain.
Method Before Allocation (b) After Allocation (b) Allocation Reduction Before Time (ns) After Time (ns) Time Improvement
Format(D) 32 32 0% 67.08 50.56 1.33x
Format(G) 0 0 0% 78.98 42.79 1.85x
Format(X) 40 40 0% 76.93 60.43 1.27x
Format(F) 64 64 0% 89.01 73.12 1.22x
GetHashCode 0 0 0% 5.59 5.23 1.07x
GetNameEnumDefined(BigEnum) 0 0 0% 77.78 43.37 1.79x
GetNameEnumDefined(SmallEnum) 0 0 0% 67.96 25.57 2.66x
GetNameEnumUndefined(BigEnum) 0 0 0% 73.40 43.08 1.70x
GetNameEnumUndefined(SmallEnum) 0 0 0% 66.28 25.44 2.60x
GetNames(BigEnum) 312 312 0% 52.97 80.09 0.66x
GetNames(SmallEnum) 48 48 0% 33.36 56.75 0.59x
GetNameUnderlyingDefined(BigEnum) 0 0 0% 82.67 55.14 1.50x
GetNameUnderlyingDefined(SmallEnum) 0 0 0% 72.90 40.06 1.82x
GetNameUnderlyingUndefined(BigEnum) 0 0 0% 78.74 54.38 1.45x
GetNameUnderlyingUndefined(SmallEnum) 0 0 0% 68.40 40.66 1.68x
GetTypeCode 0 0 0% 5.79 4.81 1.20x
GetUnderlyingType 0 0 0% 33.11 20.40 1.62x
GetValues(BigEnum) 1032 168 84% 3881.06 66.35 58.49x
GetValues(SmallEnum) 112 40 64% 466.43 28.77 16.21x
IsDefinedEnumFalse(BigContiguousEnum) 0 0 0% 135.53 22.91 5.92x
IsDefinedEnumFalse(BigNonContiguousEnum) 0 0 0% 134.04 42.16 3.18x
IsDefinedEnumFalse(SmallContiguousEnum) 0 0 0% 132.13 23.19 5.70x
IsDefinedEnumFalse(SmallNonContiguousEnum) 0 0 0% 125.68 25.51 4.93x
IsDefinedEnumTrue(BigContiguousEnum) 0 0 0% 141.09 22.97 6.14x
IsDefinedEnumTrue(BigNonContiguousEnum) 0 0 0% 133.69 42.93 3.11x
IsDefinedEnumTrue(SmallContiguousEnum) 0 0 0% 129.01 23.00 5.61x
IsDefinedEnumTrue(SmallNonContiguousEnum) 0 0 0% 125.48 25.17 4.99x
IsDefinedStringFalse(BigEnum) 0 0 0% 168.71 161.26 1.05x
IsDefinedStringFalse(SmallEnum) 0 0 0% 50.79 46.38 1.10x
IsDefinedStringTrue(BigEnum) 0 0 0% 125.96 123.78 1.02x
IsDefinedStringTrue(SmallEnum) 0 0 0% 64.35 54.21 1.19x
IsDefinedUnderlyingFalse(BigContiguousEnum) 0 0 0% 95.39 25.86 3.69x
IsDefinedUnderlyingFalse(BigNonContiguousEnum) 0 0 0% 95.80 45.45 2.11x
IsDefinedUnderlyingFalse(SmallContiguousEnum) 0 0 0% 88.63 26.66 3.32x
IsDefinedUnderlyingFalse(SmallNonContiguousEnum) 0 0 0% 89.22 29.82 2.99x
IsDefinedUnderlyingTrue(BigContiguousEnum) 0 0 0% 96.10 26.12 3.68x
IsDefinedUnderlyingTrue(BigNonContiguousEnum) 0 0 0% 96.55 45.65 2.11x
IsDefinedUnderlyingTrue(SmallContiguousEnum) 0 0 0% 89.10 26.65 3.34x
IsDefinedUnderlyingTrue(SmallNonContiguousEnum) 0 0 0% 84.73 28.94 2.93x
ParseFlagsGeneric 0 0 0% 167.94 120.63 1.39x
ParseFlagsNonGeneric 24 24 0% 247.94 164.25 1.51x
ParseNameGeneric(BigEnum) 0 0 0% 94.74 59.57 1.59x
ParseNameGeneric(ByteEnum) 0 0 0% 84.72 48.65 1.74x
ParseNameGeneric(Int16Enum) 0 0 0% 79.24 45.24 1.75x
ParseNameGeneric(Int32Enum) 0 0 0% 78.29 47.00 1.67x
ParseNameGeneric(Int64Enum) 0 0 0% 76.32 45.38 1.68x
ParseNameGeneric(SbyteEnum) 0 0 0% 80.51 48.45 1.66x
ParseNameGeneric(SmallEnum) 0 0 0% 77.57 36.16 2.15x
ParseNameGeneric(UInt16Enum) 0 0 0% 82.50 47.15 1.75x
ParseNameGeneric(UInt32Enum) 0 0 0% 82.44 46.51 1.77x
ParseNameGeneric(UInt64Enum) 0 0 0% 79.96 45.40 1.76x
ParseNameNonGeneric(BigEnum) 24 24 0% 181.37 94.55 1.92x
ParseNameNonGeneric(ByteEnum) 24 24 0% 160.69 81.94 1.96x
ParseNameNonGeneric(Int16Enum) 24 24 0% 153.85 83.81 1.84x
ParseNameNonGeneric(Int32Enum) 24 24 0% 153.47 83.40 1.84x
ParseNameNonGeneric(Int64Enum) 24 24 0% 159.13 82.19 1.94x
ParseNameNonGeneric(SByteEnum) 24 24 0% 152.43 86.43 1.76x
ParseNameNonGeneric(SmallEnum) 24 24 0% 148.10 75.53 1.96x
ParseNameNonGeneric(UInt16Enum) 24 24 0% 160.32 84.41 1.90x
ParseNameNonGeneric(UInt32Enum) 24 24 0% 161.67 86.41 1.87x
ParseNameNonGeneric(UInt64Enum) 24 24 0% 161.09 85.14 1.89x
ParseValueGeneric(ByteEnum) 0 0 0% 44.53 29.77 1.50x
ParseValueGeneric(Int16Enum) 0 0 0% 43.41 30.12 1.44x
ParseValueGeneric(Int32Enum) 0 0 0% 45.73 24.83 1.84x
ParseValueGeneric(Int64Enum) 0 0 0% 41.81 25.07 1.67x
ParseValueGeneric(SByteEnum) 0 0 0% 43.49 29.89 1.46x
ParseValueGeneric(UInt16Enum) 0 0 0% 42.64 29.97 1.42x
ParseValueGeneric(UInt32Enum) 0 0 0% 45.52 23.97 1.90x
ParseValueGeneric(UInt64Enum) 0 0 0% 41.97 26.56 1.58x
ParseValueNonGeneric(ByteEnum) 24 24 0% 111.61 60.01 1.86x
ParseValueNonGeneric(Int16Enum) 24 24 0% 117.45 64.25 1.83x
ParseValueNonGeneric(Int32Enum) 24 24 0% 115.11 59.90 1.92x
ParseValueNonGeneric(Int64Enum) 24 24 0% 117.02 60.61 1.93x
ParseValueNonGeneric(SByteEnum) 24 24 0% 112.19 64.04 1.75x
ParseValueNonGeneric(UInt16Enum) 24 24 0% 115.44 63.54 1.82x
ParseValueNonGeneric(UInt32Enum) 24 24 0% 116.70 59.73 1.95x
ParseValueNonGeneric(UInt64Enum) 24 24 0% 114.41 60.83 1.88x
IConvertible.ToByte 24 0 100% 22.37 6.99 3.20x
IConvertible.ToInt16 24 0 100% 21.82 6.68 3.27x
IConvertible.ToInt32 24 0 100% 22.83 6.15 3.72x
IConvertible.ToInt64 24 0 100% 22.16 7.54 2.94x
ToObject(Object) 24 24 0% 82.76 36.22 2.28x
ToObject(Int32) 24 24 0% 79.07 26.18 3.02x
IConvertible.ToSByte 24 0 100% 22.25 6.64 3.35x
ToStringDefined(BigEnum) 0 0 0% 37.01 20.92 1.77x
ToStringDefined(SmallEnum) 0 0 0% 31.93 10.85 2.94x
ToStringFlags(SingleFlag) 0 0 0% 20.73 21.19 0.98x
ToStringFlags(ThreeFlags) 64 64 0% 44.97 49.95 0.90x
ToStringFlags(FiveFlags) 88 88 0% 57.16 66.29 0.86x
ToStringFlags(InvalidFlagCombination) 32 32 0% 51.36 49.58 1.04x
ToStringFormat(D) 32 32 0% 27.08 29.78 0.91x
ToStringFormat(G) 0 0 0% 40.91 24.21 1.69x
ToStringFormat(X) 40 40 0% 38.81 35.91 1.08x
ToStringFormat(F) 64 64 0% 45.78 51.08 0.90x
ToStringUndefined(SmallEnum) 0 0 0% 49.32 21.02 2.35x
ToStringUndefined(BigEnum) 32 32 0% 63.85 46.22 1.38x
IConvertible.ToUInt16 24 0 100% 22.38 6.91 3.24x
IConvertible.ToUInt32 24 0 100% 21.41 7.17 2.99x
IConvertible.ToUInt64 24 0 100% 22.41 7.46 3.00x
TryParseMissingGeneric(BigEnum) 0 0 0% 124.47 88.77 1.40x
TryParseMissingGeneric(SmallEnum) 0 0 0% 83.81 41.60 2.01x
TryParseMissingNonGeneric(BigEnum) 0 0 0% 133.68 122.32 1.09x
TryParseMissingNonGeneric(SmallEnum) 0 0 0% 84.97 75.17 1.13x
TryParseOverflowGeneric 0 0 0% 56.13 37.16 1.51x
TryParseOverflowNonGeneric 0 0 0% 59.84 70.22 0.85x

Provides implementation for https://github.com/dotnet/corefx/issues/15453.

@TylerBrinkley
Copy link
Author

I've reduced the use of generics with an Enum type argument internally to just the following methods on EnumBridge. These seem to be the minimal set for a high performance generic enum implementation of the currently exposed API's.

public abstract Array GetValuesNonGeneric();
public abstract TEnum ToObject(ulong value);
public abstract object ToObjectNonGeneric(ulong value);
public abstract bool IsEnum(object value);

Now, I'd like to merge the changes upstream into this branch which include the migration of Enum to the shared section but I need some guidance on what can be shared.

@jkotas
Copy link
Member

jkotas commented Mar 22, 2019

The "Activator.CreateInstance(typeof(EnumBridge<,,>).MakeGenericType(enumType, underlyingType, typeof(UnderlyingOperations)))" call is still problematic. This call will generate a pile of code per enum type. This is bad for footprint in CoreCLR, and it does not work at all for Mono and CoreRT full AOT environments that this implementation is shared with now.

Instead, there should be only 10 or so optimized implementations per the underlying type and there should be some unsafe code used to convert the enum to the underlying type. The implementation should not use Activator.CreateInstance at all.

@TylerBrinkley
Copy link
Author

@jkotas Thanks, here I thought using Activator.CreateInstance would actually reduce my static footprint. I'll change that.

@TylerBrinkley
Copy link
Author

@jkotas I've removed the use of Activator.CreateInstance but I do still use some reflection. I'm assuming that will be an issue with AOT?

@jkotas
Copy link
Member

jkotas commented Mar 22, 2019

Yes, MakeGenericType has the same problem.

@TylerBrinkley
Copy link
Author

It looks like MakeGenericType is already used in the shared partition, here and here.

What seems to be the issue using it here?

@MichalStrehovsky
Copy link
Member

It looks like MakeGenericType is already used in the shared partition, here and here.

See the comment here:

/// For properties of reference types, we use a generic helper class to get the value. This enables us to use MethodInfo.CreateDelegate
/// to build a fast getter. We can get away with this on .NET Native, because we really only need one runtime instantiation of the
/// generic type, since it's only instantiated over reference types (and thus all instances are shared).

The .NET Native reference is specifically in respect to how MakeGenericType is problematic for AOT and that this particular use is fine because it's over reference types. The use in this pull request is over value types.

The other reference is part of reflection implementation. Reflection will have references to this API by definition.

@TylerBrinkley
Copy link
Author

@MichalStrehovsky Thanks for explaining that.

I'd love to get this implementation into .NET Core but I'm definitely out of my depth currently with the restrictions now imposed with sharing the implementation across AOT runtimes. Is there anyone that could help with this? Is the way this is implemented not feasible now that it's shared with AOT runtimes? Thanks.

@jkotas
Copy link
Member

jkotas commented Mar 25, 2019

I do not think that the current implementation would work well for .NET Core either. The problem surfaces in different way in .NET Core as extra footprint: A lot more code gets JITed for each enum.

I think the idea behind this optimization is good, it just needs to be implemented differently: #22161 (comment) .

@TylerBrinkley
Copy link
Author

TylerBrinkley commented Mar 26, 2019

While it could be implemented without EnumBridge being instantiated over the Enum type, we lose a lot of performance benefits. Here are the methods that benefit from a generic implementation.

IsEnum

Used By

  • IsDefined
  • GetName
  • Format

Non-Generic Implementation

Type valueType = value.GetType();
if (valueType.IsEnum && valueType.IsEquivalentTo(enumType))

Generic Implementation

value is TEnum;

At least 1.7x performance improvement but probably much more as I don't have tests for just this method.

ToObject

Used By

  • Parse
  • ToObject

Non-Generic Implementation

InternalBoxEnum(rt, uint64Value);

Generic Implementation

TUnderlying underlying = default(TUnderlyingOperations).ToObject(uint64Value);
return Unsafe.As<TUnderlying, TEnum>(ref underlying);

Approximately 3.0x performance improvement.

GetValues

Non-Generic Implementation

ulong[] values = Enum.InternalGetValues(this);
Array ret = Array.CreateInstance(this, values.Length);
for (int i = 0; i < values.Length; i++)
{
    object val = Enum.ToObject(this, values[i]);
    ret.SetValue(val, i);
}
return ret;

Generic Implementation

EnumCache<TUnderlying, TUnderlyingOperations> cache = EnumCache<TEnum, TUnderlying, TUnderlyingOperations>.Instance;
TUnderlying[] values = cache._values;
int nonNegativeStart = cache._nonNegativeStart;
int length = values.Length;
TEnum[] array = new TEnum[length];
for (int i = nonNegativeStart; i < length; ++i)
{
    array[i - nonNegativeStart] = Unsafe.As<TUnderlying, TEnum>(ref values[i]);
}
int start = length - nonNegativeStart;
for (int i = 0; i < nonNegativeStart; ++i)
{
    array[start + i] = Unsafe.As<TUnderlying, TEnum>(ref values[i]);
}
return array;

Approximately 16-58x performance improvement depending on enum size.

@jkotas
Copy link
Member

jkotas commented Mar 26, 2019

You are using the generic code clones as a cache. Producing this code (JITing, etc.) is expensive. The cost to populate this cache is large.

Instead, you should be able to get similar performance improvements by caching data. The allocation of this data is cheap compared to producing the code.

For example, the implementation of GetValues from your last example can be:

if (cache._enumValues == null)
{
    ... populate _enumValue using the existing non-generic algorithm ...
}

return cache._enumValue.Clone();

There is a separate question on whether the performance of repeated calls to Enum.GetValues() is common enough to warrant this caching.

@TylerBrinkley
Copy link
Author

@jkotas Thanks for taking the time to explain that.

I will adjust the implementation to use the non-generic implementations and re-benchmark and see where that puts us.

@TylerBrinkley
Copy link
Author

Just re-benchmarked with the non-generic implementation and here are the main things.

  • GetValues goes from ~16-58x performance gain to ~1.2-1.4x
  • For contiguous enums IsDefined goes from ~5.7x performance gain to ~3.7x
  • For non-contiguous enums IsDefined goes from ~3.2-5x performance gain to ~2.8-3.3x
  • Non-generic Parse of names goes from ~1.9x performance gain to ~1.2x
  • Non-generic Parse of values goes from ~1.8x performance gain to ~1.1x
  • ToObject(object) goes from ~2.3x performance gain to 1.0x as implementation goes unchanged
  • ToObject(int) goes from ~3x performance gain to 1.0x as implementation goes unchanged

@TylerBrinkley
Copy link
Author

I'm going to close this and re-open in the new dotnet/platform repo when that's done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants