-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86][AVX10] Permit AVX512 options/features used together with AVX10 #71318
Conversation
This patch relaxes the driver logic to permit combinations between AVX512 and AVX10 options and makes sure we have a unified behavior between options and features combination. Here are rules we are following when handle these combinations: 1. evex512 can only be used for avx512xxx options/features. It will be ignored if used without them; 2. avx512xxx and avx10.xxx are options in two worlds. Avoid to use them together in any case. It will enable a common super set when they are used together. E.g., "-mavx512f -mavx10.1-256" euqals "-mavx10.1-512". Compiler emits warnings when user using combinations like "-mavx512f -mavx10.1-256" in case they won't get unexpected result silently.
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-backend-x86 Author: Phoebe Wang (phoebewang) ChangesThis patch relaxes the driver logic to permit combinations between AVX512 and AVX10 options and makes sure we have a unified behavior between options and features combination. Here are rules we are following when handle these combinations:
Compiler emits warnings when user using combinations like "-mavx512f -mavx10.1-256" in case they won't get unexpected result silently. Patch is 42.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71318.diff 33 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticCommonKinds.td b/clang/include/clang/Basic/DiagnosticCommonKinds.td
index 9f0ccd255a32148..8084a4ce0d1751b 100644
--- a/clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ b/clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -346,6 +346,8 @@ def err_opt_not_valid_on_target : Error<
"option '%0' cannot be specified on this target">;
def err_invalid_feature_combination : Error<
"invalid feature combination: %0">;
+def warn_invalid_feature_combination : Warning<
+ "invalid feature combination: %0">, InGroup<DiagGroup<"invalid-feature-combination">>;
def warn_target_unrecognized_env : Warning<
"mismatch between architecture and environment in target triple '%0'; did you mean '%1'?">,
InGroup<InvalidCommandLineArgument>;
diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp
index eec3cd558435e2a..9cfda95f385d627 100644
--- a/clang/lib/Basic/Targets/X86.cpp
+++ b/clang/lib/Basic/Targets/X86.cpp
@@ -119,9 +119,13 @@ bool X86TargetInfo::initFeatureMap(
setFeatureEnabled(Features, F, true);
std::vector<std::string> UpdatedFeaturesVec;
- bool HasEVEX512 = true;
+ std::vector<std::string> UpdatedAVX10FeaturesVec;
+ int HasEVEX512 = -1;
bool HasAVX512F = false;
bool HasAVX10 = false;
+ bool HasAVX10_512 = false;
+ std::string LastAVX10;
+ std::string LastAVX512;
for (const auto &Feature : FeaturesVec) {
// Expand general-regs-only to -x86, -mmx and -sse
if (Feature == "+general-regs-only") {
@@ -131,35 +135,50 @@ bool X86TargetInfo::initFeatureMap(
continue;
}
- if (Feature.substr(0, 7) == "+avx10.") {
- HasAVX10 = true;
- HasAVX512F = true;
- if (Feature.substr(Feature.size() - 3, 3) == "512") {
- HasEVEX512 = true;
- } else if (Feature.substr(7, 2) == "1-") {
- HasEVEX512 = false;
+ if (Feature.substr(1, 6) == "avx10.") {
+ if (Feature[0] == '+') {
+ HasAVX10 = true;
+ if (Feature.substr(Feature.size() - 3, 3) == "512")
+ HasAVX10_512 = true;
+ LastAVX10 = Feature;
+ } else if (HasAVX10 && Feature == "-avx10.1-256") {
+ HasAVX10 = false;
+ HasAVX10_512 = false;
+ } else if (HasAVX10_512 && Feature == "-avx10.1-512") {
+ HasAVX10_512 = false;
}
+ // Postpone AVX10 features handling after AVX512 settled.
+ UpdatedAVX10FeaturesVec.push_back(Feature);
+ continue;
} else if (!HasAVX512F && Feature.substr(0, 7) == "+avx512") {
HasAVX512F = true;
+ LastAVX512 = Feature;
} else if (HasAVX512F && Feature == "-avx512f") {
HasAVX512F = false;
- } else if (HasAVX10 && Feature == "-avx10.1-256") {
- HasAVX10 = false;
- HasAVX512F = false;
- } else if (!HasEVEX512 && Feature == "+evex512") {
+ } else if (HasEVEX512 != true && Feature == "+evex512") {
HasEVEX512 = true;
- } else if (HasEVEX512 && Feature == "-avx10.1-512") {
- HasEVEX512 = false;
- } else if (HasEVEX512 && Feature == "-evex512") {
+ continue;
+ } else if (HasEVEX512 != false && Feature == "-evex512") {
HasEVEX512 = false;
+ continue;
}
UpdatedFeaturesVec.push_back(Feature);
}
- if (HasAVX512F && HasEVEX512)
- UpdatedFeaturesVec.push_back("+evex512");
- else if (HasAVX10)
- UpdatedFeaturesVec.push_back("-evex512");
+ llvm::append_range(UpdatedFeaturesVec, UpdatedAVX10FeaturesVec);
+ // HasEVEX512 is a three-states flag. We need to turn it into [+-]evex512
+ // according to other features.
+ if (HasAVX512F) {
+ UpdatedFeaturesVec.push_back(HasEVEX512 == false ? "-evex512" : "+evex512");
+ if (HasAVX10 && !HasAVX10_512 && HasEVEX512 != false)
+ Diags.Report(diag::warn_invalid_feature_combination)
+ << LastAVX512 + " " + LastAVX10 + "; will be promoted to avx10.1-512";
+ } else if (HasAVX10) {
+ if (HasEVEX512 != -1)
+ Diags.Report(diag::warn_invalid_feature_combination)
+ << LastAVX10 + (HasEVEX512 ? " +evex512" : " -evex512");
+ UpdatedFeaturesVec.push_back(HasAVX10_512 ? "+evex512" : "-evex512");
+ }
if (!TargetInfo::initFeatureMap(Features, Diags, CPU, UpdatedFeaturesVec))
return false;
diff --git a/clang/lib/Driver/ToolChains/Arch/X86.cpp b/clang/lib/Driver/ToolChains/Arch/X86.cpp
index 848c26ddb43e4ae..fbe665bdd5c8afb 100644
--- a/clang/lib/Driver/ToolChains/Arch/X86.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/X86.cpp
@@ -229,7 +229,6 @@ void x86::getX86TargetFeatures(const Driver &D, const llvm::Triple &Triple,
<< D.getOpts().getOptionName(LVIOpt);
}
- bool HasAVX10 = false;
for (const Arg *A : Args.filtered(options::OPT_m_x86_AVX10_Features_Group)) {
StringRef Name = A->getOption().getName();
A->claim();
@@ -251,7 +250,6 @@ void x86::getX86TargetFeatures(const Driver &D, const llvm::Triple &Triple,
#endif
Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
- HasAVX10 = true;
}
// Now add any that the user explicitly requested on the command line,
@@ -271,14 +269,9 @@ void x86::getX86TargetFeatures(const Driver &D, const llvm::Triple &Triple,
continue;
}
- StringRef AVX512Name = Name;
bool IsNegative = Name.startswith("no-");
if (IsNegative)
Name = Name.substr(3);
- if (HasAVX10 && (Name.startswith("avx512") || Name == "evex512")) {
- D.Diag(diag::warn_drv_unused_argument) << AVX512Name;
- continue;
- }
Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
}
diff --git a/clang/lib/Headers/avx2intrin.h b/clang/lib/Headers/avx2intrin.h
index 9196c8c7d24f7c8..2bb0fa39c465967 100644
--- a/clang/lib/Headers/avx2intrin.h
+++ b/clang/lib/Headers/avx2intrin.h
@@ -15,8 +15,8 @@
#define __AVX2INTRIN_H
/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx2"), __min_vector_width__(256)))
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx2"), __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx2,no-evex512"), __min_vector_width__(256)))
+#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx2,no-evex512"), __min_vector_width__(128)))
/* SSE4 Multiple Packed Sums of Absolute Difference. */
/// Computes sixteen sum of absolute difference (SAD) operations on sets of
diff --git a/clang/lib/Headers/avx512bf16intrin.h b/clang/lib/Headers/avx512bf16intrin.h
index ce1dd2ee5bdfe0e..b28d2e243f2cb80 100644
--- a/clang/lib/Headers/avx512bf16intrin.h
+++ b/clang/lib/Headers/avx512bf16intrin.h
@@ -23,7 +23,8 @@ typedef __bf16 __bfloat16 __attribute__((deprecated("use __bf16 instead")));
__attribute__((__always_inline__, __nodebug__, __target__("avx512bf16,evex512"), \
__min_vector_width__(512)))
#define __DEFAULT_FN_ATTRS \
- __attribute__((__always_inline__, __nodebug__, __target__("avx512bf16")))
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512bf16,no-evex512")))
/// Convert One BF16 Data to One Single Float Data.
///
diff --git a/clang/lib/Headers/avx512bwintrin.h b/clang/lib/Headers/avx512bwintrin.h
index df3c7294fba7a08..51dba5427b0fc0a 100644
--- a/clang/lib/Headers/avx512bwintrin.h
+++ b/clang/lib/Headers/avx512bwintrin.h
@@ -20,7 +20,9 @@ typedef unsigned long long __mmask64;
/* Define the default attributes for the functions in this file. */
#define __DEFAULT_FN_ATTRS512 __attribute__((__always_inline__, __nodebug__, __target__("avx512bw,evex512"), __min_vector_width__(512)))
#define __DEFAULT_FN_ATTRS64 __attribute__((__always_inline__, __nodebug__, __target__("avx512bw,evex512")))
-#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("avx512bw")))
+#define __DEFAULT_FN_ATTRS \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512bw,no-evex512")))
static __inline __mmask32 __DEFAULT_FN_ATTRS
_knot_mask32(__mmask32 __M)
diff --git a/clang/lib/Headers/avx512dqintrin.h b/clang/lib/Headers/avx512dqintrin.h
index 225d3eaf57faea4..88b48e3a32070b6 100644
--- a/clang/lib/Headers/avx512dqintrin.h
+++ b/clang/lib/Headers/avx512dqintrin.h
@@ -16,7 +16,9 @@
/* Define the default attributes for the functions in this file. */
#define __DEFAULT_FN_ATTRS512 __attribute__((__always_inline__, __nodebug__, __target__("avx512dq,evex512"), __min_vector_width__(512)))
-#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("avx512dq")))
+#define __DEFAULT_FN_ATTRS \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512dq,no-evex512")))
static __inline __mmask8 __DEFAULT_FN_ATTRS
_knot_mask8(__mmask8 __M)
diff --git a/clang/lib/Headers/avx512fintrin.h b/clang/lib/Headers/avx512fintrin.h
index 5823728f22252b2..4f172c74b31cbb2 100644
--- a/clang/lib/Headers/avx512fintrin.h
+++ b/clang/lib/Headers/avx512fintrin.h
@@ -168,8 +168,12 @@ typedef enum
/* Define the default attributes for the functions in this file. */
#define __DEFAULT_FN_ATTRS512 __attribute__((__always_inline__, __nodebug__, __target__("avx512f,evex512"), __min_vector_width__(512)))
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx512f"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("avx512f")))
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512f,no-evex512"), __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512f,no-evex512")))
/* Create vectors with repeated elements */
diff --git a/clang/lib/Headers/avx512fp16intrin.h b/clang/lib/Headers/avx512fp16intrin.h
index a9428c6feba2e91..4123f10c3951312 100644
--- a/clang/lib/Headers/avx512fp16intrin.h
+++ b/clang/lib/Headers/avx512fp16intrin.h
@@ -25,10 +25,12 @@ typedef _Float16 __m512h_u __attribute__((__vector_size__(64), __aligned__(1)));
__attribute__((__always_inline__, __nodebug__, \
__target__("avx512fp16,evex512"), __min_vector_width__(512)))
#define __DEFAULT_FN_ATTRS256 \
- __attribute__((__always_inline__, __nodebug__, __target__("avx512fp16"), \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512fp16,no-evex512"), \
__min_vector_width__(256)))
#define __DEFAULT_FN_ATTRS128 \
- __attribute__((__always_inline__, __nodebug__, __target__("avx512fp16"), \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512fp16,no-evex512"), \
__min_vector_width__(128)))
static __inline__ _Float16 __DEFAULT_FN_ATTRS512 _mm512_cvtsh_h(__m512h __a) {
diff --git a/clang/lib/Headers/avx512ifmavlintrin.h b/clang/lib/Headers/avx512ifmavlintrin.h
index 3284ee182004b86..8787cd471d42396 100644
--- a/clang/lib/Headers/avx512ifmavlintrin.h
+++ b/clang/lib/Headers/avx512ifmavlintrin.h
@@ -15,8 +15,14 @@
#define __IFMAVLINTRIN_H
/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx512ifma,avx512vl"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx512ifma,avx512vl"), __min_vector_width__(256)))
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512ifma,avx512vl,no-evex512"), \
+ __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512ifma,avx512vl,no-evex512"), \
+ __min_vector_width__(256)))
#define _mm_madd52hi_epu64(X, Y, Z) \
((__m128i)__builtin_ia32_vpmadd52huq128((__v2di)(X), (__v2di)(Y), \
diff --git a/clang/lib/Headers/avx512pfintrin.h b/clang/lib/Headers/avx512pfintrin.h
index b8bcf49c6b249c3..f853be021a2dd37 100644
--- a/clang/lib/Headers/avx512pfintrin.h
+++ b/clang/lib/Headers/avx512pfintrin.h
@@ -14,9 +14,6 @@
#ifndef __AVX512PFINTRIN_H
#define __AVX512PFINTRIN_H
-/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("avx512pf")))
-
#define _mm512_mask_prefetch_i32gather_pd(index, mask, addr, scale, hint) \
__builtin_ia32_gatherpfdpd((__mmask8)(mask), (__v8si)(__m256i)(index), \
(void const *)(addr), (int)(scale), \
@@ -92,6 +89,4 @@
__builtin_ia32_scatterpfqps((__mmask8)(mask), (__v8di)(__m512i)(index), \
(void *)(addr), (int)(scale), (int)(hint))
-#undef __DEFAULT_FN_ATTRS
-
#endif
diff --git a/clang/lib/Headers/avx512vbmivlintrin.h b/clang/lib/Headers/avx512vbmivlintrin.h
index c5b96ae8ada70a9..848ca2d18c3cea5 100644
--- a/clang/lib/Headers/avx512vbmivlintrin.h
+++ b/clang/lib/Headers/avx512vbmivlintrin.h
@@ -15,9 +15,14 @@
#define __VBMIVLINTRIN_H
/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx512vbmi,avx512vl"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx512vbmi,avx512vl"), __min_vector_width__(256)))
-
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vbmi,avx512vl,no-evex512"), \
+ __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vbmi,avx512vl,no-evex512"), \
+ __min_vector_width__(256)))
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_permutex2var_epi8(__m128i __A, __m128i __I, __m128i __B)
diff --git a/clang/lib/Headers/avx512vlbf16intrin.h b/clang/lib/Headers/avx512vlbf16intrin.h
index f5b8911fac2aeb8..89c9f49c7aed0fd 100644
--- a/clang/lib/Headers/avx512vlbf16intrin.h
+++ b/clang/lib/Headers/avx512vlbf16intrin.h
@@ -15,12 +15,14 @@
#ifndef __AVX512VLBF16INTRIN_H
#define __AVX512VLBF16INTRIN_H
-#define __DEFAULT_FN_ATTRS128 \
- __attribute__((__always_inline__, __nodebug__, \
- __target__("avx512vl, avx512bf16"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS256 \
- __attribute__((__always_inline__, __nodebug__, \
- __target__("avx512vl, avx512bf16"), __min_vector_width__(256)))
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512bf16,no-evex512"), \
+ __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512bf16,no-evex512"), \
+ __min_vector_width__(256)))
/// Convert Two Packed Single Data to One Packed BF16 Data.
///
diff --git a/clang/lib/Headers/avx512vlbitalgintrin.h b/clang/lib/Headers/avx512vlbitalgintrin.h
index 5154eae14cbb3c9..377e3a5ea571327 100644
--- a/clang/lib/Headers/avx512vlbitalgintrin.h
+++ b/clang/lib/Headers/avx512vlbitalgintrin.h
@@ -15,8 +15,14 @@
#define __AVX512VLBITALGINTRIN_H
/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx512vl,avx512bitalg"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx512vl,avx512bitalg"), __min_vector_width__(256)))
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512bitalg,no-evex512"), \
+ __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512bitalg,no-evex512"), \
+ __min_vector_width__(256)))
static __inline__ __m256i __DEFAULT_FN_ATTRS256
_mm256_popcnt_epi16(__m256i __A)
diff --git a/clang/lib/Headers/avx512vlbwintrin.h b/clang/lib/Headers/avx512vlbwintrin.h
index 148af5ab9a34d87..9aedba0669991a2 100644
--- a/clang/lib/Headers/avx512vlbwintrin.h
+++ b/clang/lib/Headers/avx512vlbwintrin.h
@@ -15,8 +15,14 @@
#define __AVX512VLBWINTRIN_H
/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx512vl,avx512bw"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx512vl,avx512bw"), __min_vector_width__(256)))
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512bw,no-evex512"), \
+ __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512bw,no-evex512"), \
+ __min_vector_width__(256)))
/* Integer compare */
diff --git a/clang/lib/Headers/avx512vlcdintrin.h b/clang/lib/Headers/avx512vlcdintrin.h
index cc8b72528d01269..923e2c551a97a86 100644
--- a/clang/lib/Headers/avx512vlcdintrin.h
+++ b/clang/lib/Headers/avx512vlcdintrin.h
@@ -14,9 +14,14 @@
#define __AVX512VLCDINTRIN_H
/* Define the default attributes for the functions in this file. */
-#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx512vl,avx512cd"), __min_vector_width__(128)))
-#define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("avx512vl,avx512cd"), __min_vector_width__(256)))
-
+#define __DEFAULT_FN_ATTRS128 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512cd,no-evex512"), \
+ __min_vector_width__(128)))
+#define __DEFAULT_FN_ATTRS256 \
+ __attribute__((__always_inline__, __nodebug__, \
+ __target__("avx512vl,avx512cd,no-evex512"), \
+ __min_vector_width__(256)))
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_broadcastmb_epi64 (__mmask8 __A)
diff --git a/clang/lib/Headers/avx512vldqintrin.h b/clang/lib/Headers/avx...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Ping~ |
clang/lib/Basic/Targets/X86.cpp
Outdated
@@ -119,9 +119,13 @@ bool X86TargetInfo::initFeatureMap( | |||
setFeatureEnabled(Features, F, true); | |||
|
|||
std::vector<std::string> UpdatedFeaturesVec; | |||
bool HasEVEX512 = true; | |||
std::vector<std::string> UpdatedAVX10FeaturesVec; | |||
int HasEVEX512 = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::optional ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to use enum. It's a 3-status flag. std::optional isn't much useful here.
#define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx2"), __min_vector_width__(128))) | ||
#define __DEFAULT_FN_ATTRS256 \ | ||
__attribute__((__always_inline__, __nodebug__, \ | ||
__target__("avx2,no-evex512"), __min_vector_width__(256))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the function targeted at avx2 need no-evex512
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have defined parts AVX512 intrinsics with no-evex512
and some of them will call into these AVX2 intrinsics.
Then we are facing a problem that we cannot call them in some cases because we didn't specify no-evex512
for them.
#define __DEFAULT_FN_ATTRS_MMX \ | ||
__attribute__((__always_inline__, __nodebug__, __target__("mmx,sse2"), \ | ||
__min_vector_width__(64))) | ||
__attribute__((__always_inline__, __nodebug__, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the function targeted at sse2 need no-evex512?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same reason as above.
clang/lib/Basic/Targets/X86.cpp
Outdated
HasAVX10 = false; | ||
HasAVX512F = false; | ||
} else if (!HasEVEX512 && Feature == "+evex512") { | ||
} else if (HasEVEX512 != true && Feature == "+evex512") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing a int value with true
and false
may be confusing. I suggest "std::optional"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "std::optional" doesn't help here because we need to distinguish the uninitialized status and false too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check has_value
for uninitialized value. But enum looks good to me, too.
UpdatedFeaturesVec.push_back("+evex512"); | ||
else if (HasAVX10) | ||
UpdatedFeaturesVec.push_back("-evex512"); | ||
llvm::append_range(UpdatedFeaturesVec, UpdatedAVX10FeaturesVec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the flags for AVX10 Features will be in the vector UpdatedFeaturesVec
2 times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, there is a continue
in handling avx10*
features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
I'm a little bit confused, What's the expected behavior of |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Maybe we can add some explanations about why we add attribute no-evex512
for intrinsics in the description of the PR. It's a little tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Thanks @KanRobert @e-kud |
Hi Phoebe, starting seeing this error on rather old codes after this patch landed . error: always_inline function '_mm_setzero_pd' requires target feature 'evex512', but would be inlined into function '_mm_getexp_pd' that is compiled without support for 'evex512' |
@ronlieb Do you have a reproducer for this problem? I just checked the definition of both intrinsics have |
thanks, will try workaround for now ... |
reverts: [X86][AVX10] Permit AVX512 options/features used together with AVX10 (llvm#71318) breaks pgmath build Change-Id: I14334b22129122d0aa04076dc992d45cf344f6c7
here is a small reproducer , compile with produces error : #include <immintrin.h> #if !(defined _CPU) extern "C" double log_d_scalar(double); double attribute ((noinline)) log_d_scalar(double a_input) #ifdef AVX512F |
@ronlieb The reproducer can compile successfully in trunk: https://godbolt.org/z/hvKhGq9bq
Make sure |
smallest example and used latest upstream llvm build from this morning clang++ -march=haswell t.cpp t.cpp:8:10: error: always_inline function '_mm_getexp_sd' requires target feature 'avx512f', but would be inlined into function 'log_d_scalar' that is compiled without support for 'avx512f'
1 error generated. |
I also ran into something similar, when compiling Qt; I filed #72106 with a different reproducer. |
…lvm#71318) This patch relaxes the driver logic to permit combinations between AVX512 and AVX10 options and makes sure we have a unified behavior between options and features combination. Here are rules we are following when handle these combinations: 1. evex512 can only be used for avx512xxx options/features. It will be ignored if used without them; 2. avx512xxx and avx10.xxx are options in two worlds. Avoid to use them together in any case. It will enable a common super set when they are used together. E.g., "-mavx512f -mavx10.1-256" euqals "-mavx10.1-512". Compiler emits warnings when user using combinations like "-mavx512f -mavx10.1-256" in case they won't get unexpected result silently. Function target feature attribute follows the same rule now. We have to add "no-evex512" feature for intrinsics shared between AVX512 and AVX10. We also add "no-evex512" for early ISAs like AVX etc., because some of them are called by AVX512 intrinsics. Change-Id: I3585e8fed47bfca6fedf94b86a72816ae2ee466d
…lvm#71318) This patch relaxes the driver logic to permit combinations between AVX512 and AVX10 options and makes sure we have a unified behavior between options and features combination. Here are rules we are following when handle these combinations: 1. evex512 can only be used for avx512xxx options/features. It will be ignored if used without them; 2. avx512xxx and avx10.xxx are options in two worlds. Avoid to use them together in any case. It will enable a common super set when they are used together. E.g., "-mavx512f -mavx10.1-256" euqals "-mavx10.1-512". Compiler emits warnings when user using combinations like "-mavx512f -mavx10.1-256" in case they won't get unexpected result silently. Function target feature attribute follows the same rule now. We have to add "no-evex512" feature for intrinsics shared between AVX512 and AVX10. We also add "no-evex512" for early ISAs like AVX etc., because some of them are called by AVX512 intrinsics.
…lvm#72126) llvm#71318 failed to clear EVEX512 feature for intended intrinsics. Fixes llvm#72106
This patch relaxes the driver logic to permit combinations between AVX512 and AVX10 options and makes sure we have a unified behavior between options and features combination.
Here are rules we are following when handle these combinations:
Compiler emits warnings when user using combinations like "-mavx512f -mavx10.1-256" in case they won't get unexpected result silently.
Function target feature attribute follows the same rule now. We have to add "no-evex512" feature for intrinsics shared between AVX512 and AVX10. We also add "no-evex512" for early ISAs like AVX etc., because some of them are called by AVX512 intrinsics.