Adjust the heuristics used to decide when SROA is likely to be profitable.

The SRThreshold value makes perfect sense for checking if an entire aggregate should be promoted to a scalar integer, but it is not so good for splitting an aggregate into its separate elements. A struct may contain a large embedded array along with some scalar fields that would benefit from being split apart by SROA. Even if the total aggregate size is large, it may still be good to perform SROA. Thus, the most important piece of this patch is simply moving the aggregate size comparison vs. SRThreshold so that it guards only the aggregate promotion. We have also been checking the number of elements to decide if an aggregate should be split up. The limit of "SRThreshold/4" seemed rather arbitrary, and I don't think it's very useful to derive this limit from SRThreshold anyway. I've collected some data showing that the current default limit of 32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct types. One thing suggested by the data is that distinguishing between structs and arrays might be useful. There are (obviously) a lot more large arrays than large structs (as measured by the number of elements and not the total size -- a large array inside a struct still counts as a single element given the way we do SROA right now). Out of 8377 arrays where we successfully performed SROA while compiling a large set of benchmarks, only 16 of them had more than 8 elements. And, for those 16 arrays, it's not at all clear that SROA was actually beneficial. So, to offset the compile time cost of investigating more large structs for SROA, the patch lowers the limit on array elements to 8. This fixes Apple Radar 7563690. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95224 91177308-0d34-0410-b5e6-96231b3b80d8
author: Bob Wilson <bob.wilson@apple.com> 2010-02-03 17:23:56 +0000
committer: Bob Wilson <bob.wilson@apple.com> 2010-02-03 17:23:56 +0000
commit: 3992feb075b27ff37b63017078a977206f97d10d (patch)
tree: b0a3b27e967e37fd0d790e98c9dfee96918a8095 /lib/Transforms
parent: f9fba3087e840e2f6912d96c932f80d8b6698a84 (diff)
download: external_llvm-3992feb075b27ff37b63017078a977206f97d10d.zip
external_llvm-3992feb075b27ff37b63017078a977206f97d10d.tar.gz
external_llvm-3992feb075b27ff37b63017078a977206f97d10d.tar.bz2
1 files changed, 20 insertions, 18 deletions
diff --git a/lib/Transforms/Scalar/ScalarReplAggregates.cpp b/lib/Transforms/Scalar/ScalarReplAggregates.cpp
index de02caf..900d119 100644
--- a/lib/Transforms/Scalar/ScalarReplAggregates.cpp
+++ b/lib/Transforms/Scalar/ScalarReplAggregates.cpp
@@ -202,12 +202,18 @@ bool SROA::performPromotion(Function &F) {
   return Changed;
 }
 
-/// getNumSAElements - Return the number of elements in the specific struct or
-/// array.
-static uint64_t getNumSAElements(const Type *T) {
+/// ShouldAttemptScalarRepl - Decide if an alloca is a good candidate for
+/// SROA.  It must be a struct or array type with a small number of elements.
+static bool ShouldAttemptScalarRepl(AllocaInst *AI) {
+  const Type *T = AI->getAllocatedType();
+  // Do not promote any struct into more than 32 separate vars.
   if (const StructType *ST = dyn_cast<StructType>(T))
-    return ST->getNumElements();
-  return cast<ArrayType>(T)->getNumElements();
+    return ST->getNumElements() <= 32;
+  // Arrays are much less likely to be safe for SROA; only consider
+  // them if they are very small.
+  if (const ArrayType *AT = dyn_cast<ArrayType>(T))
+    return AT->getNumElements() <= 8;
+  return false;
 }
 
 // performScalarRepl - This algorithm is a simple worklist driven algorithm,
@@ -266,22 +272,18 @@ bool SROA::performScalarRepl(Function &F) {
     // Do not promote [0 x %struct].
     if (AllocaSize == 0) continue;
 
+    // If the alloca looks like a good candidate for scalar replacement, and if
+    // all its users can be transformed, then split up the aggregate into its
+    // separate elements.
+    if (ShouldAttemptScalarRepl(AI) && isSafeAllocaToScalarRepl(AI)) {
+      DoScalarReplacement(AI, WorkList);
+      Changed = true;
+      continue;
+    }
+
     // Do not promote any struct whose size is too big.
     if (AllocaSize > SRThreshold) continue;
 
-    if ((isa<StructType>(AI->getAllocatedType()) ||
-         isa<ArrayType>(AI->getAllocatedType())) &&
-        // Do not promote any struct into more than "32" separate vars.
-        getNumSAElements(AI->getAllocatedType()) <= SRThreshold/4) {
-      // Check that all of the users of the allocation are capable of being
-      // transformed.
-      if (isSafeAllocaToScalarRepl(AI)) {
-        DoScalarReplacement(AI, WorkList);
-        Changed = true;
-        continue;
-      }
-    }
-
     // If we can turn this aggregate value (potentially with casts) into a
     // simple scalar value that can be mem2reg'd into a register value.
     // IsNotTrivial tracks whether this is something that mem2reg could have
author	Bob Wilson <bob.wilson@apple.com>	2010-02-03 17:23:56 +0000
committer	Bob Wilson <bob.wilson@apple.com>	2010-02-03 17:23:56 +0000
commit	3992feb075b27ff37b63017078a977206f97d10d (patch)
tree	b0a3b27e967e37fd0d790e98c9dfee96918a8095 /lib/Transforms
parent	f9fba3087e840e2f6912d96c932f80d8b6698a84 (diff)
download	external_llvm-3992feb075b27ff37b63017078a977206f97d10d.zip external_llvm-3992feb075b27ff37b63017078a977206f97d10d.tar.gz external_llvm-3992feb075b27ff37b63017078a977206f97d10d.tar.bz2