Core: Optimize DeleteFileIndex by aokolnychyi · Pull Request #8157 · apache/iceberg

aokolnychyi · 2023-07-26T20:15:18Z

This PR improves and refactors DeleteFileIndex.

Avoid the cost of repeated conversion of min/max boundaries that damages the index lookup performance.
Use dataSequenceNumber from ContentFile instead of ManifestEntry to support distributed planning in the future.

This change relies on existing tests and adds a new benchmark.

Results prior to this change:

Benchmark                                                    Mode  Cnt  Score   Error  Units
PlanningBenchmark.localPlanningWithMinMaxFilter                ss    5  9.740 ± 2.333   s/op
PlanningBenchmark.localPlanningWithPartitionAndMinMaxFilter    ss    5  3.008 ± 0.044   s/op
PlanningBenchmark.localPlanningWithoutFilter                   ss    5  9.569 ± 1.309   s/op

Results after this change:

Benchmark                                                    Mode  Cnt  Score   Error  Units
PlanningBenchmark.localPlanningWithMinMaxFilter                ss    5  5.618 ± 1.297   s/op
PlanningBenchmark.localPlanningWithPartitionAndMinMaxFilter    ss    5  2.668 ± 0.210   s/op
PlanningBenchmark.localPlanningWithoutFilter                   ss    5  5.699 ± 0.595   s/op

This would be even more important for equality deletes.

aokolnychyi · 2023-07-26T20:16:56Z

+
+  // a delete file wrapper that caches the converted boundaries for faster boundary checks
+  // this class is not meant to be exposed beyond the delete file index
+  private static class IndexedDeleteFile {


Here is how I found the issue:

aokolnychyi · 2023-07-26T20:17:35Z


  DeleteFileIndex(
-      Map<Integer, PartitionSpec> specsById,
+      Map<Integer, PartitionSpec> specs,


Renamed specsById to stay on one line below.

Do we need to care about style like this later?

I feel like it is OK to do given that this PR already contains a few cosmetic changes.

aokolnychyi · 2023-07-26T20:18:22Z

  }

  DeleteFile[] forDataFile(long sequenceNumber, DataFile file) {
+    if (isEmpty) {


No need to derive the partition and do anything with the streams if the index is empty.

aokolnychyi · 2023-07-26T20:19:58Z

      // read all of the matching delete manifests in parallel and accumulate the matching files in
      // a queue
-      Queue<ManifestEntry<DeleteFile>> deleteEntries = new ConcurrentLinkedQueue<>();
+      Queue<DeleteFile> files = new ConcurrentLinkedQueue<>();


I will need this change for distributed planning as well.

aokolnychyi · 2023-07-26T21:20:24Z


  private static <T> boolean rangesOverlap(
-      Type.PrimitiveType type,
+      Types.NestedField field,


Passing field to stay on a single line when calling this method.

aokolnychyi · 2023-07-26T21:57:41Z

    return Arrays.stream(files, start, files.length);
  }

+  private static DeleteFileGroup index(


These index methods only exist because the old constructor is package private and is used in tests. I had to keep that for compatibility.

rdblue · 2023-07-26T22:22:19Z

+      if (convertedLowerBounds == null) {
+        synchronized (this) {
+          if (convertedLowerBounds == null) {
+            this.convertedLowerBounds = convertBounds(wrapped.lowerBounds());


Is the idea here to convert all bounds at once to avoid a null check in lowerBound?

If so, I don't think this is a good idea. The delete index is only going to use a few bound values (overlap uses only the equality delete's equality field ID set), so converting all of them is probably unnecessary. Plus, calling lowerBounds() in lowerBound(int) already incurs a null check to see if the bounds are converted. So lazily converting each lower bound is probably no more cost.

The underlying method convertBounds only converts file path for position deletes and equality field ids for equality deletes. You are right, they are converted at the same time. I did that to avoid using a concurrent hash map and computeIfAbsent (which has performance issues).

Ah, I see. Sounds good then.

I am still debating. We probably need a concurrent hash map to load each value one by one, right?

Here is the problem I was talking about: https://bugs.openjdk.org/browse/JDK-8161372

Here is the problem I was talking about: https://bugs.openjdk.org/browse/JDK-8161372

This should help then: apache/uniffle#766

I spent a bit more time thinking about this and I don't think it would be worth the extra complexity. Let's keep this as is for now. We only index equality IDs and all columns must be checked to discard a file.

I also checked Caffeine caches and they have some workarounds but I don't think we need them here.

rdblue · 2023-07-26T22:31:25Z

+
+        } else {
+          for (int id : equalityFieldIds()) {
+            Type type = spec.schema().findField(id).type();


I just checked and spec.schema() is used in both conversions so the type should always match.

rdblue

Looks correct overall. I had some comments about mostly minor things.

zinking · 2023-07-27T01:52:56Z

@rdblue @aokolnychyi you might be interested in revisit this PR again #5760 .
as it helps further reduce timing of canContainDeleteFile

ConeyLiu · 2023-07-27T03:01:29Z

+      T deleteUpper) {
+    Type.PrimitiveType type = field.type().asPrimitiveType();
+    Comparator<T> cmp = Comparators.forType(type);
    T dataLower = Conversions.fromByteBuffer(type, dataLowerBuf);


So, the most performance improvement comes here.

ConeyLiu · 2023-07-27T03:02:52Z

-
-    return comparator.compare(deleteLower, dataUpper) <= 0
-        && comparator.compare(dataLower, deleteUpper) <= 0;
+    return cmp.compare(deleteLower, dataUpper) <= 0 && cmp.compare(dataLower, deleteUpper) <= 0;


Here, maybe we can avoid the conversion for the upper. But this is a little improvement.

Updated to match the checks we have for the file path.

aokolnychyi · 2023-07-27T04:59:31Z

@zinking, we have discussed it a bit during one of the community syncs. I'll take another look over the weekend.

aokolnychyi · 2023-07-27T06:09:33Z

I'll merge this one to unblock the distributed planning effort. Thanks everyone for reviewing!

github-actions Bot added spark core labels Jul 26, 2023

aokolnychyi commented Jul 26, 2023

View reviewed changes

Comment thread core/src/main/java/org/apache/iceberg/DeleteFileIndex.java Outdated

aokolnychyi commented Jul 26, 2023

View reviewed changes

aokolnychyi force-pushed the improve-delete-index branch from 6b0b1ba to 745d9e3 Compare July 26, 2023 21:24

aokolnychyi mentioned this pull request Jul 26, 2023

Spark 3.4: Support distributed planning #8123

Merged

aokolnychyi commented Jul 26, 2023

View reviewed changes

rdblue reviewed Jul 26, 2023

View reviewed changes

Comment thread core/src/main/java/org/apache/iceberg/DeleteFileIndex.java Outdated

rdblue reviewed Jul 26, 2023

View reviewed changes

Comment thread core/src/main/java/org/apache/iceberg/DeleteFileIndex.java

rdblue reviewed Jul 26, 2023

View reviewed changes

Comment thread core/src/main/java/org/apache/iceberg/DeleteFileIndex.java Outdated

rdblue reviewed Jul 26, 2023

View reviewed changes

rdblue approved these changes Jul 26, 2023

View reviewed changes

ConeyLiu reviewed Jul 27, 2023

View reviewed changes

Core: Optimize DeleteFileIndex

32a3d1b

aokolnychyi force-pushed the improve-delete-index branch from 745d9e3 to 32a3d1b Compare July 27, 2023 04:47

aokolnychyi merged commit 9cc9a5d into apache:master Jul 27, 2023

zhongyujiang pushed a commit to zhongyujiang/iceberg that referenced this pull request Apr 16, 2025

[Cherry-Pick] Core: Optimize DeleteFileIndex (apache#8157)

11de37b

Uh oh!

Conversation

aokolnychyi commented Jul 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Jul 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Jul 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Jul 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rdblue left a comment

Choose a reason for hiding this comment

Uh oh!

zinking commented Jul 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented Jul 27, 2023

Uh oh!

aokolnychyi commented Jul 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

aokolnychyi commented Jul 26, 2023 •

edited

Loading

aokolnychyi Jul 26, 2023 •

edited

Loading

aokolnychyi Jul 26, 2023 •

edited

Loading

aokolnychyi Jul 27, 2023 •

edited

Loading

zinking commented Jul 27, 2023 •

edited

Loading

aokolnychyi commented Jul 27, 2023 •

edited

Loading