Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-3044: HashJoin iterator construction no longer eagerly builds hash probe table. #3047

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Aklakan
Copy link
Contributor

@Aklakan Aklakan commented Mar 5, 2025

GitHub issue resolved #3044
Pull request Description: Timeouts in QueryExecDataset were postponed until a so-far eager hash probe table construction completed because of locking. This PR proposes to initialize the hash probe table lazily such that the iterator construction becomes cheap and the lock is released quickly.

I added a test case that fails without the proposed fix.


  • Tests are included.
  • (Documentation change and updates are provided for the Apache Jena website)
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

@Aklakan Aklakan force-pushed the 2025-03-05-lazyjoin branch from 1d84676 to 0661332 Compare March 6, 2025 08:51
@@ -51,7 +51,7 @@ public static void consume(RowSet rowSet)
* This operation consumes the RowSet.
*/
public static long count(RowSet rowSet)
{ return rowSet.rewindable().size(); }
{ long c[] = {0}; rowSet.forEach(b -> ++c[0]); return c[0]; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intent: Avoid memory copy.

Copy link
Member

@rvesse rvesse Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe rowSet.stream().count() would be cleaner and still avoid the memory copy (though it has conversion to Stream overhead)?

Copy link
Contributor Author

@Aklakan Aklakan Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! - changed to the cleaner stream version.

Copy link
Member

@afs afs Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it does not matter with this destructive operation but the cost of stream() is not trivial.
forEach does not use stream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intent: Avoid memory copy.

RowSetMem rewind should not be a memory copy. It shares the rows array.

A rewindable RowSet (e.g. Mem) goes back to the beginning where as forEach continues the iteration.

Copy link
Contributor Author

@Aklakan Aklakan Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behavior so far (IIRC):

  • For non-RowSetMem instances counting would create an intermediate in-memory copy.
  • The count for a RowSetMem instances would be the total number of bindings in the in-memory row set - rather than the number of remaining bindings.

I added the method RowSetStream.forEachRemaining - the stack trace for RowSetOps.count using rowSet.stream().count() then becomes a delegate to the forEachRemaining method (with some stream creation overhead that should be neglectable for larger counts):

image

@Aklakan Aklakan force-pushed the 2025-03-05-lazyjoin branch 2 times, most recently from 026f326 to c1ec70f Compare March 6, 2025 09:53
@@ -68,6 +70,17 @@ private static Iterator<BindingNodeId> access(NodeTupleTable nodeTupleTable, Bin
iterMatches = x.iterator();
}

// Add cancel check.
Copy link
Contributor Author

@Aklakan Aklakan Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a separate case where timeouts get ignored (besides the hash probe table construction), a cancel check is needed here for the following query on TDB2:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x { ?x rdfs:label ?y . ?y rdfs:label ?z }
time curl http://localhost:3030/ds --data-urlencode 'query=PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?x { ?x rdfs:label ?y . ?y rdfs:label ?z }' --data-urlencode 'timeout=1'

Against this preloaded TDB2 DBpedia dataset (~30GB)

@Aklakan Aklakan marked this pull request as draft March 6, 2025 10:27
@Aklakan
Copy link
Contributor Author

Aklakan commented Mar 6, 2025

Set to draft because I am still investigating this query on in memory datasets.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x { ?x rdfs:label ?y . ?y rdfs:label ?z }

@Aklakan Aklakan force-pushed the 2025-03-05-lazyjoin branch 2 times, most recently from 0adb79d to 7a98372 Compare March 6, 2025 11:01
@Aklakan Aklakan marked this pull request as ready for review March 6, 2025 11:02
@Aklakan
Copy link
Contributor Author

Aklakan commented Mar 6, 2025

So for the "empty join" case I had to add cancel signal checks to the StageMatchTriple (added a test case) and StageMatchTuple (so far only tested locally against a Fuseki). On a larger scale the issue is, that the iterator of graph.find() would actually already need cancellation support. For example, a custom graph.find() implementation could sequentially search a large file for matching triples - and if there are no matches in that file, then a call to hasNext would scan all data without any means to interrupt the process. Right?

@Aklakan Aklakan force-pushed the 2025-03-05-lazyjoin branch from 7a98372 to 6b61668 Compare March 6, 2025 16:10
@Aklakan
Copy link
Contributor Author

Aklakan commented Mar 6, 2025

To summarize:

  • Aborting the now lazy HashJoin has a dedicated test.
  • For the empty join case, there is a dedicated test for StageMatchTriple based on a custom GraphBase.find() implementation that returns an infinite stream of matches.
  • But: not sure how I can mock a similar test for StageMatchTuple in TDB1/2, because it requires some large dataset or mocking NodeIds. Here is a brief screen-capture of the effect of this PR's code:
CancelStageMatchTuple.webm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

timeout parameter is not always respected by fuseki server
3 participants