Result table disappear before reading all partitions

jwas · January 19, 2024, 2:39pm

I’m trying to read query results in parallel. I’m following the docs at Read Query Results in Parallel · SingleStore Documentation

When I create result tables for two queries and then try to read from these tables, for one of them I’m getting an error like Aggregator result table 'some_db.some_table' doesn't exist. Sometimes I also get Error: 2318-HY000: Leaf Error (127.0.0.1:3307): Internal result table 'some_db::1_15_pdt.some_table(2)' doesn't exist.

After creating these result tables, I make sure they exist by running a SHOW RESULT TABLE query.

It does not matter if I use single-reader mode or create materialized result tables.

I’m testing this with very tiny tables (5-25 rows), where all rows are in a single partition. There are 8 partitions by default.

I’m running SingleStore in a local container using the memsql/cluster-in-a-box:alma-8.0.4-c190bb9c08-4.0.10-1.14.4 image. I’m also using the com.singlestore:singlestore-jdbc-client:1.2.0 JDBC driver.

How I can debug this issue? The docs are pretty sparse about this feature. I stumbled on SHOW RESULT TABLES in release notes, as it’s not even documented.

jwas · January 23, 2024, 9:23am

Looks like the Spark connector retries queries that read from result tables if it doesn’t exist (yet): https://github.com/memsql/singlestore-spark-connector/blob/82afc8def1e3ce2cff26bac3d656c2fabd3c8b75/src/main/scala/com/singlestore/spark/SinglestoreRDD.scala#L114

Why is that necessary?

oyeliseiev-ua · January 23, 2024, 9:30am

Hello @jwas, you need to maintain the connection alive (used for creation of the result table) until you’ve finished reading all the results.

jwas · January 23, 2024, 1:40pm

That helped, thanks! Maybe consider adding that to the docs.

tsiddiqui · February 8, 2024, 5:11pm

Hello, @oyeliseiev-ua , I have a similar issue with spark singlestore connector

reading with option:

.option(“parallelRead.Features”, “readFromAggregatorsMaterialized”)

Get this error randomly and the spark job times out.

24/01/09 08:05:47 WARN ErrorPacket: Error: 2318-HY000: Aggregator result table <<>> doesn’t exist.

This occurs mostly when 2 park jobs in parallel are hitting the same singlestore db.

Any ideas ?

amakarovych-ua · September 10, 2024, 12:30pm

Hello, @tsiddiqui
I suspect that your issue may be connected to the bug which was fixed here Made result table name unique · memsql/singlestore-spark-connector@d30b0a8 · GitHub
Can you retest with the latest version of the connector?

JosieWalker · October 28, 2024, 2:59pm

It helped me also, thank you so much.