Order of rows when using table and json_to_array

Hi there.

I stumbled upon something that is behaving differently than what I would expect.
I have a column containing a json array, which I at times need to join with other data. I need to be able to get the index (or position) of the array element.

When only using table and json_to_array, I can compute the index using row_number() over ():

SELECT x.table_col,
       ROW_NUMBER() OVER () index_in_array
  FROM TABLE(JSON_TO_ARRAY('["a", "b", "c"]')) x;

+-----------+----------------+
| table_col | index_in_array |
+-----------+----------------+
| "a"       |              1 |
| "b"       |              2 |
| "c"       |              3 |
+-----------+----------------+

But when I join with a table, the order is reversed:

SELECT t.greeting,
       x.table_col,
       ROW_NUMBER() OVER (PARTITION BY t.greeting) index_in_array
  FROM (SELECT 'hello' greeting FROM dual
         union all
        SELECT 'world' greeting FROM dual
       ) t
  JOIN TABLE(JSON_TO_ARRAY('["a", "b", "c"]')) x;

+----------+-----------+----------------+
| greeting | table_col | index_in_array |
+----------+-----------+----------------+
| hello    | "c"       |              1 |
| hello    | "b"       |              2 |
| hello    | "a"       |              3 |
| world    | "c"       |              1 |
| world    | "b"       |              2 |
| world    | "a"       |              3 |
+----------+-----------+----------------+

I would expect to see 1 for "a", 2 for "b" and 3 for "c".

Is this expected behavior? If so, how do I reliably get the position of the element in the array?

TABLE function output is unordered. We have a feature request open for a function like TABLE_EX that will have a serial number column in the output. No ETA for that. Maybe next year.

Your ROW_NUMBERS will get layered on to the result of the join, after the fact, in your example.

If you really need this, you could implement a Wasm TVF that includes a serial number column with the data from each array element.

Thank you for replying :blush:

Would really appreciate that table_ex function. Do you have a place for feature requests where customers can vote? :grinning_face_with_smiling_eyes:

Do you know if there will be a noticeable performance hit when using such a TVF as you suggest?

Sure, you can open a topic in the Feedback channel requesting a feature, and people can vote it up.

Here is an example:

And the performance of TABLE_EX should be similar to TABLE, if we do it. It’ll take a little longer due to having to output more columns.

Any updates on this feature request? We are really struggling with this issue. Or, is there and example of the TVF?
Thanks,
Tim

A Wasm TVF is still the best solution right now. We have not created an equivalent built-in. Still might do it at some point in the future.

Actually, some internal people at SingleStore solved this as part of a larger block of code by using this approach:

TABLE(our_function(json_array_input))

In “our_function” they used PSQL, not Wasm. They took the documents in the input json array, and put them in an enclosing document that also had a serial number property. Then the resulting docs were put back into an array and returned.

This was necessary because TABLE only outputs one column.

Their code is way too hairy to share here, because it is more general and solves more issues. But this approach can work.

Thanks Hanson. I’ll reach out to my sales rep on this. This issue is blocking us from moving forward with our evaluation.