Is Multi Valued index supported in singlestore for json arrays?

I have a JSON column with array. In fact, in this column I have a list of some IDs and in my queries I refer to this column as Json_array_contains_double(“columnName”, 1234)
Is there a way to index this ?

Welcome to the forums v.dhakate! :wave:

Are you running the managed or self-hosted service? What version number?

Hi @v.dhakate,

Can you tell me more about your app and why you want to index the array?

You can’t index the json array right now. We are working on a feature that will let the array be stored in columnstore format for fast filtering. That might help.

If you really need it indexed, you may want to consider normalizing the numbers into another table and putting an index on the other table, and (maybe) indexes on the join columns of the tables.

Also, check out our TABLE function. You can do things like this:

singlestore> create table t(a int, j json);
Query OK, 0 rows affected (0.03 sec)

singlestore> insert t values (1,'[500,600,700]');
Query OK, 1 row affected (0.23 sec)

singlestore> insert t values (2,'[501,601,700]');
Query OK, 1 row affected (0.00 sec)

singlestore> select * from t where json_array_contains_double(j,501);
+------+---------------+
| a    | j             |
+------+---------------+
|    2 | [501,601,700] |
+------+---------------+
1 row in set (0.04 sec)

singlestore> select a, tbl.* from t, table(json_to_array(j)) tbl;
+------+-----------+
| a    | table_col |
+------+-----------+
|    1 | 500       |
|    1 | 600       |
|    1 | 700       |
|    2 | 501       |
|    2 | 601       |
|    2 | 700       |
+------+-----------+
6 rows in set (0.03 sec)

Also, depending on your app, you might be able to store the json array as text and use full text search on it.

1 Like

Thanks . I will try using full text search.

1 Like

Hey @hanson hope all is well. I was going to create a similar thread until stumbling across this one.

“We are working on a feature that will let the array be stored in columnstore format for fast filtering”

Has there been any update with the team regarding the above?

I’m trying to do something similar, where running a filter on an indexed column “id_char” produces favorable results. I’d like to replicate this on a JSON column “id_json” yet the performance over 1 billion rows isn’t anywhere as favorable as the 1st query.

Old Query:
SELECT * FROM table1 WHERE id_char = "0x12345";

table1 follows the following structure & includes the below data:

CREATE TABLE table1 (id_char CHAR(7), SORT KEY(), KEY(id_char) USING HASH);
"0x12345"
"0x67890"

==========

New Query:
SELECT * FROM table2 WHERE JSON_ARRAY_CONTAINS_STRING(id_json, "0x12345");

table2 follows the following structure & includes the below data:

CREATE TABLE table2 (id_json JSON, SORT KEY());
["0x12345", "0xabcde", "0xfedcb"]
["0x67890", "0xaaaaa", "0xbbbbb"]
["0xccccc", "0xddddd", "0x12345"]

Thanks

We have a new built-in function family in 8.1 that can help with this kind of thing:

For example:

singlestore> select * from table2 
where json_match_any(match_param_string_strict()="0x12345", json_id);
+---------------------------------+
| json_id                         |
+---------------------------------+
| ["0x12345","0xabcde","0xfedcb"] |
| ["0xccccc","0xddddd","0x12345"] |
+---------------------------------+

This was added in part so our Kai mongo-compatible API would work easier/faster.

In the future, we’re planning to add enhancements that will make the

select ... from table(json_to_array(...)) ...

type of queries run a lot faster.

Please let us know if the json_match_any() helps, and how much.

1 Like

Thanks @hanson,

I’d actually followed your advice given to the original poster last year and created a new column with the same data, yet formatted as a LONGTEXT with a FULLTEXT:

ALTER TABLE table2 ADD COLUMN id_text LONGTEXT;
ALTER TABLE table2 ADD FULLTEXT (id_text);
UPDATE table2 SET id_text = id_json;

It appears the JSON_MATCH_ANY against the JSON column results in a worse performance than MATCH() AGAINST() on the LONGTEXT column (32 seconds vs 944ms), which are identical datasets. Is there anything that needs to be modified against my JSON column?

Below is a comparison between each profile, taken in SingleStore Studio, over a table with 193M rows:

==========

SELECT * FROM table2 A WHERE MATCH(A.id_text) AGAINST ('0xaabbccdd') ORDER BY A.row DESC;

==========

SELECT * FROM table2 A WHERE json_match_any(match_param_string_strict()='0xaabbccdd', A.id_json) ORDER BY A.row DESC;

MATCH() AGAINST() is going to use an index while json_match_any won’t, which explains the difference. Consider creating a computed column in text format from the json column and put a fulltexts index on that, and search with match() against() if that works for you.

We’ll keep in mind this need to do full-text-style search on JSON as a feature request.

1 Like

Do I understand correctly that Singlestore does not currently support any kind of index for a simple array (int or string) or JsonArray?

We often have data like

tags = [“home”, “office”, “worldwide”]
tag_id = [3839,38383,37838],
color = [‘red’,‘blue’,black’],
status = [1,3,2,10,47].
etc.

We often use these fields as filters. Indices would be very useful for this.

Contrary to your description for Json Index (about computed columns), this data has no other properties where you can make a unique field for an index out of the property.

I have not found a way to create a computed column from an array and then give it an index. It works with single json fields, but not with a json array. If there was a function ARRAY_TO_STRING(seperator), you could create a computed column from the json array and then give it a full text index. But I have not found such a function.

What is the best way to create an index for these simple arrays (or json-arrays) that are very often used as filters?

I replied to the last question from @losch on another forum post that had the same question.