I'm attempting to use JSONB
with JDBC, which means that I have to avoid any of the operators which use the '?' character (as the PostgreSQL JDBC driver has no escaping for this character). Taking a simple table:
CREATE TABLE jsonthings(d JSONB NOT NULL);
INSERT INTO jsonthings VALUES
('{"name":"First","tags":["foo"]}')
, ('{"name":"Second","tags":["foo","bar"]}')
, ('{"name":"Third","tags":["bar","baz"]}')
, ('{"name":"Fourth","tags":["baz"]}');
CREATE INDEX idx_jsonthings_name ON jsonthings USING GIN ((d->'name'));
Using the command line I can run a simple select and it uses the index as expected:
EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE d->'name' ? 'First';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on jsonthings (cost=113.50..30236.13 rows=10000 width=61) (actual time=0.024..0.025 rows=1 loops=1)
Recheck Cond: ((d -> 'name'::text) ? 'First'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_jsonthings_name (cost=0.00..111.00 rows=10000 width=0) (actual time=0.015..0.015 rows=1 loops=1)
Index Cond: ((d -> 'name'::text) ? 'First'::text)
Planning time: 0.073 ms
Execution time: 0.047 ms
(7 rows)
Since I cannot use the ?
character I resorted to using a function which underpins the ?
operator. However, it is not using the index:
EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE jsonb_exists(d->'name','First');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on jsonthings (cost=10000000000.00..10000263637.06 rows=3333334 width=61) (actual time=0.016..3135.119 rows=1 loops=1)
Filter: jsonb_exists((d -> 'name'::text), 'First'::text)
Rows Removed by Filter: 10000003
Planning time: 0.051 ms
Execution time: 3135.138 ms
(5 rows)
Why is this happening, and what can I do to make the function use the index? Note that in reality the table has another 10MM rows in it and I also have enable_seqscan
turned off so this isn't a case of the planner deciding not to use the index.
In response to a comment I tried using a custom operator instead:
CREATE OPERATOR ### (
PROCEDURE = jsonb_exists,
LEFTARG = jsonb,
RIGHTARG = text,
RESTRICT = contsel,
JOIN = contjoinsel);
But that has the same problem:
EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE d->'name' ### 'First';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Seq Scan on jsonthings (cost=10000000000.00..10000263637.06 rows=10000 width=61) (actual time=0.012..3381.608 rows=1 loops=1)
Filter: ((d -> 'name'::text) ### 'First'::text)
Rows Removed by Filter: 10000003
Planning time: 0.046 ms
Execution time: 3381.623 ms
(5 rows)
Update
The latest PostgreSql driver (as of March 2015) has the ability to escape the ?
character so this specific case is no longer a problem.
The default operator class for GIN indexes on json
columns jsonb_ops
only supports these operators (per documentaion):
Name Indexed Data Type Indexable Operators ... jsonb_ops jsonb ? ?& ?| @>
You can achieve this the other way round: Create a simple IMMUTABLE
SQL function using the ?
operator, which can be inlined and will use the index just like the operator itself:
CREATE OR REPLACE FUNCTION jb_contains(jsonb, text)
RETURNS bool AS
'SELECT $1 ? $2' LANGUAGE sql IMMUTABLE;
This works, I tested in Postgres 9.4 ...
However, you have been asking the wrong question. There are two basic misconceptions in your question.
The jsonb
operator ?
cannot be used to search for values. Only for keys or array elements. The manual:
Description:
Does the string exist as a top-level key within the JSON value?
You got the wrong operator, the WHERE
condition cannot work: WHERE d->'name' ? 'First'
The expression index you have doesn't makes sense either way
CREATE INDEX idx_jsonthings_name ON jsonthings USING GIN ((d->'name'));
The expression d->'name'
returns a jsonb
value. You would need d
->>
'name'
to get the value as text
.
But that would still be pointless. Since the value of the name
key is a simple string, a GIN index (while possible) makes no sense to begin with.
You don't need the operator ?
- so no workaround either.
Here are two ways that would actually work:
Plain GIN index on d
and use the "contains" operator @>
:
CREATE INDEX idx_jsonthings_d_gin ON jsonthings USING GIN (d);
SELECT d FROM jsonthings WHERE d @> '{"name":"First"}'
You could even use the more specialized operator class jsonb_path_ops
. See:
B-tree expression index on d->>'email'
and test with good old =
:
CREATE INDEX idx_jsonthings_d_email ON jsonthings ((d->>'email'));
SELECT d FROM jsonthings WHERE d->>'email' = 'First';
The second index would be considerably smaller and the query faster.
@>
as per your first solution. — Jun 01, 2015 at 08:08 External links referenced by this document: