Kiwi queries originating from LDP requests are run as prepared queries. In PSQL 9.1 and before, these queries are executed using a generic query plan, allowing for arbitrary entries for each variable (including pattern matching, etc...) in the executed query.
This started to become a problem for us after making some unrelated changes to the indexes and memory profile of our server. Typical GET requests for LDP-RSs result in handfuls of queries like:
2015-10-21 11:32:26 EDT LOG: duration: 1773.118 ms execute S_663/C_664: SELECT id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt FROM triples WHERE deleted = false AND subject = $1 AND predicate = $2 AND object = $3 AND context = $4 2015-10-21 11:32:26 EDT DETAIL: parameters: $1 = '623995867504693248', $2 = '564051914606792704', $3 = '568948696633593856', $4 = '564051914489352192' 2015-10-21 11:32:29 EDT LOG: duration: 1773.106 ms plan: Query Text: SELECT id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt FROM triples WHERE deleted = false AND subject = $1 AND predicate = $2 AND object = $3 AND context = $4 Index Scan using idx_triples_cop on triples (cost=0.00..13.74 rows=1 width=66) (actual time=1773.104..1773.104 rows=0 loops=1) Index Cond: ((context = $4) AND (object = $3) AND (predicate = $2)) Filter: (subject = $1) Buffers: shared hit=1139876
The total duration of these queries ranges from 1.2 up to 17 seconds. Note that the query plan involves an index scan on idx_triples_cop
, followed by a filter on subject
. Note also the shared buffer hit count of 1139876
.
An equivalent query run without PREPARE
, results in the following plan:
marmotta=# explain (analyze on, buffers on) SELECT id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt FROM triples WHERE deleted = false AND subject = '623995867504693248' AND predicate = '564051914606792704' AND object = '568948696633593856' AND context = '564051914489352192'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using idx_triples_spo on triples (cost=0.00..20.18 rows=1 width=66) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: ((subject = 623995867504693248::bigint) AND (predicate = 564051914606792704::bigint) AND (object = 568948696633593856::bigint)) Filter: (context = 564051914489352192::bigint) Buffers: shared hit=4 Total runtime: 0.044 ms
In this case, we see a scan on idx_triples_spo
, and a filter on context
. The relevant context appears to be the base LDP graph, which contains many statements with common predicates and objects, but different subjects. The result of changing the query index to a more suitable plan for the executed query is a total time of 0.044 ms
; we see results up to 1.5 ms
when running queries this way. The shared buffer hit count is reduced to 4
accounting for the several thousand-fold performance difference.
Postgresql 9.2 allows custom query plans for prepared queries, and should alleviate this problem. See: http://blog.endpoint.com/2014/04/custom-plans-prepared-statements-in.html