Generic Query Plans in PSQL 9.1

Kiwi queries originating from LDP requests are run as prepared queries. In PSQL 9.1 and before, these queries are executed using a generic query plan, allowing for arbitrary entries for each variable (including pattern matching, etc...) in the executed query.

This started to become a problem for us after making some unrelated changes to the indexes and memory profile of our server. Typical GET requests for LDP-RSs result in handfuls of queries like:

postgresql generic plan

2015-10-21 11:32:26 EDT LOG:  duration: 1773.118 ms  execute S_663/C_664: SELECT id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt FROM triples WHERE deleted = false AND subject = $1 AND predicate = $2 AND object = $3 AND context = $4
2015-10-21 11:32:26 EDT DETAIL:  parameters: $1 = '623995867504693248', $2 = '564051914606792704', $3 = '568948696633593856', $4 = '564051914489352192'
2015-10-21 11:32:29 EDT LOG:  duration: 1773.106 ms  plan:
	Query Text: SELECT id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt FROM triples WHERE deleted = false AND subject = $1 AND predicate = $2 AND object = $3 AND context = $4
	Index Scan using idx_triples_cop on triples  (cost=0.00..13.74 rows=1 width=66) (actual time=1773.104..1773.104 rows=0 loops=1)
	  Index Cond: ((context = $4) AND (object = $3) AND (predicate = $2))
	  Filter: (subject = $1)
	  Buffers: shared hit=1139876

The total duration of these queries ranges from 1.2 up to 17 seconds. Note that the query plan involves an index scan on idx_triples_cop, followed by a filter on subject. Note also the shared buffer hit count of 1139876.

An equivalent query run without PREPARE, results in the following plan:

postgresql custom plan

marmotta=# explain (analyze on, buffers on) SELECT id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt FROM triples WHERE deleted = false AND subject = '623995867504693248' AND predicate = '564051914606792704' AND object = '568948696633593856' AND context = '564051914489352192';
QUERY PLAN 
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_triples_spo on triples (cost=0.00..20.18 rows=1 width=66) (actual time=0.018..0.018 rows=0 loops=1)
Index Cond: ((subject = 623995867504693248::bigint) AND (predicate = 564051914606792704::bigint) AND (object = 568948696633593856::bigint))
Filter: (context = 564051914489352192::bigint)
Buffers: shared hit=4
Total runtime: 0.044 ms

In this case, we see a scan on idx_triples_spo, and a filter on context. The relevant context appears to be the base LDP graph, which contains many statements with common predicates and objects, but different subjects. The result of changing the query index to a more suitable plan for the executed query is a total time of 0.044 ms; we see results up to 1.5 ms when running queries this way. The shared buffer hit count is reduced to 4 accounting for the several thousand-fold performance difference.

Postgresql 9.2 allows custom query plans for prepared queries, and should alleviate this problem. See: http://blog.endpoint.com/2014/04/custom-plans-prepared-statements-in.html