Various PostgreSQL changes (not covered elsewhere herein)
The following changes have been made to our production PostgreSQL GUCs since starting to work with Marmotta.
- Enabled logging of slow statements, currently > 200ms
- Set shared_buffers to 3072MB, where the system has 7GB real memory (and the system was upgraded to a c3.xlarge Amazon EC2 instance)
- Increased maintenance_work_mem to 100MB
- Turned off synchronous_commit
- Turned on logging of queries that create temporary files (log_temp_files).
- Set random_page_cost to 1.5 because we have SSD drives
- Set work_mem to 38MB
We have also tuned autovacuuming to run more frequently than the default on our two largest tables by setting the autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor storage parameters on triples and nodes. We have lowered these to attempt vacuuming and analysis when a smaller proportion of rows are changed, though it hasn't made a noticeable impact on normal query performance. The last vacuuming of triples caused significant I/O wait when it was running, but did not disrupt anything outside of that. It did not make anything run noticeably faster after it was done.
With triples at 294M rows, we have autovacuum_vacuum_scale_factor set to 0.05 and autovacuum_analyze_scale_factor set to 0.025. Vacuuming should happen again after 58M rows have been changed, which will probably happen in a couple of weeks. We're eager to receive suggestions as to whether vacuuming or analyisis should happen more or less often for a table of this size.