我如何优化这个PostgreSQL查询？

如何解决我如何优化这个PostgreSQL查询？

| 下面是一个postgres查询，它似乎花费的时间比我预期的要长得多。在form_instance_id和field_id上都为field_instances表建立索引，而在workflow_state上为form_instances表建立索引。因此，我认为这将是一个快速的查询，但这需要永远的时间。有人可以帮助我解释查询计划以及要添加哪些类型的索引以加快查询速度吗？谢谢。

explain analyze
select form_id,form_instance_id,answer,field_id
from form_instances,field_instances
where workflow_state = \'DRqueued\'
    and form_instance_id = form_instances.id
    and field_id = \'Book_EstimatedDueDate\';
                                                                               QUERY PLAN                                                                                
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=8733.85..95692.90 rows=9277 width=29) (actual time=2550.000..15430.000 rows=11431 loops=1)
   Hash Cond: (field_instances.form_instance_id = form_instances.id)
   ->  Bitmap Heap Scan on field_instances  (cost=2681.11..89071.72 rows=47567 width=25) (actual time=850.000..13690.000 rows=51726 loops=1)
         Recheck Cond: ((field_id)::text = \'Book_EstimatedDueDate\'::text)
         ->  Bitmap Index Scan on index_field_instances_on_field_id  (cost=0.00..2669.22 rows=47567 width=0) (actual time=830.000..830.000 rows=51729 loops=1)
               Index Cond: ((field_id)::text = \'Book_EstimatedDueDate\'::text)
   ->  Hash  (cost=5911.34..5911.34 rows=11312 width=8) (actual time=1590.000..1590.000 rows=11431 loops=1)
         ->  Bitmap Heap Scan on form_instances  (cost=511.94..5911.34 rows=11312 width=8) (actual time=720.000..1570.000 rows=11431 loops=1)
               Recheck Cond: ((workflow_state)::text = \'DRqueued\'::text)
               ->  Bitmap Index Scan on index_form_instances_on_workflow_state  (cost=0.00..509.11 rows=11312 width=0) (actual time=650.000..650.000 rows=11509 loops=1)
                     Index Cond: ((workflow_state)::text = \'DRqueued\'::text)
 Total runtime: 15430.000 ms
(12 rows)

解决方法

当您说field_instances表同时在form_instance_id和field_id上建立索引时，您是说该表的form_instance_id和field_id上有单独的索引吗？尝试将索引放在form_instance_id上，然后将串联的索引放在(form_instance_id,field_id)上。索引的工作原理是快速查找，告诉您与索引匹配的行。然后，它必须读取这些行以执行所需的操作。因此，您始终希望索引尽可能具体。如果在表上放置两个索引，您将有两种不同的查找方法，但是查询通常只利用其中一种。如果将串联索引放在表上，则可以高效地查找索引中的第一个字段，前两个字段等。（因此，在(a,b)上的级联索引可让您在a上快速查找，甚至在a和b上都更快查找，但并不能帮助您在b上查找内容）现在，它正在找出form_instances中具有正确状态的所有可能事物。它分别找出所有具有正确字段ID的ѭ9。然后，它进行哈希联接。为此，从一个结果集中进行查找散列，然后扫描另一结果集。根据我的建议，它应该算出所有可能的利息。然后它将转到索引，找出与表单实例和字段ID都匹配的所有“ 9”，然后将精确找到感兴趣的结果。因为索引更具体，所以数据库将具有更少的数据行来处理您的查询。 , http://explain.depesz.com是一个出色的在线工具，可帮助您直观地识别热点。我将您的结果粘贴到该工具中并进行了分析：http://explain.depesz.com/s/VIk 但是，在不查看表和索引的情况下很难明确地告诉任何事情。 , 只需要查看sql和列名，就需要知道表中的数据，我建议您是否真的需要在工作流状态上建立索引，并假设其中的元素不是非常独特-这可能不会改善选择，但会插入或更新... 尝试使field_id检查where语句中的第一个条件

我如何优化这个PostgreSQL查询？

如何解决我如何优化这个PostgreSQL查询？

解决方法

相关推荐