database - Fastest way to move data from one table to another in Postgres -


background info

on postgres server have few tables accessed business intelligence applications ideally should stay available of time. tables flushed , reloaded our etl pipeline on nightly basis (i know... due legacy setup can't use incremental updates here). loading process takes pretty long , not bullet-proof stable yet.

to increase availability of tables, decided use staging tables load downstream data our etl pipeline first, , if loaded successfully, copy data on actual production tables.

here copy function created purpose:

create or replace function guarded_copy(src text, dest text) returns void     $$       declare         c1 integer;         c2 integer;       begin         execute format('select count(*) %i', src) c1;         execute format('select count(*) %i', dest) c2;         if c1>=c2           execute format('truncate table %i cascade;', dest);           execute format('insert %i select * %i;', dest, src);         end if;       end     $$ language plpgsql volatile; 

the idea truncate dest table , load data src table, if src table(staging table) has more rows dest table(actual production table). , works.

note actual production table(dest) has constraints , indexes, while staging table(src) configured no indexes or constraints speed loading process etl.

problem

the issue function above data copy can expensive because of indexes , constraints on dest table.

question

  1. what's better way achieve same goal?
  2. i thinking of dropping/disabling index on dest before data copy step , add right after. how in sql function?
  3. i thinking swapping 2 tables renaming them, requires indexes on 1 table being copied other. how can inside function?

edit 1

postgres version:
postgresql 9.2.6 on x86_64-unknown-linux-gnu, compiled gcc (ubuntu/linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit

table constraints:
on table dest have (unique) primary key on column id, , indexes on timestamp columns.

edit 2

this question , this question helped. option 3 above, think code below close want.

create or replace function guarded_swap(src text, dest text) returns void     $$       declare         c1 integer;         c2 integer;         _query text;       begin         execute format('select count(*) %i', src) c1;         execute format('select count(*) %i', dest) c2;         if c1>=c2            -- create indexes in src table           _query in               select format('%s;', replace(pg_get_indexdef(ix.indexrelid), dest, src))               pg_class t, pg_class i, pg_index ix               t.oid = ix.indrelid                     , i.oid = ix.indexrelid                     , t.relkind = 'r' , i.relkind = 'i'                     , t.oid= dest::regclass               order                   t.relname, i.relname           loop               execute _query;           end loop;            -- drop indexes in dest table           _query in               select format('drop index %s;', i.relname)               pg_class t, pg_class i, pg_index ix               t.oid = ix.indrelid                     , i.oid = ix.indexrelid                     , t.relkind = 'r' , i.relkind = 'i'                     , t.oid= dest::regclass               order                   t.relname, i.relname           loop               execute _query;           end loop;           -- create constraints in src table           _query in               select                   format ('alter table %s add constraint %s %s;', src,                       replace(conname, dest, src),                       pg_get_constraintdef(oid))               pg_constraint               contype = 'p' , conrelid = dest::regclass           loop               execute _query;           end loop;            -- drop constraints in dest table           _query in               select                   format ('alter table %s drop constraint if exists %s;', dest, conname)               pg_constraint               conrelid = dest::regclass           loop               execute _query;           end loop;            -- swap table names           execute format('alter table %i rename %i;', dest, concat(dest, '_old'));           execute format('alter table %i rename %i;', src, dest);           execute format('alter table %i rename %i;', concat(dest, '_old'), src);           end if;       end     $$ language plpgsql volatile;  

edit 3

another thought: pks , fks might unnecessary on tables used analytics purposes. indexes concern here.

create or replace function guarded_swap(src text, dest text) returns void     $$       declare         c1 integer;         c2 integer;         _idx_name text;         _query text;         _qs text[];       begin         execute format('select count(*) %i', src) c1;         execute format('select count(*) %i', dest) c2;         if c1>=c2            -- drop indexes in dest table           _idx_name, _query in               select i.relname, format('%s;', pg_get_indexdef(ix.indexrelid))               pg_class t, pg_class i, pg_index ix               t.oid = ix.indrelid                     , i.oid = ix.indexrelid                     , t.relkind = 'r' , i.relkind = 'i'                     , t.oid= dest::regclass               order                   t.relname, i.relname           loop               _qs := array_append(_qs, _query);               execute format('drop index if exists %s;', _idx_name);           end loop;            -- swap table names           execute format('alter table %i rename %i;', dest, concat(dest, '_old'));           execute format('alter table %i rename %i;', src, dest);           execute format('alter table %i rename %i;', concat(dest, '_old'), src);            -- build indexes in dest table           foreach _query in array _qs           loop               execute _query;           end loop;          end if;       end     $$ language plpgsql volatile;  

if don't have other objects depending on tables (like views , foreign keys), it's simple operation drop existing table dest , rename new table src.
that's item 3. in list, without consideration:

but requires indexes on 1 table being copied other.

you don't "copy" index. create new identical indexes on new table src in background. table ready , switcheroo matter of milliseconds. however, if go route concurrent load on table, must prepared of these error messages in concurrent transactions:

error:  not open relation oid 123456 

related answer detailed explanation , sample code on dba.se:


Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

node.js - How to abort query on demand using Neo4j drivers -