Oracle 12.1 upgrade

Postgress Remote Database Administration, Monitoring and Support

  • SATS Remote DBAis dedicated to optimizing database uptime and performance, and to reducing the cost of IT ownership. Our Like Clockwork RDBA service saves you time, recourses, continuity, and helps you to ally IT tasks with the business goals of your organization. Many other companies chose SATS for our dependability, cost-effectiveness, and expert technological support. SATS RDBA advantages help our customers achieve costs savings of up to 75% while increasing the quality of DBA coverage.
  • SATS Remote DBA could be a one stop service provider for all this platforms:
  • Oracle, PostgreSQL, Hadoop, MS SQL, MySQL, Mongo DB

For specific inquiries please contact SATS RDBA Services or call 650 889-5533.

Speeding up Postgres by using Matviews

PROBLEM CASE

Sometimes we would like to allow frequent access to a big and active tables. However we quite quickly realize those big joins along with many active connections do not perform as we would expect. For the live sample - lets take two large tables - orders and customers.


		Table "public.customers"
Column | Type | Modifiers
----------------------+-----------------------+---------------------------------------------------------------
customerid | integer | not null default nextval('customers_customerid_seq'::regclass)
firstname | character varying(50) | not null
lastname | character varying(50) | not null
address1 | character varying(50) | not null
address2 | character varying(50) |
city | character varying(50) | not null
state | character varying(50) |
zip | integer |
country | character varying(50) | not null
region | smallint | not null
email | character varying(50) |
phone | character varying(50) |
creditcardtype | integer | not null
creditcard | character varying(50) | not null
creditcardexpiration | character varying(50) | not null
username | character varying(50) | not null
password | character varying(50) | not null
age | smallint |
income | integer |
gender | character varying(1) |
Indexes: "customers_pkey" PRIMARY KEY, btree (customerid)
Referenced by: TABLE "cust_hist" CONSTRAINT "fk_cust_hist_customerid" FOREIGN KEY (customerid) REFERENCES customers(customerid) ON DELETE CASCADE
TABLE "orders" CONSTRAINT "fk_customerid" FOREIGN KEY (customerid) REFERENCES customers(customerid) ON DELETE SET NULL


		Table "public.orders"
Column | Type | Modifiers
-------------+---------------+----------------------------------------------------------
orderid | integer | not null default nextval('orders_orderid_seq'::regclass)
orderdate | date | not null
customerid | integer |
netamount | numeric(12,2) | not null
tax | numeric(12,2) | not null
totalamount | numeric(12,2) | not null
Indexes:
"orders_pkey" PRIMARY KEY, btree (orderid)
"ix_order" btree (orderdate)
"ix_order_custid" btree (customerid)
Foreign-key constraints:
"fk_customerid" FOREIGN KEY (customerid) REFERENCES customers(customerid) ON DELETE SET NULL
Referenced by: TABLE "orderlines" CONSTRAINT "fk_orderid" FOREIGN KEY (orderid) REFERENCES orders(orderid) ON DELETE CASCADE

Lets there be lots of rows in both tables: select count(*) from customers; -> 5120000
and select count(*) from orders; 12288000 rows.
Management needs to see a list of "heavy shoppers" for last 7 days displayed quite often. But select below which provides data wanted, runs long time:


		SELECT o.customerid, c.firstname, c.lastname, sum(o.totalamount) SUM
FROM orders o, customers c
WHERE c.customerid=o.customerid and o.orderdate > now() - interval '7 day'
GROUP BY o.customerid, c.customerid ORDER BY SUM DESC LIMIT 100;

Problem with this select that it runs 6 seconds. Total runtime: 5602.532 ms (18 rows)
What can we do? indexing and partitioning will only get you so far, what we can do is to "cache" a view to fit the select statement (frequently used and natively available in ORACLE) - called Materialized Views (or matviews)
While no native matview support exists in PostgreSQL as of version 9.4, it is relatively easy to create a pglsql logic to control the situation. The logic below uses matviews and refreshes them into tables as it goes.

  • Step 1 - Creation of matviews functions and data dictionary

    Create the mat view table (data dictionary) this will include the original view for each mat view and the last refresh date and duration as well.

    
    			CREATE TABLE matviews (
    mv NAME NOT NULL PRIMARY KEY
    , view NAME NOT NULL
    , last_refresh TIMESTAMP WITH TIME ZONE
    , refresh_time INTEGER
    );

    Create the create_matview, drop_matview and refresh_matview as per below:
    
    		CREATE OR REPLACE FUNCTION create_matview(NAME, NAME)
    RETURNS VOID
    SECURITY DEFINER
    LANGUAGE plpgsql AS '
    DECLARE
    matview ALIAS FOR $1;
    view_name ALIAS FOR $2;
    entry matviews%ROWTYPE;
    BEGIN
    SELECT * INTO entry FROM matviews WHERE mv = matview;
    IF FOUND THEN
    RAISE EXCEPTION ''Materialized view ''''%'''' already exists.'',
    matview;
    END IF;
    EXECUTE ''REVOKE ALL ON '' || view_name || '' FROM PUBLIC'';
    EXECUTE ''GRANT SELECT ON '' || view_name || '' TO PUBLIC'';
    EXECUTE ''CREATE TABLE '' || matview || '' AS SELECT * FROM '' ||view_name;
    EXECUTE ''REVOKE ALL ON '' || matview || '' FROM PUBLIC'';
    EXECUTE ''GRANT SELECT ON '' || matview || '' TO PUBLIC'';
    INSERT INTO matviews (mv, view, last_refresh,refresh_time)
    VALUES (matview, view_name, CURRENT_TIMESTAMP,0);
    RETURN;
    END
    ';
    CREATE OR REPLACE FUNCTION drop_matview(NAME) RETURNS VOID SECURITY DEFINER
    LANGUAGE plpgsql AS '
    DECLARE
    matview ALIAS FOR $1;
    entry matviews%ROWTYPE;
    BEGIN
    SELECT * INTO entry FROM matviews WHERE mv = matview;
    IF NOT FOUND THEN
    RAISE EXCEPTION ''Materialized view % does not exist.'', matview;
    END IF;
    EXECUTE ''DROP TABLE '' || matview;
    DELETE FROM matviews WHERE mv=matview;
    RETURN;
    END
    ';
    CREATE OR REPLACE FUNCTION refresh_matview(name) RETURNS VOID
    SECURITY DEFINER
    LANGUAGE plpgsql AS '
    DECLARE
    matview ALIAS FOR $1;
    entry matviews%ROWTYPE;
    start_time TIMESTAMP WITH TIME ZONE;
    BEGIN
    start_time:= timeofday()::timestamp;
    RAISE NOTICE ''Matview Refresh Start Time is %s'',start_time;
    SELECT * INTO entry FROM matviews WHERE mv = matview;
    IF NOT FOUND THEN
    RAISE EXCEPTION ''Materialized view % does not exist.'', matview;
    END IF;
    EXECUTE ''DELETE FROM '' || matview;
    EXECUTE ''INSERT INTO '' || matview
    || '' SELECT * FROM '' || entry.view;
    RAISE NOTICE ''Matview Refresh End Time is %s (total %seconds)'',
    timeofday()::timestamp,round(extract(epoch from(timeofday()::timestamp-start_time)));
    UPDATE matviews
    SET last_refresh= timeofday()::timestamp,
    refresh_time= round(extract(epoch from ( timeofday()::timestamp-start_time)))
    WHERE mv=matview;
    RETURN;
    END
    ';
  • Step 2 - creation of the mission-specific view.

    To create the matview we will use the create_matview function.

    
    		bar=# SELECT create_matview('mv_top_customer', 'v_top_customer');
    create_matview
    ----------------
    (1 row)
    Time: 5559.486 ms

    After that the mv_top_customer matview is created and could be used. You can even create indexes on the matview table as needed (though it is suggested to drop them before refresh in some cases).

    Now select from the created matview will run in a fraction of the time: (vs original select from source tables)

    
    		bar=# explain analyze select * from mv_top_customer;
    Seq Scan on mv_top_customer (cost=0.00..12.70 rows=270 width=272) (actual time=0.008..0.020 rows=100 loops=1)
    Total runtime: 0.044 ms
    (2 rows)
    Time: 0.397 ms
  • Step 3 - Administration of the matview

    To refresh , run select refresh_matview('mv_top_customer');
    
    		bar=# select refresh_matview('mv_top_customer');
    NOTICE: Matview Refresh Start Time is 2013-10-21 23:05:03.297062+03s
    NOTICE: Matview Refresh End Time is 2013-10-21 23:05:08.675053s (total 5 seconds)
    refresh_matview
    -----------------
    (1 row) Time: 5398.158 ms

    Note that refresh still runs the original 6 seconds but we don't care as it is "behind the scenes"

    Last step is to create a simple crontab
    script to refresh the matview as needed, something along the lines of following will work:

    
    		postgres@test-01:~$ crontab -l
    0 * * * * echo "select refresh_matview('mv_top_customer')" | psql -d bar > /tmp/refresh.log 2>&1

    The matviews table will let you know once how long the last refresh took and when the rest refresh occurred.

    
    		bar=# select * from matviews;
    mv | view | last_refresh | refresh_time
    -----------------+----------------+-------------------------------+--------------
    mv_top_customer | v_top_customer | 2013-10-21 23:05:08.675362+03 | 5

    If needed, matview could be dropped using drop_matview('mv_top_customer')

You can change the origin view definition (v_top_customer in this case) and this will be updated in the next matview.
Just note that every change in the view columns or table defenitons into the view will break the
matview and it will have to be dropped and recreated


Questions/Comments? - feel free to contact Cerberus (which is me) on RDBA.net forums under Oracle DBA corner.