SQL analyser for Vertica. A wrapper of uber/queryparser.
Features:
- Provides the following information (check out uber/queryparser for more detail):
- What columns were accessed and in what clauses (
SORT,GROUP, etc.) - What are the join columns (
JOIN ON) - What tables were accessed
- Table lineage
- What columns were accessed and in what clauses (
- Built in catalog usage
- Simple json output
- HTTP server mode
Build:
stack setupstack build
Docker:
docker pull lukasmi/vertica-query-parser:latest(lukasmi/vertica-query-analyser)
Usage:
vq-analyser catalog_file [query_file]- specify catalog file and optionally query file, if query file is not specified then stdin is used.vq-analyser catalog_file -d directory- specify catalog file and directory in which all*.sqlfiles will be analysed. Will create files after the original ones:<original>.sql.jsonon analysis success and<original>.sql.txton failure.vq-analyser catalog_file -s -p port- specify catalog file and port on which to run http server- API
200andapplication/jsonon success400andtext/plainwhen input is incorrect (unparsable/unresolvable)404andtext/plainwhen resource path is specified405andtext/plainwhen not usingPOST500andtext/plainon server internal error503andtext/plainon server processing timeout
- API
Examples:
catalog.sql:
CREATE SCHEMA demo;
CREATE TABLE demo.foo (a INT, b INT);
CREATE TABLE demo.bar AS SELECT * FROM demo.foo;Commands (pipes to stedolan/jq to format the json):
- Single file -
vq-analyser catalog.sql queries.sql | jq '.'- Using docker -
docker run -it -v /host/path:/container/path lukasmi/vertica-query-analyser /container/path/catalog.sql /container/path/queries.sql | jq '.'
- Using docker -
- Using http server mode:
- Start the server -
vq-analyser catalog.sql -s -p 3000- Using docker -
docker run -it -p 3000:3000 -v /host/path:/container/path lukasmi/vertica-query-analyser /container/path/catalog.sql -s -p 3000
- Using docker -
- Issue POST -
cat queries.sql | curl localhost:3000 --data-binary @- | jq '.'
- Start the server -
- Column resolving
queries.sql:
SELECT * FROM demo.bar WHERE a IS NOT NULL ORDER BY a;
- output:
[ { "statement": "SELECT", "lineage": [], "columns": [ { "clause": "ORDER", "column": "demo.bar.a" }, { "clause": "SELECT", "column": "demo.bar.a" }, { "clause": "WHERE", "column": "demo.bar.a" }, { "clause": "SELECT", "column": "demo.bar.b" } ], "tables": [ "demo.bar" ], "joins": [] } ] - Joins
queries.sql:
SELECT foo.a, foo.b FROM demo.foo foo JOIN demo.bar bar ON foo.a = bar.a;
- output:
[ { "statement": "SELECT", "lineage": [], "columns": [ { "clause": "JOIN", "column": "demo.bar.a" }, { "clause": "JOIN", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.b" } ], "tables": [ "demo.bar", "demo.foo" ], "joins": [ { "left": "demo.bar.a", "right": "demo.foo.a" } ] } ] - Lineage
queries.sql:
CREATE TABLE demo.baz AS SELECT foo.a, foo.b FROM demo.foo foo JOIN demo.bar bar ON foo.a = bar.a;
- output
[ { "statement": "CREATE_TABLE", "lineage": [ { "decedent": "demo.baz", "ancestors": [ "demo.bar", "demo.foo" ] } ], "columns": [ { "clause": "JOIN", "column": "demo.bar.a" }, { "clause": "JOIN", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.b" } ], "tables": [ "demo.bar", "demo.baz", "demo.foo" ], "joins": [ { "left": "demo.bar.a", "right": "demo.foo.a" } ] } ]