Data connectivity & integrationSQL warehousingSQL dialect

SQL dialect

Foundry SQL is generally a subset of Spark SQL with ANSI compliance.

Quick reference

Common operations

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Create a table
CREATE TABLE `/path/to/table` (id INT, name STRING);

-- Insert data
INSERT INTO `/path/to/table` VALUES (1, 'foo'), (2, 'bar');

-- Basic query
SELECT * FROM `/path/to/table` WHERE id = 1;

-- Join tables
SELECT a.*, b.name
FROM `/path/to/table1` a
LEFT JOIN `/path/to/table2` b ON a.id = b.id;

-- Aggregate
SELECT category, COUNT(*), AVG(price)
FROM `/path/to/products`
GROUP BY category
HAVING COUNT(*) > 10;

-- Window function
SELECT name, salary,
       RANK() OVER (ORDER BY salary DESC) as rank
FROM `/path/to/employees`;

-- CTE (Common Table Expression)
WITH summary AS (
  SELECT category, SUM(amount) as total
  FROM `/path/to/sales`
  GROUP BY category
)
SELECT * FROM summary WHERE total > 1000;

Commonly used functions

Category	Functions
Aggregation	`count()`, `sum()`, `avg()`, `min()`, `max()`, `count(DISTINCT col)`, `collect_set()`
String	`upper()`, `lower()`, `trim()`, `concat()`, `concat_ws()`, `contains()`, `split()`, `substr()`
Date/Time	`current_date()`, `current_timestamp()`, `date_add()`, `date_diff()`, `to_date()`, `year()`, `month()`, `day()`
Math	`abs()`, `round()`, `ceil()`, `floor()`, `pow()`, `sqrt()`, `mod()`
Array	`array()`, `array_contains()`, `array_size()`, `array_distinct()`, `explode()`, `flatten()`
Conditional	`CASE WHEN ... THEN ... END`, `coalesce()`, `regexp_like()`, `regexp_extract()`
Window	`ROW_NUMBER()`, `RANK()`, `DENSE_RANK()`, `LAG()`, `LEAD()`

Table references

Copied!1
2
3
4
5
6
7
8
-- By path
SELECT * FROM `/path/to/table`;

-- By RID
SELECT * FROM `ri.rid.for.table.1234`;

-- With branch
SELECT * FROM `/path/to/table`.branch_master;

Key syntax notes

Subqueries must be wrapped in parentheses.
Created tables are Iceberg by default but can be overridden with USING <format> syntax.
Only Iceberg tables support INSERT, UPDATE, and DELETE.

Table creation

CREATE [ OR REPLACE ] [ ( col_name1 col_type1, ... ) ] [ USING table_format ] [ AS select_statement ]

If not specified, the table format will be Iceberg. Valid table formats include:

Iceberg: iceberg
Parquet: parquet
Avro: avro

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- examples
CREATE TABLE `/path/to/table` AS
SELECT *
FROM `/path/to/table`;

CREATE TABLE `/path/to/table` USING parquet AS
SELECT *
FROM `/path/to/table`;

CREATE TABLE `/path/to/table`
(
    id   INT,
    name STRING
);

CREATE OR REPLACE TABLE `/path/to/table` (id INT, name STRING);

Table alteration

Iceberg tables support data appends with INSERT, UPDATE, and DELETE statements. Non-Iceberg tables do not currently support table alteration.

Insert

INSERT [ INTO ] table_identifier { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] | query }

Copied!1
2
3
4
-- examples
INSERT INTO `/path/to/table` VALUES (1, 'foo'), (2, 'bar');

INSERT INTO `/path/to/table` SELECT id, name FROM `/path/to/table2`;

Update

UPDATE table_identifier
SET column_name = value [, ...]
[ WHERE condition ]

Copied!1
2
3
4
5
6
7
-- examples
UPDATE `/path/to/table`
SET col = 'new value';

UPDATE `/path/to/table`
SET col = 'new value'
WHERE col = 'old value';

Delete

DELETE FROM table_identifier WHERE condition

Copied!1
DELETE FROM `/path/to/table` WHERE col = 'to delete'

Alter metadata

Branches can be created on Iceberg tables with the ALTER TABLE syntax.

ALTER TABLE table_identifier CREATE BRANCH branch_name

Copied!1
ALTER TABLE `/path/to/table` CREATE BRANCH feature_branch

Query composition

[ WITH with_query [ , ... ] ]
select_statement
  [ { UNION | INTERSECT | EXCEPT } select_statement, ... ]
  [ ORDER BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] } ]
  [ WINDOW { window_name AS ( window_spec ) [, ...] } ]
  [ LIMIT { ALL | expression } ]

WITH clause (common table expressions)

Where with_query is:

cte_name [ ( column_name [, ...] ) ] AS ( select_statement )

Copied!1
2
3
4
5
6
7
-- examples
WITH sales_summary (cust_id, total_sales, order_count) AS (
  SELECT customer_id, SUM(total_amount), COUNT(*)
  FROM `/data/orders`
  GROUP BY customer_id
)
SELECT * FROM sales_summary WHERE total_sales > 10000;

SELECT statement

Where select_statement is:

SELECT [ ALL | DISTINCT ] { [ named_expression | regex_column_names | * ] [, ...] }
  FROM { from_item [, ...] }
  [ WHERE boolean_expression ]
  [ GROUP BY { expression | ROLLUP(...) | CUBE(...) | GROUPING SETS(...) } [ , ... ] ]
  [ HAVING boolean_expression ]
  [ LIMIT { ALL | expression } [ OFFSET expression ] ]

Named expressions

Where named_expression is one of:

Expression with optional alias:

expression [ AS alias ]

Copied!1
2
3
4
-- examples
SELECT count(*) AS total_count FROM `/path/to/table`;

SELECT name, salary * 1.1 AS adjusted_salary FROM `/path/to/employees`;

Window function:

window_function() OVER ( window_spec )
window_function() OVER window_name

Where window_spec is:

[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ]

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
-- examples
SELECT name, ROW_NUMBER() OVER (ORDER BY salary DESC) as rank
FROM `/path/to/employees`;

SELECT category, product_id,
       ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) as category_rank
FROM `/path/to/products`;

-- Using named window
SELECT customer_id, order_date, total,
       ROW_NUMBER() OVER w as order_sequence
FROM `/path/to/orders`
WINDOW w AS (PARTITION BY customer_id ORDER BY order_date);

FROM clause

Where from_item is one of:

Table reference

Tables can be referenced by filesystem path or resource identifier (RID). Both support optional branch specifications for querying specific Iceberg branches. Branch names that contain special characters need to be escaped with backticks.

{ `/path/to/table`[.branch_<branch name>] | `ri.rid.for.table.1234`[.branch_<branch name>] }

Copied!1
2
3
4
5
6
7
8
9
10
-- examples
SELECT * FROM `/path/to/table`;

SELECT * FROM `/path/to/table`.branch_master;

SELECT * FROM `/path/to/table` WHERE id = 1;

SELECT * FROM `ri.rid.for.table.1234`;

SELECT * FROM `ri.rid.for.table`.`branch_abc\xyz`;

Time travel queries to a specific version are possible with the VERSION AS OF syntax using a snapshot ID for Iceberg tables and a transaction RID for Foundry datasets.

Copied!1
2
3
4
-- Examples
SELECT * FROM `/path/to/table` VERSION AS OF `ri.foundry.main.transaction.1234`;

SELECT * FROM `/path/to/table` VERSION AS OF 1234567;

Other time travel syntax is not currently supported.

Join

relation
  { CROSS | LEFT [ OUTER ] | RIGHT [ OUTER ] | FULL [ OUTER ] }
  JOIN relation
  ON boolean_expression

Copied!1
2
3
4
-- examples
SELECT * FROM `/path/to/table` LEFT JOIN `/path/to/table2` ON id = id2;

SELECT * FROM `/path/to/table` RIGHT OUTER JOIN `/path/to/table2` ON id = id2;

Subquery

( select_statement )

Copied!1
2
3
4
5
6
7
8
9
-- examples
SELECT * FROM (SELECT * FROM `/path/to/table`);

SELECT * FROM (SELECT * FROM `/path/to/table`) WHERE id = 1;

SELECT * FROM (
  VALUES (9001, 1001, 1, DATE '2023-06-01', 'credit_card', 1329.98, 'completed',
          TIMESTAMP '2023-06-01 11:00:00')
) AS orders(order_id, customer_id, product_id, order_date, payment_method, total, status, created_at);

Subqueries must be wrapped in parentheses. This is enforced in Foundry even though it is not always required in Spark SQL.

Complete query examples

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Complex query with joins and filtering
SELECT DISTINCT o.order_id,
                c.name,
                o.total_amount,
                p.category
FROM `/data/orders` o
         LEFT JOIN `/data/customers` c ON o.customer_id = c.id
         CROSS JOIN (SELECT product_id, category
                     FROM `/data/products`
                     WHERE category = 'Electronics') p ON o.product_id = p.product_id
WHERE o.order_date > DATE '2024-01-01'
  AND o.total_amount > 1000.0
ORDER BY o.total_amount DESC
LIMIT 100;

-- Aggregation with HAVING clause
SELECT c.name,
       COUNT(o.order_id) as order_count,
       SUM(o.total_amount) as total_spent
FROM `/data/customers` c
       LEFT JOIN `/data/orders` o ON c.id = o.customer_id
GROUP BY c.name
HAVING COUNT(o.order_id) > 0;

-- UNION query with multiple joins
SELECT emp_id, name, dept, 'Active' AS status
FROM `/hr/employees` e
         LEFT OUTER JOIN `/hr/departments` d ON e.dept_id = d.id
WHERE e.active = true
UNION
SELECT emp_id, name, dept, 'Inactive' AS status
FROM (SELECT *
      FROM `/hr/employees`
      WHERE termination_date < current_timestamp()) e
         RIGHT JOIN `/hr/departments` d ON e.dept_id = d.id
WHERE e.active = false
ORDER BY dept ASC, status DESC NULLS LAST;

Iceberg references

Foundry SQL supports Iceberg references to query metadata. Review the official documentation ↗.

Copied!1
2
3
4
-- examples
SELECT * FROM `/path/to/table/`.files;

SELECT * FROM `/path/to/table/`.snapshots;

Functions

Functions are available to transforms data in more complex ways. Refer to the Spark SQL documentation ↗ for detailed information on each function below.

Array functions

Function	Description	Example	Furnace	Ontology
`array(expr, ...)`	Create an array with given elements	`SELECT array(1, 2, 3)`	Supported	Supported
`array_distinct(array)`	Remove duplicate values from array	`SELECT array_distinct(tags) FROM table`	Supported	Not supported
`array_intersect(array1, array2)`	Elements in both arrays without duplicates	`SELECT array_intersect(arr1, arr2) FROM table`	Supported	Not supported
`array_union(array1, array2)`	Elements in either array without duplicates	`SELECT array_union(arr1, arr2) FROM table`	Supported	Not supported
`array_except(array1, array2)`	Elements in array1 but not in array2	`SELECT array_except(arr1, arr2) FROM table`	Supported	Not supported
`array_join(array, delimiter [, nullReplacement])`	Concatenate array elements with delimiter	`SELECT array_join(tags, ',') FROM table`	Supported	Not supported
`array_max(array)`	Maximum value in array	`SELECT array_max(ARRAY(1, 5, 3))`	Supported	Not supported
`array_min(array)`	Minimum value in array	`SELECT array_min(ARRAY(1, 5, 3))`	Supported	Not supported
`array_position(array, element)`	Position of first occurrence (1-based, 0 if not found)	`SELECT array_position(tags, 'item') FROM table`	Supported	Not supported
`array_remove(array, element)`	Remove all matching elements	`SELECT array_remove(tags, 'old') FROM table`	Supported	Not supported
`array_size(array)`	Number of elements in array	`SELECT array_size(tags) FROM table`	Supported	Not supported
`array_contains(array, value)`	True if array contains value	`SELECT array_contains(tags, 'new') FROM table`	Supported	Not supported
`array_repeat(element, count)`	Create array with element repeated count times	`SELECT array_repeat('x', 5)`	Supported	Not supported
`arrays_overlap(array1, array2)`	True if arrays have common non-null elements	`SELECT arrays_overlap(arr1, arr2) FROM table`	Supported	Not supported
`flatten(arrayOfArrays)`	Flatten nested arrays into single array	`SELECT flatten(nested_arrays) FROM table`	Supported	Not supported
`sequence(start, stop, step)`	Generate sequence of integers	`SELECT sequence(1, 10, 2)`	Supported	Not supported
`shuffle(array)`	Random permutation of array	`SELECT shuffle(tags) FROM table`	Supported	Not supported
`slice(array, start, length)`	Subset of array from start with length	`SELECT slice(tags, 1, 2) FROM table`	Supported	Not supported
`sort_array(array[, ascendingOrder])`	Sort array in ascending (default) or descending order	`SELECT sort_array(tags) FROM table`	Supported	Not supported
`get(array, index)`	Returns element at given index (0-based)	`SELECT get(tags, 0) FROM table`	Supported	Not supported
`explode(array)`	Generates a new row for each element in the array	`SELECT explode(tags) FROM table`	Supported	Not supported

Map functions

Function	Description	Example	Furnace	Ontology
`map(key1, value1, key2, value2, ...)`	Create a map from key-value pairs	`SELECT map('a', 1, 'b', 2)`	Supported	Not supported

Struct functions

Function	Description	Example	Furnace	Ontology
`struct(expr1, expr2, ...)`	Create a struct from expressions	`SELECT struct('red', 'large', 2.0)`	Supported	Not supported

Date and timestamp functions

Function	Description	Example	Furnace	Ontology
`current_timestamp()`	Returns current timestamp	`SELECT current_timestamp()`	Supported	Supported
`current_date()`	Returns current date	`SELECT current_date()`	Supported	Supported
`date_add(expr, num_days)`	Add days to date	`date_add(order_date, 7)`	Supported	Not supported
`date_sub(expr, num_days)`	Subtract days from date	`date_sub(order_date, 7)`	Supported	Not supported
`from_unixtime(expr [, format])`	Convert Unix timestamp to timestamp	`from_unixtime(1609459200)`	Supported	Not supported
`unix_timestamp([expr] [, format])`	Convert timestamp to Unix timestamp	`unix_timestamp(current_timestamp())`	Supported	Not supported
`day(date)`	Extract day of month from date	`SELECT day(order_date) FROM table`	Supported	Supported
`hour(timestamp)`	Extract hour from timestamp	`SELECT hour(order_timestamp) FROM table`	Supported	Not supported
`minute(timestamp)`	Extract minute from timestamp	`SELECT minute(order_timestamp) FROM table`	Supported	Not supported
`month(date)`	Extract month from date	`SELECT month(order_date) FROM table`	Supported	Supported
`quarter(date)`	Extract quarter from date	`SELECT quarter(order_date) FROM table`	Supported	Supported
`second(timestamp)`	Extract second from timestamp	`SELECT second(order_timestamp) FROM table`	Supported	Not supported
`year(date)`	Extract year from date	`SELECT year(order_date) FROM table`	Supported	Supported
`to_date(str [, format])`	Convert string to date	`SELECT to_date('2024-01-15')`	Supported	Not supported
`date_diff(date1, date2)`	Subtract two dates	`SELECT date_diff('2020-01-01', '2020-01-02')`	Supported	Supported

Mathematical functions

Function	Description	Example	Furnace	Ontology
`abs(x)`	Absolute value	`SELECT abs(-5)`	Supported	Supported
`acos(x)`	Arc cosine	`SELECT acos(0.5)`	Supported	Not supported
`asin(x)`	Arc sine	`SELECT asin(0.5)`	Supported	Not supported
`atan(x)`	Arc tangent	`SELECT atan(1)`	Supported	Not supported
`atan2(y, x)`	Arc tangent of y/x	`SELECT atan2(1, 1)`	Supported	Not supported
`cbrt(x)`	Cube root	`SELECT cbrt(27)`	Supported	Not supported
`ceil(x)`	Round up to nearest integer	`SELECT ceil(3.7)`	Supported	Not supported
`ceiling(x)`	Round up to nearest integer	`SELECT ceiling(3.7)`	Supported	Not supported
`cos(x)`	Cosine	`SELECT cos(pi())`	Supported	Not supported
`cosh(x)`	Hyperbolic cosine	`SELECT cosh(0)`	Supported	Not supported
`degrees(x)`	Convert radians to degrees	`SELECT degrees(pi())`	Supported	Not supported
`e()`	Euler's number	`SELECT e()`	Supported	Not supported
`exp(x)`	e raised to power x	`SELECT exp(1)`	Supported	Not supported
`floor(x)`	Round down to nearest integer	`SELECT floor(3.7)`	Supported	Not supported
`ln(x)`	Natural logarithm	`SELECT ln(e())`	Supported	Not supported
`log(base, x)`	Logarithm with specified base	`SELECT log(2, 8)`	Supported	Not supported
`log10(x)`	Base 10 logarithm	`SELECT log10(100)`	Supported	Not supported
`log2(x)`	Base 2 logarithm	`SELECT log2(8)`	Supported	Not supported
`mod(n, m)`	Modulus (remainder)	`SELECT mod(10, 3)`	Supported	Not supported
`pi()`	Pi constant	`SELECT pi()`	Supported	Not supported
`pow(x, p)`	x raised to power p	`SELECT pow(2, 3)`	Supported	Not supported
`power(x, p)`	x raised to power p	`SELECT power(2, 3)`	Supported	Not supported
`radians(x)`	Convert degrees to radians	`SELECT radians(180)`	Supported	Not supported
`rand()`	Random value between 0 and 1	`SELECT rand()`	Supported	Not supported
`random()`	Random value between 0 and 1	`SELECT random()`	Supported	Not supported
`round(x, d)`	Round to d decimal places	`SELECT round(3.7, 1)`	Supported	Not supported
`sign(x)`	Sign function (-1, 0, 1)	`SELECT sign(-5)`	Supported	Not supported
`sin(x)`	Sine	`SELECT sin(pi() / 2)`	Supported	Not supported
`sinh(x)`	Hyperbolic sine	`SELECT sinh(0)`	Supported	Not supported
`sqrt(x)`	Square root	`SELECT sqrt(16)`	Supported	Not supported
`tan(x)`	Tangent	`SELECT tan(pi() / 4)`	Supported	Not supported
`tanh(x)`	Hyperbolic tangent	`SELECT tanh(0)`	Supported	Not supported

String functions

Function	Description	Example	Furnace	Ontology
`chr(n)`	Returns character from Unicode code point	`SELECT chr(65)`	Supported	Not supported
`char(n)`	Returns character from code point (alias for chr)	`SELECT char(65)`	Supported	Not supported
`contains(str, substr)`	True if string contains substring	`SELECT contains('hello', 'ell')`	Supported	Not supported
`concat_ws(sep, str1, ...)`	Concatenate strings with separator	`SELECT concat_ws(',', 'a', 'b')`	Supported	Not supported
`concat(str1, ...)`	Concatenate strings	`SELECT concat('a', 'b')`	Supported	Supported
`length(str)`	Length of string in characters	`SELECT length('hello')`	Supported	Supported
`char_length(str)`	Length of string (alias for length)	`SELECT char_length('hello')`	Supported	Supported
`character_length(str)`	Length of string (alias for length)	`SELECT character_length('hello')`	Supported	Supported
`lower(str)`	Convert to lowercase	`SELECT lower('HELLO')`	Supported	Supported
`upper(str)`	Convert to uppercase	`SELECT upper('hello')`	Supported	Supported
`lpad(str, len, pad)`	Left pad string to length	`SELECT lpad('hi', 5, 'x')`	Supported	Not supported
`rpad(str, len, pad)`	Right pad string to length	`SELECT rpad('hi', 5, 'x')`	Supported	Not supported
`ltrim(str)`	Remove leading whitespace	`SELECT ltrim(' hello')`	Supported	Not supported
`rtrim(str)`	Remove trailing whitespace	`SELECT rtrim('hello ')`	Supported	Not supported
`trim(str)`	Remove leading and trailing whitespace	`SELECT trim(' hello ')`	Supported	Supported
`replace(str, search, replace)`	Replace all occurrences	`SELECT replace('hello', 'l', 'x')`	Supported	Supported
`substr(str, pos, len)`	Extract substring	`SELECT substr('hello', 2, 3)`	Supported	Supported
`substring(str, pos, len)`	Extract substring	`SELECT substring('hello', 2, 3)`	Supported	Supported
`split(str, delimiter)`	Split string into array	`SELECT split('a,b,c', ',')`	Supported	Not supported
`split_part(str, delimiter, index)`	Get part from split string (1-based)	`SELECT split_part('a,b,c', ',', 2)`	Supported	Not supported
`instr(str, substr)`	Find substring position (1-based, 0 if not found)	`SELECT instr('hello', 'll')`	Supported	Not supported
`position(substr IN str)`	Find substring position (SQL standard syntax)	`SELECT position('ll' IN 'hello')`	Supported	Not supported
`startswith(str, prefix)`	Check if string starts with prefix	`SELECT startswith('hello', 'he')`	Supported	Not supported
`luhn_check(str)`	Validate string using Luhn algorithm	`SELECT luhn_check('79927398713')`	Supported	Not supported

Conditional functions

Function	Description	Example	Furnace	Ontology
`coalesce(expr1, expr2, ...)`	Returns first non-null expression	`SELECT coalesce(col, 'default')`	Supported	Not supported
`regexp_like(str, pattern)`	Test if string matches regex pattern	`SELECT regexp_like('hello123', '[0-9]+')`	Supported	Not supported
`regexp_extract(str, pattern)`	Extract first substring matching pattern	`SELECT regexp_extract('hello123', '[0-9]+')`	Supported	Supported
`regexp_extract_all(str, pattern)`	Extract all substrings matching pattern	`SELECT regexp_extract_all('a1b2c3', '[0-9]+')`	Supported	Not supported
`regexp_replace(str, pattern, replacement)`	Replace all matches with replacement string	`SELECT regexp_replace('hello123', '[0-9]+', 'X')`	Supported	Supported
`regexp_count(str, pattern)`	Count occurrences of pattern in string	`SELECT regexp_count('a1b2c3', '[0-9]+')`	Supported	Not supported

Conversion functions

Function	Description	Example	Furnace	Ontology
`cast(expr AS type)`	Convert value to specified type	`cast(count() AS float)`	Supported	Supported
`try_cast(expr AS type)`	Convert value to specified type, returns NULL on failure	`try_cast('abc' AS int)`	Supported	Not supported

Aggregate functions

Function	Description	Example	Furnace	Ontology
`count()`	Count all rows	`SELECT count() FROM table`	Supported	Supported
`count(DISTINCT col)`	Count distinct values in column	`SELECT count(DISTINCT id) FROM table`	Supported	Supported if single property
`sum(col)`	Sum values in column	`SELECT sum(cost) FROM table`	Supported	Supported
`avg(col)`	Mean average values in column	`SELECT avg(weight) FROM table`	Supported	Supported
`min(col)`	Minimum value in column	`SELECT min(price) FROM table`	Supported	Supported
`max(col)`	Maximum value in column	`SELECT max(price) FROM table`	Supported	Supported
`min_by(x, y)`	Value of x associated with minimum value of y	`SELECT min_by(name, age) FROM table`	Supported	Not supported
`max_by(x, y)`	Value of x associated with maximum value of y	`SELECT max_by(name, age) FROM table`	Supported	Not supported
`count_if(condition)`	Count rows where condition is true	`SELECT count_if(age > 18) FROM table`	Supported	Not supported
`bool_and(condition)`	True if all values are true	`SELECT bool_and(active) FROM table`	Supported	Not supported
`bool_or(condition)`	True if any value is true	`SELECT bool_or(active) FROM table`	Supported	Not supported
`bit_and(col)`	Bitwise AND of all non-null integer values	`SELECT bit_and(flags) FROM table`	Supported	Not supported
`bit_or(col)`	Bitwise OR of all non-null integer values	`SELECT bit_or(flags) FROM table`	Supported	Not supported
`bit_xor(col)`	Bitwise XOR of all non-null integer values	`SELECT bit_xor(flags) FROM table`	Supported	Not supported
`every(condition)`	True if all values are true (alias for bool_and)	`SELECT every(active) FROM table`	Supported	Not supported
`stddev(col)`	Sample standard deviation	`SELECT stddev(salary) FROM table`	Supported	Supported
`stddev_pop(col)`	Population standard deviation	`SELECT stddev_pop(salary) FROM table`	Supported	Supported
`stddev_samp(col)`	Sample standard deviation (alias for stddev)	`SELECT stddev_samp(salary) FROM table`	Supported	Supported
`variance(col)`	Sample variance	`SELECT variance(salary) FROM table`	Supported	Not supported
`var_pop(col)`	Population variance	`SELECT var_pop(salary) FROM table`	Supported	Not supported
`var_samp(col)`	Sample variance	`SELECT var_samp(salary) FROM table`	Supported	Not supported
`corr(y, x)`	Pearson correlation coefficient	`SELECT corr(sales, ads) FROM table`	Supported	Not supported
`covar_pop(y, x)`	Population covariance	`SELECT covar_pop(y, x) FROM table`	Supported	Not supported
`covar_samp(y, x)`	Sample covariance	`SELECT covar_samp(y, x) FROM table`	Supported	Not supported
`skewness(col)`	Skewness	`SELECT skewness(values) FROM table`	Supported	Not supported
`regr_intercept(y, x)`	Linear regression intercept	`SELECT regr_intercept(y, x) FROM table`	Supported	Not supported
`regr_slope(y, x)`	Linear regression slope	`SELECT regr_slope(y, x) FROM table`	Supported	Not supported
`grouping(col)`	Returns 1 if column is aggregated, 0 otherwise (for use with ROLLUP/CUBE/GROUPING SETS)	`SELECT grouping(category) FROM table GROUP BY ROLLUP(category)`	Supported	Not supported
`collect_set(col)`	Returns an array of unique values from column	`SELECT collect_set(name) FROM table GROUP BY category`	Supported	Not supported

Window functions

Function	Description	Example	Furnace	Ontology
`ROW_NUMBER()`	Assigns sequential row number within partition	`ROW_NUMBER() OVER (ORDER BY col)`	Supported	Supported
`RANK()`	Rank with gaps for ties	`RANK() OVER (ORDER BY salary DESC)`	Supported	Supported
`DENSE_RANK()`	Rank without gaps for ties	`DENSE_RANK() OVER (ORDER BY salary DESC)`	Supported	Supported
`PERCENT_RANK()`	Percentage ranking of value in group	`PERCENT_RANK() OVER (ORDER BY salary)`	Supported	Not supported
`CUME_DIST()`	Cumulative distribution of value	`CUME_DIST() OVER (ORDER BY salary)`	Supported	Not supported
`NTILE(n)`	Divide rows into n buckets	`NTILE(4) OVER (ORDER BY salary)`	Supported	Not supported
`LAG(expr[, offset[, default]])`	Value from previous row	`LAG(salary, 1) OVER (ORDER BY date)`	Supported	Supported
`LEAD(expr[, offset[, default]])`	Value from next row	`LEAD(salary, 1) OVER (ORDER BY date)`	Supported	Supported
`FIRST_VALUE(expr)`	First value in window frame	`FIRST_VALUE(salary) OVER (ORDER BY date)`	Supported	Supported
`LAST_VALUE(expr)`	Last value in window frame	`LAST_VALUE(salary) OVER (ORDER BY date)`	Supported	Supported
`NTH_VALUE(expr, n)`	Nth value in window frame	`NTH_VALUE(salary, 2) OVER (ORDER BY date)`	Supported	Supported
Aggregate functions	Standard aggregates (count, sum, avg, min, max, stddev, variance, etc.) can be used as window functions	`SUM(amount) OVER (PARTITION BY customer_id)`	Supported	Only count, sum, min max, avg

Window functions require an OVER clause with optional PARTITION BY and ORDER BY:

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- examples
SELECT id, name, ROW_NUMBER() OVER (ORDER BY name) as row_num
FROM `/path/to/table`;

SELECT category, name, price,
       ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) as rank_in_category
FROM `/path/to/products`;

SELECT order_id, customer_id, order_date, total_amount,
       SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) as running_total,
        AVG(total_amount) OVER (PARTITION BY customer_id) as customer_avg,
        COUNT(*) OVER (PARTITION BY customer_id) as customer_order_count
FROM `/path/to/orders`;

SELECT id, amount,
       ROW_NUMBER() OVER w as sequence_num
FROM `/path/to/transactions`
WINDOW w AS (PARTITION BY customer_id ORDER BY transaction_date);

Misc functions

Function	Description	Example	Furnace	Ontology
`zorder()`	Iceberg Z-order optimization (Iceberg only)	`CALL zorder('table', 'col1,col2')`	Supported	Not supported

Geospatial functions

Function	Description	Example	Furnace	Ontology
`st_point(x, y)`	Creates a point geometry from X, Y coordinates	`SELECT st_point(-122.4, 37.8)`	Supported	Not supported
`st_contains(geom1, geom2)`	Returns true if geom1 contains geom2	`WHERE st_contains(boundary, location)`	Supported	Not supported
`st_equals(geom1, geom2)`	Returns true if geometries are spatially equal	`SELECT st_equals(point1, point2)`	Supported	Not supported
`st_geomfromgeojson(json)`	Creates geometry from GeoJSON string	`SELECT st_geomfromgeojson('{"type":"Point","coordinates":[0,0]}')`	Supported	Not supported
`st_astext(geom)`	Returns geometry as Well-Known Text (WKT) string	`SELECT st_astext(st_point(0, 0))`	Supported	Not supported
`st_h3cellids(geom, level, fullCover)`	Returns an array of H3 cells that cover the geometry	`SELECT st_h3cellids(st_point(0, 0), 8, true)`	Supported	Not supported

Geometry columns cannot be returned directly. Use st_astext() to convert to WKT format.

Arithmetic operators

Operator	Description	Example	Furnace	Ontology
`+`	Plus	`SELECT count() + 5`	Supported	Supported
`-`	Minus	`SELECT count() - 1`	Supported	Supported
`*`	Times	`SELECT sum(weight * 2)`	Supported	Supported
`/`	Divide	`SELECT (avg(a) + avg(b)) / 2`	Supported	Supported

Boolean operators

Operator	Description	Example	Furnace	Ontology
`=`	Equal to	`WHERE id = 1`	Supported	Supported
`<>`, `!=`	Not equal to	`WHERE status <> 'inactive'`	Supported	Supported
`<`	Less than	`WHERE age < 18`	Supported	Supported
`<=`	Less than or equal to	`WHERE price <= 100.0`	Supported	Supported
`>`	Greater than	`WHERE salary > 50000`	Supported	Supported
`>=`	Greater than or equal to	`WHERE score >= 90`	Supported	Supported
`AND`	Logical AND	`WHERE active = true AND verified = true`	Supported	Supported
`OR`	Logical OR	`WHERE status = 'new' OR status = 'pending'`	Supported	Supported
`NOT`	Logical NOT	`WHERE NOT deleted`	Supported	Supported
`IN`	Value in list	`WHERE category IN ('A', 'B', 'C')`	Supported	Supported
`LIKE`	Pattern matching	`WHERE name LIKE 'John%'`	Supported	Supported
`EXISTS`	True if subquery returns rows	`WHERE EXISTS (SELECT 1 FROM orders WHERE orders.customer_id = customers.id)`	Supported	Supported

Other expressions

Expression	Description	Example	Furnace	Ontology
`CASE ... WHEN`	Switch expression	`SELECT CASE WHEN mark >= 70 THEN 'A' WHEN mark >= 60 THEN 'B' ELSE 'C' END`	Supported	Supported
`INTERVAL`	Time interval literal	`SELECT order_date + INTERVAL 30 DAY`, `WHERE created_at > current_timestamp() - INTERVAL 1 HOUR`	Supported	Not supported

The IF(condition, trueValue, falseValue) function is not supported. Use CASE WHEN condition THEN trueValue ELSE falseValue END instead.

Case-when statements

Copied!1
2
3
4
CASE WHEN condition THEN result
     [WHEN condition THEN result]
     [ELSE result]
END

Date/time intervals

Copied!1
2
3
4
5
6
7
8
9
10
11
INTERVAL { yearMonthIntervalQualifier | dayTimeIntervalQualifier }

yearMonthIntervalQualifier
 { YEAR [TO MONTH] |
   MONTH }

dayTimeIntervalQualifier
 { DAY [TO { HOUR | MINUTE | SECOND } ] |
   HOUR [TO { MINUTE | SECOND } ] |
   MINUTE [TO SECOND] |
   SECOND }

Copied!1
2
3
4
5
-- examples
INTERVAL '1' YEAR
INTERVAL '6' MONTH
INTERVAL '25' DAY
INTERVAL '6' YEAR TO MONTH

Data types

Type	Spark equivalent	Furnace	Ontology
`BOOLEAN`	`BOOLEAN`	Supported	Supported
`SHORT`	`SHORT`	Supported	Supported
`INT`	`INT`	Supported	Supported
`LONG`	`LONG`	Supported	Supported
`FLOAT`	`FLOAT`	Supported	Supported
`DOUBLE`	`DOUBLE`	Supported	Supported
`DATE`	`DATE`	Supported	Supported
`TIMESTAMP`	`TIMESTAMP`	Supported	Supported
`TIMESTAMP_NTZ`	`TIMESTAMP_NTZ`	Supported	Supported
`STRING`	`STRING`	Supported	Supported
`BINARY`	`BINARY`	Supported	Not supported
`DECIMAL`	`DECIMAL`	Supported	Supported
`MAP<key, value>`	`MAP<key, value>`	Supported	Not supported
`ARRAY<element_type>`	`ARRAY<element_type>`	Supported	Supported
`STRUCT<field1_name: field1_type, field2_name: field2_type, …>`	`STRUCT<field1_name: field1_type, field2_name: field2_type, …>`	Supported	Not supported

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
CREATE TABLE `/path/to/table` (
  col1 BOOLEAN,
  col2 SHORT,
  col3 INT,
  col4 LONG,
  col5 FLOAT,
  col6 DOUBLE,
  col7 DATE,
  col8 TIMESTAMP,
  col9 TIMESTAMP_NTZ,
  col10 STRING,
  col11 BINARY,
  col12 DECIMAL(10, 2),
  col13 MAP<STRING, STRING>,
  col14 ARRAY<STRING>,
  col15 STRUCT<col1: STRING, col2: STRING>
);

←

PREVIOUSOntology SQL

NEXTSQL warehousing in Foundry

→