Spark SQL Date Functions – Complete List

In this post we will address Spark SQL Date Functions, its syntax and what it does. There are 28 Spark SQL Date functions, meant to address string to date, date to timestamp, timestamp to date, date additions, subtractions and current date conversions. If you are a beginner to Spark SQL, please read our post on Spark tutorial for beginners: Apache Spark Concepts for a refresher. Spark SQL is the Apache Spark module for processing structured data. There are a couple of different ways to begin executing Spark SQL queries.

  • API : When writing and executing Spark SQL from Scala, Java, Python or R, a SparkSession is still the entry point. Once a SparkSession has been established, a DataFrame or a Dataset needs to be created on the data before Spark SQL can be executed.

  • Spark SQL CLI : This Spark SQL Command Line interface is a lifesaver for writing and testing out SQL. However, the SQL is executed against Hive, so make sure test data exists in some capacity.

For experimenting with the various Spark SQL Date Functions, using the Spark SQL CLI is definitely the recommended approach. The table below lists the 28 Spark SQL Date functions as of Spark 2.3.2 documentation release.

Function
Syntax
Description
add_months
add_months(start_date, num_months)
Returns a date with months specified by num_months added
current_date
current_date()
Returns the current date
current_timestamp
current_timestamp()
Returns the current timestamp
date
date(expr)
Converts a string to a date
date_add
date_add(start_date,num_days)
Returns the date plus number of days specified by integer num_days
date_format
date_format(timestamp, fmt)
Returns a date in string format
date_sub
date_sub(start_date, num_days)
Returns a date which is num_days less than start_date
date_trunc
date_trunc(fmt,ts)
Returns a timestamp truncated to the unit specified by format [“YEAR”, “YYYY”, “YY”, “MON”, “MONTH”, “MM”, “DAY”, “DD”, “HOUR”, “MINUTE”, “SECOND”, “WEEK”, “QUARTER”]
datediff
datediff(endDate, startDate)
Returns the difference between the two dates
day
day(date)
Returns the day from the date or timestamp
dayofmonth
dayofmonth(date)
Returns the month from the date or timestamp
dayofweek
dayofweek(date)
Returns the day of week for the date or timestamp
dayofyear
dayofyear(date)
Returns the day of year from the date or timestamp
month
month(date)
Returns the month of the year from the date or timestamp
minute
minute(timestamp)
Returns the minute from the string or timestamp
months_between
months_between(timestamp1, timestamp2)
Returns the number of months between the two timestamps
next_day
next_day(start_date, day_of_week)
Returns the day after the start_date plus day_of_week
now
now()
Returns the current timestamp
quarter
quarter(date)
Returns the quarter of the year from the specified date
second
second(timestamp)
Returns the second component of the string or timestamp
timestamp
timestamp(expr)
Returns the timestamp from the specified expression
to_date
to_date(date_str[, fmt])
Converts a string in date format to a Date datatype specified by fmt
to_timestamp
to_timestamp(timestamp[, fmt])
Converts a string in timestamp format to timestamp specified by fmt
to_unix_timestamp
to_unix_timestamp(expr[, pattern])
Returns the UNIX timestamp
to_utc_timestamp
to_utc_timestamp(timestamp, timezone)
Returns a given timestamp in the UTC timestamp format
unix_timestamp
unix_timestamp([expr[, pattern]])
Returns the UNIX timestamp of current or specified time
weekofyear
weekofyear(date)
Returns the week of the year from the given date
year
year(date)
Returns the year component of the date/timestamp

Spark SQL Related Resources

Programming Guide

The official Apache Spark v2.3.2 Spark SQL Programming Guide.

Application Programming Interface

The official Apache Spark SQL API documentation

Email us at : info@obstkel.com

Copyright 2019 © OBSTKEL LLC. All rights Reserved.
Scroll to Top