PySpark functions provide to_date() function to convert timestamp to date (DateType), this is ideally achieved by just truncating the time part from the Timestamp column. In this tutorial, I will show you a PySpark example of how to convert timestamps to date on DataFrame & SQL.
to_date() – function formats Timestamp to Date.
PySpark timestamp (TimestampType
) consists of value in the format yyyy-MM-dd HH:mm:ss.SSSS
and Date (DateType
) format would be yyyy-MM-dd
. Use to_date() function to truncate time from the Timestamp or to convert the timestamp to date on DataFrame column.
Using to_date() – Convert Timestamp String to Date
In this example, we will use to_date()
function to convert TimestampType
(or string) column to DateType
column. The input to this function should be a timestamp column or string in TimestampType format and it returns just date in DateType column.
Convert TimestampType (timestamp) to DateType (date)
This example converts the PySpark TimestampType column to DateType.
Using Column cast() Function
Here is another way to convert TimestampType (timestamp string) to DateType using cast
function.
PySpark SQL – Convert Timestamp to Date
Following are similar examples using with PySpark SQL. If you are from an SQL background these come in handy.
Complete Example
In this example, we have learned how to cast the timestamp to date column using to_date()
and cast
functions.