Use to_timestamp()
function to convert String to Timestamp (TimestampType) in PySpark. The converted time would be in a default format of MM-dd-yyyy HH:mm:ss.SSS
, I will explain how to use this function with a few examples.
Syntax – to_timestamp()
This function has above two signatures that defined in PySpark SQL Date & Timestamp Functions, the first syntax takes just one argument and the argument should be in Timestamp format ‘MM-dd-yyyy HH:mm:ss.SSS
‘, when the format is not in this format, it returns null.
The second signature takes an additional String argument to specify the format of the input Timestamp; this support formats specified in SimeDateFormat. Using this additional argument, you can cast String from any format to Timestamp type in PySpark.
Convert String to PySpark Timestamp Type
In the below example, we convert the string pattern which is in PySpark default format to Timestamp type, since the input DataFrame column is in default Timestamp format, we use the first signature for conversion. And the second example uses the cast
function to do the same.
In this snippet, we just add a new column timestamp
by converting the input column from string to Timestamp type.
Custom String Format to Timestamp Type
This example converts input timestamp string from custom format to PySpark Timestamp type, to do this, we use the second syntax where it takes an additional argument to specify user-defined patterns for date-time formatting,
In case if you want to convert string to date format use to_date()
the function. And here is another example to convert Timestamp to custom string pattern format.