Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 11601

Using datetime in Machine Learning

$
0
0

I have a pandas dataset of various features, including datetime feature.It looks like this:

           DD SSCL1 SEG_CLASS_CODE  FCLCLD  PASS_BK  SA  AU  DTD  DAY_OF_YEAR0  2018-01-01     C              C       1        0   0  18   -1            11  2018-01-01     C              C       0        0   7  26   -1            12  2018-01-01     C              C       0        0   9  18   -1            13  2018-01-01     C              C       1       10   0  18   -1            14  2018-01-01     C              C       0        9   1  18   -1            1

I need to use DD column to train the model. The problem is how to encode this column?

I can`t use Cyclic Feature Encoding, described here:How to handle date variable in machine learning data pre-processingbecause in the field for which I am teaching the model, 2020 is not the same as 2018, and February 2022 is not February 2023. So, years, months and days sometimes differ from each other.

My idea is to somehow transform datetime to int. For example, to get total days or hours or minutes or seconds, but i do not know the starting point (Maybe January 1st, 1970 as usual).The easiest way to use: dataset['DD']).apply(lambda x: x.value), so I`ll get something like this:

0          15147648000000000001          15147648000000000002          15147648000000000003          15147648000000000004          1514764800000000000                  ...         1450583    15775776000000000001450584    15776640000000000001450585    15776640000000000001450586    15771456000000000001450587    1577232000000000000Name: DD, Length: 1450588, dtype: int64

After that I would like to use MinMaxScaler or Standardscaler.

So, are there any ways to encode datetime according to my requirements?


Viewing all articles
Browse latest Browse all 11601

Trending Articles