Time Series - Sequence-to-Seqeuence Prediction

Hi there,

Would like to predict one week of timeseries data (distance between time steps are always the same (data every two minutes)). So let’s say I have data from 4 weeks and would like to predict the 5th week. So it would be a “4-weeks-sequence-to-1-week-sequence” prediction.

My data looks like that:
train/features: Data of 4 weeks
train/labels: Data of week 5th (week which follows after the train data)

How would you handle such a use case?
Would you prefer to work with an LSTM or an Auto-Encoder?

@clasch-student sorry feel free to ping me if you don’t get a reply within a day or so. Sometimes replies slip through the cracks due to being busy.

For time series, Conv1D or LSTM will be fine. I would just try both and see what works better for your use case. There’s no one size fits all solution.

1 Like

@agibsonccc No worries :slight_smile:

Tried it out with LSTM and the results look quite good.

What I was wondering is, if the format of my data is correct.
I have a train_features.csv with 168 time steps per line and a train_labels.csv with 168 time steps per line. Line 1 of train_features.csv is week 1 and line 1 of train_labels.csv is week 2, line 2 of train_features.csv is week 3 and line 2 of train_labels.csv is week 4 and so on…

What do you think about that? Is my data in a good shape or should I do this in another way to improve the learning/training process?

train_features.csv:

0	2	1	0	0	0	0	0	0	0	0	0	1	0	3	1	1	3	2	0	3	1	1	0	0	0	2	1	0	1	2	1	0	2	5	2	2	0	0	2	1	3	1	1	1	1	2	0	1	0	0	0	0	0	0	2	6	4	3	0	0	0	1	0	2	0	0	0	4	1	4	1	1	3	6	6	8	6	4	8	21	14	17	14	16	16	35	36	38	35	36	41	41	34	30	21	9	12	12	14	7	3	2	4	18	19	25	25	28	23	39	59	57	46	39	33	38	37	30	19	12	9	10	11	6	7	5	4	8	9	5	16	14	20	28	35	42	33	38	31	47	39	21	25	20	10	8	7	3	3	7	8	14	25	28	37	21	30	31	48	52	42	36	24	37	22	37	20
10	10	9	6	3	7	10	7	26	27	33	29	20	15	37	30	44	46	42	28	27	25	18	15	13	6	4	3	1	1	1	1	3	2	1	1	1	3	2	1	1	2	1	1	1	1	3	5	3	1	1	1	1	1	1	1	1	1	1	1	1	3	4	5	1	5	3	2	1	1	1	1	3	4	5	3	5	5	11	12	34	37	51	39	34	48	35	24	18	17	8	3	3	7	9	4	4	3	2	2	6	3	10	20	31	35	40	29	32	29	45	39	35	27	24	23	38	32	29	22	14	8	9	10	8	9	7	14	23	36	37	24	29	35	37	49	58	51	42	41	43	29	26	25	15	5	12	9	3	4	17	12	32	29	41	36	32	34	50	63	41	42	33	28	27	27	22	26
8	11	10	10	4	4	10	12	36	35	35	37	42	38	32	60	36	35	26	37	41	22	17	21	9	5	3	1	0	0	0	0	0	0	0	0	0	0	0	2	1	2	0	4	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	2	3	3	1	3	1	6	5	0	2	10	6	8	5	6	11	30	39	48	41	31	42	34	44	56	63	47	46	42	39	39	41	14	9	13	8	2	3	9	26	49	41	53	36	41	43	33	78	64	40	61	42	39	40	35	27	13	11	12	8	6	5	5	13	19	43	63	41	34	35	38	66	58	26	33	39	39	42	29	33	19	17	13	8	12	4	6	10	23	56	36	45	40	30	39	41	48	48	37	45	40	42	47	19
8	3	5	3	0	2	3	2	18	33	30	30	23	32	35	32	52	56	45	35	29	34	22	15	10	4	1	0	0	0	0	0	0	0	0	0	0	0	2	1	2	4	3	4	2	1	1	0	1	0	0	0	0	0	0	0	0	0	0	0	1	3	0	2	2	5	6	2	1	1	7	5	1	5	2	3	4	1	0	8	29	45	52	42	42	32	49	48	58	53	39	44	40	21	28	20	9	6	4	2	4	3	9	6	45	40	57	45	37	28	42	59	65	53	45	36	38	41	50	22	19	10	3	7	4	1	6	13	34	37	46	53	42	34	38	45	63	67	48	39	36	32	43	36	21	6	7	12	10	2	5	20	23	36	33	37	36	32	44	58	60	66	52	44	30	39	25	21
13	12	4	10	8	3	11	7	18	16	25	21	19	20	35	28	35	35	22	16	32	26	22	19	11	1	0	1	2	1	0	0	0	0	0	2	1	0	2	2	3	1	2	3	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	3	6	6	1	3	5	4	8	8	2	14	14	33	30	33	36	32	29	41	43	49	55	44	33	26	27	35	21	19	11	15	8	4	2	17	9	41	50	33	45	25	22	38	46	54	56	34	32	23	23	30	32	14	10	9	12	5	4	3	18	27	42	41	27	28	37	62	52	65	46	42	35	32	37	33	38	20	18	9	8	7	11	19	14	38	45	48	33	28	32	44	46	60	44	33	42	47	53	32	17
12	5	14	4	4	4	9	6	31	29	39	19	21	10	19	47	37	21	27	30	27	11	26	14	8	5	1	5	2	1	0	0	1	0	0	0	0	0	2	0	2	2	1	5	4	6	2	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	3	0	0	0	1	0	0	0	2	10	7	6	8	22	31	49	52	55	27	35	39	40	60	56	57	37	25	24	31	21	15	16	11	14	7	8	12	23	26	37	53	45	20	32	64	49	43	64	40	30	38	34	29	25	16	13	14	9	4	5	12	32	30	33	36	40	33	47	52	59	56	50	41	51	53	33	27	15	17	12	8	5	6	7	10	12	28	38	43	31	44	37	57	64	54	49	43	48	32	40	19	10
9	9	10	5	9	5	2	15	21	23	26	29	25	33	48	31	27	31	13	4	28	31	14	13	9	1	0	0	0	0	0	0	0	0	0	0	2	3	3	0	0	3	0	2	1	1	0	1	0	0	0	0	3	0	2	0	0	0	0	0	0	0	1	4	4	1	1	3	8	4	4	2	1	1	5	6	9	6	2	20	35	29	49	40	33	16	22	30	34	44	38	40	31	40	33	23	25	21	10	8	8	5	4	23	30	32	39	35	26	34	43	41	48	46	46	31	26	39	35	23	27	16	5	6	9	8	18	34	34	36	31	29	32	41	36	54	49	47	42	45	40	32	15	11	9	12	7	6	3	3	10	17	23	25	31	24	18	14	23	34	40	36	29	36	35	28	23	28
13	3	11	5	0	5	8	9	15	27	31	23	16	18	30	43	40	41	39	34	36	38	23	19	11	10	0	2	0	0	2	0	0	0	2	1	2	0	3	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	2	1	0	1	0	0	0	1	1	0	0	4	8	2	0	0	0	0	0	8	5	3	10	15	26	32	29	22	21	30	34	46	30	35	36	24	33	23	27	27	22	10	8	7	6	8	10	15	32	37	40	32	29	28	24	41	24	45	34	28	28	26	32	19	20	14	10	9	7	6	6	20	28	32	40	23	26	19	36	40	31	36	36	35	45	37	44	21	17	13	13	14	11	8	14	24	32	26	47	38	33	26	40	39	46	35	28	51	44	24	23	23

train_labels.csv:

13	11	9	7	8	1	8	13	35	36	31	31	18	30	27	30	34	43	32	33	34	23	17	18	17	13	5	4	1	1	1	1	2	1	1	1	1	1	1	2	3	2	2	6	6	8	5	4	2	1	1	1	3	2	2	3	1	1	1	1	1	1	1	1	1	1	2	1	5	3	4	4	2	6	2	4	5	2	10	15	23	35	31	40	29	30	43	45	49	44	40	35	33	37	42	40	24	19	20	12	6	6	18	14	39	32	35	39	35	32	36	64	54	51	45	36	33	36	28	36	18	10	12	2	3	4	4	17	8	20	29	41	45	48	70	46	45	69	58	52	39	27	23	27	21	12	12	10	4	6	7	16	28	33	40	33	28	21	40	41	42	54	34	36	31	26	14	21
10	7	7	10	4	0	8	11	25	22	31	38	28	37	36	37	37	38	35	34	35	29	20	14	3	3	3	1	0	0	0	0	0	0	0	0	0	1	0	0	1	0	0	0	0	2	3	1	1	0	0	0	3	0	0	1	0	0	0	0	0	1	0	0	0	0	0	2	0	2	1	1	1	0	4	2	3	0	10	15	22	34	50	42	33	42	44	44	47	43	38	42	43	43	30	20	15	9	12	24	12	6	15	12	23	48	34	27	22	33	40	43	55	41	42	28	46	46	22	18	8	4	10	10	8	6	11	16	20	26	22	25	20	24	35	49	84	60	46	30	33	37	26	22	8	7	6	8	2	2	3	16	27	53	39	21	26	30	34	47	45	50	43	34	42	41	27	28
17	13	12	7	2	2	2	15	23	40	52	39	38	28	38	31	36	31	36	26	30	34	24	13	6	7	4	1	0	2	2	1	1	0	0	0	0	4	2	0	2	1	0	0	0	2	1	4	2	1	2	6	4	2	2	1	5	3	1	1	0	0	0	0	4	0	0	5	6	7	1	0	1	0	4	5	6	1	6	9	49	46	44	33	30	34	58	53	60	56	64	48	42	33	22	25	13	14	8	7	2	7	7	15	34	31	33	44	35	24	46	50	66	47	51	39	39	29	36	22	33	11	8	8	8	5	3	4	18	35	43	43	32	39	38	43	48	46	39	38	46	51	25	28	23	15	8	6	5	1	2	6	20	32	44	35	26	22	49	47	58	48	41	31	21	38	31	25
15	5	7	9	12	3	11	19	20	36	57	36	23	18	33	27	31	32	28	28	28	21	7	5	4	5	7	3	4	8	3	0	1	2	2	0	0	0	0	0	1	4	1	3	3	2	0	0	0	0	0	2	4	1	0	6	0	0	2	1	0	0	0	1	2	1	0	0	0	0	0	0	0	0	4	7	5	2	17	11	36	55	37	32	29	39	42	39	47	50	39	46	47	44	33	18	9	7	8	8	4	3	12	15	34	28	40	44	29	41	46	63	59	44	44	43	41	35	25	28	21	6	4	4	5	0	9	4	27	48	38	29	19	49	48	59	63	45	39	35	38	34	50	29	28	10	8	7	4	0	7	10	27	31	31	29	27	31	44	45	51	45	42	46	28	21	26	20
9	11	7	5	2	6	7	16	36	39	43	35	29	25	32	45	45	32	40	26	29	21	12	11	3	5	2	0	1	0	0	1	0	0	4	1	3	1	0	2	0	0	0	0	0	2	1	0	0	0	0	0	0	0	0	0	0	2	0	0	0	1	1	0	6	2	3	0	2	1	0	2	0	0	0	3	6	2	9	12	27	38	46	42	34	49	48	52	49	57	53	39	33	52	44	24	10	6	4	3	2	0	9	10	37	32	34	37	27	30	38	57	45	43	48	45	45	38	32	31	13	9	7	7	2	2	7	12	35	29	46	34	34	39	47	64	46	55	42	29	37	33	39	27	26	19	18	11	7	6	12	17	38	33	30	43	33	37	35	62	64	51	48	58	42	29	30	20
7	6	14	6	3	5	12	10	13	29	35	35	32	26	27	31	42	24	36	28	25	10	9	8	3	1	0	1	0	0	0	0	0	1	0	0	0	2	0	0	2	5	0	0	2	0	0	0	0	0	4	2	1	1	0	0	0	0	4	3	0	0	0	2	1	1	1	0	3	1	0	0	1	0	6	7	5	4	11	13	31	45	47	55	42	50	40	55	54	55	35	34	26	30	18	13	8	11	7	9	8	8	11	13	29	34	36	31	27	27	53	47	68	35	43	32	49	33	26	14	13	10	11	9	6	1	11	31	34	49	46	43	22	43	64	55	40	53	46	34	23	16	8	13	9	2	3	1	5	5	15	14	36	44	39	21	22	27	48	54	47	36	26	25	28	20	22	11
16	14	9	8	8	2	1	2	4	5	4	3	2	5	9	18	25	32	30	31	23	32	35	19	8	2	7	2	0	0	2	0	1	0	2	0	0	0	0	0	0	2	2	3	1	0	1	0	0	0	0	2	1	0	0	0	0	0	0	0	0	0	0	2	2	1	2	0	0	0	0	0	1	2	1	2	1	2	0	3	0	2	5	2	4	3	16	20	24	27	38	28	27	26	30	34	18	10	5	12	8	4	10	10	35	32	35	30	20	26	27	46	43	49	36	44	40	30	27	17	7	13	8	8	6	7	6	8	27	24	35	29	28	30	30	31	43	28	45	43	35	15	27	22	20	20	7	5	3	2	2	3	18	28	33	28	22	38	33	48	47	50	54	34	23	32	34	27
14	11	3	5	2	5	7	13	29	34	22	32	27	34	25	39	37	34	44	43	24	18	17	9	11	8	2	2	3	1	1	0	0	0	0	0	0	0	0	0	0	0	0	2	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	2	2	0	0	0	2	3	1	0	0	2	1	2	1	4	8	15	32	41	22	33	40	39	54	65	59	49	44	45	47	46	25	20	21	15	7	13	8	4	7	10	21	32	38	38	49	42	41	52	42	46	31	28	25	36	22	16	9	13	5	2	2	5	7	17	34	30	34	35	33	36	26	38	41	32	31	41	41	32	32	26	18	17	6	5	7	4	7	6	29	27	54	45	32	35	46	46	54	38	41	34	43	33	28	17

@agibsonccc Please advise :slight_smile:

@clasch-student yes this all looks good. See here for the description:

As long as you follow the format you’re fine.

Just note for supervised learning if you are doing regression, you may want to consider label normalization in addition to input normalization.

1 Like

Hi @agibsonccc

Great - thanks for your help - really appreciate it!

Another question popped up: I am facing with “gaps” in the time series data. Is there any possiblity to fill them automatically in DL4J or do I need to fill them with e.g. Interpolation on Data Preparation part (before Training)?

@clasch-student Generally interopolation or approximation would be the way to \go. At this point it will depend on the characteristics of your time series data though. Seasonality being one example. Could you elaborate on the gaps a little bit?

Hi @agibsonccc

Decided to fill the gaps with interpolation, but did it “manually” - is there a possibility to do it with DL4J? Maybe something similar like pandas.DataFrame.interpolate — pandas 1.3.0 documentation ?

@agibsonccc Please advise :slight_smile:

@clasch-student you might want to give tablesaw a shot: tablesaw/InterpolatorTest.java at master · jtablesaw/tablesaw · GitHub then just convert the results to ndarrays.

You could also look at using our python execution. You can pass ndarrays directly in memory to be parsed as numpy arrays to python scripts and get the results back out in java.

Great - will have a look at it - thanks :slight_smile: