Suppose I work for a national pizza company. We have a large chain of 24/7 pizza restaurants. I want to predict when any given customer is most likely to buy a pizza. These are all people that have ordered from us before. They can order pizzas multiple times a day.
-
I divide the 24 hour day into twelve 2 hour time slots.
-
For training data I create an input record for each customer. Each line looks like this:
1,0,4,0,0,1,0,0,0,3,0,0
which a sum of the number of pizzas the customer has ever bought in each time slot. 1 pizza in time slot 0, 4 pizzas in time slot 2, etc -
The training labels look like this
1,0,1,0,0,1,0,0,0,1,0,0
which means that it is 0/1 for each time slot, has that customer ever ordered a pizza in that time slot before. -
The model looks like this, a simple classifier with 12 output layers, each corresponding to a time slot. It is possible that a customer can order multiple pizzas in a day after all. The output layers are binary just a yes/no on whether they would buy a pizza at that time slot.
ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Sgd(0.01))
.graphBuilder()
.addInputs(“input”)
.addLayer(“L1”, new DenseLayer.Builder()
.nIn(columnsInput)
.nOut(layer1Size)
.weightInit(WeightInit.ZERO)
.activation(Activation.SIGMOID)
.build(), “input”)
.addLayer(“out1”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out2”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out3”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out4”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out5”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out6”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out7”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out8”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out9”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out10”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out11”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.addLayer(“out12”, new OutputLayer.Builder()
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.nIn(layer1Size).nOut(2).build(), “L1”)
.setOutputs(“out1”, “out2”, “out3”, “out4”, “out5”, “out6”, “out7”, “out8”, “out9”, “out10”, “out11”, “out12”)
.setInputTypes(InputType.feedForward(batchSize))
.build();
after training the model will take any new customer pizza ordering record like the training record and return a probability for each time slot (output layer).
Does this model make sense? Or should I have configured it differently? Imagine a scenario where a customer might be presented with a coupon at any time slot where it is predicted they might buy a pizza. Obviously this should just do a good job of just encoding their purchase history but it can be then built up on, adding in additional features such as geography, age, etc.
And to be clear, the problem I am working on is simplified here due to confidentiality. This is not actually related to real life pizzas, but the issues are otherwise identical.