Implementing a simple k-means clustering model


I have an excel worksheet with rows of data in columns A thru E
The data represents a time series - 5 consecutive hourly prices for a currency pair

I wish to train a clustering model where k=25 BUT

I want the clusters to be based on the plot or profile of the 5 data points - see examples attached

NOT on the actual prices themselves

Can this be done in dl4j ?

If so, please explain the differences between setting up what I am trying to do (using 5*1 vectors) and the straightforward clustering based on the various price levels

The latter giving you 25 clusters based on the average price of the 5 data points
This is NOT what I am wanting



@Bob_M we had a clustering algorithm at one point in Maven Central Repository Search

it was originally a community pull request that wasn’t really getting many updates. Due to lack of interest from the user base and customers I removed it. Commercially (we have to pay the bills too you know :slight_smile: ) it didn’t make sense for us to sink time in to it.

Rather than leaving code we weren’t updating in the code base I removed it a few years ago.

You can look in to smile or tribuo for kmeans.