DZone

We built a workflow to train a model. It works fast enough on our local, maybe not so powerful, machine. So far.

The data set is growing. Each month a considerable number of new records is added. Each month the training workflow becomes slower. Shall we start to think of scalability? Shall we consider big data platforms? Could my neat and elegant KNIME workflow be replicated on a big data platform? Indeed it can.

Source: DZone