Beware of relying on future data in machine learning

Machine learning is almost always connected with analyzing historical data in a way that will allow us predict the future. However, it is easy to create a model by erroneously relying on future data. Sometimes we might catch this mistake towards the beginning of model development; other times, it will go unnoticed until the model is complete. I will present two cases where we should be really careful not to use future data.

Stock performance based on company statements

Companies listed on stock exchange present various financial statements like income statement, cash flow statements, balance sheet statements, etc. One may try to predict a company’s performance using this data. Where is the risk of using future data in this case?
Let’s assume we have a database with historical financial statements. Each statement has start and end dates which defines the period of time. We would like to create a training sample from this data. Let’s say we want to predict company’s performance based on the data we have on January 2 2020. In our database we have annual statements for period
2019-01-01 – 2019-12-31. read more

What makes a good user story in agile software development?

Introduction

Working with our clients, we frequently get asked about how a perfect user story should look like in order to facilitate the cooperation between business and development teams. Below is a small write up on the subject based on our longtime experience.

User stories are central element to the agile/scrum methodology as they define every piece of work being done by an agile team. There are several important guidelines that needs to be followed to create proper user stories that fit well with the overall process. read more