Trying to gather some data
All new technologies hitting agriculture are based on analyzing data. But there is an old truth of all analysis: crap in, crap out.
Then came artificial intelligence. Machine learning. Big data analytics. Unstructured data. The promise was that the algorithms will figure it all out themselves. Just throw the data at it, and it will spew out knowledge.
But then the old truth prevailed: crap in, crap out.
So I am very sorry to be the messenger of bad news, but unless you fix your data management practices first, you have no chance of successfully deploying any of the new technologies under the precision farming umbrella.
The problems of data management are not specific to agriculture, they are generic to all industries, and they typically include the following:
Data is stored in silos: Spray logs are in a database supplied by the co-op. Harvest logs are in a spreadsheet and in the receipts from the wholesaler and the two sources do not match. Pictures are in the family photo album. Notes are on a paper notebook. The agronomist's observations and recommendations are on a printed form.
Data formats are not harmonized. Even simple things like dates are messed up because they are logged differently in different computer programs. Units for areas, lengths, application rates are not harmonized.
Data is not gathered diligently (because it’s not being used for anything anyway). You forgot to record some jobs done in the fields. You don’t remember precisely when the first bloom was last year. You are pretty sure last week's spray was noted on that piece of paper which was in one of the tractors last time you saw it.
The solution to the data problems is not that hard, it just requires a bit of determination and structure:
Data needs to be gathered. Just stop forgetting to document things - if it’s not documented, it didn't happen. And if it didn’t happen, you can’t learn from it.
Make sure the formats match up. When you split your farm into fields, and you name the fields, stick to it and be consistent. If the name of the field is “green orchard” in one system, and you name it “green orchards” in another, there are problems already. Computers don’t treat inconsistencies nicely.
Store everything in one place. (This is not a necessity, it just makes your life a little easier). And make sure it’s a place that helps you share that data with those who need it to deliver future services. For example, if you deploy a drone in the field, how will you tell it where to go? You can’t talk to it. If all your relevant field data is already in the right place, with the right formats, your data will serve as the translator between yourself and the drone.
This may look easy in theory. It's really hard in real life. My own farming business, that I run with two partners, is entering its 8th season, and we still don't have our data management fully under control. We operate 50 hectares, divided on 32 fields, with 5 different crop categories (apples, plums, pears, strawberries, raspberries), and a bunch of varieties of those crops. We have estimates and approximations, but we just can’t conclude which of the combinations have been the better business for us. We also struggle to distribute seasonal workers’ hours on those fields and varieties. And when our accounting system shows a certain cost of pesticides, we can’t easily distribute that cost on fields and crops.
In the realm of data analytics, the timestamp is one of the most important attributes. When did something happen? If you want to save the observation of the first bloom for future reference, and you take a picture of it, it’s not worth a lot without knowledge of when it happened, right?. The rest of the world is happy using the Gregorian calendar for this. Farmers are not.
Plants don’t really care a lot about dates. They care about temperature. The plant’s calendar is measured in temperature-days. We therefore need to store all data with two timestamps, one of them for us humans to know when something happened, and the other one for the plants to know when something happened. We call this the phenological timeline, and storing this for all data points at the farm is going to make a world of difference when future analytics is deployed.
In as far as precision agriculture is a journey, data management needs to be the main focus at this stage of the journey. It may seem boring, but the rewards are there waiting. From good data comes many opportunities. The first of which, we´ll explore in the next blog post.
This is part three in a seven-part series on a farmer’s journey to precision agriculture.