From notepad to big data

From notepad to big data

In the last 98 years, researchers at Wageningen UR and other institutes have collected an unbelievably huge amount of information. What have we done with all that data? And how are we going to deal with it in the future?

Discover the opportunities of open data and big data in this historical overview of research at Wageningen UR.

1918Institutional data collection

1918 - 1980

Institutional data collection

Between 1918 and the 1980s, little use was made of computers for research. Activities such as microbiological research, counting birds or mapping out social networks were carried out by researchers themselves. Around the 1960s, the first computers came into use.

Data collection was often an initiative of the university or other institutes, such as the government. As a result, much data remained within the walls of an institute. For example, the government researched data on land registry primarily for its own use.

Era
Data collection method
Purpose of data collection

Laboratory for Surveying

Surveying

Data and information (positions) were noted down in logbooks.

In the 1950s, survey data and information (positions) were passed on to colleagues orally, who noted down all the information in logbooks. This information was then transferred to reports by the Laboratory for Surveying.

Landbouw Economisch Instituut (now: LEI Wageningen UR)

Computers in economic research

Collecting data from farmers for cost calculations and as input for policymaking. During the initial years this was all done by hand. In the 60s and 70s LEI started to use computers and developed its first model for Dutch agriculture.

LEI has been collecting data from farmers since the 1940s for cost calculations and as input for policymaking. During the initial years this was all done by hand, and then in the 1960s the computer made its entrance. Instructions and information were input into computers and analysed using punch cards.

During that period, researchers often used factor analysis. This digital analysis increased the insight into the differences between agricultural companies and indicated ways for improving results.

In the 1970s, LEI also developed its first model for Dutch agriculture. This ‘Intermodel’ made it possible to study and explain the development of the agricultural sector. Additionally, researchers were able to use the model to track the possible consequences of policy decisions.

Prof.Dr.ir. RD Politiek

Analogue measurements for cattle breeding

Analogue measurements to study the relationship between animal size and milk production.

Professor Politiek had a major influence on research into cattle breeding and milk production. He conducted research into relevant properties for selection procedures in the cattle breeding sector. In doing so, Politiek demonstrated, amongst other things, the relationship between animal size and milk production.

Politiek and Chardon analysed their own (analogue) measurements. They shared the results of the study outside the field of science, with dairy farmers.

1980 - 2000

The computer goes mainstream

From the 1980s, we used computers more often for carrying out research. Computer models were used in experiments and simulations.

Based on existing information about a specific crop, a researcher tested his hypothesis, for example: Is my assumption about the causes of a bad harvest correct?

Era
Data collection method
Purpose of data collection

Prof.dr.ir. CT de Wit

Modelling plant growth

First use of computers and dynamic models to simulate crop growth.

De Wit was one of the first agricultural researchers to make use of computers and dynamic models to simulate crop growth. Up until the 1980s these types of models were predominantly used by economists. Researchers all over the world view the application by De Wit as revolutionary.

De Wit, C.T., 1992. Resource use efficiency in agriculture. Agricultural Systems 40, 125-15. Download the resource.

"Based on my theory, it was predicted that wheat could produce 10,000 kilograms of dry matter per hectare, while the actual figure was 4,000 kilograms. At the time, everyone thought I was mad. Today that figure is as high as 12,000 kilograms"

Wageningen UR

Opening of Computechnion

Computer science education and the Wageningen UR Computer Centre expanded rapidly.

In the 1980s computer science education and the Computer Centre expanded rapidly. Wageningen UR therefore opened a new building in 1987: the Computechnion. At that moment, the Computechnion possessed the most powerful computer cluster of all the Dutch universities.

Researchers made use of the computers to perform complex simulations. A news item on 10 September 1987 states: "(…) via the Computechnion we can simulate reality. Complex simulation models provide a picture of the growth of a crop, for example, without any farmer, the weather, wind or earth becoming involved."

Ir. CA van Diepen, ir. HL Boogaard, dr. AJW de Wit

MARS-OP predicts crop yield

Monitoring seasonal crop yields using meteorological information.

In 1998, MARS-OP started as a service of EC-JRC, supported by Wageningen UR. Initially, the programme mapped out all the crops and harvests within Europe. Today, MARS-OP monitors seasonal crop yields using meteorological information. This information provides input for the expected yield of crops in Europe. Using these forecasts aid organisations for example can determine where and when a shortage or abundance of agricultural products will arise.

2000 - present

Digital data as forecaster

In recent years the simulations have become much more complex and more complete. We even use digital data within a research field to make forecasts. Thereby making research a powerful policy instrument.

A well-known example is the climate models that are used to forecast the rise in the water level. Another is the identification of positive tipping points which tell us something about which initial measures lead to a wooded area being able to restore itself.

Era
Data collection method
Purpose of data collection

Dr.ir. F van den Berg, ir. FM Peeters

FOCUSPEARL-model

Forecasting behaviour of plant protection products.

Since 1989, the leaching of plant protection products to the groundwater has been assessed using a model that simulates the behaviour of these substances in the soil-plant system. In the 1990s, this assessment was done in the Netherlands based on a standard scenario for a sensitive soil profile.

At the end of the 1990s Alterra, RIVM and PBL have developed the FOCUSPEARL model. This model has since been used to assess the risks of leaching to the groundwater as part of the registration procedure for plant protection products within the EU.

VFor registration at the national level, the Board for the Authorisation of Plant Protection Products and Biocides (Ctgb) in the Netherlands uses the GeoPEARL model. This model is used since 2004 and it includes GIS-data.

This makes it possible to assess the risk of leaching for which registration has been requested.

dr. NJJP Koenderink

Match X

Shopping without barcodes.

In an average supermarket 30.000 different kind of articles are individually recognised and registered by using a barcode. ITAB, Europe’s market leader for checkouts has asked Wageningen UR Food & Biobased Research to develop intelligent software that can automatically recognise all products. Even when the barcode is unreadable or incorrect. The MatchX software is implemented in the EasyFlow checkout by Food & Biobased Research.

The Match X software creates a "digital fingerprint" for all products. This fingerprint consists of the products weight, shape, volume, colour and material composition. From that the MatchX software can determine with 99% accuracy the identity of a product without using a barcode. This is more accurate than the average cashier!

The EasyFlow checkout identifies all products, weighs fruit and vegetables automatically, and determines the total price of your groceries. For the customer this means fast payments and convenience during shopping. For the supermarket it means less mistakes and more employees available for providing service.

Wageningen UR Library

RAF aerial photos WWII

The Wageningen UR library digitises and archives historical images.

The Wageningen UR library digitises and archives historical images, among other things A combination of these old images and new data can result in new insights. The aerial photos that the RAF took during the Second World War are particularly interesting. These can contribute to the research into the changes in topography and land use in the 20th century. Additionally, the images provide indications for initiating archaeological research. For example, it shows the former locations of concentration camps and gives an indication of the damage done by the bombings in Rotterdam.

The digitalization of historical material can be found here.

1000 bull genomes consortium

1,147 bulls in one database

Which bull is suitable for breeding?

The complete DNA sequence of 1,147 bulls is revealed. As part of the 1000 bull genome project, this DNA is now stored in a database. Researchers believe that the data will make it possible to determine more quickly which bulls are suitable for breeding cows that produce more milk, for example, or that result in fewer greenhouse gas emissions.

Is a bull suitable for breeding? "In the past, we were only able to answer this question after about seven years, with genomic prediction already at birth."

Read more about this article.

present - 2030

Open data and big data

The origin of data is becoming increasingly diverse. Scientists and researchers are increasingly gaining access to detailed satellite information, crowdsourced data, information that is collected via social media, data derived from tracking behaviour on a large scale, etc. What is new here is that we are becoming better in combining large amounts of data from various fields of research.

The scientific community can only make use of these opportunities if more ‘open’ data becomes available. In this way, the availability of data will no longer be a limiting factor for some types of research. Which new relationships can the researcher discover? Which new sources of data can the researcher use for this purpose?

Era
Data collection method
Purpose of data collection

Dr.ir. MW den Besten, dr.ir. GJ Steeneveld, PAJ Daane MM BSc

Big data, food safety and food sustainability

Big data becomes available at different levels of detail and domains.

Big data becomes available at different levels of detail and domains. The interdisciplinary integration provides guidelines for researching complex issues, for example in the field of food safety and sustainability.

Big data plays a role here at three distinct levels. Firstly, at the micro level, big data helps to understand and predict the behaviour of micro-organisms in the foodchain. Additionally, at the meso level, big data provides a view on local environmental factors, such as fluctuations in air temperatures. At the macro level, big data predicts the behaviour of the consumer and the manufacturer and global issues. This reveals how consumers and producers use products: what are the consequences for the optimal shelf life of products?

By combining these data about the behaviour of micro-organisms, weather forecasts, expected consumer behaviour and different levels of details, research is better able to predict which product has the most health risks, or where in the chain extra measures are required.

Dr.ir. PA Jansen, Y Liefting BSc

Camera traps

Camera traps to study wildlife.

The CameraTrapLab uses camera traps to study wildlife. These devices take series of photos when animals move in front of an infrared motion sensor. By running cameras at a wide range of randomly selected locations, the researchers obtain a representative picture of an area or habitat.

Based on the images, researchers can estimate the species composition, population sizes, behaviours in specific environments, and daily activity patterns of wildlife.

Images and data are stored in a large database. This allows researchers to do large-scale comparisons across species, areas and years.

GODAN: Global Open Data for Agriculture and Nutrition

Make more data openly available for governments, industry and research.

Wageningen UR is actively involved in GODAN – an international network whose goal is to make more data openly available and to play a global and connecting role between governments, industry and research regarding the use of open data. In collaboration with the Dutch Ministry of Economic Affairs, Wageningen UR has seconded a researcher to the GODAN secretariat.

LEI Wageningen UR

FarmDigital

Standardising data for more efficient input and distribution.

Agricultural entrepreneurs have to record increasingly more data for, amongst others, the government, consumers and customers. All these parties want to know how safely or sustainably food has been produced.

Much of this data is currently stored in different systems. The content is often difficult to share with stakeholders. FarmDigital looks for ways to standardise datasets and make them shareable via an independent platform. In this way, an entrepreneur will soon only have to input his data once to be able to easily share the information.

prof.dr.ir. JL Top

Tiffany

Standardising and sharing research data safely.

Publishing research is more than writing an article. The underlying raw data and the used methods should be accessible and understandable. Tiffany is an online application that allows scientists to save their data and methods in a coherent and transparent way. This ensures that the whole trajectory from first research question to the final publication is recorded. Datasets are suitable for reuse and easy to share.

The scientists themselves determine when and with whom their research is shared. In this way the data is available to others, but the scientists can prevent misuse of their work.

dr.ir. J Bremmer

BIGt&u

Making market information from various resources available to the horticulture industry.

In the horticulture industry the growers often lack enough market information to efficiently cultivate crops to let their business expand. The information they need includes how their product is sold and appreciated.

The BIGt&u project aims to simplify the access to various data sources like social media, market research and import/export data. This makes relevant data available for the whole horticulture industry and allows for market oriented growing.