Scientific discovery is borne out of observing things that happen, say a hypothesis, and forming theories around the resultant data. This is how we come to understand our world and the various reactions which occur within it.
An explosion in data collection has broadened the horizons of science; we can now observe more phenomena than ever before with a level of precision which last century’s scientists could only have dreamed of.
This means satellites, particle accelerators, among other marvels of modern technology. The multiplicity of the world and universe is being better explored in each scientific generation—and this could be about to accelerate to unforeseen heights.
It is crucial thus that data sharing becomes more widespread, especially in the fields of science. As the National Science Foundation for the U.S. states in a report:
“Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants.”
Can we store research data securely and make it accessible?
Permanently storing data from academic research is a minefield of security risks – not even offline data storage is safe all of the time – and the goto option tends to be in data archives.
The problem with such repositories, although they are secure, is that they tend to act as a ‘final destination’ for data. Research is not at its most accessible in this environment, in fact far from it.
Cloud storage services are the choice of many in academia but they pose several problems, especially when looking to store potentially high value intellectual property. The security on these platforms cannot be guaranteed and as such high-risk information should be stored elsewhere.
Good old fashioned local storage is always an option, and indeed probably fine for low-risk data, but the hardware has to be periodically upgraded to avoid obsolescence. This is a costly and labour-intensive endeavour.
New technologies may hold the key
A key concern for researchers is to make their data permanently available – yet immutable – in order to prevent data manipulation. But efficient methods of keeping research data let alone making it freely accessible are currently lacking.
In terms of peer review, Decentralized Science is a startup who aim to do exactly what their name says; they want to distribute articles for review and have them returned quicker than ever before, using new distributed technologies such as blockchain.
According to their website reviewers will work on a recognition-based system, which provides an incentive to work on building and maintaining their reputation. Reviewers of the highest repute may also be sought out by third parties to carry out paid reviews.
But although blockchain has been touted as a resolution to issues surrounding data transfer, security and transparency, there remain unresolved scaling issues that could take some time to fix.
The concept is solid, however, as a distributed ledger of transactions clearly shows a sequential history on the chain and this gives it inbuilt trust and transparency.
This is the space that a DAG architecture might seek to move into; it has been touted as Blockchain 3.0 and, similar to blockchain, once a data block is added to the chain it can never be changed.
A major advantage of a DAG-based network is that it could in theory handle vast quantities of data without the network slowing to a crawl. Its design means that it could verify thousands of transactions simultaneously, in a matter of seconds.
How would a DAG network work in practice?
When a researcher has a study published in a scientific journal, they would be able to upload all of the supporting data on to the DAG network in the form of a fully decentralized database.
Following on from this, access permissions can be set so the data is read-only for the wider public while a select few are able to modify or add data—this will be in a clearly visible series of ledgers where the history of edits can be seen by all.
This is part of China-based startup CyberVein’s vision: the idea follows that academic researchers will have a much easier time trying to reproduce the results of studies with unfettered access to raw data.
Their network seeks to store datasets on a distributed network of user devices, who donate disk space in order to gain a reward of CyberVein Tokens (CVT).
There may be more solutions forthcoming
Potential solutions for academic data sharing seem to be thin on the ground right now. Indeed, CyberVein’s DAG-based network appears to be the only one that clearly outlines the technological and economical framework for achieving this.
Do expect to see more solutions for academic data sharing to come out of the blockchain and DAG space—the technological and information value is simply too great. There is potential here to foster real trust in scientific research.
Whenever a successful implementation comes, and whatever form it takes, it is likely to be a boon to future scientific progress. The key lies with the data, and improved sharing has potential to usher in a global information and technology boom on a scale never before seen.
Aubrey Hansen is a freelance writer, a graduate of Aarhus University and crypto enthusiast. She writes about blockchain technology, Fintech and cryptocurrencies. She’s been researching major developments in the crypto world in past couple of years.