News Articles

How do I ensure my files remain accessible in the future?

Source: SA Migration, 14/05/2018

How do I ensure my files remain accessible in the future? How to stop your valuable information becoming extinct as old formats die, according to experts The BBC made an expensive mistake in the 1980s. It spent £2.5 million (£7.1 million in today`s money) building one of the first computer encyclopedias. The massively ambitious Domesday Project, in commemoration of the 900th anniversary of the Domesday Book, shipped on a pair of LaserDiscs, a standard that`s largely disappeared. It was programmed using BCPL, a 51-year old language that`s no longer in common use, and used analogue video stills layered on top of the interface where it needed to show a photo. This was, after all, the pre-JPEG era. Even the hardware on which it ran ` the BBC Micro and a LaserDisc player ` was bespoke, and cost £5,000. Inevitably, much of the data was lost as the discs degraded, formats moved on and the hardware came to the end of its useful life. Work is still ongoing to try and recover the contents, some of which have been posted online. It`s hard to imagine the same thing happening now. Today, we have ubiquitous formats, and everything lives in the cloud. Doesn`t it? Backups aren`t archives In 2015, Google`s `chief internet evangelist`, Vint Cerf, warned that we face a `forgotten generation or even a forgotten century` as formats fall out of favour and hardware degrades. `We digitise things because we think we will preserve them, but what we don`t understand is that unless we take other steps, those digital versions may not be any better, and may even be worse, than the artefacts that we digitised.` It`s a theme picked up by Arkivum`s Paula Keogh, who makes a clear distinction between archiving and backup ` two allied fields that people who don`t work in digital preservation frequently confuse. `A backup won`t be migrating the infrastructure or file format over time,` she said. `You`re locking your data in a metaphorical room, throwing away the key and hoping it will still be there in the future.` Arkivum`s clients sign 25-year contracts for the preservation of their data which, in Keogh`s words, `is a lifetime in IT, but a drop in the ocean for an archive`. Critically, they need their data to be not only secure, but also accessible. `Life science organisations [and others] want to be able to double-click a file in a couple of decades and open it... so media is one lifecycle management process that we undertake. The other is file format preservation. It`s not backup, scanning or digitisation, all of which can ` and does ` get confused with the term digital preservation. It`s about migrating the file formats into the most preservable version at that point.` Format deprecation It seems almost inconceivable that industry standards such as Word and Excel might disappear, but this is precisely what the data archiving standards body, the Association for Information and Image Management (AIIM), is planning for. `The industry has decided that [archival-focused] PDF/A is going to be a future-proof format,` said Howard Frear of Easy Software, which sits on the body`s board. `It contains all of the data and metadata within the document itself, so you don`t necessarily need an application to open it, as there will always be an industry standard viewer.` This will be more important to certain industries than others. Easy Software works with pensions providers, for example, who maintain their records for the life of each subscriber, plus 20 years, and need to know that the records they produce will still be accessible, potentially, 100 years from now. That`s not guaranteed with proprietary formats. `With Microsoft Word, older and newer versions, they aren`t that compatible,` Frear said. `Backwards compatibility has been problematic but looking at forwards compatibility is nigh-on impossible unless you have a standard.` However, if PDF/A is the way ahead, when should the file actually be generated? At the point when we save our assets, or when they`re added to an archive? `It should be a problem for Apple, Microsoft, IBM and Amazon, but it`s not,` explained Keogh. `For us to be looking after our data well, when we`re creating the data in whatever format, that`s when you should have the option to make it as future-proof as possible.` `To some degree, it`s down to the user to put in some extra effort,` Frear said, explaining that Microsoft Word can output PDF/A using an add-in. `Perhaps developers could do a little bit more and store both copies as part of the single save function, but then everybody is battling against the volume of data that creates.` Keeping data alive It`s easy to forget when we have become so used to the idea of putting our assets in the cloud that it, like your local hard drive, is still a limited resource backed by fallible hardware. That`s why taking responsibility for your own archive is essential. `Cloud providers perhaps aren`t as mindful as the software community is,` Frear said. `Software and records management communities are driving the standards and we need to remind cloud vendors that it`s all very well bringing in new hardware, but that they have a responsibility to ensure that the data we put up to the cloud lives beyond the hardware`s usable life, and that as they move on to different hardware they have a responsibility to move the data across smoothly,` he continued. If that archive remains usable, so much the better. PDF/A looks like the best compromise, preserving both the final look of the archived document, and extractable content for reuse. `Could you read a WordPerfect file?` Keogh asked. `I couldn`t, not without an emulator, and that`s only from the 1990s, which from a data protection point of view, for something like the deeds of a house, someone`s pension scheme, a clinical trial or the research that meant you could bring a drug to market, is no time at all.` Yet, despite warnings like this, a study published by the journal Current Biology found that only a fifth of all the research published in the early 1990s remains accessible. The Digital Preservation Coalition, founded by the British Library and JISC (Joint Information Systems Committee), published a list of the world`s endangered digital species at the end of 2017. It classified data from marginalised sub-groups and the photo archives of SMEs as critically endangered, requiring urgent action and assessment within 12 months. Even documents stored on Google Drive and Dropbox, where access is restricted to specific users, were listed as endangered, along with digital images with no analogue equivalent posted to social networks. Archives and the right to be forgotten The implementation of GDPR this May will have implications for archive-keeping, which Freer described as `another piece of the puzzle`. Keogh sees potential conflicts ` particularly over the question of what should and shouldn`t be removed on request. `There`s a lot still to be ironed out,` she said. `When you talk about things like [archived] genome sequencing or thumbprints you need to start asking what is identifiable about an individual. Is it their NI number, their first and last name, their DNA sequence? You can`t take an individual out of [a study] because it skews the figures. Yet, they still have the right to be forgotten, so how do those two conflicting things work in reality?` It`s likely the answer will become clear in the months following GDPR coming into force through trial cases and legal guidance. It illustrates once again, though, the crucial difference between a static backup that rots with age, and a live, accessible archive, which remains an asset for the organisation that created it years or even decades into the future


Search

  •    THE media and the public will now be allowed to attend asylum seekers` appeals when conducted following the adoption of the Refugee Amendment Bill.... Read more...
  •    THE media and the public will now be allowed to attend asylum seekers` appeals when conducted following the adoption of the Refugee Amendment Bill.... Read more...
  •    THE media and the public will now be allowed to attend asylum seekers` appeals when conducted following the adoption of the Refugee Amendment Bill.... Read more...
  •    Cape Town - A Western Cape Cape High Court judge is facing a legal conundrum after three illegal immigrants arrested during a national clampdown Operation Fiela turned to the courts to test the lawfulness of their detention... Read more...
  •    FLAWS: ANC treasurer-general Zweli Mkhize says elitist business models and a sluggish economy are limiting jobs.... Read more...
  •    Statistics SA`s analysis of tourist arrivals between 2013 and 2014 fails to take into account that 2013 tourist arrival figures included transit passengers, while for 2014 these were excluded.... Read more...
  •    SA`S NEW travel rules prevented our minor children from attending a family funeral. My wife and children were on holiday in Florence, Italy, and I had returned early to our home in Singapore when we received the tragic news that our three-year-old niece had passed away. We wanted to return to SA immediately, in time for the funeral last Friday.... Read more...
  •    US military personnel on active duty have been banned from using devices and applications which utilize GPS.... Read more...
  •    In the face of the continuing Cape drought being today declared a ‘national state of disaster’ by the Co-operative Governance and Traditional Affairs Minister, Zweli Mkhizeby �` and what this new blow could mean for inbound tourism �` a welcome light on the horizon came with the announcement of the discovery of Late Pleistocene Hominin tracks along the Cape South Coast.... Read more...
  •    AI can be used to automatically detect and combat malware -- but this does not mean hackers can also use it to their advantage. More security news... Read more...

Get the latest Immigration News