Case Study
If you had never heard of DoroStore, you would be forgiven. DoroStore garnered a modest following in the 1990’s as a data management platform (like an early HSM) and was adopted in a range of industries – one of these being the oil sector.
The DoroStore system was created by a French company who is now no longer trading, but was found to still be in use in a major oil company.
The Problem
Selected assets from the customers data archives were preserved on an ageing hardware and software system and were facing redundancy. Multiple attempts were made by the customer to retrieve the data but tape software errors hindered the process. The data trapped on these tapes was highly sensitive and of significant importance to the customer, consisting of seismic records and well logs. The customer needed assistance from data migration experts, Tape Ark to liberate the data and position it in Amazon Web Services (AWS) for long term preservation.
The Solution
This project is a perfect example of the innovative and dedicated approach of the Tape Ark team to solving complex customer problems. After initial review of the tapes, it was clear that the data was not structured as originally thought. The datasets were written in an interleaved format that was going to be complex and time consuming to unravel and fully extract. To add to the complexity, the tapes had been created from a range of machines including RS6000, SUN SPARC Server, HP9000/7xx and no relevant documentation was available for reference. Through analyzing the file structure and considerable testing, Tape Ark reverse engineered a bespoke data liberation routine to read these now defunct tapes and successfully extracted and reformatted this data.
The key steps in the process are set out as per below;
Tape Cataloguing – Tape Ark received the tapes at its Mass Ingest Facility in Perth. Upon receipt of the archive, a media audit was performed which included photographing each tape, reading RFID data chips, and applying a supplier QR code.
Data Extraction – Data was extracted from the tapes using a bespoke processing flow and each file interrogated to ensure it was fully extracted. Each file underwent a complex restore process and detailed metadata was generated including media information and checksumming, for a central database.
Data Upload – The data were securely uploaded to the customer’s Amazon Simple Storage Service (S3) bucket. Tape Ark worked closely with the customer to define criteria for data tiering, as well as duplicate detection and removal in order to optimise the storage footprint. The data was placed into the Tape Ark Nexus Portal, hosted on the AWS platform for easy search and access by the customer and permanent archiving.
The Outcome
Tape Ark extracted over 90% of the data from the complex DoroStore structure, with a handful of tapes presenting as empty or with unresolvable tape read errors. The customer was extremely happy to have retrieved key datasets and considered the project a huge win for their team. Legacy data that was all but lost is now readily accessible on the AWS Cloud for sharing and long term preservation.