Reading RefTek Disks and Converting to CSS3.0 DAY Volumes

Here's a partial description of what to do to read RefTek portable disks. Keep in mind that our processing needs may differ and that each deployment may have it's own particular quirks. If you see some glaring errors in the steps listed below, feel free to bring them to my attention (jeakins@ucsd.edu).

Note that although you can often "hot-swap" the portable disks, it is not always smart to do so. The best procedure would be to reboot the sun with the disk attached (make sure you are not booting someone else off the system if it is a shared host). If you don't want to start with a re-boot, make sure you inform other users that your actions may result in the host going down.

Types of RefTek disks

I have encountered two types of RefTek disks: type ID#1 and ID#2. Normally, ID#1 disks are referenced as /dev/rsd30c, ID#2 is /dev/rsd31c.

When a disk is attached, type "format". In the format list an ID#2 portable disk may appear as:

c2t2d0 /sbus@1f,0/esp@0,200000/sd@2,0

Reading data

Use refdump to read a raw data file from the SCSI:

refdump /dev/rsd30c file_name.raw

If refdump fails to read all of the data, or does not work at all, try dd.

dd if=/dev/rsd30c of=file2_name.raw bs=##

You may have to fiddle with the block size (bs) to get all/most of the data. Occasionally, you may have to use the dd command with the "skip" option to by-pass bad sectors on the disk.

As a precaution, you may want to make a backup tape of this raw data file before you begin to convert to another format. Assuming you were able to read the data successfully, you can now continue with creating state of health files and converting the data to your preferred data format. If that was too bold of an assumption read on for some steps to take if you were not able to get all of the data off the disk.

Additional steps to take if you can't read the disk

Verify that no one is using your host, or any data on your host! (You should probably send email to your sys-admin that you are about to reboot.) With your RefTek disk attached, re-boot with the following command:

sudo /usr/sbin/init 0

at the "ok" prompt type

boot -r

You should see the c2t2d0 disk referenced in the reboot output.

There may be an error about label size. Something like this:

cklabel: Error opening disk /dev/rsd31c for reading
cklabel: disk opening problems:  Permission denied
refdump: ERROR!  Couldn't check/fix disk label against kernel.

If so proceed as follows:

Use sudo and do a "format" (or get your sys-admin to do it). Select the c2t2d0 disk (or whatever your SCSI disk label is). It will probably say: "Disk not labeled. Label it now?" Respond with y if you are absolutely positively certain you know what you are doing. If not, talk to someone who might know what to do (Glen/Jennifer). If you are sure about the action you are about to take, choose "label" from the format menu. This will write the proper Solaris label to the DAS (you should already have a proper REFTEK label) and will allow you to read the data. DO NOT DO A LOW LEVEL FORMAT AS YOU WILL ***LOSE ALL OF YOUR DATA ***!!!

You should now be able to read your REFTEK data on-line with either refdump, dd, or ref2db.

Converting the raw RefTek data

There are numerous PASSCAL programs that can help you process the data from raw RefTek format to various other formats such as ref2mseed and ref2segy. See http://www.iris.washington.edu/DOCS/software.htm for the latest software release. Read the manpages for details (refdump, logview, ref2log, ref2mseed, ref2segy).

Decide what your immediate needs are. How do you want to process the data? Do you only need state of health or do you want to process css3.0 data? If you only need state of health information, ref2log followed by logview might be all you need.

ref2log -f myfile.raw
logview I????.log

Timing errors

Ideally, looking at the logview output, you would note that you have no significant timing errors. You could then proceed with the format conversion. If this is not the case, you will need to run two programs to identify and fix the timing errors. You really should read the man pages for refrate and clockcor. Refrate will generate information on the drift rate for each DAS unit. Clockcor will apply those corrections to the waveforms. You should be able to apply these corrections at any point in the processing, but the corrections should only be applied once!

refrate I????.log > ratecor.out
clockcor ratecor.out file(s)

Prior to the advent of the ref2mseed program, you had to run ref2segy, segy2css, and finally dbsteimc to get css3.0 format miniSEED data. This was a very space hungry way to process the data as you have four versions of the data on disk (raw RefTek, segy, uncompressed css30, miniSEED)!

The tortuous path to convert was as follows:

ref2segy -f file_name.raw
segy2css -fsegy2css.tbl "R*/*"
cat 19*wfdisc > tmp.wfdisc
extrd_comp_days tmp.wfdisc start_day end_day (see below)
(make a descriptor file for extracted volumes)
run_ucsdwf2db (see below)
run_verify 

Segy2css needed a table that would map the DAS name to station names (see segy2css.tbl below for example). For extrd_comp_days you needed to have a network code. This corresponds to the 2 digit network code supplied by IRIS (ie AZ for Anza, KN for Knet, etc). I've attached a copy of the old hack below, but note that the program extrd has been deprecated: use trexcerpt which has extensive options for extracting waveforms and database tables.

Ref2mseed allows you to read the raw RefTek data from disk and immediately converts to miniSEED (put in proper channel code with the -X option). However, the output wfdiscs do not have the proper station or channel names. This can be fixed using the program ucsdwf2db if you have a proper set of stapars built for the database. Otherwise, you must use multiple dbset commands. See stapar_description for the format of a stapar file.

ref2mseed -d /dev/rsd/30c -X XZ
cat [0-9]*wfdisc > mydb.wfdisc

Next, make a descriptor file using vi (filename = fixed_db):

css3.0
./{fixed_db}:/some/path/to/field_tables/{fixed_db}

Continue with ucsdwf2db and dbset.

ucsdwf2db stapardir mydb fixed_db

-or-

dbset -v mydb.wfdisc chan '4' 'EHZ'
dbset -v mydb.wfdisc chan '5' 'EHN'
dbset -v mydb.wfdisc chan '6' 'EHE'
dbset -v mydb.wfdisc sta '7409' 'ISTA'
dbset -v mydb.wfdisc sta '7410' 'USTA'
dbset -v mydb.wfdisc sta '7411' 'WASTA'
etc.

Note that the dbset command assumes that DAS#7409 was always station ISTA. If this is not the case, you will have to do some fiddling with the ref2mseed command and only process particular jdays, or DAS ids.

Run dbverify for all to find problems

dbverify fixed_db

Creating day volumes

You might be able to use the extrd_comp_days, but I would suggest you use trexcerpt_days instead. The "extrd" command has been deprecated and was rather buggy. Trexceprt is more robust and allows the user many additional options. The script trexcerpt_days is a very basic hack that allows you to extract all day volumes for a given input database. You can choose the format of the output data (see trexcerpt(1) for acceptable formats).

trexcerpt_days sd fixed_db

This will create wfdisc files of the form YYYYJJJ and waveforms in miniSEED format will be stored according to: wf/YYYY/JJJ/STA.CHAN.YYYY:JJJ:HH:MM:SS.

If you want a single database, you can either cat the individual YYYYJJJ wfdiscs into a single wfdisc:

cat 19*wfdisc 20*wfdisc > endall.wfdisc
dbfixids endall wfid >& /dev/null

or you can run multiple "dbmerge" commands on the individual databases you want added together.

Additional files and scripts:

trexcerpt_days

: # use perl
eval 'exec `dirname $ANTELOPE`/perl/bin/perl -S $0 "$@"'
if 0;

#
# extract input database into daylong volumes
#
# J.Eakins 8/23/2000
# jeakins@ucsd.edu
#

use lib "$ENV{ANTELOPE}/data/perl" ;

    if ( @ARGV != 2 )
        { die ( "USAGE: $0 format dbin \n" ) ; }

    use Datascope ;

    $format = $ARGV[0];
    $dbin   = $ARGV[1];
#
# open input dbs
#

@dbin = dbopen($dbin, "r");
    
@dbwfdisc = dblookup( @dbin,"","wfdisc","","") ;

$mintime  = dbex_eval(@dbwfdisc, "min(time)");
$maxtime  = dbex_eval(@dbwfdisc, "max(endtime)");

$nrows = dbquery(@dbwfdisc,"dbRECORD_COUNT");

print STDERR "     ", strtime($mintime), " minimum time in database: $dbin \n";
print STDERR "     ", strtime($maxtime), " maximum time in database: $dbin \n";

$start = yearday($mintime); 
$end   = yearday($maxtime); 

while ( $start <= $end ) {
$cmd = "trexcerpt -vv -a -o $format -w \"wf/%Y/%j/%{sta}.%{chan}.%Y:%j:%H:%M:%S\" 
$dbin $start $start 86400" ; 
&run($cmd);
$start++ ;

    }

exit;

sub run {

    my ( $cmd ) = @_ ;
    my $line         ;
    print STDERR "$cmd\n" if $opt_V ;
    system ( $cmd ) ;
    if ($?) {
        print STDERR "$cmd error $? \n";
    }
}

extrd_comp_days

#!/bin/csh -f
#
  if ($#argv != 4) then
    echo ""
    echo "USAGE:  extrd_days wfdisc start_day end_day network_code"
    echo ""
    echo "for example: extrd_days anzac_tmp.wfdisc 1998021 1998037 AZ" 
    exit 1
  endif

  set wfd = $argv[1]
  set cnt = $argv[2]
  set end = $argv[3]
  set nw = $argv[4] 
  set cdir = COMP

  while ($cnt <= $end)
    echo $cnt
    set s = $cnt":00:00:00.000"
    set e = $cnt":23:59:59.990"
    extrd -n $nw -o DAY_VOLS "$wfd" $s $e     
    @ cnt ++
  end
 
  echo "Done extracting day volumes"
  echo `date`
  echo "Going to compress day volumes"
  mkdir DAY_VOLS/$cdir
  cd DAY_VOLS
  foreach wf ([0-9]*.wfdisc)
    set db = `basename $wf .wfdisc`
    echo $db
    cd $cdir
    dbsteimc -v -n $nw -s $db".w" -r -2147483648 -r 2147483647 ../$db $db
    cd ..
  end

segy2css.tbl

The segy2css.tbl has the following format:

DAS# starttime endtime STA

7338 1997300:00:00:00.0 1999365:23:59:59.9 PUPE
7358 1997298:00:00:00.0 1999365:23:59:59.9 ELAR
7363 1997299:00:00:00.0 1999365:23:59:59.9 SACA
7432 1997289:20:00:00.0 1999365:23:59:59.0 TELM
7441 1997293:22:00:00.0 1999365:23:59:59.9 ALAM
7442 1997293:02:00:00.0 1999365:23:59:59.0 SAJO
7443 1997293:19:00:00.0 1999365:23:59:59.0 OBTO
7450 1997296:00:00:00.0 1999365:23:59:59.0 LACB
7453 1997290:01:00:00.0 1999365:23:59:59.0 LOQI
7458 1997297:00:00:00.0 1999365:23:59:59.9 SAFE
7465 1997302:00:00:00.0 1999365:23:59:59.9 ROKO

URL: http://eqinfo.ucsd.edu/faq/reftek.php [Last updated: 2015-10-22 (295) 23:07:44 UTC]