Reading RefTek Disks and Converting to CSS3.0 DAY Volumes
Here's a partial description of what to do to read RefTek portable disks. Keep in mind that our processing needs may differ and that each deployment may have it's own particular quirks. If you see some glaring errors in the steps listed below, feel free to bring them to my attention (jeakins@ucsd.edu).
Note that although you can often "hot-swap" the portable disks, it is not always smart to do so. The best procedure would be to reboot the sun with the disk attached (make sure you are not booting someone else off the system if it is a shared host). If you don't want to start with a re-boot, make sure you inform other users that your actions may result in the host going down.
Types of RefTek disks
I have encountered two types of RefTek disks: type ID#1 and ID#2. Normally, ID#1 disks are referenced as /dev/rsd30c, ID#2 is /dev/rsd31c.
When a disk is attached, type "format". In the format list an ID#2 portable disk may appear as:
c2t2d0/sbus@1f,0/esp@0,200000/sd@2,0
Reading data
Use refdump to read a raw data file from the SCSI:
refdump /dev/rsd30c file_name.raw
If refdump fails to read all of the data, or does not work at all, try dd.
dd if=/dev/rsd30c of=file2_name.raw bs=##
You may have to fiddle with the block size (bs) to get all/most of the data. Occasionally, you may have to use the dd command with the "skip" option to by-pass bad sectors on the disk.
As a precaution, you may want to make a backup tape of this raw data file before you begin to convert to another format. Assuming you were able to read the data successfully, you can now continue with creating state of health files and converting the data to your preferred data format. If that was too bold of an assumption read on for some steps to take if you were not able to get all of the data off the disk.
Additional steps to take if you can't read the disk
Verify that no one is using your host, or any data on your host! (You should probably send email to your sys-admin that you are about to reboot.) With your RefTek disk attached, re-boot with the following command:
sudo /usr/sbin/init 0
at the "ok" prompt type
boot -r
You should see the c2t2d0 disk referenced in the reboot output.
There may be an error about label size. Something like this:
cklabel: Error opening disk /dev/rsd31c for reading cklabel: disk opening problems: Permission denied refdump: ERROR! Couldn't check/fix disk label against kernel.
If so proceed as follows:
Use sudo and do a "format" (or get your sys-admin to do it). Select the c2t2d0 disk (or whatever your SCSI disk label is). It will probably say: "Disk not labeled. Label it now?" Respond with y if you are absolutely positively certain you know what you are doing. If not, talk to someone who might know what to do (Glen/Jennifer). If you are sure about the action you are about to take, choose "label" from the format menu. This will write the proper Solaris label to the DAS (you should already have a proper REFTEK label) and will allow you to read the data. DO NOT DO A LOW LEVEL FORMAT AS YOU WILL ***LOSE ALL OF YOUR DATA ***!!!
You should now be able to read your REFTEK data on-line with either refdump, dd, or ref2db.
Converting the raw RefTek data
There are numerous PASSCAL programs that can help you process the data from raw RefTek format to various other formats such as ref2mseed and ref2segy. See http://www.iris.washington.edu/DOCS/software.htm for the latest software release. Read the manpages for details (refdump, logview, ref2log, ref2mseed, ref2segy).
Decide what your immediate needs are. How do you want to process the data? Do you only need state of health or do you want to process css3.0 data? If you only need state of health information, ref2log followed by logview might be all you need.
ref2log -f myfile.raw logview I????.log
Timing errors
Ideally, looking at the logview output, you would note that you have no significant timing errors. You could then proceed with the format conversion. If this is not the case, you will need to run two programs to identify and fix the timing errors. You really should read the man pages for refrate and clockcor. Refrate will generate information on the drift rate for each DAS unit. Clockcor will apply those corrections to the waveforms. You should be able to apply these corrections at any point in the processing, but the corrections should only be applied once!
refrate I????.log > ratecor.out clockcor ratecor.out file(s)
Prior to the advent of the ref2mseed program, you had to run ref2segy, segy2css, and finally dbsteimc to get css3.0 format miniSEED data. This was a very space hungry way to process the data as you have four versions of the data on disk (raw RefTek, segy, uncompressed css30, miniSEED)!
The tortuous path to convert was as follows:
ref2segy -f file_name.raw segy2css -fsegy2css.tbl "R*/*" cat 19*wfdisc > tmp.wfdisc extrd_comp_days tmp.wfdisc start_day end_day (see below) (make a descriptor file for extracted volumes) run_ucsdwf2db (see below) run_verify
Segy2css needed a table that would map the DAS name to station names (see segy2css.tbl below for example). For extrd_comp_days you needed to have a network code. This corresponds to the 2 digit network code supplied by IRIS (ie AZ for Anza, KN for Knet, etc). I've attached a copy of the old hack below, but note that the program extrd has been deprecated: use trexcerpt which has extensive options for extracting waveforms and database tables.
Ref2mseed allows you to read the raw RefTek data from disk and immediately converts to miniSEED (put in proper channel code with the -X option). However, the output wfdiscs do not have the proper station or channel names. This can be fixed using the program ucsdwf2db if you have a proper set of stapars built for the database. Otherwise, you must use multiple dbset commands. See stapar_description for the format of a stapar file.
ref2mseed -d /dev/rsd/30c -X XZ cat [0-9]*wfdisc > mydb.wfdisc
Next, make a descriptor file using vi (filename = fixed_db):
css3.0
./{fixed_db}:/some/path/to/field_tables/{fixed_db}
Continue with ucsdwf2db and dbset.
ucsdwf2db stapardir mydb fixed_db
-or-
dbset -v mydb.wfdisc chan '4' 'EHZ' dbset -v mydb.wfdisc chan '5' 'EHN' dbset -v mydb.wfdisc chan '6' 'EHE' dbset -v mydb.wfdisc sta '7409' 'ISTA' dbset -v mydb.wfdisc sta '7410' 'USTA' dbset -v mydb.wfdisc sta '7411' 'WASTA' etc.
Note that the dbset command assumes that DAS#7409 was always station ISTA. If this is not the case, you will have to do some fiddling with the ref2mseed command and only process particular jdays, or DAS ids.
Run dbverify for all to find problems
dbverify fixed_db
Creating day volumes
You might be able to use the extrd_comp_days, but I would suggest you use trexcerpt_days instead. The "extrd" command has been deprecated and was rather buggy. Trexceprt is more robust and allows the user many additional options. The script trexcerpt_days is a very basic hack that allows you to extract all day volumes for a given input database. You can choose the format of the output data (see trexcerpt(1) for acceptable formats).
trexcerpt_days sd fixed_db
This will create wfdisc files of the form YYYYJJJ and waveforms in miniSEED format will be stored according to: wf/YYYY/JJJ/STA.CHAN.YYYY:JJJ:HH:MM:SS.
If you want a single database, you can either cat the individual YYYYJJJ wfdiscs into a single wfdisc:
cat 19*wfdisc 20*wfdisc > endall.wfdisc dbfixids endall wfid >& /dev/null
or you can run multiple "dbmerge" commands on the individual databases you want added together.
Additional files and scripts:
trexcerpt_days
: # use perl
eval 'exec `dirname $ANTELOPE`/perl/bin/perl -S $0 "$@"'
if 0;
#
# extract input database into daylong volumes
#
# J.Eakins 8/23/2000
# jeakins@ucsd.edu
#
use lib "$ENV{ANTELOPE}/data/perl" ;
if ( @ARGV != 2 )
{ die ( "USAGE: $0 format dbin \n" ) ; }
use Datascope ;
$format = $ARGV[0];
$dbin = $ARGV[1];
#
# open input dbs
#
@dbin = dbopen($dbin, "r");
@dbwfdisc = dblookup( @dbin,"","wfdisc","","") ;
$mintime = dbex_eval(@dbwfdisc, "min(time)");
$maxtime = dbex_eval(@dbwfdisc, "max(endtime)");
$nrows = dbquery(@dbwfdisc,"dbRECORD_COUNT");
print STDERR " ", strtime($mintime), " minimum time in database: $dbin \n";
print STDERR " ", strtime($maxtime), " maximum time in database: $dbin \n";
$start = yearday($mintime);
$end = yearday($maxtime);
while ( $start <= $end ) {
$cmd = "trexcerpt -vv -a -o $format -w \"wf/%Y/%j/%{sta}.%{chan}.%Y:%j:%H:%M:%S\"
$dbin $start $start 86400" ;
&run($cmd);
$start++ ;
}
exit;
sub run {
my ( $cmd ) = @_ ;
my $line ;
print STDERR "$cmd\n" if $opt_V ;
system ( $cmd ) ;
if ($?) {
print STDERR "$cmd error $? \n";
}
}
extrd_comp_days
#!/bin/csh -f
#
if ($#argv != 4) then
echo ""
echo "USAGE: extrd_days wfdisc start_day end_day network_code"
echo ""
echo "for example: extrd_days anzac_tmp.wfdisc 1998021 1998037 AZ"
exit 1
endif
set wfd = $argv[1]
set cnt = $argv[2]
set end = $argv[3]
set nw = $argv[4]
set cdir = COMP
while ($cnt <= $end)
echo $cnt
set s = $cnt":00:00:00.000"
set e = $cnt":23:59:59.990"
extrd -n $nw -o DAY_VOLS "$wfd" $s $e
@ cnt ++
end
echo "Done extracting day volumes"
echo `date`
echo "Going to compress day volumes"
mkdir DAY_VOLS/$cdir
cd DAY_VOLS
foreach wf ([0-9]*.wfdisc)
set db = `basename $wf .wfdisc`
echo $db
cd $cdir
dbsteimc -v -n $nw -s $db".w" -r -2147483648 -r 2147483647 ../$db $db
cd ..
end
segy2css.tbl
The segy2css.tbl has the following format:
DAS# starttime endtime STA 7338 1997300:00:00:00.0 1999365:23:59:59.9 PUPE 7358 1997298:00:00:00.0 1999365:23:59:59.9 ELAR 7363 1997299:00:00:00.0 1999365:23:59:59.9 SACA 7432 1997289:20:00:00.0 1999365:23:59:59.0 TELM 7441 1997293:22:00:00.0 1999365:23:59:59.9 ALAM 7442 1997293:02:00:00.0 1999365:23:59:59.0 SAJO 7443 1997293:19:00:00.0 1999365:23:59:59.0 OBTO 7450 1997296:00:00:00.0 1999365:23:59:59.0 LACB 7453 1997290:01:00:00.0 1999365:23:59:59.0 LOQI 7458 1997297:00:00:00.0 1999365:23:59:59.9 SAFE 7465 1997302:00:00:00.0 1999365:23:59:59.9 ROKO
URL: http://eqinfo.ucsd.edu/faq/reftek.php [Last updated: 2006-11-03 (307) 00:12:36 UTC]