Deleting Pieces of a Data Set¶
HDF5 files are per-se mutable. pyadsf
currently allows the deletion of
any piece of information in an ASDF
file. Thus overwriting arrays
currently only works as a combination of deleting and writing again. Please
contact the developers if you require additional functionalities.
Deleting Things¶
Almost anything can be deleted by using the del
operator. The following
code snippets illustrates the different possibilities to delete data. Please
be careful - this directly modifies the file and is not revertible.
import pyasdf
with pyasdf.ASDFDataSet("example.h5") as ds:
# Delete all events.
del ds.events
# Delete all information from a particular station.
del ds.waveforms.BW_RJOB
del ds.waveforms["BW.RJOB"]
# Delete all waveforms with a certain tag from a particular station.
del ds.waveforms.BW_RJOB.example
del ds.waveforms["BW.RJOB"]["example"]
# Delete the StationXML file for a certain station.
del ds.waveforms.BW_RJOB.StationXML
del ds.waveforms["BW.RJOB"]["StationXML"]
# Directly delete a certain piece of waveform information.
del ds.waveforms["BW.RJOB"][
"BW.RJOB..EHE__2009-08-24T00:20:03__2009-08-24T00:20:32__example"]
# Delete a provenance document.
del ds.provenance.example_document
del ds.provenance["example_document"]
# Delete an auxiliary data group.
del ds.auxiliary_data.RandomArrays
del ds.auxiliary_data["RandomArrays"]
# Delete a certain piece of auxiliary data.
del ds.auxiliary_data.RandomArrays.array_a
del ds.auxiliary_data["RandomArrays"]["array_a"]
# Also works with nested paths.
del ds.auxiliary_data.RandomArrays.nested.path.array_a
del ds.auxiliary_data["RandomArrays"]["nested"]["path"]["array_a"]
Freeing Space¶
Deleting data sets or groups within an HDF5 file does in general not
physically delete the data from the file. It just removes the item from the
index. To actually regain the now not needed space use the h5repack
program that ships with HDF5.
Assuming a file has been created with the following code snippet:
import pyasdf
with pyasdf.ASDFDataSet("example.h5") as ds:
ds.add_waveforms(..., tag="example")
Current file size:
$ ls -l example.h5
-rw-r--r-- 1 lion staff 144424 Jan 19 15:47 example.h5
Delete some waveform data.
import pyasdf
with pyasdf.ASDFDataSet("example.h5") as ds:
del ds.waveforms.BW_RJOB
The physical space is only regained after h5repack
is used.
$ ls -l example.h5
-rw-r--r-- 1 lion staff 144424 Jan 19 15:47 example.h5
$ h5repack example.h5 example_repacked.h5
$ ls -l example*
-rw-r--r-- 1 lion staff 144424 Jan 19 15:47 example.h5
-rw-r--r-- 1 lion staff 20092 Jan 19 15:48 example_repacked.h5