Alexandria 2.31.0
SDC-CH common library for the Euclid project
|
XYDataset module
This module provides an interface for accessing two dimensional datasets (pairs of (X,Y) values) stored in some storage (file system, database, etc).
Datasets are organized in groups (nested groups are allowed, which create a tree) and they can be uniquely identified by their qualified name, which consists of the group names and the dataset name, separated by slashes "/", for instance : "groupA/groupB/name". Note that datasets might not belong to any group (or alternatively that they might belong to the root group), in which case they are accessed by just using their name (no leading slash). The module abstracts the nature of the storage and the only assumption is that the datasets can be accessed using their qualified names.
The following sections describe briefly how to use this module and can be used as a quick-start guide. For more detailed information refers to the documentation of each individual class and method in the XYDataset namespace.
Before going a bit further, three terms are important for this module, QualifiedName, Group and dataset name. Hereafter, we describe them.
We define a Group as a set of words separated by a '/' character,
e.g. group1/group2/name. It could refer to a set of directories, as group1, group2 and name could be a dataset name.
We defined as a dataset name, a filename (without extension and path) or a name which is defined inside a file (specific keyword defined in a FITS file or a name defined after a commented line in an ASCII file)
The QualifiedName is a class itself. This class represents a name qualified with a set of groups. The groups and names are separated with the '/' character (e.g. group1/group2/name). Note that the qualified name is assumed to be unique and it is used as an identifier.
The following examples show how to create a new XYDataset object and some operations on it.
Note : the second line is used to make the example code more readable and it introduces all the symbols of the XYDataset namespace.
The example below shows how to create a XYDataset object. You have two ways, either you provide a vector pair of double type or two vectors of double type as follows:
or
The easiest way to access the data of a XYDataset object is by using the iterators provided by the XYDataset class. The XYDataset::begin(), XYDataset::end() methods provide an iterator which can be used to iterate through all data. The following lines of code demonstrate how to print on the screen all the contents of a XYDataset object, by using these iterators:
Using the vector1 and vector2 define above, the code is as follows:
Output:
You can get the size of the XYdataset object above by using the follwoing code :
The AsciiParser class is doing this work. It reads ASCII files which contain space or tab separated tables of two columns. The first column contains the X data and the second the Y data. Comments are supported by using the "#" character.
Let's see how to get the XYdataset object reading the ASCII file in the following example:
The ascii_ptr is a unique pointer to the XYDataset object. An exception will be thrown if the file is not found.
The AsciiParser class gets also the dataset name. It is extracted from the first non-empty line of the file, as the first match of a regular expression. If the regular expression does not match, the name of the file (excluding the extension) is used as the name of the dataset. To get the dataset name proceed as follows:
Note that the AsciiParser class inherits from the FileParser interface class. This FileParser class has the two virtual functions : getName and getDataset.
The FitsParser class is done for that. This class has the same functionalities as the AsciiParser class. So to get the dataset and the dataset name proceed as follows:
The FileSystemProvider class is doing that for you. This class inherits from the XYDatasetProvider class. The FileSystemProvider class handles files in a directory tree of the file system. The directory path of the files and the name of the dataset are used for constructing the qualified name to match with the identifier. To support different file formats the work is delegated to the FileParser interface about file related operations (it gets dataset name and data).
Let's see few examples of how it works. In these examples we consider the two ASCII following files:
We create a unique pointer to a FileParser object in order to build a FileSystemProvider object as follows :
The "/tmp/euclid/" string is the root path to the data.
We use the listContents function in order to get all qualified names for the "filter/MER" group as follows :
The following code displays the datatset and qualified names of the result vector:
And the output is:
Now we have all qualified names for a specific group of files, so we can get a XYDadaset object for a specific qualified name (for the first element for instance) as follows :
For this example, we can display the data of this "dataset_ptr" above as follows:
and the result is: