GSD documentation#
The GSD file format is the native file format for HOOMD-blue. GSD files store trajectories of the HOOMD-blue system state in a binary file with efficient random access to frames. GSD allows all particle and topology properties to vary from one frame to the next. Use the GSD Python API to specify the initial condition for a HOOMD-blue simulation or analyze trajectory output with a script. Read a GSD trajectory with a visualization tool to explore the behavior of the simulation.
GitHub Repository: GSD source code and issue tracker.
HOOMD-blue: Simulation engine that reads and writes GSD files.
hoomd-users Google Group: Ask questions to the HOOMD-blue community.
freud: A powerful set of tools for analyzing trajectories.
OVITO: The Open Visualization Tool works with GSD files.
gsd-vmd plugin: VMD plugin to support GSD files.
Installation#
gsd binaries are available in the glotzerlab-software Docker/Singularity images and in
packages on conda-forge and PyPI. You can also compile gsd from source, embed gsd.c
in
your code, or read gsd files with a pure Python reader pygsd.py
.
Binaries#
Conda package#
gsd is available on conda-forge on the linux-64, linux-aarch64, linux-ppc64le, osx-64, osx-arm64 and win-64 platforms. To install, download and install miniforge or miniconda Then install gsd from the conda-forge channel:
$ conda install -c conda-forge gsd
Singularity / Docker images#
See the glotzerlab-software documentation for instructions to install and use the containers.
PyPI#
Use pip to install gsd binaries:
$ python3 -m pip install gsd
Compile from source#
To build the gsd Python package from source:
-
$ <package-manager> install cmake cython git numpy python pytest
-
$ git clone https://github.com/glotzerlab/gsd
-
$ python3 -m pip install -e gsd
OR Build with CMake for development:
$ cmake -B build/gsd -S gsd $ cmake --build build/gsd
To run the tests (optional):
-
$ pytest --pyargs gsd
To build the documentation from source (optional):
-
$ <package-manager> install breathe doxygen sphinx furo ipython
-
$ cd gsd && doxygen && cd .. $ sphinx-build -b html gsd/doc build/gsd-documentation
The sections below provide details on each of these steps.
Install prerequisites#
gsd requires a number of tools and libraries to build.
Note
This documentation is generic. Replace <package-manager>
with your package or module
manager. You may need to adjust package names and/or install additional packages, such as
-dev
packages that provide headers needed to build gsd.
Tip
Create or use an existing virtual environment, one place where you can install dependencies and gsd:
$ python3 -m venv gsd-venv
You will need to activate your environment before installing or configuring gsd:
$ source gsd-venv/bin/activate
General requirements:
C compiler (tested with gcc 9-12, clang 10-14, visual studio 2019-2022)
Python >= 3.8
numpy >= 1.17.3
Cython >= 0.22
To build the documentation:
breathe
Doxygen
Sphinx
IPython
furo
an internet connection
To execute unit tests:
pytest >= 3.9.0
Obtain the source#
Clone using Git:
$ git clone https://github.com/glotzerlab/gsd
Release tarballs are also available on the GitHub release pages.
Install with setuptools#
Use pip to install the Python module into your virtual environment:
$ python3 -m pip install -e gsd
Build with CMake for development#
In addition to the setuptools build system. GSD also provides a CMake configuration for
development and testing. You can assemble a functional Python module in the given build directory.
First, configure the build with cmake
.
$ cmake -B build/gsd -S gsd
Then, build the code:
$ cmake --build build/gsd
When modifying code, you only need to repeat the build step to update your build - it will automatically reconfigure as needed.
Tip
Place your build directory in /tmp
or /scratch
for faster builds. CMake performs
out-of-source builds, so the build directory can be anywhere on the filesystem.
Tip
Pass the following options to cmake
to optimize the build for your processor:
-DCMAKE_CXX_FLAGS=-march=native -DCMAKE_C_FLAGS=-march=native
.
Important
When using a virtual environment, activate the environment and set the cmake prefix path
before running CMake: $ export CMAKE_PREFIX_PATH=<path-to-environment>
.
Run tests#
Use pytest to execute unit tests:
$ python3 -m pytest --pyargs gsd
Add the --validate
option to include longer-running validation tests:
$ python3 -m pytest --pyargs gsd -p gsd.pytest_plugin_validate --validate
Tip
When using CMake builds, change to the build directory before running pytest
:
$ cd build/gsd
Build the documentation#
Run Doxygen to generate the C documentation:
$ cd gsd
$ doxygen
$ cd ..
Run Sphinx to build the HTML documentation:
$ sphinx-build -b html gsd/doc build/gsd-documentation
Open the file build/gsd-documentation/index.html
in your web browser to view the
documentation.
Tip
When iteratively modifying the documentation, the sphinx options -a -n -W -T --keep-going
are helpful to produce docs with consistent links in the side panel and to see more useful error
messages:
$ sphinx-build -a -n -W -T --keep-going -b html gsd/doc build/gsd-documentation
Tip
When using CMake builds, set PYTHONPATH to the build directory before running sphinx-build
:
$ PYTHONPATH=build/gsd sphinx-build -b html gsd/doc build/gsd-documentation
Embedding GSD in your project#
Using the C library#
gsd is implemented in a single C file. Copy gsd/gsd.h
and gsd/gsd.c
into your project.
Using the pure Python reader#
If you only need to read files, you can skip installing and just extract the module modules
gsd/pygsd.py
and gsd/hoomd.py
. Together, these implement a pure Python reader for gsd
and HOOMD files - no C compiler required.
Change Log#
GSD releases follow semantic versioning.
3.x#
3.1.1 (2023-08-03)#
Fixed:
Raise a
FileExistsError
when opening a file that already exists withmode = 'x'
.
3.1.0 (2023-07-28)#
Fixed:
hoomd.read_log
no longer triggers a numpy deprecation warning.
Added:
HOOMDTrajectory.flush
- flush buffered writes on an openHOOMDTrajectory
.
3.0.1 (2023-06-20)#
Fixed:
Prevent
ValueError: signal only works in main thread of the main interpreter
when importing gsd in a non-main thread.
3.0.0 (2023-06-16)#
Added:
gsd.version.version
- version string identifier. PEP8 compliant name replaces__version__
.GSDFile.flush
- flush write buffers (C APIgsd_flush
) (#237).GSDFile.maximum_write_buffer_size
- get/set the write buffer size (C APIgsd_get_maximum_write_buffer_size
/gsd_set_maximum_write_buffer_size
) (#237).GSDFile.index_entries_to_buffer
- get/set the write buffer size (C APIindex_entries_to_buffer
/index_entries_to_buffer
) (#237).On importing
gsd
, install aSIGTERM
handler that callssys.exit(1)
(#237).
Changed:
write_chunk
buffers writes across frames to increase performance (#237).Use Doxygen and breathe to generate C API documentation in Sphinx (#237).
Removed:
2.x#
2.9.0 (2023-05-19)#
Added:
File modes
'r'
,'r+'
,'w'
,'x'
, and'a'
(#238).
Changed:
Test on gcc9, clang10, and newer (#235).
Test and provide binary wheels on Python 3.8 and newer (#235).
Deprecated:
v2.8.1 (2023-03-13)#
Fixed:
Reduce memory usage in most use cases.
Reduce likelihood of data corruption when writing GSD files.
v2.8.0 (2023-02-24)#
Added:
gsd.hoomd.read_log
- Read log quantities from a GSD file.gsd.hoomd.Frame
class to replacegsd.hoomd.Snapshot
.
Changed:
Improved documentation.
Deprecated:
gsd.hoomd.Snapshot
.
v2.7.0 (2022-11-30)#
Added
Support Python 3.11.
v2.6.1 (2022-11-04)#
Fixed:
Default values are now written to frame N (N != 0) when non-default values exist in frame 0.
Data chunks can now be read from files opened in ‘wb’, ‘xb’, and ‘ab’ modes.
v2.6.0 (2022-08-19)#
Changed:
Raise an error when writing a frame with duplicate types.
v2.5.3 (2022-06-22)#
Fixed
Support Python >=3.6.
v2.5.2 (2022-04-15)#
Fixed
Correctly handle non-ASCII characters on Windows.
Document that the
fname
argument togsd_
C API functions is UTF-8 encoded.
v2.5.1 (2021-11-17)#
Added
Support for Python 3.10.
Support for clang 13.
v2.5.0 (2021-10-13)#
Changed
Improved documentation.
Deprecated
HOOMDTrajectory.read_frame
- use indexing (trajectory[index]
) to access frames from a trajectory.
v2.4.2 (2021-04-14)#
Added
MacOS and Windows wheels on PyPI.
Fixed
Documented array shapes for angles, dihedrals, and impropers.
v2.4.1 (2021-03-11)#
Added
Support macos-arm64.
Changed
Stop testing with clang 4-5, gcc 4.8-6.
v2.4.0 (2020-11-11)#
Changed
Set
gsd.hoomd.ConfigurationData.dimensions
default based onbox
’s \(L_z\) value.
Fixed
Failure in
test_fl.py
when run by a user and GSD was installed by root.
v2.3.0 (2020-10-30)#
Added
Support clang 11.
Support Python 3.9.
Changed
Install unit tests with the Python package.
Fixed
Compile error on macOS 10.15.
v2.2.0 (2020-08-05)#
Added
Command line convenience interface for opening a GSD file.
v2.1.2 (2020-06-26)#
Fixed
Adding missing
close
method toHOOMDTrajectory
.Documentation improvements.
v2.1.1 (2020-04-20)#
Fixed
List defaults in
gsd.fl.open
documentation.
v2.1.0 (2020-02-27)#
Added
Shape specification for sphere unions.
v2.0.0 (2020-02-03)#
Note
This release introduces a new file storage format.
GSD >= 2.0 can read and write to files created by GSD 1.x.
Files created or upgraded by GSD >= 2.0 can not be opened by GSD < 1.x.
Added
The
upgrade
method converts a GSD 1.0 file to a GSD 2.0 file in place.Support arbitrarily long chunk names (only in GSD 2.0 files).
Changed
gsd.fl.open
acceptsNone
forapplication
,schema
, andschema_version
when opening files for reading.Improve read latency when accessing files with thousands of chunk names in a frame (only for GSD 2.0 files).
Buffer small writes to improve write performance.
Improve performance and reduce memory usage in read/write modes (‘rb+’, ‘wb+’ and (‘xb+’).
C API: functions return error codes from the
gsd_error
enum. v2.x integer error codes differ from v1.x, use the enum to check. For example:if (retval == GSD_ERROR_IO)
.Python, Cython, and C code must follow strict style guidelines.
Removed
gsd.fl.create
- usegsd.fl.open
.gsd.hoomd.create
- usegsd.hoomd.open
.GSDFile
v1.0 compatibility mode - usegsd.fl.open
.hoomdxml2gsd.py
.
Fixed
Allow more than 127 data chunk names in a single GSD file.
v1.x#
v1.10.0 (2019-11-26)#
Improve performance of first frame write.
Allow pickling of GSD file handles opened in read only mode.
Removed Cython-generated code from repository.
fl.pyx
will be cythonized during installation.
v1.9.3 (2019-10-04)#
Fixed preprocessor directive affecting Windows builds using setup.py.
Documentation updates
v1.9.2 (2019-10-01)#
Support chunk sizes larger than 2GiB
v1.9.1 (2019-09-23)#
Support writing chunks wider than 255 from Python.
v1.9.0 (2019-09-18)#
File API: Add
find_matching_chunk_names()
HOOMD
schema 1.4: Add user defined logged data.HOOMD
schema 1.4: Addtype_shapes
specification.pytest >= 3.9.0 is required to run unit tests.
gsd.fl.open
andgsd.hoomd.open
accept objects implementingos.PathLike
.Report an error when attempting to write a chunk that fails to allocate a name.
Reduce virtual memory usage in
rb
andwb
open modes.Additional checks for corrupt GSD files on open.
Synchronize after expanding file index.
v1.8.1 (2019-08-19)#
Correctly raise
IndexError
when attempting to read frames before the first frame.Raise
RuntimeError
when importinggsd
in unsupported Python versions.
v1.8.0 (2019-08-05)#
Slicing a HOOMDTrajectory object returns a view that can be used to directly select frames from a subset or sliced again.
raise
IndexError
when attempting to read frames before the first frame.Dropped support for Python 2.
v1.7.0 (2019-04-30)#
Add
hpmc/sphere/orientable
to HOOMD schema.HOOMD schema 1.3
v1.6.2 (2019-04-16)#
PyPI binary wheels now support numpy>=1.9.3,<2
v1.6.1 (2019-03-05)#
Documentation updates
v1.6.0 (2018-12-20)#
The length of sliced HOOMDTrajectory objects can be determined with the built-in
len()
function.
v1.5.5 (2018-11-28)#
Silence numpy deprecation warnings
v1.5.4 (2018-10-04)#
Add
pyproject.toml
file that definesnumpy
as a proper build dependency (requires pip >= 10)Reorganize documentation
v1.5.3 (2018-05-22)#
Revert
setup.py
changes in v1.5.2 - these do not work in most circumstances.Include
sys/stat.h
on all architectures.
v1.5.2 (2018-04-04)#
Close file handle on errors in
gsd_open
.Always close file handle in
gsd_close
.setup.py
now correctly pulls in the numpy dependency.
v1.5.1 (2018-02-26)#
Documentation fixes.
v1.5.0 (2018-01-18)#
Read and write HPMC shape state data.
v1.4.0 (2017-12-04)#
Support reading and writing chunks with 0 length. No schema changes are necessary to support this.
v1.3.0 (2017-11-17)#
Document
state
entries in the HOOMD schema.No changes to the gsd format or reader code in v1.3.
v1.2.0 (2017-02-21)#
Add
gsd.hoomd.open()
method which can create and open hoomd gsd files.Add
gsd.fl.open()
method which can create and open gsd files.The previous create/class
GSDFile
instantiation is still supported for backward compatibility.
v1.1.0 (2016-10-04)#
Add special pairs section pairs/ to HOOMD schema.
HOOMD schema version is now 1.1.
v1.0.1 (2016-06-15)#
Fix compile error on more strict POSIX systems.
v1.0.0 (2016-05-24)#
Initial release.
User community#
hoomd-users mailing list#
GSD primarily exists as a file format for HOOMD-blue, so please use the hoomd-users mailing list. Subscribe for release announcements, to post questions questions for advice on using the software, and discuss potential new features.
Issue tracker#
File bug reports on GSD’s issue tracker.
HOOMD examples#
gsd.hoomd
provides high-level access to HOOMD schema GSD files.
View the page source to find unformatted example code.
Import the module#
In [1]: import gsd.hoomd
Define a frame#
In [2]: frame = gsd.hoomd.Frame()
In [3]: frame.particles.N = 4
In [4]: frame.particles.types = ['A', 'B']
In [5]: frame.particles.typeid = [0,0,1,1]
In [6]: frame.particles.position = [[0,0,0],[1,1,1], [-1,-1,-1], [1,-1,-1]]
In [7]: frame.configuration.box = [3, 3, 3, 0, 0, 0]
gsd.hoomd.Frame
stores the state of a single system configuration, or frame, in the file.
Instantiate this class to create a system configuration. All fields default to None
. Each field is
written to the file when not None
and when the data does not match the data in the first frame
or defaults specified in the schema.
Create a hoomd gsd file#
In [8]: f = gsd.hoomd.open(name='file.gsd', mode='w')
Use gsd.hoomd.open
to open a GSD file as a gsd.hoomd.HOOMDTrajectory
instance.
Write frames to a gsd file#
In [9]: def create_frame(i):
...: frame = gsd.hoomd.Frame()
...: frame.configuration.step = i
...: frame.particles.N = 4+i
...: frame.particles.position = numpy.random.random(size=(4+i,3))
...: return frame
...:
In [10]: f = gsd.hoomd.open(name='example.gsd', mode='w')
In [11]: f.extend( (create_frame(i) for i in range(10)) )
In [12]: f.append( create_frame(10) )
In [13]: len(f)
Out[13]: 11
gsd.hoomd.HOOMDTrajectory
is similar to a sequence of gsd.hoomd.Frame
objects. The
append
and extend
methods
add frames to the trajectory.
Tip
When using extend
, pass in a
generator or generator expression to avoid storing the entire
trajectory in memory before writing it out.
Randomly index frames#
In [14]: f = gsd.hoomd.open(name='example.gsd', mode='r')
In [15]: frame = f[5]
In [16]: frame.configuration.step
Out[16]: 5
In [17]: frame.particles.N
Out[17]: 9
In [18]: frame.particles.position
Out[18]:
array([[0.76987386, 0.8500083 , 0.6554387 ],
[0.70285565, 0.39068496, 0.9691509 ],
[0.2806916 , 0.4078087 , 0.21823367],
[0.21312587, 0.5539063 , 0.63151246],
[0.39110786, 0.45884034, 0.64764583],
[0.05678358, 0.5388946 , 0.44161376],
[0.1593161 , 0.45715162, 0.34260198],
[0.12676366, 0.15313694, 0.91920704],
[0.9035517 , 0.4697146 , 0.30803636]], dtype=float32)
gsd.hoomd.HOOMDTrajectory
supports random indexing of frames in the file.
Indexing into a trajectory returns a gsd.hoomd.Frame
.
Slicing and selection#
Use the slicing operator to select individual frames or a subset of a trajectory.
In [19]: f = gsd.hoomd.open(name='example.gsd', mode='r')
In [20]: for frame in f[5:-2]:
....: print(frame.configuration.step, end=' ')
....:
5 6 7 8
In [21]: every_2nd_frame = f[::2] # create a view of a trajectory subset
In [22]: for frame in every_2nd_frame[:4]:
....: print(frame.configuration.step, end=' ')
....:
0 2 4 6
Slicing a trajectory creates a trajectory view, which can then be queried for length or sliced again.
Pure python reader#
In [23]: f = gsd.pygsd.GSDFile(open('example.gsd', 'rb'))
In [24]: trajectory = gsd.hoomd.HOOMDTrajectory(f);
In [25]: trajectory[3].particles.position
Out[25]:
array([[0.82144237, 0.7534815 , 0.20822531],
[0.78262943, 0.6135626 , 0.6509529 ],
[0.14822051, 0.12288113, 0.5776071 ],
[0.6893253 , 0.18314475, 0.63959736],
[0.21940948, 0.8104691 , 0.3400011 ],
[0.7660661 , 0.22540931, 0.8791339 ],
[0.04646937, 0.18391052, 0.31848043]], dtype=float32)
You can use GSD without needing to compile C code to read GSD files
using gsd.pygsd.GSDFile
in combination with gsd.hoomd.HOOMDTrajectory
. It
only supports the rb
mode and does not read files as fast as the C
implementation. It takes in a python file-like object, so it can be used with
in-memory IO classes, and grid file classes that access data over the internet.
Warning
gsd.pygsd
is slow. Use gsd.hoomd.open
whenever possible.
Access logged data#
In [26]: with gsd.hoomd.open(name='log-example.gsd', mode='w') as f:
....: frame = gsd.hoomd.Frame()
....: frame.particles.N = 4
....: for i in range(10):
....: frame.configuration.step = i*100
....: frame.log['particles/net_force'] = numpy.array([[-1,2,-3+i],
....: [0,2,-4],
....: [-3,2,1],
....: [1,2,3]],
....: dtype=numpy.float32)
....: frame.log['value/potential_energy'] = 1.5+i
....: f.append(frame)
....:
Logged data is stored in the log
dictionary as numpy arrays. Place data into
this dictionary directly without the 'log/'
prefix and gsd will include it in
the output. Store per-particle quantities with the prefix particles/
. Choose
another prefix for other quantities.
In [27]: log = gsd.hoomd.read_log(name='log-example.gsd', scalar_only=True)
In [28]: list(log.keys())
Out[28]: ['configuration/step', 'log/value/potential_energy']
In [29]: log['log/value/potential_energy']
Out[29]: array([ 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5])
In [30]: log['configuration/step']
Out[30]: array([ 0, 100, 200, 300, 400, 500, 600, 700, 800, 900], dtype=uint64)
Read logged data from the log
dictionary.
Note
Logged data must be a convertible to a numpy array of a supported type.
In [31]: with gsd.hoomd.open(name='example.gsd', mode='w') as f:
....: frame = gsd.hoomd.Frame()
....: frame.particles.N = 4
....: frame.log['invalid'] = dict(a=1, b=5)
....: f.append(frame)
....:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[31], line 5
3 frame.particles.N = 4
4 frame.log['invalid'] = dict(a=1, b=5)
----> 5 f.append(frame)
File ~/checkouts/readthedocs.org/user_builds/gsd/envs/v3.1.1/lib/python3.11/site-packages/gsd/hoomd.py:780, in HOOMDTrajectory.append(self, frame)
778 # write log data
779 for log, data in frame.log.items():
--> 780 self.file.write_chunk('log/' + log, data)
782 self.file.end_frame()
File ~/checkouts/readthedocs.org/user_builds/gsd/envs/v3.1.1/lib/python3.11/site-packages/gsd/fl.pyx:611, in gsd.fl.GSDFile.write_chunk()
ValueError: invalid type for chunk: log/invalid
Use multiprocessing#
import multiprocessing
def count_particles(args):
t, frame_idx = args
return len(t[frame_idx].particles.position)
with gsd.hoomd.open(name='example.gsd', mode='r') as t:
with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:
result = pool.map(count_particles, [(t, frame_idx) for frame_idx in range(len(t))])
result
gsd.hoomd.HOOMDTrajectory
can be pickled when in read mode to allow for multiprocessing through
Python’s multiprocessing
library. Here count_particles
finds the number of particles
in each frame and appends it to a list.
Using the command line#
The GSD library provides a command line interface for reading files with first-class support for reading HOOMD GSD files. The CLI opens a Python interpreter with a file opened in a specified mode.
$ gsd read -s hoomd 'example.gsd'
...
File: example.gsd
Number of frames: 11
The GSD file handle is available via the "handle" variable.
For supported schema, you may access the trajectory using the "traj" variable.
Type "help(handle)" or "help(traj)" for more information.
The gsd and gsd.fl packages are always loaded.
Schema-specific modules (e.g. gsd.hoomd) are loaded if available.
>>> len(traj)
11
>>> traj[0].particles.position.shape == (4, 3)
True
>>> handle.read_chunk(0, 'particles/N')
array([4], dtype=uint32)
File layer examples#
The file layer python module gsd.fl
allows direct low level access to read and
write GSD files of any schema. The HOOMD reader (gsd.hoomd
) provides
higher level access to HOOMD schema files, see HOOMD examples.
View the page source to find unformatted example code.
Import the module#
In [1]: import gsd.fl
Open a gsd file#
In [2]: f = gsd.fl.open(name="file.gsd",
...: mode='w',
...: application="My application",
...: schema="My Schema",
...: schema_version=[1,0])
...:
Use gsd.fl.open
to open a gsd file.
Note
When creating a new file, you must specify the application name, schema name, and schema version.
Warning
Opening a gsd file with a ‘w’ or ‘x’ mode overwrites any existing file with the given name.
Close a gsd file#
In [3]: f.close()
Call the close
method to close the file.
Write data#
In [4]: f = gsd.fl.open(name="file.gsd",
...: mode='w',
...: application="My application",
...: schema="My Schema",
...: schema_version=[1,0]);
...:
In [5]: f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32))
In [6]: f.write_chunk(name='chunk2', data=numpy.array([[5,6],[7,8]], dtype=numpy.float32))
In [7]: f.end_frame()
In [8]: f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32))
In [9]: f.write_chunk(name='chunk2', data=numpy.array([[13,14],[15,16]], dtype=numpy.float32))
In [10]: f.end_frame()
In [11]: f.close()
Add any number of named data chunks to each frame in the file with
write_chunk
. The data must be a 1 or 2
dimensional numpy array of a simple numeric type (or a data type that will
automatically convert when passed to numpy.array(data)
. Call
end_frame
to end the frame and start the next one.
Note
While supported, implicit conversion to numpy arrays creates a copy of the data in memory and adds conversion overhead.
Warning
Call end_frame
to write the last frame before
closing the file.
Read data#
In [12]: f = gsd.fl.open(name="file.gsd", mode='r')
In [13]: f.read_chunk(frame=0, name='chunk1')
Out[13]: array([1., 2., 3., 4.], dtype=float32)
In [14]: f.read_chunk(frame=1, name='chunk2')
Out[14]:
array([[13., 14.],
[15., 16.]], dtype=float32)
In [15]: f.close()
read_chunk
reads the named chunk at the given
frame index in the file and returns it as a numpy array.
Test if a chunk exists#
In [16]: f = gsd.fl.open(name="file.gsd", mode='r')
In [17]: f.chunk_exists(frame=0, name='chunk1')
Out[17]: True
In [18]: f.chunk_exists(frame=1, name='chunk2')
Out[18]: True
In [19]: f.chunk_exists(frame=2, name='chunk1')
Out[19]: False
In [20]: f.close()
chunk_exists
tests to see if a chunk by the
given name exists in the file at the given frame.
Discover chunk names#
In [21]: f = gsd.fl.open(name="file.gsd", mode='r')
In [22]: f.find_matching_chunk_names('')
Out[22]: ['chunk1', 'chunk2']
In [23]: f.find_matching_chunk_names('chunk')
Out[23]: ['chunk1', 'chunk2']
In [24]: f.find_matching_chunk_names('chunk1')
Out[24]: ['chunk1']
In [25]: f.find_matching_chunk_names('other')
Out[25]: []
find_matching_chunk_names
finds all
chunk names present in a GSD file that start with the given string.
Read-only access#
In [26]: f = gsd.fl.open(name="file.gsd", mode='r')
In [27]: if f.chunk_exists(frame=0, name='chunk1'):
....: data = f.read_chunk(frame=0, name='chunk1')
....:
In [28]: data
Out[28]: array([1., 2., 3., 4.], dtype=float32)
# Fails because the file is open read only
In [29]: f.write_chunk(name='error', data=numpy.array([1]))
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[29], line 1
----> 1 f.write_chunk(name='error', data=numpy.array([1]))
File ~/checkouts/readthedocs.org/user_builds/gsd/envs/v3.1.1/lib/python3.11/site-packages/gsd/fl.pyx:627, in gsd.fl.GSDFile.write_chunk()
File ~/checkouts/readthedocs.org/user_builds/gsd/envs/v3.1.1/lib/python3.11/site-packages/gsd/fl.pyx:55, in gsd.fl.__raise_on_error()
RuntimeError: File must be writable: file.gsd
In [30]: f.close()
Writes fail when a file is opened in a read only mode.
Access file metadata#
In [31]: f = gsd.fl.open(name="file.gsd", mode='r')
In [32]: f.name
Out[32]: 'file.gsd'
In [33]: f.mode
Out[33]: 'r'
In [34]: f.gsd_version
Out[34]: (2, 0)
In [35]: f.application
Out[35]: 'My application'
In [36]: f.schema
Out[36]: 'My Schema'
In [37]: f.schema_version
Out[37]: (1, 0)
In [38]: f.nframes
Out[38]: 2
In [39]: f.close()
Read file metadata from properties of the file object.
Open a file in read/write mode#
In [40]: f = gsd.fl.open(name="file.gsd",
....: mode='w',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [41]: f.write_chunk(name='double', data=numpy.array([1,2,3,4], dtype=numpy.float64));
In [42]: f.end_frame()
In [43]: f.nframes
Out[43]: 1
In [44]: f.read_chunk(frame=0, name='double')
Out[44]: array([1., 2., 3., 4.])
Open a file in read/write mode to allow both reading and writing.
Use as a context manager#
In [45]: with gsd.fl.open(name="file.gsd", mode='r') as f:
....: data = f.read_chunk(frame=0, name='double');
....:
In [46]: data
Out[46]: array([1., 2., 3., 4.])
Use gsd.fl.GSDFile
as a context manager for guaranteed file closure and
cleanup when exceptions occur.
Store string chunks#
In [47]: f = gsd.fl.open(name="file.gsd",
....: mode='w',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [48]: f.mode
Out[48]: 'w'
In [49]: s = "This is a string"
In [50]: b = numpy.array([s], dtype=numpy.dtype((bytes, len(s)+1)))
In [51]: b = b.view(dtype=numpy.int8)
In [52]: b
Out[52]:
array([ 84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 116, 114,
105, 110, 103, 0], dtype=int8)
In [53]: f.write_chunk(name='string', data=b)
In [54]: f.end_frame()
In [55]: r = f.read_chunk(frame=0, name='string')
In [56]: r
Out[56]:
array([ 84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 116, 114,
105, 110, 103, 0], dtype=int8)
In [57]: r = r.view(dtype=numpy.dtype((bytes, r.shape[0])));
In [58]: r[0].decode('UTF-8')
Out[58]: 'This is a string'
In [59]: f.close()
To store a string in a gsd file, convert it to a numpy array of bytes and store that data in the file. Decode the byte sequence to get back a string.
Truncate#
In [60]: f = gsd.fl.open(name="file.gsd", mode='r+')
In [61]: f.nframes
Out[61]: 1
In [62]: f.schema, f.schema_version, f.application
Out[62]: ('My Schema', (1, 0), 'My application')
In [63]: f.truncate()
In [64]: f.nframes
Out[64]: 0
In [65]: f.schema, f.schema_version, f.application
Out[65]: ('My Schema', (1, 0), 'My application')
In [66]: f.close()
Truncating a gsd file removes all data chunks from it, but retains the same schema, schema version, and application name. The file is not closed during this process. This is useful when writing restart files on a Lustre file system when file open operations need to be kept to a minimum.
gsd Python package#
GSD provides a Python API. Use the gsd.hoomd
module to read and write files for
HOOMD-blue.
Submodules#
gsd.fl module#
GSD file layer API.
Low level access to gsd files. gsd.fl
allows direct access to create,
read, and write gsd
files. The module is implemented in C and is optimized.
See File layer examples for detailed example code.
- class gsd.fl.GSDFile#
GSD file access interface.
- Parameters:
name (str) – Name of the open file.
mode (str) – Mode of the open file.
gsd_version (tuple[int, int]) – GSD file layer version number (major, minor).
application (str) – Name of the generating application.
schema (str) – Name of the data schema.
schema_version (tuple[int, int]) – Schema version number (major, minor).
nframes (int) – Number of frames.
GSDFile
implements an object oriented class interface to the GSD file layer. Useopen()
to open a GSD file and obtain aGSDFile
instance.GSDFile
can be used as a context manager.- __reduce__()#
Allows filehandles to be pickled when in read only mode.
- chunk_exists(frame, name)#
Test if a chunk exists.
- Parameters:
- Returns:
True
if the chunk exists in the file at the given frame.False
if it does not.- Return type:
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[13,14],[15,16]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.chunk_exists(frame=0, name='chunk1') Out[3]: True In [4]: f.chunk_exists(frame=0, name='chunk2') Out[4]: True In [5]: f.chunk_exists(frame=0, name='chunk3') Out[5]: False In [6]: f.chunk_exists(frame=10, name='chunk1') Out[6]: False In [7]: f.close()
- close()#
Close the file.
Once closed, any other operation on the file object will result in a
ValueError
.close()
may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.Example
In [1]: f = gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [2]: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], dtype=numpy.float32)) ...: In [3]: f.end_frame() In [4]: data = f.read_chunk(frame=0, name='chunk1') In [5]: f.close() # Read fails because the file is closed In [6]: data = f.read_chunk(frame=0, name='chunk1') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[6], line 1 ----> 1 data = f.read_chunk(frame=0, name='chunk1') File ~/checkouts/readthedocs.org/user_builds/gsd/envs/v3.1.1/lib/python3.11/site-packages/gsd/fl.pyx:740, in gsd.fl.GSDFile.read_chunk() ValueError: File is not open
- end_frame()#
Complete writing the current frame. After calling
end_frame()
future calls towrite_chunk()
will write to the next frame in the file.Danger
Call
end_frame()
to complete the current frame before closing the file. If you fail to callend_frame()
, the last frame will not be written to disk.Example
In [1]: f = gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [2]: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], dtype=numpy.float32)) ...: In [3]: f.end_frame() In [4]: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: In [5]: f.end_frame() In [6]: f.write_chunk(name='chunk1', ...: data=numpy.array([13,14], ...: dtype=numpy.float32)) ...: In [7]: f.end_frame() In [8]: f.nframes Out[8]: 3 In [9]: f.close()
- find_matching_chunk_names(match)#
Find all the chunk names in the file that start with the string match.
- Parameters:
match (str) – Start of the chunk name to match
- Returns:
Matching chunk names
- Return type:
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: f.write_chunk(name='data/chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.write_chunk(name='data/chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.write_chunk(name='input/chunk3', ...: data=numpy.array([9, 10], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='input/chunk4', ...: data=numpy.array([11, 12, 13, 14], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.find_matching_chunk_names('') Out[3]: ['data/chunk1', 'data/chunk2', 'input/chunk3', 'input/chunk4'] In [4]: f.find_matching_chunk_names('data') Out[4]: ['data/chunk1', 'data/chunk2'] In [5]: f.find_matching_chunk_names('input') Out[5]: ['input/chunk3', 'input/chunk4'] In [6]: f.find_matching_chunk_names('other') Out[6]: [] In [7]: f.close()
- flush()#
Flush all buffered frames to the file.
- read_chunk(frame, name)#
Read a data chunk from the file and return it as a numpy array.
- Parameters:
- Returns:
Data read from file.
N
,M
, andtype
are determined by the chunk metadata. If the data is NxM in the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.- Return type:
(N,M)
or(N,)
numpy.ndarray
oftype
Tip
Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, call
read_chunk()
on the same chunk only once.Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[13,14],[15,16]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.read_chunk(frame=0, name='chunk1') Out[3]: array([1., 2., 3., 4.], dtype=float32) In [4]: f.read_chunk(frame=1, name='chunk1') Out[4]: array([ 9., 10., 11., 12.], dtype=float32) In [5]: f.read_chunk(frame=2, name='chunk1') --------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[5], line 1 ----> 1 f.read_chunk(frame=2, name='chunk1') File ~/checkouts/readthedocs.org/user_builds/gsd/envs/v3.1.1/lib/python3.11/site-packages/gsd/fl.pyx:753, in gsd.fl.GSDFile.read_chunk() KeyError: 'frame 2 / chunk chunk1 not found in: file.gsd' In [6]: f.close()
- truncate()#
Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application, schema, and schema version remain the same.
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: for i in range(10): ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r+', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.nframes Out[3]: 10 In [4]: f.schema, f.schema_version, f.application Out[4]: ('My Schema', (1, 0), 'My application') In [5]: f.truncate() In [6]: f.nframes Out[6]: 0 In [7]: f.schema, f.schema_version, f.application Out[7]: ('My Schema', (1, 0), 'My application') In [8]: f.close()
- upgrade()#
Upgrade a GSD file to the v2 specification in place. The file must be open in a writable mode.
- write_chunk(name, data)#
Write a data chunk to the file. After writing all chunks in the current frame, call
end_frame()
.- Parameters:
name (str) – Name of the chunk
data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer dimensions.
Warning
write_chunk()
will implicitly converts array-like and non-contiguous numpy arrays to contiguous numpy arrays withnumpy.ascontiguousarray(data)
. This may or may not produce desired data types in the output file and incurs overhead.Example
In [1]: f = gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [2]: f.write_chunk(name='float1d', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: In [3]: f.write_chunk(name='float2d', ...: data=numpy.array([[13,14],[15,16],[17,19]], ...: dtype=numpy.float32)) ...: In [4]: f.write_chunk(name='double2d', ...: data=numpy.array([[1,4],[5,6],[7,9]], ...: dtype=numpy.float64)) ...: In [5]: f.write_chunk(name='int1d', ...: data=numpy.array([70,80,90], ...: dtype=numpy.int64)) ...: In [6]: f.end_frame() In [7]: f.nframes Out[7]: 1 In [8]: f.close()
- gsd.fl.open(name, mode, application=None, schema=None, schema_version=None)#
open()
opens a GSD file and returns aGSDFile
instance. The return value ofopen()
can be used as a context manager.- Parameters:
Valid values for
mode
:mode
description
'r'
Open an existing file for reading.
'r+'
Open an existing file for reading and writing.
'w'
Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.
'x'
Create a gsd file exclusively and opens it for reading and writing. Raise
FileExistsError
if it already exists.'a'
Open a file for reading and writing. Creates the file if it doesn’t exist.
When opening a file for reading (
'r'
and'r+'
modes):application
andschema_version
are ignored and may beNone
. Whenschema
is notNone
,open()
throws an exception if the file’s schema does not matchschema
.When opening a file for writing (
'w'
,'x'
, or'a'
modes): The givenapplication
,schema
, andschema_version
must not be None.Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", schema="My Schema", ...: schema_version=[1,0]) as f: ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[13,14],[15,16]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r') In [3]: if f.chunk_exists(frame=0, name='chunk1'): ...: data = f.read_chunk(frame=0, name='chunk1') ...: In [4]: data Out[4]: array([1., 2., 3., 4.], dtype=float32) In [5]: f.close()
gsd.hoomd module#
Read and write HOOMD schema GSD files.
gsd.hoomd
reads and writes GSD files with the hoomd
schema.
HOOMDTrajectory
- Read and write hoomd schema GSD files.Frame
- Store the state of a single frame.ConfigurationData
- Store configuration data in a frame.ParticleData
- Store particle data in a frame.BondData
- Store topology data in a frame.
open
- Open a hoomd schema GSD file.read_log
- Read log from a hoomd schema GSD file into a dict of time-series arrays.
See also
See HOOMD examples for full examples.
- class gsd.hoomd.BondData(M)#
Bases:
object
Store bond data chunks.
Use the
Frame.bonds
,Frame.angles
,Frame.dihedrals
,Frame.impropers
, andFrame.pairs
attributes to access the bond topology.Instances resulting from file read operations will always store array quantities in
numpy.ndarray
objects of the defined types. User created frames may provide input data that can be converted to anumpy.ndarray
.See also
hoomd.State
for a full description of how HOOMD interprets this data.Note
M varies depending on the type of bond.
BondData
represents all types of topology connections.Type
M
Bond
2
Angle
3
Dihedral
4
Improper
4
Pair
2
- N#
Number of bonds/angles/dihedrals/impropers/pairs in the frame (
bonds/N
,angles/N
,dihedrals/N
,impropers/N
,pairs/N
).- Type:
- types#
Names of the particle types (
bonds/types
,angles/types
,dihedrals/types
,impropers/types
,pairs/types
).
- typeid#
Bond type id (
bonds/typeid
,angles/typeid
,dihedrals/typeid
,impropers/typeid
,pairs/types
).- Type:
(N,)
numpy.ndarray
ofnumpy.uint32
- group#
Tags of the particles in the bond (
bonds/group
,angles/group
,dihedrals/group
,impropers/group
,pairs/group
).- Type:
(N, M)
numpy.ndarray
ofnumpy.uint32
- validate()#
Validate all attributes.
Convert every array attribute to a
numpy.ndarray
of the proper type and check that all attributes have the correct dimensions.Ignore any attributes that are
None
.Warning
Array attributes that are not contiguous numpy arrays will be replaced with contiguous numpy arrays of the appropriate type.
- class gsd.hoomd.ConfigurationData#
Bases:
object
Store configuration data.
Use the
Frame.configuration
attribute of a to access the configuration.- step#
Time step of this frame (
configuration/step
).- Type:
- dimensions#
Number of dimensions (
configuration/dimensions
). When not set explicitly, dimensions will default to different values based on the value of \(L_z\) inbox
. When \(L_z = 0\) dimensions will default to 2, otherwise 3. User set values always take precedence.- Type:
- property box#
Box dimensions (
configuration/box
).[lx, ly, lz, xy, xz, yz].
- Type:
((6, 1)
numpy.ndarray
ofnumpy.float32
)
- validate()#
Validate all attributes.
Convert every array attribute to a
numpy.ndarray
of the proper type and check that all attributes have the correct dimensions.Ignore any attributes that are
None
.Warning
Array attributes that are not contiguous numpy arrays will be replaced with contiguous numpy arrays of the appropriate type.
- class gsd.hoomd.ConstraintData#
Bases:
object
Store constraint data.
Use the
Frame.constraints
attribute to access the constraints.Instances resulting from file read operations will always store array quantities in
numpy.ndarray
objects of the defined types. User created frames may provide input data that can be converted to anumpy.ndarray
.See also
hoomd.State
for a full description of how HOOMD interprets this data.- N#
Number of constraints in the frame (
constraints/N
).- Type:
- value#
Constraint length (
constraints/value
).- Type:
(N, )
numpy.ndarray
ofnumpy.float32
- group#
Tags of the particles in the constraint (
constraints/group
).- Type:
(N, 2)
numpy.ndarray
ofnumpy.uint32
- validate()#
Validate all attributes.
Convert every array attribute to a
numpy.ndarray
of the proper type and check that all attributes have the correct dimensions.Ignore any attributes that are
None
.Warning
Array attributes that are not contiguous numpy arrays will be replaced with contiguous numpy arrays of the appropriate type.
- class gsd.hoomd.Frame#
Bases:
object
System state at one point in time.
- configuration#
Configuration data.
- Type:
- particles#
Particles.
- Type:
- constraints#
Distance constraints.
- Type:
- log#
Logged data (values must be
numpy.ndarray
orarray_like
)- Type:
- validate()#
Validate all contained frame data.
- class gsd.hoomd.HOOMDTrajectory(file)#
Bases:
object
Read and write hoomd gsd files.
- Parameters:
file (
gsd.fl.GSDFile
) – File to access.
Open hoomd GSD files with
open
.- __enter__()#
Enter the context manager.
- __exit__(exc_type, exc_value, traceback)#
Close the file when the context manager exits.
- __getitem__(key)#
Index trajectory frames.
The index can be a positive integer, negative integer, or slice and is interpreted the same as
list
indexing.Warning
As you loop over frames, each frame is read from the file when it is reached in the iteration. Multiple passes may lead to multiple disk reads if the file does not fit in cache.
- __iter__()#
Iterate over frames in the trajectory.
- __len__()#
The number of frames in the trajectory.
- append(frame)#
Append a frame to a hoomd gsd file.
- Parameters:
frame (
Frame
) – Frame to append.
Write the given frame to the file at the current frame and increase the frame counter. Do not write any fields that are
None
. For all non-None
fields, scan them and see if they match the initial frame or the default value. If the given data differs, write it out to the frame. If it is the same, do not write it out as it can be instantiated either from the value at the initial frame or the default value.
- close()#
Close the file.
- extend(iterable)#
Append each item of the iterable to the file.
- Parameters:
iterable – An iterable object the provides
Frame
instances. This could be another HOOMDTrajectory, a generator that modifies frames, or a list of frames.
- property file#
The file handle.
- Type:
- flush()#
Flush all buffered frames to the file.
- truncate()#
Remove all frames from the file.
- class gsd.hoomd.ParticleData#
Bases:
object
Store particle data chunks.
Use the
Frame.particles
attribute of a to access the particles.Instances resulting from file read operations will always store array quantities in
numpy.ndarray
objects of the defined types. User created frames may provide input data that can be converted to anumpy.ndarray
.See also
hoomd.State
for a full description of how HOOMD interprets this data.- N#
Number of particles in the frame (
particles/N
).- Type:
- types#
Names of the particle types (
particles/types
).
- position#
Particle position (
particles/position
).- Type:
(N, 3)
numpy.ndarray
ofnumpy.float32
- orientation#
Particle orientation. (
particles/orientation
).- Type:
(N, 4)
numpy.ndarray
ofnumpy.float32
- typeid#
Particle type id (
particles/typeid
).- Type:
(N, )
numpy.ndarray
ofnumpy.uint32
- mass#
Particle mass (
particles/mass
).- Type:
(N, )
numpy.ndarray
ofnumpy.float32
- charge#
Particle charge (
particles/charge
).- Type:
(N, )
numpy.ndarray
ofnumpy.float32
- diameter#
Particle diameter (
particles/diameter
).- Type:
(N, )
numpy.ndarray
ofnumpy.float32
- body#
Particle body (
particles/body
).- Type:
(N, )
numpy.ndarray
ofnumpy.int32
- moment_inertia#
Particle moment of inertia (
particles/moment_inertia
).- Type:
(N, 3)
numpy.ndarray
ofnumpy.float32
- velocity#
Particle velocity (
particles/velocity
).- Type:
(N, 3)
numpy.ndarray
ofnumpy.float32
- angmom#
Particle angular momentum (
particles/angmom
).- Type:
(N, 4)
numpy.ndarray
ofnumpy.float32
- image#
Particle image (
particles/image
).- Type:
(N, 3)
numpy.ndarray
ofnumpy.int32
- type_shapes#
Shape specifications for visualizing particle types (
particles/type_shapes
).
- validate()#
Validate all attributes.
Convert every array attribute to a
numpy.ndarray
of the proper type and check that all attributes have the correct dimensions.Ignore any attributes that are
None
.Warning
Array attributes that are not contiguous numpy arrays will be replaced with contiguous numpy arrays of the appropriate type.
- gsd.hoomd.open(name, mode='r')#
Open a hoomd schema GSD file.
The return value of
open
can be used as a context manager.- Parameters:
- Returns:
HOOMDTrajectory
instance that accesses the file name with the given mode.
Valid values for
mode
:mode
description
'r'
Open an existing file for reading.
'r+'
Open an existing file for reading and writing.
'w'
Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.
'x'
Create a gsd file exclusively and opens it for reading and writing. Raise
FileExistsError
if it already exists.'a'
Open a file for reading and writing. Creates the file if it doesn’t exist.
- gsd.hoomd.read_log(name, scalar_only=False)#
Read log from a hoomd schema GSD file into a dict of time-series arrays.
- Parameters:
The log data includes
configuration/step
and all matchinglog/user_defined
,log/bonds/user_defined
, andlog/particles/user_defined
quantities in the file.- Returns:
Note
read_log
issues aRuntimeWarning
when there are no matchinglog/
quantities in the file.Caution
read_log
requires that a logged quantity has the same shape in all frames. Useopen
andFrame.log
to read files where the shape changes from frame to frame.To create a pandas
DataFrame
with the logged data:In [1]: import pandas In [2]: df = pandas.DataFrame(gsd.hoomd.read_log('log-example.gsd', ...: scalar_only=True)) ...: In [3]: df Out[3]: configuration/step log/value/potential_energy 0 0 1.5 1 100 2.5 2 200 3.5 3 300 4.5 4 400 5.5 5 500 6.5 6 600 7.5 7 700 8.5 8 800 9.5 9 900 10.5
gsd.pygsd module#
GSD reader written in pure Python.
pygsd.py
is a pure Python implementation of a GSD reader. If your
analysis tool is written in Python and you want to embed a GSD reader without
requiring C code compilation or require the gsd Python package as a
dependency, then use the following Python files from the gsd/
directory
to make a pure Python reader. It is not as high performance as the C reader.
gsd/
__init__.py
pygsd.py
hoomd.py
The reader reads from file-like Python objects, which may be useful for reading
from in memory buffers, and in-database grid files, For regular files on the
filesystem, and for writing gsd files, use gsd.fl
.
The GSDFile
in this module can be used with the
gsd.hoomd.HOOMDTrajectory
hoomd reader:
>>> with gsd.pygsd.GSDFile('test.gsd', 'r') as f:
... t = gsd.hoomd.HOOMDTrajectory(f)
... pos = t[0].particles.position
- class gsd.pygsd.GSDFile(file)#
GSD file access interface.
Implemented in pure Python and accepts any Python file-like object.
- Parameters:
file – File-like object to read.
GSDFile implements an object oriented class interface to the GSD file layer. Use it to open an existing file in a read-only mode. For read-write access to files, use the full featured C implementation in
gsd.fl
. Otherwise, the two implementations can be used interchangeably.Examples
Open a file in read-only mode:
f = GSDFile(open('file.gsd', mode='r')) if f.chunk_exists(frame=0, name='chunk'): data = f.read_chunk(frame=0, name='chunk')
Access file metadata:
f = GSDFile(open('file.gsd', mode='r')) print(f.name, f.mode, f.gsd_version) print(f.application, f.schema, f.schema_version) print(f.nframes)
Use as a context manager:
with GSDFile(open('file.gsd', mode='r')) as f: data = f.read_chunk(frame=0, name='chunk')
- __enter__()#
Implement the context manager protocol.
- __exit__(exc_type, exc_value, traceback)#
Implement the context manager protocol.
- __getstate__()#
Implement the pickle protocol.
- __setstate__(state)#
Implement the pickle protocol.
- chunk_exists(frame, name)#
Test if a chunk exists.
- Parameters:
- Returns:
True if the chunk exists in the file. False if it does not.
- Return type:
Example
Handle non-existent chunks:
with GSDFile(open('file.gsd', mode='r')) as f: if f.chunk_exists(frame=0, name='chunk'): return f.read_chunk(frame=0, name='chunk') else: return None
- close()#
Close the file.
Once closed, any other operation on the file object will result in a
ValueError
.close()
may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.
- end_frame()#
Not implemented.
- property file#
File-like object opened.
- find_matching_chunk_names(match)#
Find chunk names in the file that start with the string match.
- property gsd_version#
GSD file layer version number.
The tuple is in the order (major, minor).
- read_chunk(frame, name)#
Read a data chunk from the file and return it as a numpy array.
- Parameters:
- Returns:
Data read from file.
- Return type:
Examples
Read a 1D array:
with GSDFile(name=filename, mode='r') as f: data = f.read_chunk(frame=0, name='chunk1d') # data.shape == [N]
Read a 2D array:
with GSDFile(name=filename, mode='r') as f: data = f.read_chunk(frame=0, name='chunk2d') # data.shape == [N,M]
Read multiple frames:
with GSDFile(name=filename, mode='r') as f: data0 = f.read_chunk(frame=0, name='chunk') data1 = f.read_chunk(frame=1, name='chunk') data2 = f.read_chunk(frame=2, name='chunk') data3 = f.read_chunk(frame=3, name='chunk')
Tip
Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, don’t call
read_chunk()
on the same chunk repeatedly. Cache the arrays instead.
- property schema_version#
Schema version number.
The tuple is in the order (major, minor).
- truncate()#
Not implemented.
- write_chunk(name, data)#
Not implemented.
gsd.version module#
Define the current version of the gsd package.
Package contents#
The GSD main module.
The main package gsd
is the root package. It holds the submodules
gsd.fl
and gsd.hoomd
, but does not import them by default.
You must explicitly import these modules before use:
import gsd.fl
import gsd.hoomd
Logging#
All Python modules in GSD use the Python standard library module logging
to log
events. Use this module to control the verbosity and output destination:
import logging
logging.basicConfig(level=logging.INFO)
Signal handling#
On import, gsd
installs a SIGTERM
signal handler that calls sys.exit
so that open gsd files
have a chance to flush write buffers (GSDFile.flush
) when a user’s process is terminated. Use
signal.signal
to adjust this behavior as needed.
gsd command line interface#
GSD provides a command line interface for rapid inspection of files from the command line.
The GSD command line interface.
To simplify ad hoc usage of gsd
, this module provides a command line
interface for interacting with GSD files. The primary entry point is a single
command for starting a Python interpreter with a GSD file pre-loaded:
$ gsd read trajectory.gsd
The following options are available for the read
subcommand:
- -s schema, --schema schema#
The schema of the GSD file. Supported values for
schema
are “hoomd” and “none”.
- -m mode, --mode mode#
The mode in which to open the file. Valid modes are identical to those accepted by
gsd.fl.open()
.
C API#
The GSD C API consists of a single header and source file. Developers can drop the implementation into any package that needs it.
-
struct gsd_byte_buffer#
- #include <gsd.h>
Byte buffer.
Used to buffer of small data chunks held for a buffered write at the end of a frame. Also used to hold the names.
-
struct gsd_handle#
- #include <gsd.h>
File handle.
A handle to an open GSD file.
This handle is obtained when opening a GSD file and is passed into every method that operates on the file.
Warning
All members are read-only to the caller.
Public Members
-
int fd#
File descriptor.
-
struct gsd_header header#
The file header.
-
struct gsd_index_buffer file_index#
Mapped data chunk index.
-
struct gsd_index_buffer frame_index#
Index entries to append to the current frame.
-
struct gsd_index_buffer buffer_index#
Buffered index entries to append to the current frame.
-
struct gsd_byte_buffer write_buffer#
Buffered write data.
-
struct gsd_name_buffer file_names#
List of names stored in the file.
-
struct gsd_name_buffer frame_names#
List of names added in the current frame.
-
enum gsd_open_flag open_flags#
Flags passed to gsd_open() when opening this handle.
-
struct gsd_name_id_map name_map#
Access the names in the namelist.
-
int fd#
-
struct gsd_header#
- #include <gsd.h>
GSD file header.
The in-memory and on-disk storage of the GSD file header. Stored in the first 256 bytes of the file.
Warning
All members are read-only to the caller.
Public Members
-
uint32_t schema_version#
Schema version: from gsd_make_version().
-
uint32_t gsd_version#
GSD file format version from gsd_make_version().
-
char application[GSD_NAME_SIZE]#
Name of the application that generated this file.
-
char schema[GSD_NAME_SIZE]#
Name of data schema.
-
char reserved[GSD_RESERVED_BYTES]#
Reserved for future use.
-
uint32_t schema_version#
-
struct gsd_index_buffer#
- #include <gsd.h>
Array of index entries.
May point to a mapped location of index entries in the file or an in-memory buffer.
-
struct gsd_index_entry#
- #include <gsd.h>
Index entry.
An index entry for a single chunk of data.
Warning
All members are read-only to the caller.
Public Members
-
struct gsd_name_buffer#
- #include <gsd.h>
Name buffer.
Holds a list of string names in order separated by NULL terminators. In v1 files, each name is 64 bytes. In v2 files, only one NULL terminator is placed between each name.
-
struct gsd_name_id_map#
- #include <gsd.h>
Name/id hash map.
A hash map of string names to integer identifiers.
Public Members
-
struct gsd_name_id_pair *v#
Name/id mappings.
-
struct gsd_name_id_pair *v#
-
struct gsd_name_id_pair#
- #include <gsd.h>
Name/id mapping.
A string name paired with an ID. Used for storing sorted name/id mappings in a hash map.
Public Members
-
char *name#
Pointer to name (actual name storage is allocated in gsd_handle)
-
struct gsd_name_id_pair *next#
Next name/id pair with the same hash.
-
char *name#
- file gsd.h
- #include <stdbool.h>#include <stdint.h>#include <string.h>
Declare GSD data types and C API.
Enums
-
enum gsd_type#
Identifiers for the gsd data chunk element types.
Values:
-
enumerator GSD_TYPE_UINT8#
Unsigned 8-bit integer.
-
enumerator GSD_TYPE_UINT16#
Unsigned 16-bit integer.
-
enumerator GSD_TYPE_UINT32#
Unsigned 32-bit integer.
-
enumerator GSD_TYPE_UINT64#
Unsigned 53-bit integer.
-
enumerator GSD_TYPE_INT8#
Signed 8-bit integer.
-
enumerator GSD_TYPE_INT16#
Signed 16-bit integer.
-
enumerator GSD_TYPE_INT32#
Signed 32-bit integer.
-
enumerator GSD_TYPE_INT64#
Signed 64-bit integer.
-
enumerator GSD_TYPE_FLOAT#
32-bit floating point number.
-
enumerator GSD_TYPE_DOUBLE#
64-bit floating point number.
-
enumerator GSD_TYPE_UINT8#
-
enum gsd_open_flag#
Flag for GSD file open options.
Values:
-
enumerator GSD_OPEN_READWRITE#
Open for both reading and writing.
-
enumerator GSD_OPEN_READONLY#
Open only for reading.
-
enumerator GSD_OPEN_APPEND#
Open only for writing.
-
enumerator GSD_OPEN_READWRITE#
-
enum gsd_error#
Error return values.
Values:
-
enumerator GSD_SUCCESS#
Success.
-
enumerator GSD_ERROR_IO#
IO error. Check
errno
for details.
-
enumerator GSD_ERROR_INVALID_ARGUMENT#
Invalid argument passed to function.
-
enumerator GSD_ERROR_NOT_A_GSD_FILE#
The file is not a GSD file.
-
enumerator GSD_ERROR_INVALID_GSD_FILE_VERSION#
The GSD file version cannot be read.
-
enumerator GSD_ERROR_FILE_CORRUPT#
The GSD file is corrupt.
-
enumerator GSD_ERROR_MEMORY_ALLOCATION_FAILED#
GSD failed to allocated memory.
-
enumerator GSD_ERROR_NAMELIST_FULL#
The GSD file cannot store any additional unique data chunk names.
-
enumerator GSD_ERROR_FILE_MUST_BE_WRITABLE#
This API call requires that the GSD file opened in with the mode GSD_OPEN_APPEND or GSD_OPEN_READWRITE.
-
enumerator GSD_ERROR_FILE_MUST_BE_READABLE#
This API call requires that the GSD file opened the mode GSD_OPEN_READ or GSD_OPEN_READWRITE.
-
enumerator GSD_SUCCESS#
Functions
-
uint32_t gsd_make_version(unsigned int major, unsigned int minor)#
Specify a version.
- Parameters:
major – major version
minor – minor version
- Returns:
a packed version number aaaa.bbbb suitable for storing in a gsd file version entry.
-
int gsd_create(const char *fname, const char *application, const char *schema, uint32_t schema_version)#
Create a GSD file.
The generated gsd file is not opened. Call gsd_open() to open it for writing.
- Parameters:
fname – File name (UTF-8 encoded).
application – Generating application name (truncated to 63 chars).
schema – Schema name for data to be written in this GSD file (truncated to 63 chars).
schema_version – Version of the scheme data to be written (make with gsd_make_version()).
- Post:
Create an empty gsd file in a file of the given name. Overwrite any existing file at that location.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
-
int gsd_create_and_open(struct gsd_handle *handle, const char *fname, const char *application, const char *schema, uint32_t schema_version, enum gsd_open_flag flags, int exclusive_create)#
Create and open a GSD file.
Open the generated gsd file in handle.
The file descriptor is closed if there when an error opening the file.
- Parameters:
handle – Handle to open.
fname – File name (UTF-8 encoded).
application – Generating application name (truncated to 63 chars).
schema – Schema name for data to be written in this GSD file (truncated to 63 chars).
schema_version – Version of the scheme data to be written (make with gsd_make_version()).
flags – Either GSD_OPEN_READWRITE, or GSD_OPEN_APPEND.
exclusive_create – Set to non-zero to force exclusive creation of the file.
- Post:
Create an empty gsd file with the given name. Overwrite any existing file at that location.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_NOT_A_GSD_FILE: Not a GSD file.
GSD_ERROR_INVALID_GSD_FILE_VERSION: Invalid GSD file version.
GSD_ERROR_FILE_CORRUPT: Corrupt file.
GSD_ERROR_MEMORY_ALLOCATION_FAILED: Unable to allocate memory.
-
int gsd_open(struct gsd_handle *handle, const char *fname, enum gsd_open_flag flags)#
Open a GSD file.
The file descriptor is closed if there is an error opening the file.
- Parameters:
handle – Handle to open.
fname – File name to open (UTF-8 encoded).
flags – Either GSD_OPEN_READWRITE, GSD_OPEN_READONLY, or GSD_OPEN_APPEND.
- Pre:
The file name fname is a GSD file.
- Post:
Open a GSD file and populates the handle for use by API calls.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_NOT_A_GSD_FILE: Not a GSD file.
GSD_ERROR_INVALID_GSD_FILE_VERSION: Invalid GSD file version.
GSD_ERROR_FILE_CORRUPT: Corrupt file.
GSD_ERROR_MEMORY_ALLOCATION_FAILED: Unable to allocate memory.
-
int gsd_truncate(struct gsd_handle *handle)#
Truncate a GSD file.
After truncating, a file will have no frames and no data chunks. The file size will be that of a newly created gsd file. The application, schema, and schema version metadata will be kept. Truncate does not close and reopen the file, so it is suitable for writing restart files on Lustre file systems without any metadata access.
- Parameters:
handle – Open GSD file to truncate.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_NOT_A_GSD_FILE: Not a GSD file.
GSD_ERROR_INVALID_GSD_FILE_VERSION: Invalid GSD file version.
GSD_ERROR_FILE_CORRUPT: Corrupt file.
GSD_ERROR_MEMORY_ALLOCATION_FAILED: Unable to allocate memory.
-
int gsd_close(struct gsd_handle *handle)#
Close a GSD file.
Warning
Ensure that all gsd_write_chunk() calls are completed with gsd_end_frame() before closing the file.
- Parameters:
handle – GSD file to close.
- Pre:
handle was opened by gsd_open().
- Post:
Writable files: All data and index entries buffered before the previous call to gsd_end_frame() is written to the file (see gsd_flush()).
- Post:
The file is closed.
- Post:
handle is freed and can no longer be used.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_INVALID_ARGUMENT: handle is NULL.
-
int gsd_end_frame(struct gsd_handle *handle)#
Complete the current frame.
- Parameters:
handle – Handle to an open GSD file
- Pre:
handle was opened by gsd_open().
- Post:
The current frame counter is increased by 1.
- Post:
Flush the write buffer if it has overflowed. See gsd_flush().
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_INVALID_ARGUMENT: handle is NULL.
GSD_ERROR_FILE_MUST_BE_WRITABLE: The file was opened read-only.
GSD_ERROR_MEMORY_ALLOCATION_FAILED: Unable to allocate memory.
-
int gsd_flush(struct gsd_handle *handle)#
Flush the write buffer.
- Parameters:
handle – Handle to an open GSD file
- Pre:
handle was opened by gsd_open().
- Post:
All data buffered by gsd_write_chunk() are present in the file.
- Post:
All index entries buffered by gsd_write_chunk() prior to the last call to gsd_end_frame() are present in the file.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_INVALID_ARGUMENT: handle is NULL.
GSD_ERROR_FILE_MUST_BE_WRITABLE: The file was opened read-only.
GSD_ERROR_MEMORY_ALLOCATION_FAILED: Unable to allocate memory.
-
int gsd_write_chunk(struct gsd_handle *handle, const char *name, enum gsd_type type, uint64_t N, uint32_t M, uint8_t flags, const void *data)#
Add a data chunk to the current frame.
Note
If the GSD file is version 1.0, the chunk name is truncated to 63 bytes. GSD version 2.0 files support arbitrarily long names.
Note
N == 0 is allowed. When N is 0, data may be NULL.
- Parameters:
handle – Handle to an open GSD file.
name – Name of the data chunk.
type – type ID that identifies the type of data in data.
N – Number of rows in the data.
M – Number of columns in the data.
flags – set to 0, non-zero values reserved for future use.
data – Data buffer.
- Pre:
handle was opened by gsd_open().
- Pre:
name is a unique name for data chunks in the given frame.
- Pre:
data is allocated and contains at least
N * M * gsd_sizeof_type(type)
bytes.- Post:
When there is space in the buffer: The given data is present in the write buffer. Otherwise, the data is present at the end of the file.
- Post:
The index is present in the buffer.
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_INVALID_ARGUMENT: handle is NULL, N == 0, M == 0, type is invalid, or flags* != 0.
GSD_ERROR_FILE_MUST_BE_WRITABLE: The file was opened read-only.
GSD_ERROR_NAMELIST_FULL: The file cannot store any additional unique chunk names.
GSD_ERROR_MEMORY_ALLOCATION_FAILED: failed to allocate memory.
-
const struct gsd_index_entry *gsd_find_chunk(struct gsd_handle *handle, uint64_t frame, const char *name)#
Find a chunk in the GSD file.
The found entry contains size and type metadata and can be passed to gsd_read_chunk() to read the data.
Note
gsd_find_chunk() calls gsd_flush() when the file is writable.
- Parameters:
handle – Handle to an open GSD file
frame – Frame to look for chunk
name – Name of the chunk to find
- Pre:
handle was opened by gsd_open() in read or readwrite mode.
- Returns:
A pointer to the found chunk, or NULL if not found.
-
int gsd_read_chunk(struct gsd_handle *handle, void *data, const struct gsd_index_entry *chunk)#
Read a chunk from the GSD file.
Note
gsd_read_chunk() calls gsd_flush() when the file is writable.
- Parameters:
handle – Handle to an open GSD file.
data – Data buffer to read into.
chunk – Chunk to read.
- Pre:
handle was opened in read or readwrite mode.
- Pre:
chunk was found by gsd_find_chunk().
- Pre:
data points to an allocated buffer with at least
N * M * gsd_sizeof_type(type)
bytes.- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_INVALID_ARGUMENT: handle is NULL, data is NULL, or chunk is NULL.
GSD_ERROR_FILE_MUST_BE_READABLE: The file was opened in append mode.
GSD_ERROR_FILE_CORRUPT: The GSD file is corrupt.
-
uint64_t gsd_get_nframes(struct gsd_handle *handle)#
Get the number of frames in the GSD file.
- Parameters:
handle – Handle to an open GSD file
- Pre:
handle was opened by gsd_open().
- Returns:
The number of frames in the file, or 0 on error.
-
size_t gsd_sizeof_type(enum gsd_type type)#
Query size of a GSD type ID.
- Parameters:
type – Type ID to query.
- Returns:
Size of the given type in bytes, or 0 for an unknown type ID.
-
const char *gsd_find_matching_chunk_name(struct gsd_handle *handle, const char *match, const char *prev)#
Search for chunk names in a gsd file.
To find the first matching chunk name, pass NULL for prev. Pass in the previous found string to find the next after that, and so on. Chunk names match if they begin with the string in match*. Chunk names returned by this function may be present in at least one frame.
Note
gsd_find_matching_chunk_name() calls gsd_flush() when the file is writable.
- Parameters:
handle – Handle to an open GSD file.
match – String to match.
prev – Search starting point.
- Pre:
handle was opened by gsd_open()
- Pre:
prev was returned by a previous call to gsd_find_matching_chunk_name()
- Returns:
Pointer to a string, NULL if no more matching chunks are found found, or NULL if prev* is invalid
-
int gsd_upgrade(struct gsd_handle *handle)#
Upgrade a GSD file to the latest specification.
- Parameters:
handle – Handle to an open GSD file
- Pre:
handle was opened by gsd_open() with a writable mode.
- Pre:
There are no pending data to write to the file in gsd_end_frame()
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_IO: IO error (check errno).
GSD_ERROR_INVALID_ARGUMENT: handle is NULL
GSD_ERROR_FILE_MUST_BE_WRITABLE: The file was opened in read-only mode.
-
uint64_t gsd_get_maximum_write_buffer_size(struct gsd_handle *handle)#
Get the maximum write buffer size.
- Parameters:
handle – Handle to an open GSD file
- Pre:
handle was opened by gsd_open().
- Returns:
The maximum write buffer size in bytes, or 0 on error.
-
int gsd_set_maximum_write_buffer_size(struct gsd_handle *handle, uint64_t size)#
Set the maximum write buffer size.
- Parameters:
handle – Handle to an open GSD file
size – Maximum number of bytes to allocate in the write buffer (must be greater than 0).
- Pre:
handle was opened by gsd_open().
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_INVALID_ARGUMENT: handle is NULL
GSD_ERROR_INVALID_ARGUMENT: size == 0
-
uint64_t gsd_get_index_entries_to_buffer(struct gsd_handle *handle)#
Get the number of index entries to buffer.
- Parameters:
handle – Handle to an open GSD file
- Pre:
handle was opened by gsd_open().
- Returns:
The number of index entries to buffer, or 0 on error.
-
int gsd_set_index_entries_to_buffer(struct gsd_handle *handle, uint64_t number)#
Set the number of index entries to buffer.
Note
GSD may allocate more than this number of entries in the buffer, as needed to store all index entries for the already buffered frames and the current frame.
- Parameters:
handle – Handle to an open GSD file
number – Number of index entries to buffer before automatically flushing in
gsd_end_frame()
(must be greater than 0).
- Pre:
handle was opened by gsd_open().
- Returns:
GSD_SUCCESS (0) on success. Negative value on failure:
GSD_ERROR_INVALID_ARGUMENT: handle is NULL
GSD_ERROR_INVALID_ARGUMENT: number == 0
-
enum gsd_type#
- dir gsd
-
type uint8_t#
8-bit unsigned integer (defined by C compiler).
-
type uint16_t#
16-bit unsigned integer (defined by C compiler).
-
type uint32_t#
32-bit unsigned integer (defined by C compiler).
-
type uint64_t#
64-bit unsigned integer (defined by C compiler).
-
type int64_t#
64-bit signed integer (defined by C compiler).
-
type size_t#
unsigned integer (defined by C compiler).
Specification#
HOOMD Schema#
HOOMD-blue supports a wide variety of per particle attributes and properties.
Particles, bonds, and types can be dynamically added and removed during
simulation runs. The hoomd
schema can handle all of these situations in a
reasonably space efficient and high performance manner. It is also backwards
compatible with previous versions of itself, as we only add new additional data
chunks in new versions and do not change the interpretation of the existing data
chunks. Any newer reader will initialize new data chunks with default values
when they are not present in an older version file.
- Schema name:
hoomd
- Schema version:
1.4
See also
hoomd.State
for a full description of how HOOMD interprets this
data.
Use-cases#
The GSD schema hoomd
provides:
Every frame of GSD output is viable to restart a simulation
Support varying numbers of particles, bonds, etc…
Support varying attributes (type, mass, etc…)
Support orientation, angular momentum, and other fields.
Binary format on disk
High performance file read and write
Support logging computed quantities
Data chunks#
Each frame the hoomd
schema may contain one or more data chunks. The layout
and names of the chunksmatch that of the binary frame API in HOOMD-blue
itself. Data chunks are organized in categories. These categories have no
meaning in the hoomd
schema specification, and are simply an organizational
tool. Some file writers may implement options that act on categories (i.e. write
attributes out to every frame, or just frame 0).
Values are well defined for all fields at all frames. When a data chunk is present in frame i, it defines the values for the frame. When it is not present, the data chunk of the same name at frame 0 defines the values for frame i (when N is equal between the frames). If the data chunk is not present in frame 0, or N differs between frames, values are default. Default values allow files sizes to remain small. For example, a simulation with point particles where orientation is always (1,0,0,0) would not write any orientation chunk to the file.
N may be zero. When N is zero, an index entry may be written for a data chunk with no actual data written to the file for that chunk.
Name |
Category |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
Configuration |
|||||
uint64 |
1x1 |
0 |
number |
||
uint8 |
1x1 |
3 |
number |
||
float |
6x1 |
varies |
|||
Particle data |
|||||
attribute |
uint32 |
1x1 |
0 |
number |
|
attribute |
int8 |
NTxM |
[‘A’] |
UTF-8 |
|
attribute |
uint32 |
Nx1 |
0 |
number |
|
attribute |
int8 |
NTxM |
UTF-8 |
||
attribute |
float |
Nx1 |
1.0 |
mass |
|
attribute |
float |
Nx1 |
0.0 |
charge |
|
attribute |
float |
Nx1 |
1.0 |
length |
|
attribute |
int32 |
Nx1 |
-1 |
number |
|
attribute |
float |
Nx3 |
0,0,0 |
mass * length^2 |
|
property |
float |
Nx3 |
0,0,0 |
length |
|
property |
float |
Nx4 |
1,0,0,0 |
unit quaternion |
|
momentum |
float |
Nx3 |
0,0,0 |
length/time |
|
momentum |
float |
Nx4 |
0,0,0,0 |
quaternion |
|
momentum |
int32 |
Nx3 |
0,0,0 |
number |
|
Bond data |
|||||
topology |
uint32 |
1x1 |
0 |
number |
|
topology |
int8 |
NTxM |
UTF-8 |
||
topology |
uint32 |
Nx1 |
0 |
number |
|
topology |
uint32 |
Nx2 |
0,0 |
number |
|
Angle data |
|||||
topology |
uint32 |
1x1 |
0 |
number |
|
topology |
int8 |
NTxM |
UTF-8 |
||
topology |
uint32 |
Nx1 |
0 |
number |
|
topology |
uint32 |
Nx3 |
0,0,0 |
number |
|
Dihedral data |
|||||
topology |
uint32 |
1x1 |
0 |
number |
|
topology |
int8 |
NTxM |
UTF-8 |
||
topology |
uint32 |
Nx1 |
0 |
number |
|
topology |
uint32 |
Nx4 |
0,0,0,0 |
number |
|
Improper data |
|||||
topology |
uint32 |
1x1 |
0 |
number |
|
topology |
int8 |
NTxM |
UTF-8 |
||
topology |
uint32 |
Nx1 |
0 |
number |
|
topology |
uint32 |
Nx4 |
0,0,0,0 |
number |
|
Constraint data |
|||||
topology |
uint32 |
1x1 |
0 |
number |
|
topology |
float |
Nx1 |
0 |
length |
|
topology |
uint32 |
Nx2 |
0,0 |
number |
|
Special pairs data |
|||||
topology |
uint32 |
1x1 |
0 |
number |
|
topology |
int8 |
NTxM |
utf-8 |
||
topology |
uint32 |
Nx1 |
0 |
number |
|
topology |
uint32 |
Nx2 |
0,0 |
number |
Configuration#
- configuration/step#
- Type:
uint64
- Size:
1x1
- Default:
0
- Units:
number
Simulation time step.
- configuration/dimensions#
- Type:
uint8
- Size:
1x1
- Default:
3
- Units:
number
Number of dimensions in the simulation. Must be 2 or 3.
Note
When using
gsd.hoomd.Frame
, the object will try to intelligently default to a dimension. When setting a box with \(L_z = 0\),dimensions
will default to 2 otherwise 3. Explicit setting of this value by users always takes precedence.
- configuration/box#
- Type:
float
- Size:
6x1
- Default:
[1,1,1,0,0,0]
- Units:
varies
Simulation box. Each array element defines a different box property. See the hoomd documentation for a full description on how these box parameters map to a triclinic geometry.
box[0:3]
: \((l_x, l_y, l_z)\) the box length in each direction, in length unitsbox[3:]
: \((xy, xz, yz)\) the tilt factors, dimensionless values
Particle data#
Within a single frame, the number of particles N and NT are fixed for all chunks. N and NT may vary from one frame to the next. All values are stored in hoomd native units.
Attributes#
- particles/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of particles, for all data chunks
particles/*
.
- particles/types#
- Type:
int8
- Size:
NTxM
- Default:
[‘A’]
- Units:
UTF-8
Implicitly define NT, the number of particle types, for all data chunks
particles/*
. M must be large enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type name for particle type i.
- particles/typeid#
- Type:
uint32
- Size:
Nx1
- Default:
0
- Units:
number
Store the type id of each particle. All id’s must be less than NT. A particle with type id has a type name matching the corresponding row in
particles/types
.
- particles/type_shapes#
- Type:
int8
- Size:
NTxM
- Default:
empty
- Units:
UTF-8
Store a per-type shape definition for visualization. A dictionary is stored for each of the NT types, corresponding to a shape for visualization of that type. M must be large enough to accommodate the shape definition as a null-terminated UTF-8 JSON-encoded string. See: Shape Visualization for examples.
- particles/mass#
- Type:
float (32-bit)
- Size:
Nx1
- Default:
1.0
- Units:
mass
Store the mass of each particle.
- particles/charge#
- Type:
float (32-bit)
- Size:
Nx1
- Default:
0.0
- Units:
charge
Store the charge of each particle.
- particles/diameter#
- Type:
float (32-bit)
- Size:
Nx1
- Default:
1.0
- Units:
length
Store the diameter of each particle.
- particles/body#
- Type:
int32
- Size:
Nx1
- Default:
-1
- Units:
number
Store the composite body associated with each particle. The value -1 indicates no body. The body field may be left out of input files, as hoomd will create the needed constituent particles.
- particles/moment_inertia#
- Type:
float (32-bit)
- Size:
Nx3
- Default:
0,0,0
- Units:
mass * length^2
Store the moment_inertia of each particle \((I_{xx}, I_{yy}, I_{zz})\). This inertia tensor is diagonal in the body frame of the particle. The default value is for point particles.
Properties#
- particles/position#
- Type:
float (32-bit)
- Size:
Nx3
- Default:
0,0,0
- Units:
length
Store the position of each particle (x, y, z).
All particles in the simulation are referenced by a tag. The position data chunk (and all other per particle data chunks) list particles in tag order. The first particle listed has tag 0, the second has tag 1, …, and the last has tag N-1 where N is the number of particles in the simulation.
All particles must be inside the box:
\(-l_x/2 + (xz-xy \cdot yz) \cdot z + xy \cdot y \le x < l_x/2 + (xz-xy \cdot yz) \cdot z + xy \cdot y\)
\(-l_y/2 + yz \cdot z \le y < l_y/2 + yz \cdot z\)
\(-l_z/2 \le z < l_z/2\)
Where \(l_x\), \(l_y\), \(l_z\), \(xy\), \(xz\), and \(yz\) are the simulation box parameters (
configuration/box
).
- particles/orientation#
- Type:
float (32-bit)
- Size:
Nx4
- Default:
1,0,0,0
- Units:
unit quaternion
Store the orientation of each particle. In scalar + vector notation, this is \((r, a_x, a_y, a_z)\), where the quaternion is \(q = r + a_xi + a_yj + a_zk\). A unit quaternion has the property: \(\sqrt{r^2 + a_x^2 + a_y^2 + a_z^2} = 1\).
Momenta#
- particles/velocity#
- Type:
float (32-bit)
- Size:
Nx3
- Default:
0,0,0
- Units:
length/time
Store the velocity of each particle \((v_x, v_y, v_z)\).
- particles/angmom#
- Type:
float (32-bit)
- Size:
Nx4
- Default:
0,0,0,0
- Units:
quaternion
Store the angular momentum of each particle as a quaternion. See the HOOMD documentation for information on how to convert to a vector representation.
- particles/image#
- Type:
int32
- Size:
Nx3
- Default:
0,0,0
- Units:
number
Store the number of times each particle has wrapped around the box \((i_x, i_y, i_z)\). In constant volume simulations, the unwrapped position in the particle’s full trajectory is
\(x_u = x + i_x \cdot l_x + xy \cdot i_y \cdot l_y + xz \cdot i_z \cdot l_z\)
\(y_u = y + i_y \cdot l_y + yz \cdot i_z \cdot l_z\)
\(z_u = z + i_z \cdot l_z\)
Topology#
- bonds/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of bonds, for all data chunks
bonds/*
.
- bonds/types#
- Type:
int8
- Size:
NTxM
- Default:
empty
- Units:
UTF-8
Implicitly define NT, the number of bond types, for all data chunks
bonds/*
. M must be large enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type name for bond type i. By default, there are 0 bond types.
- bonds/typeid#
- Type:
uint32
- Size:
Nx1
- Default:
0
- Units:
number
Store the type id of each bond. All id’s must be less than NT. A bond with type id has a type name matching the corresponding row in
bonds/types
.
- bonds/group#
- Type:
uint32
- Size:
Nx2
- Default:
0,0
- Units:
number
Store the particle tags in each bond.
- angles/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of angles, for all data chunks
angles/*
.
- angles/types#
- Type:
int8
- Size:
NTxM
- Default:
empty
- Units:
UTF-8
Implicitly define NT, the number of angle types, for all data chunks
angles/*
. M must be large enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type name for angle type i. By default, there are 0 angle types.
- angles/typeid#
- Type:
uint32
- Size:
Nx1
- Default:
0
- Units:
number
Store the type id of each angle. All id’s must be less than NT. A angle with type id has a type name matching the corresponding row in
angles/types
.
- angles/group#
- Type:
uint32
- Size:
Nx3
- Default:
0,0,0
- Units:
number
Store the particle tags in each angle.
- dihedrals/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of dihedrals, for all data chunks
dihedrals/*
.
- dihedrals/types#
- Type:
int8
- Size:
NTxM
- Default:
empty
- Units:
UTF-8
Implicitly define NT, the number of dihedral types, for all data chunks
dihedrals/*
. M must be large enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type name for dihedral type i. By default, there are 0 dihedral types.
- dihedrals/typeid#
- Type:
uint32
- Size:
Nx1
- Default:
0
- Units:
number
Store the type id of each dihedral. All id’s must be less than NT. A dihedral with type id has a type name matching the corresponding row in
dihedrals/types
.
- dihedrals/group#
- Type:
uint32
- Size:
Nx4
- Default:
0,0,0,0
- Units:
number
Store the particle tags in each dihedral.
- impropers/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of impropers, for all data chunks
impropers/*
.
- impropers/types#
- Type:
int8
- Size:
NTxM
- Default:
empty
- Units:
UTF-8
Implicitly define NT, the number of improper types, for all data chunks
impropers/*
. M must be large enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type name for improper type i. By default, there are 0 improper types.
- impropers/typeid#
- Type:
uint32
- Size:
Nx1
- Default:
0
- Units:
number
Store the type id of each improper. All id’s must be less than NT. A improper with type id has a type name matching the corresponding row in
impropers/types
.
- impropers/group#
- Type:
uint32
- Size:
Nx4
- Default:
0,0,0,0
- Units:
number
Store the particle tags in each improper.
- constraints/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of constraints, for all data chunks
constraints/*
.
- constraints/value#
- Type:
float
- Size:
Nx1
- Default:
0
- Units:
length
Store the distance of each constraint. Each constraint defines a fixed distance between two particles.
- constraints/group#
- Type:
uint32
- Size:
Nx2
- Default:
0,0
- Units:
number
Store the particle tags in each constraint.
- pairs/N#
- Type:
uint32
- Size:
1x1
- Default:
0
- Units:
number
Define N, the number of special pair interactions, for all data chunks
pairs/*
.New in version 1.1.
- pairs/types#
- Type:
int8
- Size:
NTxM
- Default:
empty
- Units:
UTF-8
Implicitly define NT, the number of special pair types, for all data chunks
pairs/*
. M must be large enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type name for particle type i. By default, there are 0 special pair types.New in version 1.1.
- pairs/typeid#
- Type:
uint32
- Size:
Nx1
- Default:
0
- Units:
number
Store the type id of each special pair interaction. All id’s must be less than NT. A pair with type id has a type name matching the corresponding row in
pairs/types
.New in version 1.1.
- pairs/group#
- Type:
uint32
- Size:
Nx2
- Default:
0,0
- Units:
number
Store the particle tags in each special pair interaction.
New in version 1.1.
Logged data#
Users may store logged data in log/*
data chunks. Logged data encompasses
values computed at simulation time that are too expensive or cumbersome to
re-compute in post processing. This specification does not define specific chunk
names or define logged data. Users may select any valid name for logged data
chunks as appropriate for their workflow.
For any named logged data chunks present in any frame frame the file: If a chunk is not present in a given frame i != 0, the implementation should provide the quantity as read from frame 0 for that frame. GSD files that include a logged data chunk only in some frames i != 0 and not in frame 0 are invalid.
By convention, per-particle and per-bond logged data should have a chunk name
starting with log/particles/
and log/bonds
, respectively. Scalar,
vector, and string values may be stored under a different prefix starting with
log/
. This specification may recognize additional conventions in later
versions without invalidating existing files.
Name |
Type |
Size |
Units |
---|---|---|---|
n/a |
NxM |
user-defined |
|
n/a |
NxM |
user-defined |
|
n/a |
NxM |
user-defined |
- log/particles/user_defined#
- Type:
user-defined
- Size:
NxM
- Units:
user-defined
This chunk is a place holder for any number of user defined per-particle quantities. N is the number of particles in this frame. M, the data type, the units, and the chunk name (after the prefix
log/particles/
) are user-defined.New in version 1.4.
- log/bonds/user_defined#
- Type:
user-defined
- Size:
NxM
- Units:
user-defined
This chunk is a place holder for any number of user defined per-bond quantities. N is the number of bonds in this frame. M, the data type, the units, and the chunk name (after the prefix
log/bonds/
) are user-defined.New in version 1.4.
- log/user_defined#
- Type:
user-defined
- Size:
NxM
- Units:
user-defined
This chunk is a place holder for any number of user defined quantities. N, M, the data type, the units, and the chunk name (after the prefix
log/
) are user-defined.New in version 1.4.
State data#
HOOMD stores auxiliary state information in state/*
data chunks. Auxiliary
state encompasses internal state to any integrator, updater, or other class that
is not part of the particle system state but is also not a fixed parameter. For
example, the internal degrees of freedom in integrator. Auxiliary state is
useful when restarting simulations.
HOOMD only stores state in GSD files when requested explicitly by the user. Only a few of the documented state data chunks will be present in any GSD file and not all state chunks are valid. Thus, state data chunks do not have default values. If a chunk is not present in the file, that state does not have a well-defined value.
Note
HOOMD-blue >= v3.0.0 do not write state data.
Name |
Type |
Size |
Units |
---|---|---|---|
HPMC integrator state |
|||
double |
1x1 |
length |
|
double |
1x1 |
number |
|
float |
NTx1 |
length |
|
uint8 |
NTx1 |
boolean |
|
float |
NTx1 |
length |
|
float |
NTx1 |
length |
|
float |
NTx1 |
length |
|
uint32 |
NTx1 |
number |
|
float |
sum(N)x3 |
length |
|
uint32 |
NTx1 |
number |
|
float |
sum(N)x3 |
length |
|
float |
NTx1 |
length |
|
uint32 |
NTx1 |
number |
|
float |
sum(N)x2 |
length |
|
uint32 |
NTx1 |
number |
|
float |
sum(N)x2 |
length |
|
float |
NTx1 |
length |
|
uint32 |
NTx1 |
number |
|
float |
sum(N)x2 |
length |
HPMC integrator state#
NT is the number of particle types.
- state/hpmc/integrate/d#
- Type:
double
- Size:
1x1
- Units:
length
d is the maximum trial move displacement.
New in version 1.2.
- state/hpmc/integrate/a#
- Type:
double
- Size:
1x1
- Units:
number
a is the size of the maximum rotation move.
New in version 1.2.
- state/hpmc/sphere/radius#
- Type:
float
- Size:
NTx1
- Units:
length
Sphere radius for each particle type.
New in version 1.2.
- state/hpmc/sphere/orientable#
- Type:
uint8
- Size:
NTx1
- Units:
boolean
Orientable flag for each particle type.
New in version 1.3.
- state/hpmc/ellipsoid/a#
- Type:
float
- Size:
NTx1
- Units:
length
Size of the first ellipsoid semi-axis for each particle type.
New in version 1.2.
- state/hpmc/ellipsoid/b#
- Type:
float
- Size:
NTx1
- Units:
length
Size of the second ellipsoid semi-axis for each particle type.
New in version 1.2.
- state/hpmc/ellipsoid/c#
- Type:
float
- Size:
NTx1
- Units:
length
Size of the third ellipsoid semi-axis for each particle type.
New in version 1.2.
- state/hpmc/convex_polyhedron/N#
- Type:
uint32
- Size:
NTx1
- Units:
number
Number of vertices defined for each type.
New in version 1.2.
- state/hpmc/convex_polyhedron/vertices#
- Type:
float
- Size:
sum(N)x3
- Units:
length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for type 1 is the next N[1] vertices, and so on…
New in version 1.2.
- state/hpmc/convex_spheropolyhedron/N#
- Type:
uint32
- Size:
NTx1
- Units:
number
Number of vertices defined for each type.
New in version 1.2.
- state/hpmc/convex_spheropolyhedron/vertices#
- Type:
float
- Size:
sum(N)x3
- Units:
length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for type 1 is the next N[1] vertices, and so on…
New in version 1.2.
- state/hpmc/convex_spheropolyhedron/sweep_radius#
- Type:
float
- Size:
NTx1
- Units:
length
Sweep radius for each type.
New in version 1.2.
- state/hpmc/convex_polygon/N#
- Type:
uint32
- Size:
NTx1
- Units:
number
Number of vertices defined for each type.
New in version 1.2.
- state/hpmc/convex_polygon/vertices#
- Type:
float
- Size:
sum(N)x2
- Units:
length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for type 1 is the next N[1] vertices, and so on…
New in version 1.2.
- state/hpmc/convex_spheropolygon/N#
- Type:
uint32
- Size:
NTx1
- Units:
number
Number of vertices defined for each type.
New in version 1.2.
- state/hpmc/convex_spheropolygon/vertices#
- Type:
float
- Size:
sum(N)x2
- Units:
length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for type 1 is the next N[1] vertices, and so on…
New in version 1.2.
- state/hpmc/convex_spheropolygon/sweep_radius#
- Type:
float
- Size:
NTx1
- Units:
length
Sweep radius for each type.
New in version 1.2.
- state/hpmc/simple_polygon/N#
- Type:
uint32
- Size:
NTx1
- Units:
number
Number of vertices defined for each type.
New in version 1.2.
- state/hpmc/simple_polygon/vertices#
- Type:
float
- Size:
sum(N)x2
- Units:
length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for type 1 is the next N[1] vertices, and so on…
New in version 1.2.
Shape Visualization#
The chunk particles/type_shapes
stores information about shapes
corresponding to particle types. Shape definitions are stored for each type as a
UTF-8 encoded JSON string containing key-value pairs. The class of a shape is
defined by the type
key. All other keys define properties of that shape.
Keys without a default value are required for a valid shape specification.
Empty (Undefined) Shape#
An empty dictionary can be used for undefined shapes. A visualization application may choose how to interpret this, e.g. by drawing nothing or drawing spheres.
Example:
{}
Spheres#
Type: Sphere
Spheres’ dimensionality (2D circles or 3D spheres) can be inferred from the system box dimensionality.
Key |
Description |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
diameter |
Sphere diameter |
float |
1x1 |
length |
Example:
{
"type": "Sphere",
"diameter": 2.0
}
Ellipsoids#
Type: Ellipsoid
The ellipsoid class has principal axes a, b, c corresponding to its radii in the x, y, and z directions.
Key |
Description |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
a |
Radius in x direction |
float |
1x1 |
length |
|
b |
Radius in y direction |
float |
1x1 |
length |
|
c |
Radius in z direction |
float |
1x1 |
length |
Example:
{
"type": "Ellipsoid",
"a": 7.0,
"b": 5.0,
"c": 3.0
}
Polygons#
Type: Polygon
A simple polygon with its vertices specified in a counterclockwise order.
Spheropolygons can be represented using this shape type, through the
rounding_radius
key.
Key |
Description |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
rounding_radius |
Rounding radius |
float |
1x1 |
0.0 |
length |
vertices |
Shape vertices |
float |
Nx2 |
length |
Example:
{
"type": "Polygon",
"rounding_radius": 0.1,
"vertices": [[-0.5, -0.5], [0.5, -0.5], [0.5, 0.5]]
}
Convex Polyhedra#
Type: ConvexPolyhedron
A convex polyhedron with vertices specifying the convex hull of the shape.
Spheropolyhedra can be represented using this shape type, through the
rounding_radius
key.
Key |
Description |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
rounding_radius |
Rounding radius |
float |
1x1 |
0.0 |
length |
vertices |
Shape vertices |
float |
Nx3 |
length |
Example:
{
"type": "ConvexPolyhedron",
"rounding_radius": 0.1,
"vertices": [[0.5, 0.5, 0.5], [0.5, -0.5, -0.5], [-0.5, 0.5, -0.5], [-0.5, -0.5, 0.5]]
}
General 3D Meshes#
Type: Mesh
A list of lists of indices are used to specify faces. Faces must contain 3 or more vertex indices. The vertex indices must be zero-based. Faces must be defined with a counterclockwise winding order (to produce an “outward” normal).
Key |
Description |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
vertices |
Shape vertices |
float |
Nx3 |
length |
|
indices |
Vertices indices |
uint32 |
number |
Example:
{
"type": "Mesh",
"vertices": [[0.5, 0.5, 0.5], [0.5, -0.5, -0.5], [-0.5, 0.5, -0.5], [-0.5, -0.5, 0.5]],
"indices": [[0, 1, 2], [0, 3, 1], [0, 2, 3], [1, 3, 2]]
}
Sphere Unions#
Type: SphereUnion
A collection of spheres, defined by their diameters and centers.
Key |
Description |
Type |
Size |
Default |
Units |
---|---|---|---|---|---|
diameters |
Sphere diameters |
float |
Nx1 |
length |
|
centers |
Sphere centers |
float |
Nx3 |
length |
Example:
{
"type": "SphereUnion",
"centers": [[0, 0, 1.0], [0, 1.0, 0], [1.0, 0, 0]],
"diameters": [0.5, 0.5, 0.5]
}
File layer#
Version: 2.0
General simulation data (GSD) file layer design and rationale. These use cases and design specifications define the low level GSD file format.
Differences from the 1.0 specification are noted.
Use-cases#
capabilities
efficiently store many frames of data from simulation runs
high performance file read and write
support arbitrary chunks of data in each frame (position, orientation, type, etc…)
variable number of named chunks in each frame
variable size of chunks in each frame
each chunk identifies data type
common use cases: NxM arrays in double, float, int, char types.
generic use case: binary blob of N bytes
can be integrated into other tools
append frames to an existing file with a monotonically increasing frame number
resilient to job kills
queries
number of frames
is named chunk present in frame i
type and size of named chunk in frame i
read data for named chunk in frame i
read only a portion of a chunk
list chunk names in the file
writes
write data to named chunk in the current frame
end frame and commit to disk
These capabilities enable a simple and rich higher level schema for storing particle and other types of data. The schema determine which named chunks exist in a given file and what they mean.
Non use-cases#
These capabilities are use-cases that GSD does not support, by design.
Modify data in the file: GSD is designed to capture simulation data.
Add chunks to frames in the middle of a file: See (1).
Transparent conversion between float and double: Callers must take care of this.
Transparent compression: this gets in the way of parallel I/O. Disk space is cheap.
Dependencies#
The file layer is implemented in C (not C++) with no dependencies to enable
trivial installation and incorporation into existing projects. A single header
and C file completely implement the entire file layer. Python based projects
that need only read access can use gsd.pygsd
, a pure Python gsd reader
implementation.
A Python interface to the file layer allows reference implementations and convenience methods for schemas. Most non-technical users of GSD will probably use these reference implementations directly in their scripts.
The low level C library is wrapped with cython. A Python pyproject.toml
file will
provide simple installation on as many systems as possible. Cython c++ output is
checked in to the repository so users do not even need cython as a dependency.
Specifications#
Support:
Files as large as the underlying filesystem allows (up to 64-bit address limits)
Data chunk names of arbitrary length (v1.0 limits chunk names to 63 bytes)
Reference up to 65535 different chunk names within a file
Application and schema names up to 63 characters
Store as many frames as can fit in a file up to file size limits
Data chunks up to (64-bit) x (32-bit) elements
The limits on only 16-bit name indices and 32-bit column indices are to keep the size of each index entry as small as possible to avoid wasting space in the file index. The primary use cases in mind for column indices are Nx3 and Nx4 arrays for position and quaternion values. Schemas that wish to store larger truly n-dimensional arrays can store their dimensionality in metadata in another chunk and store as an Nx1 index entry. Or use a file format more suited to N-dimensional arrays such as HDF5.
File format#
There are four types of data blocks in a GSD file.
Header block
Overall header for the entire file, contains the magic cookie, a format version, the name of the generating application, the schema name, and its version. Some bytes in the header are reserved for future use. Header size: 256 bytes. The header block also includes a pointer to the index, the number of allocated entries, the number of allocated entries in the index, a pointer to the name list, and the size of the name list block.
The header is the first 256 bytes in the file.
Index block
Index the frame data, size information, location, name id, etc…
The index contains space for any number of
index_entry
structsThe first index in the list with a location of 0 marks the end of the list.
When the index fills up, a new index block is allocated at the end of the file with more space and all current index entries are rewritten there.
Index entry size: 32 bytes
Name list
List of string names used by index entries.
v1.0 files: Each name is a 64-byte character string.
v2.0 files: Names may have any length and are separated by 0 terminators.
The first name that starts with the 0 byte marks the end of the list
The header stores the total size of the name list block.
Data chunk
Raw binary data stored for the named frame data blocks.
Header index, and name blocks are stored in memory as C structs (or arrays of C structs) and written to disk in whole chunks.
Header block#
This is the header block:
struct gsd_header
{
uint64_t magic;
uint64_t index_location;
uint64_t index_allocated_entries;
uint64_t namelist_location;
uint64_t namelist_allocated_entries;
uint32_t schema_version;
uint32_t gsd_version;
char application[64];
char schema[64];
char reserved[80];
};
magic
is the magic number identifying this as a GSD file (0x65DF65DF65DF65DF
).gsd_version
is the version number of the gsd file layer (0xaaaabbbb => aaaa.bbbb
).application
is the name of the generating application.schema
is the name of the schema for data in this gsd file.schema_version
is the version of the schema (0xaaaabbbb => aaaa.bbbb
).index_location
is the file location f the index block.index_allocated_entries
is the number of 64-byte segments available in the namelist block.namelist_location
is the file location of the namelist block.namelist_allocated_entries
is the number of entries allocated in the namelist block.reserved
are bytes saved for future use.
This structure is ordered so that all known compilers at the time of writing produced a tightly packed 256-byte header. Some compilers may required non-standard packing attributes or pragmas to enforce this.
Index block#
An Index block is made of a number of line items that store a pointer to a single data chunk:
struct gsd_index_entry
{
uint64_t frame;
uint64_t N;
int64_t location;
uint32_t M;
uint16_t *id*;
uint8_t type;
uint8_t flags;
};
frame
is the index of the frame this chunk belongs toN
andM
define the dimensions of the data matrix (NxM in C ordering with M as the fast index).location
is the location of the data chunk in the fileid
is the index of the name of this entry in the namelist.type
is the type of the data (char, int, float, double) indicated by index valuesflags
is reserved for future use.
Many gsd_index_entry_t
structs are combined into one index block. They are
stored densely packed and in the same order as the corresponding data chunks are
written to the file.
This structure is ordered so that all known compilers at the time of writing produced a tightly packed 32-byte entry. Some compilers may required non-standard packing attributes or pragmas to enforce this.
In v1.0 files, the frame index must monotonically increase from one index entry to the next. The GSD API ensures this.
In v2.0 files, the entire index block is stored sorted first by frame, then by id.
Namelist block#
In v2.0 files, the namelist block stores a list of strings separated by 0 terminators.
In v1.0 files, the namelist block stores a list of 0-terminated strings in 64-byte segments.
The first string that starts with 0 marks the end of the list.
Data block#
A data block stores raw data bytes on the disk. For a given index entry
entry
, the data starts at location entry.location
and is the next
entry.N * entry.M * gsd_sizeof_type(entry.type)
bytes.
Contributing#
Contributions are welcomed via pull requests on GitHub. Contact the GSD developers before starting work to ensure it meshes well with the planned development direction and standards set for the project.
Features#
Implement functionality in a general and flexible fashion#
New features should be applicable to a variety of use-cases. The GSD developers can assist you in designing flexible interfaces.
Maintain performance of existing code paths#
Expensive code paths should only execute when requested.
Version control#
Base your work off the correct branch#
Base backwards compatible bug fixes on
trunk-patch
.Base additional functionality on
trunk-minor
.Base API incompatible changes on
trunk-major
.
Agree to the Contributor Agreement#
All contributors must agree to the Contributor Agreement before their pull request can be merged.
Set your git identity#
Git identifies every commit you make with your name and e-mail. Set your identity to correctly identify your work and set it identically on all systems and accounts where you make commits.
Source code#
Use a consistent style#
The Code style section of the documentation sets the style guidelines for GSD code.
Document code with comments#
Use doxygen header comments for classes, functions, etc. Also comment complex sections of code so that other developers can understand them.
Compile without warnings#
Your changes should compile without warnings.
Tests#
Write unit tests#
Add unit tests for all new functionality.
Validity tests#
The developer should run research-scale simulations using the new functionality and ensure that it behaves as intended.
User documentation#
Write user documentation#
Document public-facing API with Python docstrings in Google style.
Example notebooks#
Demonstrate new functionality in the documentation examples pages.
Document version status#
Each user-facing Python class, method, etc. with a docstring should have [versionadded
,
versionchanged
, and deprecated
Sphinx directives]
Add developer to the credits#
Update the credits documentation to name each developer that has contributed to the code.
Propose a change log entry#
Propose a short concise entry describing the change in the pull request description.
Code style#
All code in GSD must follow a consistent style to ensure readability. We provide configuration files for linters (specified below) so that developers can automatically validate and format files.
These tools are configured for use with pre-commit in .pre-commit-config.yaml
. You can
install pre-commit hooks to validate your code. Checks will run on pull requests. Run checks
manually with:
pre-commit run --all-files
Python#
Python code in GSD should follow PEP8 with the formatting performed by yapf (configuration in
setup.cfg
). Code should pass all flake8 tests and formatted by yapf.
Tools#
Documentation#
Python code should be documented with docstrings and added to the Sphinx documentation index in
doc/
. Docstrings should follow Google style formatting for use in Napoleon.
C#
Style is set by clang-format=11
Whitesmith’s indentation style.
100 character line width.
Indent only with spaces.
4 spaces per indent level.
See
.clang-format
for the full clang-format configuration.
Naming conventions:
Functions: lowercase with words separated by underscores
function_name
.Structures: lowercase with words separated by underscores
struct_name
.Constants: all upper-case with words separated by underscores
SOME_CONSTANT
.
Tools#
Autoformatter: clang-format.
Linter: clang-tidy
Compile GSD with CMake to see clang-tidy output.
Documentation#
Documentation comments should be in Javadoc format and precede the item they document for
compatibility with Doxygen and most source code editors. Multi-line documentation comment blocks
start with /**
and single line ones start with ///
.
See gsd.h
for an example.
Restructured Text/Markdown files#
80 character line width.
Use spaces to indent.
Indentation levels are set by the respective formats.
Other file types#
Use your best judgment and follow existing patterns when styling CMake and other files types. The following general guidelines apply:
100 character line width.
4 spaces per indent level.
4 space indent.
Editor configuration#
Visual Studio Code users: Open the provided workspace file
(gsd.code-workspace
) which provides configuration settings for these style guidelines.
Credits#
The following people contributed to GSD.
Joshua A. Anderson, University of Michigan
Carl Simon Adorf, University of Michigan
Bradley Dice, University of Michigan
Jenny W. Fothergill, Boise State University
Jens Glaser, University of Michigan
Vyas Ramasubramani, University of Michigan
Luis Y. Rivera-Rivera, University of Michigan
Brandon Butler, University of Michigan
Arthur Zamarin, Gentoo Linux
Alexander Stukowski, OVITO GmbH
Charlotte Shiqi Zhao, University of Michigan
Tim Moore, University of Michigan
License#
GSD is available under the following license.
Copyright (c) 2016-2023 The Regents of the University of Michigan
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.