File naming convention
The following naming convention should be used to name all files associated with the S33 Polytunnel that are being uploaded into HIEv:
S33PT_<PROJECT CODE>_<VARIABLE COLLECTION CODE>_<DATA PROCESSING>_<DATE or DATERANGE>[_<VERSION>].extension
The individual elements that make up the naming convention are described in detail below. First of all however it is important that you consider the following points
- All terms in the filename should be upper case apart from the file ending which should be lower case
- An underscore is used to separate the different ‘elements’ of the filename. An underscore should therefore not be used elsewhere within the filename
- Do not use illegal characters within a filename, e.g. / ? < > : * | ” or a ‘space’.
- The file naming convention has been constructed to balance the amount of useful detail in the name, the need to ensure file uniqueness, as well as to keep the filename as short as possible.
- Whilst the naming convention should cover most naming scenarios, there may be occasions when your particular data does not lend itself to the convention. In that scenario, it is best to consult with the facility leader/team and/or data manager.
- S33PT: Official HIE facility code to represent data sourced from the S39 Polytunnel facility (all data from the S39 Polytunnel facility should be prefixed by use of the code ‘S33PT’).
- PROJECT CODE: Which of the S33PT projects this data file is associated with, e.g. use the project code ‘DL’ for the ‘ARC Drought Linkage’ project. In the event that the data is from an automated sensor (i.e not associated with any specific project), ‘AUTO’ can be used for the project code. The current list of projects associated with the S33 Polytunnel can be found on the ‘S33PT Project list‘ page.
- VARIABLE COLLECTION CODE: A term that represents a particular grouping of variables contained within a file. An example could be ‘SOILVARS’ if your data is entirely made up of, or predominately consists of, soil measurements. You can choose a variable collection code yourself – however, you should check with your group if any variable collection codes are already being used that you could also use.
- DATA PROCESSING: The level of post-processing operated on the data. The meanings of the different data processing codes used in WTC filenames are as follows:
- R (Raw Data): Data that has been directly extracted from an instrument and that has not been subjected to any data cleaning or post-processing
- L1 (Level One/Cleansed Data): Data that has been cleaned/gap-filled. Some erroneous data may be included.
- L2 (Level Two/Processed Data): Data that has been rigorously cleaned, processed, and checked for quality control.
- DATE or DATERANGE: The date range which a dataset covers (for automated timeseries data for example) or the single date on which a sample, for example, was taken. Note that dates are in the YYYYMMDD format (with a hyphen used to split the start and end date of a date range).
- VERSION: An optional version number of the form _Vx (where x is the version number, _V2, _V3, etc). This can be utilised when a new version of a file already existing within HIEv has been updated/corrected (maintaining the same data processing level).
- extension: The format of the data file, e.g. .csv, .dat (for toa5 data) etc.
- Environmental data from the 8th October, 2015 from an automated sensor (not associated with any specific project) installed at S33PT. The data are raw and have not been quality checked.
- A CSV dataset from the S33 Polytunnel containing a variety of soil carbon measurements, collected as part of the ‘Drought Linkage’ project, and measured between 19th March, 2015 and 24th December, 2015. Data have not been thoroughly quality checked, but some checking has been done after data entry:
- An updated version (version 2) of the above dataset that includes edits to erroneous data contained within the original version.