Data Matrix Uploads
A data matrix is a project-specific file that is used to submit meaningful data for a workshop component. It is an excel spreadsheet, with columns that contain relevant project specific data, and may refer to other raw-data uploaded files. Because of this referral, upload the (raw data) files first and the accompanying data matrix last.
What if my data matrix is incomplete or invalid?
If data is divided in separate experiments, it is possible to submit multiple data matrices from the same user or from the same lab. The multiple data matrices will be combined as data is analyzed.
It is also acceptable to submit if the supporting files (HML / Genotyping or Antibody files) are not yet available when the data matrix is created. You can fix the data matrix later (Using the [EDIT] Button on the upload page) and re-upload it. Editing and re-uploading the data matrix will overwrite and replace the old data matrix with the new one.
Incomplete or “invalid” data sets are probably still usable. Specific validation issues can be fixed, but users are encouraged to upload data even if there are validation issues. The validation feedback is intended to be helpful, it does not indicate that the data is rejected.
Here is how to create and upload a data matrix, using Immunogenic Epitopes project as an example.
Prepare a spreadsheet for your data based on the data matrix templates.
Templates can be downloaded by the links above, it shows the data columns that are expected. In Immunogenic Epitopes project, each row represents one donor/recipient pair, and dropdown boxes are provided for many columns to clarify what data is expected. There is also an “Instructions” tab, which shows a brief description of how the data matrix is intended to be used and what data can be submitted.
Generate HLA typing and antibody files for Donor / Recipient
HLA genotyping can be submitted in a few ways:
– A formatted GL String
– A UNIQUE HML v1.0.1 file which contains the sample HLA genotypes.
Antibody files can be reported as CSV files, which are exported from the OneLambda HLA Fusion or Immucor MatchIt software.
Upload the HML and Antibody CSV files to data.ihiws.org. You need an account to be able to upload files, read more about registering an account.
You have to upload on file type at a time.
On the upload files page, choose the appropriate Project (in this example “Definition of Immunogenic Epitopes”) and File Type (ex. Antibody CSV or HML), click choose files and select all the files to be uploaded, and click “Save”.
These files are now individually converted and/or validated:
– HAML Files are created from the Antibody CSV files
– HML files are validated against the XML Schema
Fill the data matrix spreadsheet with the appropriate identifiers or typings.
In case HML or .csv files contain multiple samples, make sure the sample ID used in these files correspond with the sample ID filled in the data matrix.
Tip for preparing spreadsheets:
Pay attention to the cell format for HLA data, they should be formatted as text. Excel may interpret HLA typing, such as “02:01”, as a time value. It might automatically remove a zero (“2:01”) or convert the genotype to a strange decimal number. To prevent this, the cells should be formatted as text.
HLA can be reported as:
1. HML Filename (Must be unique!)
– The filename as it appears on your local computer (ex Donor12345.xml) is sufficient if it is unique.
– A unique filename is generated by the website for each upload, this can also be used.
2. An anonymized unique sample identifier (ex “Donor12345”). This ID should be part of the HML Filename, to ensure that the data matrix cell refers to only a single, unique file.
3. a GLString containing the relevant typing
Antibody test results can be reported as:
1. A local filename of the .csv extracts from Immucor or HLA Fusion software (ex. “Recipient22334.csv”)
2. A unique filename that is generated for the converted file:
If there are multiple .csv for a single sample you need to separate them with a comma in the data matrix. For example: Recipient22334-CL1.csv, Recipient223344-CL2.csv
When the data matrix is complete, upload your data matrix file to data.ihiws.org
– Choose the correct Project (in this example “Definition of Immunogenic Epitopes”) and File Type (Project Data Matrix) and click “Save”.
– This file will now be validated automatically. Data Matrix files that seem correct will show a green [V] icon to indicate “valid“, and if issues are detected the website shows a red [I] icon for “invalid“.
– The website generates a validation report “validation_report.xlsx” with feedback on your data matrix. This report is connected to the data matrix you uploaded and can be found by clicking the [+] button:
If the file is not completely valid, any problems with the submitted data are shown in this report. The invalid cells are highlighted red, and the spreadsheet shows “comments” with feedback text.
Fix validation issues and re-upload the data matrix.
– Perhaps the data is in the wrong format, or certain entries do not map to a single unique file. Change the data matrix cells that have problems, and fix and re-upload the associated HML or Antibody CSV files.
– You can correct your original data matrix file, or you may correct the Validation_Report.xlsx, if that is more convenient.
– On the original upload (not the validation report), choose the [Edit] Button, select your corrected file, and click “Save”. The data matrix is re-validated and the validation report is re-generated.
Direct specific questions to Ben Matern or the project leader(s).