How we validate data¶
When you submit changes to the library directory, an automated system checks your data before it's added to the live dashboard. This ensures that all library records are accurate, complete, and formatted correctly.
What the validation system does¶
The system performs two levels of checking:
- Format check: Verifies that the data is properly structured
- Content check: Confirms that all information is complete and correct
Both checks happen automatically and provide instant feedback through GitHub.
When validation runs¶
The system checks your data in two situations:
- Automatically: When you submit a pull request that changes the library data
- On demand: When you manually request a validation check from the GitHub Actions tab
How the format check works¶
The first check ensures your data is properly formatted as a JSON file (a structured text format used to store information).
The system verifies:
- All brackets and braces are correctly matched
- Commas separate fields properly
- Text is properly enclosed in quotation marks
- There are no trailing commas or other syntax errors
- The file uses standard UTF-8 encoding
Example: Missing comma
{
"library": "Example"
"nation": "Country"
}
The system finds the missing comma between fields and alerts you to fix it.
Example: Trailing comma
{
"library": "Example",
}
Commas should not appear after the last field.
How the content check works¶
The second check ensures all your data follows the required structure and contains valid information.
The system verifies:
- All required fields are present
-
Every library record must include: ID, name, country, city, website, copyright information, manuscript count, format support, license type, and project information
-
Data types are correct
- Text fields contain text (not numbers)
- Yes/No fields use true or false (not the words "true" or "false")
-
Number fields contain whole numbers
-
Websites are valid
- Website addresses follow proper format:
https://example.com - URLs start with
http://orhttps:// -
Links are reachable and functional
-
Relationships are consistent
- When you indicate a library is part of a larger project, you must provide the project name and website
-
When a library operates independently, project fields must be empty
-
Categories use correct values
- Manuscript counts must be: "Few", "Dozens", "Hundreds", "Thousands", or "Unknown"
- No other values are accepted
Example: Multiple violations
{
"id": 1,
"library": "Example Library",
"nation": "Country",
"city": "City",
"website": "not-a-valid-url",
"iiif": "false",
"quantity": "Many"
}
Issues the system finds:
website: Format is invalid (missing protocol)iiif: Should be true or false, not the word "false"quantity: "Many" is not an allowed value
What you see on GitHub¶
After you submit changes, GitHub displays the validation results next to your pull request.
Validation passed¶
You see a green checkmark (✓) and a "Passed" status.
This means: - Your data format is correct - All required information is present - Content meets all standards - Your pull request can be reviewed and merged
Validation failed¶
You see a red X (✗) and a "Failed" status.
This means: - Your data has one or more errors - The system cannot process your changes until you fix the issues - Click the "Details" link to see what's wrong
Reading error messages¶
When validation fails, GitHub shows you exactly what needs to be fixed.
Format error
Expecting ',' delimiter: line 45 column 3 (char 1234)
What it means: Line 45 is missing a comma.
How to fix: Go to line 45 in your data and add a comma between fields.
Missing field
data/123 must have required property 'iiif'
What it means: Record number 123 is missing the IIIF field.
How to fix: Add "iiif": true or "iiif": false to that record.
Invalid website
data/456/website must match format "uri"
What it means: The website address in record 456 is not formatted correctly.
How to fix: Ensure the URL starts with https:// and is a working link. Example: "website": "https://example.com/manuscripts"
Incomplete project information
data/789 must match "then" schema
data/789/is_part_of_project_name must NOT have fewer than 1 characters
What it means: Record 789 says it's part of a project but doesn't provide the project name.
How to fix: Either provide both the project name and project website, or set the library as independent ("is_part_of": false).
Common errors and how to fix them¶
Missing required fields¶
Error message: must have required property
What went wrong: You forgot to include one or more required fields.
Example:
{
"id": 1,
"library": "Example"
}
Missing: country, city, website, copyright information, manuscript count, format support, license type, and project information.
How to fix: Review the Data Structure Guide and add all required fields to your record.
Text written as "true" or "false" instead of true/false¶
Error message: must be boolean
What went wrong: You used quotation marks around true or false, making them text instead of yes/no values.
Example:
{
"iiif": "false"
}
How to fix: Remove quotation marks:
{
"iiif": false
}
Website URL missing the protocol¶
Error message: must match format "uri"
What went wrong: Your website address doesn't start with http:// or https://.
Example:
{
"website": "www.example.com"
}
How to fix: Add the protocol:
{
"website": "https://www.example.com"
}
Manuscript count not in the allowed list¶
Error message: must be equal to one of the allowed values
What went wrong: You used a value that's not in the approved list.
Example:
{
"quantity": "Some"
}
Approved values: "Few", "Dozens", "Hundreds", "Thousands", "Unknown"
How to fix: Use one of the five approved values:
{
"quantity": "Dozens"
}
Project marked as true but no project name provided¶
Error message: is_part_of_project_name must NOT have fewer than 1 characters
What went wrong: You indicated the library is part of a project but didn't provide the project's name.
Example:
{
"is_part_of": true,
"is_part_of_project_name": null,
"is_part_of_url": null
}
How to fix: Either provide the project name and website:
{
"is_part_of": true,
"is_part_of_project_name": "Europeana Manuscripts",
"is_part_of_url": "https://www.europeana.eu/"
}
Or mark the library as independent:
{
"is_part_of": false,
"is_part_of_project_name": null,
"is_part_of_url": null
}
Manual validation check¶
To run a validation check on demand:
- Go to your GitHub repository
- Click the Actions tab
- Find Data Guardrails in the workflow list
- Click Run workflow
- Select your branch
- Click Run workflow
The system checks your data and displays results within 1-2 minutes.
When to use manual validation: - Before submitting a pull request to catch issues early - After making changes to schema rules - To verify data on a specific branch without a pull request
Performance¶
Typical validation time: 20-40 seconds
Cost: Free (included with GitHub)
Preventing validation failures¶
Before you submit changes¶
- Review the Data Structure Guide to understand all required fields
- Check the Update the Dashboard Data guide for step-by-step instructions
- Test your website links in a browser before submitting
- Verify spelling of library names, cities, and countries
If validation fails¶
- Read the error message carefully—it tells you exactly what's wrong
- Click the error details in GitHub to see which record and field have the problem
- Compare your data to the examples in the Data Structure Guide
- Fix the issue and resubmit
- Use the manual validation check to verify before creating a new pull request
Still stuck?¶
If you can't figure out the error:
- Review the "Common errors and how to fix them" section above
- Check the Data Structure Guide for field requirements
- Look at other records in the database to see examples of correct formatting
- Open an issue or contact @Dioscorides for help
Related documentation¶
- Data Structure Guide — Understanding the data fields
- Update the Dashboard Data — How to add or edit libraries
- Contributing Guide — Ways to help the project
Last Updated: February 6, 2026