How to Validate MARC Records Before Importing into Koha
Learn how to check MARC records before importing them into Koha, including structure, encoding, Leader, 008, ISBNs, item data, and matching rules.
How to Validate MARC Records Before Importing into Koha
Before you import records into Koha, it is worth checking the MARC file carefully. A file can look fine at first glance but still contain problems that affect import, searching, duplicate detection, item creation, or OPAC display.
This is especially important during a migration. Once a large batch has been imported into a live catalogue, fixing structural problems can take longer than checking the file before import.
MARCReady helps with this review by checking records for common MARC issues, applying rule-based repairs, and letting you review changes before exporting a cleaner MARC21 or MARCXML file for Koha.
Why validation matters
Koha can import MARC and MARCXML files through its staged import workflow. That workflow is powerful, but it depends on the quality of the incoming records.
If your source file contains malformed records, broken character encoding, incomplete Leader or 008 fields, invalid ISBNs, or poorly mapped item data, Koha may still import some records but produce inconsistent results.
Typical symptoms include:
- records failing during staging;
- titles appearing with strange characters;
- ISBN search not working as expected;
- duplicate records being created;
- item barcodes missing after import;
- branch, item type, or location values not mapping correctly;
- OPAC display problems after migration.
Validation reduces those risks before the data enters Koha.
What to check before import
1. File format
Confirm that the file is one of the formats Koha or your preparation tool expects.
For Koha import, the usual output formats are:
| Format | Extension | Use case |
|---|---|---|
| MARC21 binary | .mrc, .marc |
Standard MARC import workflow |
| MARCXML | .xml, .marcxml |
XML-based MARC import workflow |
If your source data is in CSV, TSV, Excel, JSON, or another structured format, it usually needs mapping before it becomes valid MARC. MARCReady can help map those columns or fields into MARC21 before export.
2. Record structure
A valid MARC record has a strict structure. It includes the Leader, directory, control fields, variable data fields, indicators, and subfields.
Check for:
- records that cannot be parsed;
- missing record terminators;
- broken field delimiters;
- empty fields;
- missing subfield codes;
- duplicated control fields where only one should exist.
A spreadsheet exported from an old system is not automatically a MARC file. It may contain catalogue data, but it must still be mapped into a MARC structure.
3. Leader and 008
The Leader and field 008 are fixed-length fields used by library systems to interpret the record. They affect record type, material type, language, date handling, and other coded values.
Common problems include:
- missing Leader;
- Leader with the wrong length;
- invalid record type;
- missing 008;
- 008 field too short;
- invalid date, language, or material codes.
Koha may import records with some fixed-field problems, but the catalogue quality may suffer. Search limits, facets, material type handling, and reporting can all be affected.
4. ISBNs and standard numbers
ISBNs are often used for searching, duplicate detection, and matching rules. Check whether ISBNs are stored correctly in MARC field 020.
Look for:
- ISBNs with spaces or punctuation in the wrong place;
- invalid check digits;
- ISBN qualifiers mixed into the main ISBN value;
- canceled or invalid ISBNs stored as valid ISBNs;
- multiple ISBNs combined into one subfield.
MARCReady can identify common ISBN issues and apply corrections before Koha import.
5. Character encoding
Encoding problems are common in legacy MARC files. They often appear as broken accents, strange symbols, or replacement characters.
Check for:
- MARC-8 and UTF-8 confusion;
- broken accented characters;
- symbols that display correctly in one tool but not another;
- inconsistent encoding across different batches.
Encoding issues are easier to fix before import than after records have been loaded into Koha.
6. Duplicate and repeatable fields
Some MARC fields are repeatable; others are not. Repeated fields are not always errors, but unexpected repetition can signal bad export logic or poor conversion.
Check for:
- repeated control fields;
- duplicate title fields;
- repeated local fields with conflicting data;
- multiple item fields for the same barcode;
- duplicate ISBN or control number fields.
7. Item data
Bibliographic records describe the title. Item data describes the physical or electronic copies owned by the library.
Before import, confirm whether item data is included and where it is stored. Many Koha migrations place item data in a local field such as 952, but the exact field and subfields depend on the migration plan.
Check for:
- barcode;
- home branch;
- holding branch;
- item type;
- shelving location;
- call number;
- copy number;
- price;
- lost, withdrawn, or damaged status.
If item data is missing, Koha may import bibliographic records but not create usable holdings.
A practical validation workflow
Use this process before importing a full catalogue:
- Keep the original export unchanged.
- Upload a small sample to MARCReady.
- Review the issues identified.
- Check field mapping if the source is CSV, Excel, TSV, or JSON.
- Export a small repaired sample.
- Stage the sample in a Koha test environment.
- Review staged record counts, warnings, and matches.
- Import into the test catalogue.
- Check staff display, OPAC display, search, items, and barcodes.
- Adjust mapping or repair settings before processing the full file.
When to use MARCReady
MARCReady is useful when:
- you are preparing a migration to Koha;
- you have vendor-supplied MARC records;
- your old system exports messy MARC or spreadsheet data;
- Koha reports staging errors;
- your records contain encoding problems;
- you want to review catalogue quality before import.
Use the free preview to review up to 3 records per upload at no cost (up to 15 records per month) before committing to a full catalogue repair.
Related articles
Next Steps
More in Resources & Guides
Was this article helpful?