Methodology
Standards, cautions, and principles for working with historical sources online.
Primary vs secondary sources
A primary source is an original record created at or near the time of the event — a birth certificate, a military service record, a newspaper published the day of an event. A secondary source interprets or summarises primary sources — a published history book, a genealogy database, a Wikipedia article.
Most online genealogy and archival databases are intermediaries, not primary sources. A genealogy platform may hold a digital image of a primary source (valuable) or only a transcribed index of it (useful but requiring verification). Know what type of evidence you are looking at.
Source reliability
Not all online sources are equally reliable. Consider:
- Who created the record? Official government records, church records, and institutional registers tend to be more reliable than user-submitted trees or community-edited databases.
- When was it created? Records created close to the event are generally more reliable than later recollections or compilations.
- What was the purpose? A death certificate created for legal purposes is different from a memorial page created by a family member decades later.
- Has it been transcribed? Every transcription introduces the possibility of error. Where possible, view the original image.
Conflicting evidence
Conflicting evidence is common in historical research. Ages in census records often differ from those in vital records. Name spellings vary. Birth places are recorded inconsistently. Do not simply ignore conflicting evidence or choose the version that fits your existing conclusions.
When sources conflict, record both (or all) versions, note where each comes from, and consider which is likely more reliable and why.
OCR limitations
Optical character recognition (OCR) is used to convert scanned images of text into searchable text. OCR is imperfect, particularly with older typefaces, damaged pages, handwriting, columns, and non-standard spelling. For handwritten records, use a handwriting-focused service such as Transkribus.
A name that appears in a newspaper may be completely unsearchable if the OCR failed on that word. When searching fails to find a known result, consider:
- Alternative spellings and phonetic variants
- Common OCR errors for the letters in the name (e.g., rn reading as m, li as h)
- Browsing pages by date rather than relying entirely on keyword search
Citation principles
Cite sources at the time you find them. Digital sources can disappear, change, or become inaccessible. A good citation for an online source should include:
- The title or description of the specific record
- The holding institution or platform
- A direct URL where possible
- The date you accessed it
- Any collection, roll, or folio reference given
Copyright and reuse
The age of a source does not automatically mean it is free to reuse. Copyright depends on the jurisdiction, the nature of the work, and the rights held by the institution digitising it. Many digitised archives apply their own terms of use on top of the original material.
Before reproducing, publishing, or distributing material found in online archives:
- Check the rights statement on the specific item
- Check the platform's terms of use
- Do not assume that "free to view" means "free to reuse"
Public domain basics
In most countries, unpublished works enter the public domain after a set period following the author's death. Published works have varying terms. In the European Union, the general rule is life plus 70 years. In the United States, works published before 1928 are generally in the public domain.
Government records and official documents are often in the public domain, but this varies by country and type of record.
Research ethics
When researching people — especially recently deceased or living individuals — consider:
- Privacy of living people: information that is publicly available is not necessarily appropriate to republish or share.
- Sensitive records: Holocaust records, health records, and records of persecution require particular care.
- Accuracy before publication: incorrect claims about people can cause harm to families and communities.
Archive access limitations
Not all archival records are online. A large proportion of historical material in national archives, regional archives, and specialist repositories has not been digitised and requires an in-person visit or a written request. Online search results showing "no records found" do not mean a record does not exist — it may simply not be accessible online.
When online searches are exhausted, consider contacting the relevant archive directly, hiring a local researcher, or visiting in person.