Introduction to Family Research
For much of my adult life I
have had a curiosity about my ancestry.
This curiosity was based mainly upon family anecdotes that I had been
told, or which I had overheard. They
provided enough clues to stimulate my interest but they did not, even in the
remotest sense, constitute an account or my origins, or those of my
family. Mostly, these anecdotes lacked
specific detail on timing, place or social context but it is worth repeating
some of them so as to give some understanding of the basis for my curiosity,
prior to undertaking serious family research.
I knew that both my
grandfathers, Frederick Fox and Thomas Harriman, had been farmers but had ended
their working lives as farm labourers. Frederick was reputed to
have lost his farm due to gambling and Thomas’ fall from grace was said to be
due to alcohol addiction. I also knew
that my maternal grandmother, Mary Jane, was a first cousin of Thomas and that
she also came from a farming family. My
other grandmother, Jane (nee McAlpine) was a Scottish lady who was reputed to
be related to Sir Robert McAlpine, the founder of the construction company of
the same name.
These stories were in the
back of my mind for decades as I pursued my career, first as an academic,
teaching and researching in genetics and then as a university manager, before
finally entering commercial life as the Chief Executive of Southampton Science
Park Ltd. During this varied life in
work I learned many skills which would stand me in good stead once I found the
resources and the inclination to undertake serious research into my family
background. That opportunity arose after
my retirement in 2007. Then I had the
time to spare and the financial means to pursue the study, which quickly became
a demanding and consuming hobby. Also,
family research had received a substantial boost from the increasing
availability of on-line sources of information and many lines of enquiry could
thus be conducted at home from my PC.
How much truth was there in
the family anecdotes and what else would I find out about my past along the
way? I did not start my study by reading
“how to” books on genealogy but simply allowed myself to be seduced by the
offer of a period of free use of the Ancestry.co.uk website and set about finding
my grandparents in the 1901 Census. This
proved to be straightforward and within a few minutes I become hooked on family
research, as many others have been, due to an addictive desire to know ever
more about my origins and about my ancestors and their lives.
At this point I will
digress to the present. After almost
five years of research I know a lot more about my antecedents and the forebears
of my wife and my son-in-law. I now have
almost 4500 individuals (all but a few being relatives of my wider family) on
my genealogical database. The process of
acquiring this information has also had the effect of stimulating the evolution
of a personal philosophy of family research.
This is how I now view my activities.
The terms “family
research”, “family history” and “genealogy” are used, loosely, as being
interchangeable. There does not seem to
be any generally accepted set of definitions for the terms, which would allow
us to say if they are exact equivalents.
“Genealogy” seems to be the least contentious. The OED defines the term as “the study and
tracing of lines of descent.” This seems
to imply a circumscribed discipline in which, once a line of descent from an
ancestor has been identified with high probability, the study is essentially at
an end. In the initial stages of a study
into one’s ancestors, establishing such relationships is both revealing and
fascinating. Learning that I emerged
from generations of farmers in the East Riding of Yorkshire and North Lincolnshire was initially exciting for me but I
now find that adding another name and another generation to these lineages is
only mildly stimulating, since one can usually know little about the newly
discovered ancestor, except name, year and place of birth. Now I find it more important to have some
understanding of relatives’ lives than simply to know who they were.
Before examining the terms
“family research” and “family history” let us first look at how we might define
“family”. A narrow definition focuses on
there being a genetic relationship between the members of a family and
typically this includes parents and children, but occasionally encompasses
other close relatives, such as grandparents, cousins, aunts and uncles. More broadly, “family” can be defined to
include everyone living together in a household. This will typically include parents and
children and perhaps other genetic relatives but may also include various
categories of non-genetic family members, such as adoptees, new spouses and the
children of new spouses from previous relationships.
When used in a medical
context, the term “family history” has a narrow definition, meaning the recording of family
structure and relationships, including information
about diseases in family members and typically going back about three
generations. This is an aid to
identifying the presence of disease with a genetic causation. Outwith a narrow, medical context, “family
history” also has a wider definition encompassing all family matters, not just
those dealing with health and disease.
So, what are the essential
elements of my philosophy of “family research”, the term I shall use as
shorthand to describe the framework of my activities? Firstly genetic relatedness, ultimately to
me, is a predominant, but not an exclusive property linking the people that I
study. This includes people who are my
direct relatives, going forward as well as backwards in time, but also people
who are collateral relatives, such as uncles and aunts and the spouses of
direct and collateral relatives. Close,
collateral relatives are not very interesting if they have left little trace,
but more distant collateral relatives are interesting if they have achievements
in their lives. Secondly, I study the relationships,
activities and interactions within families, in the broadest sense, to whom I
am linked genetically. But families are
not static entities. They evolve, mainly
through marriage, birth and death and my family research covers this
evolutionary process too. Also, families
do not exist in a vacuum, insulated from interactions with their neighbours,
employers and landlords and my research takes frequent forays into these
external relationships of the family. On
a grander scale, families and the individuals within them live within societies
which are themselves evolving, buffeted by social and technological change and
fashioned by national and international events such as wars and natural
disasters. From time to time consideration
of cultural, social and international events is essential in interpreting what
happened within families, for example relatives who moved from the country to
burgeoning towns as a consequence of the industrial revolution, or relatives
caught up in WWI and WWII..
It is also important to say
something about the research methods that I have employed. When I started out on this journey I did so
without preconceptions and with a frankly casual attitude to the collection,
evaluation and recording of data. It did
not take long for me to realise that family research, as I have chosen to
define it, is a serious academic discipline, which needs to adhere to certain
principles. In my opinion they include
the following.
Recording information. Since the strongest theme running through my
family research is that of genetic linkage, the employment of genealogical
software to store these fundamentals is indispensible. I use PAF5, which is freely available from
the Latter Day Saints (LDS – the Mormon Church). Not only does PAF5 allow you to record the
fundamental statistics of peoples’ lives, such as date of birth/christening,
marriage, death and burial but it also allows you to record genetic linkages
and to navigate with ease through this complex network of interrelationships.
Information sources. As important as recording information is the
need to record the source of a piece of information so that in the future you,
or anyone else, can check the information and evaluate any conclusions based
upon it.
Time sequence. All information on an individual should be
recorded in a time sequence. The “Notes”
section of an individual’s record on PAF5 can be used for this purpose. This should include dates and places of
births marriages and deaths, census records, will and probate records, press
mentions, etc. Getting an insight into
someone’s life is much easier when data are organised in such a time sequence. Indeed, this is the start of a biography of
the person, though that biography will prove to be sketchy to the point of
triviality for most relatives born before 1750.
Understanding the nature of
data. Registration of a birth will
usually only tell you when and where a birth was registered, not when and where
the birth occurred. Similarly a
christening, would usually have occurred shortly after birth and in the same
village, or one nearby, but there may only be a loose relationship between date
and place of christening and date and place of birth. It is important not to make unwarranted
assumptions in evaluating such data.
Limitations of data
sources. Conventional family research is
only possible once written records became routinely available. The Subsidiary Rolls of Edward I, used to
raise taxes for his wars in Scotland
and Wales hardly meet that description.
The same is true of the Poll Tax records which first date from 1377 in
the reign of Edward III. This was a tax
levied at the rate of 4d on all adults to pay for military excursions into France . The first records which were really useful
were parish records of baptisms, marriages and burials. These records were first introduced by the
Catholic Church in Europe in the 15th century but were not
introduced in Britain until
1538 in the reign of Henry VIII, after the split with Rome .
Thomas Cromwell, Henry’s Vicar General ordered that each parish priest
must keep these records. However, this
instruction was poorly observed and many records, in any case, were
subsequently lost. Statutory
registration of births, marriages and deaths was introduced in England and Wales
in 1838 but not in Scotland
until 1855. A census of the whole
population was first carried out in 1801 and has been repeated every 10 years,
except in 1941. Sadly, the 1931 records
for England and Wales were
destroyed by fire. The censuses for 1801
to 1831 were statistical in nature and did not give information on individuals
or households, so it is only from 1841 that we are able to get a ten-yearly
snapshot of family relationships. In gathering data from official records, primary
sources, eg microfiches of parish records or scans of census records are
superior to secondary sources, such as transcripts of parish records and census
returns, especially machine-mediated transcripts. Comparison of equivalent primary and
secondary sources will show quite quickly how often transcription, by either
man or machine, can introduce errors.
But primary sources can also contain errors, especially when data are
collected orally and then rendered in writing, eg a census enumerator who was
told that a child was called “Tom” but rendered the child’s name as “John”
because he misheard what was said. One
of the weakest sources of secondary information is a source, such as a
published family tree or an LDS record submitted by an individual (see original
Family Search database), which present “facts” on family relationships, eg
parentage, but without supporting data sources.
Such “facts” should be treated as “opinions” until they can be
independently verified.
Identifying distant
relatives. It is frequently the case
that in identifying relatives before 1800 we are reduced to using only parish
records and identification may then rely on someone having the right name,
being born in a plausible time bracket and not far from the place of birth of
his or her children. But this is a
dangerous practice, especially where a surname is frequent in a given locality,
eg Harrison in the East Riding of Yorkshire. The most plausible birth is not necessarily
that of your ancestor. Parish records
are notoriously incomplete and your relative may have been christened or
married in a parish whose records have not survived, eg in a non-conformist
church. Ideally you need to arrive at
the same attribution by two independent routes.
If this is not possible, then I usually stop chasing a line back because
the effort may be wasted. It is
important to remember that an error will normally result in all subsequent,
more distant ancestors being wrongly identified too. I find that potential ancestors for whom I do
not have a high level of confidence of correct attribution are simply not
interesting.
Can any facts be relied
upon to be true? The answer to this
question, in an absolute sense, is “no”.
A brief consideration of any “fact” will soon uncover potential sources
of error. Attributed fatherhood may be
unreliable because of hidden liaisons by the mother. Attributed motherhood may be more reliable
but even this apparent “fact” can be in error, eg from accidental
baby-switching in a hospital setting.
But if we can never be 100% sure of our facts, how can we proceed with
family research, or indeed any kind of research? The answer is that we try to reduce the
margin for error to acceptable levels.
Probability is usually expressed as a percentage or as a decimal
fraction. The probability that a
particular statement is true may, in theory, range from 100%, ie it is certainly
true, to 0%, ie it is certainly false.
While the extremes of this distribution of probabilities can never be
reached, the closer we get to 100% the more likely we are to be correct in our
deductions based on the statement. In
quantitative statistical tests the probability that a result was obtained by
chance can be calculated and a probability of 95% is usually taken as a
practical measure of significance but this accepts that in 1 in 20 such cases
the conclusion will be in error. Most of
the time, when conducting family research, we cannot attribute a quantitative
value to the probability that a statement is true but we can use perfectly
legitimate qualitative approaches to increasing the probability that
conclusions are correct. For example, if
we are trying to decide where someone was born we can collect as many
independent statements as possible which give this datum, eg different census
returns, birth certificate. If out of
three such sources only two agree with each other we would have a dilemma in
concluding where someone was born but if we had 10 such sources and 9 agreed,
we would be very confident that we could identify that individual’s place of
birth.
Essay writing. Facts are of limited use without integration
and interpretation. For me, this process
involves writing a series of extended essays covering coherent areas based on
my overall family research database.
Essay writing is an essential process because it drives you to evaluate
fully your thoughts on a topic. It is
firmly based on fact but blatantly and openly strays into interpretation and
hypothesis building, starting from established facts (as qualified by high
probability). An essay might deal with a
branch within my family research data base, eg The Foxes or The Spurriers, or
it might deal with an inanimate object, eg “Clipper Ship Conway”, or even an
individual to whom we are not related but with whom our ancestors had
interacted, eg “Captain WH Duguid”. I
like to keep my essays free from references within the body of the work, to
help the flow of argument, uninterrupted by citations. The reader can always refer to my family
research database for the discovery of sources.
Autobiography. I believe every genealogist should write his
or her autobiography, being careful to distinguish fact from opinion and trying
as far as possible to be balanced and accurate.
The autobiographer may wish to exclude some materials which might cause
offence or embarrassment to living people and this approach is likely to be
justified unless it bears upon a matter of substantial interest or importance,
which would otherwise be lost.
It has been pointed out
above that the (almost) unifying theme of my family research is the existence
of a web of genetic links. My own
career, which included a period of 20 years when I was a teacher and researcher
in genetics, gave me a good insight into the nature and significance of these
genetic links and it is useful to give a brief summary of the important aspects
of the academic discipline of genetics which impact most directly on family
research.
The characteristics of an
individual are due to information acquired by two parallel routes. We acquire genes from our two parents which
are involved, often predominantly, in determining many of our characteristics,
from hair colour to our ability to metabolise chemicals that we ingest. On the other hand we acquire other
characteristics almost entirely from the environment in which we are
raised. The ability to speak a language
is genetically determined but the ability to speak Chinese or English is
not. The type of language that we speak
is entirely learned during our upbringing, as are many other attributes, such
as system of religious belief or style of cooking. However, many characteristics are the result
of an interaction between our genes and out environment. For example, the disease Favism occurs when
susceptible individuals eat broad beans.
Their red blood cells then break down due to an enzyme deficiency. The
deficiency is caused by a defective gene but the disease is only expressed if a
deficient individual also eats broad beans (an environmental factor). Genetic inheritance involves the physical
passage of an information-bearing molecule, via egg and sperm, from one
generation to the next, but cultural inheritance involves only the passage of
ideas and information via written and spoken language and via observation of
others within a family, tribe or community.
(Please skip the next few
pages if you have an understanding of basic genetics!)
The information-bearing
molecule alluded to above is deoxyribonucleic acid, universally known by its
abbreviation, DNA. Information is
encoded in DNA using an alphabet of four letters and these letters can form 64
different 3-letter words (4^3). The
sequences of these words on linear DNA molecules provide the instructions which
pass from the parents to the fertilised egg and determine the genetic component
of inheritance.
The fertilised egg cell
divides repeatedly during development and the cells produced eventually differentiate
to form our tissues and organs. Before
cell division the information in DNA is copied exactly and one copy transmitted
to each of the two new cells. All cells
in our bodies (except unfertilised eggs and sperm – see below) thus contain the
same genetic information and cells with different functions, eg liver cells,
brain cells, are made by switching on and off different sets of instructions in
the cell’s DNA.
Most of the DNA in a cell
is present in the nucleus but some is present in the mitochondria (small bodies
generating energy) in the cytoplasm. Mitochondria
copy their DNA and the mitochondria divide, just like the whole cell. However, the copying of DNA and the division
of the mitochondrion are not synchronised.
Each body cell contains ~100 mitochondria and each mitochondrion
contains a variable number of DNA molecules, typically ~5. When the cell divides the cytoplasm is
pinched into two and the mitochondria are distributed roughly equally to the
two daughter cells. The mechanism for
separating copied nuclear DNA into daughter cells is much more precise. The DNA exists as several very long molecules
which are condensed by coiling and looping into structures called chromosomes
for ease of transmission at cell division.
A chromosome prior to cell division consists of two separate parts, each
containing one copy of the DNA information and each daughter cell gets one of
these copies, so that each contains exactly the same number of chromosomes and
exactly the same nuclear DNA information.
When egg and sperm cells
are formed there is a special kind of cell division which treats the
chromosomal and mitochondrial DNA differently.
When an egg cell is formed most of the cytoplasm (and its contained
mitochondria with their DNA molecules) goes to the unfertilised egg cell. On the other hand, when a sperm cell is
formed all the cytoplasm (and mitochondria) is excluded from the part of the
sperm cell (the nucleus) involved in fertilisation. Egg cells contain mitochondria (with their
DNA molecules) but sperm cells do not transmit mitochondria, thus our mitochondrial
DNA comes exclusively from our mothers.
Chromosomes are of two
types, those which occur in pairs in the body cells of males and females and
those (sex chromosomes) which differ in number and/or type in the body cells of
the two sexes. Humans have 46
chromosomes, in total, in their body cells. These are made up of 24 different
types. Twenty two of the types are not concerned with sex determination and
each of these occurs as a pair in body cells of both males and females. The other 2 types of chromosome are the sex
chromosomes, called X and Y respectively.
Female body cells contain two X chromosomes and male body cells have one
X and one Y chromosome. X chromosomes
contain lots of information in their DNA but Y chromosomes contain very little.
There is a special kind of
cell division which generates gametes (unfertilised egg and sperm cells). It ensures that each gamete contains only one
chromosome of each type not concerned with sex determination. This is also true of the pair of X
chromosomes in females. Thus, unfertilised
eggs all contain a single X chromosome. In
males the X and the Y chromosome separate from each other during sperm cell
formation. Half of all sperm cells thus contain a single X chromosome and the
others contain a single Y chromosome.
After fertilisation all eggs have 2 chromosomes of each type not
concerned with sex determination. Half
of all fertilised eggs contain two X chromosomes and the others contain an X
and a Y. A fertilised egg which contains
only X chromosomes develops into a female child. If a Y chromosome is present it switches development
from a pathway leading to femaleness to one leading to maleness.
A gene is a unit of
inheritance. It determines one
particular character, for example the structure of a protein molecule. The information in a gene is made up of the
code words in a linear section of DNA and different genes are arranged in a
linear sequence along a DNA molecule. A
change in the code words within a gene, eg swapping one word for another, may
change the information in the gene in such a way that some body characteristic
is changed. Such a change in DNA
information is called mutation and almost all genes exist in a population in these
different forms, or alleles.
Humans have ~20,000
different genes. While all genes are
made of DNA, not all DNA is used to make genes.
In humans quite a lot of the DNA in the chromosomes does not appear to
contain any information but its sequence of letters is still copied and
distributed accurately at cell division.
This so-called “junk” DNA can still accumulate changes in its sequence
of code letters, which can be detected by determining the sequence of letters
in DNA by chemical analysis. There is far
more “junk” DNA in our cells than DNA used to make genes. “Junk” DNA has proved to be very important in
studying the historical movement of human populations by plotting the
geographical presence or absence of particular sequence variants.
Each of our body cells has
two of the 22 different types of non-sex chromosomes. A gene has a fixed position on a chromosome
and each body cell has 2 copies of each of the ~20,000 different genes. If, in a particular human population, there
are two forms, or alleles of a particular gene, which we will call A and a,
then each individual may potentially have a genetic make-up of AA or Aa or aa
for that gene. If individuals with the
make-up AA look identical to those with Aa but different from aa for a
particular characteristic, we say that allele A is dominant over allele a. Sometimes an allele will cause disease in a
person possessing it. Usually when this
happens, eg in Sickle Cell Anaemia, where there is an abnormal haemoglobin
molecule in the red blood cells, the allele causing the disease is recessive to
the allele which determines normal haemoglobin structure. So only people who have two “a” alleles show
the full-blown disease.
The special kind of cell
division which occurs in the formation of egg and sperm cells has another
important function in addition to halving the number of chromosomes. It also causes the different alleles of the
many genes, both on the same pair of chromosomes and on different pairs of
chromosomes, to be recombined to give combinations which did not occur in
either the father or the mother. Many of
our characteristics are determined, not by single genes, but by many genes
interacting with each other, so recombination is important in generating new
combinations of alleles which may produce new characteristics.
Disease-causing recessive alleles
of genes usually only occur rarely in a population, because people with the
disease often don’t survive to reproductive age, so these alleles tend to be
eliminated. If a disease-causing
recessive allele has a frequency of 1/1000 in a population and mating is at random,
only 1 in a million (1/1000 x 1/1000) people will have two copies of the allele
and show the disease, though almost 2/1000 will possess one copy of the allele
([1/1000 x 999/1000] + [999/1000 x 1/1000] = approximately 2/1000). However in a family where a disease-causing
recessive allele is present, if close relatives mate together, their children
have a much higher chance of having a child suffering from a genetic disease,
because this increases the chance that the two relatives, though not suffering
from the genetic disease will both be carrying it. In such circumstances the chance of each
child showing the disease is 1 in 4 (1/2 x 1/2). This is why inbreeding, for example cousin
marriage, can have unpleasant outcomes and should be avoided.
All individuals of a
species are genetically related to each other and this is true of humans as it
is of other animals and plants. The
practical test of this relationship is that all individuals of a species are
capable of mating with each other and producing fertile offspring. All humans present on this planet are linked
to each other genetically and, using techniques to identify DNA markers, we are
able to follow this linkage back through time.
On the other hand, looking forward, there is a significant chance that
any one of us will not leave descendants far into the future.
Looking to posterity, each
of us usually has 2 parents, 4 grand-parents, 8 great-grand-parents, etc, with
the number of our direct relatives doubling with each successive generation
going backwards. Now, if this
proposition was absolutely true, the number of our direct relatives would
quickly exceed the number of people in the population of these islands. If we assume that there are 3 generations per
century and we go back 27 generations to a time shortly after the Norman
Conquest, then the population would need to have been at least 2^27, which is
greater than 134 million! It is
estimated that the population of Britain at that time was actually
~2 million. The explanation for this
discrepancy is that if we could trace our relationships back through 27
generations we would find that many, perhaps most, of our marrying ancestors in
those generations were relatives, even if the relationship was often remote.
How much genetic similarity
is there between an individual and his/her parents, grandparents and more
remote direct ancestors? The answer
depends upon the location of the genes in question. All the genes which lie on the mitochondrial
DNA of an individual come from his/her mother.
There are 37 genes on the mitochondrial DNA and most of the
mitochondrial DNA is used to code for these genes. All the genes which lie on the Y chromosome
of a male individual come from his father.
The Y chromosome DNA has the capacity to code for several thousand genes
but in fact only codes for ~27, most of which do not have vital functions. Thus, only ~0.3% of our genes is found in
these two locations. As a result, they tend to be ignored so that we can make
approximate calculations for the bulk of our genes which lie on the chromosomes
and which carry lots of genes, the chromosomes not concerned with sex
determination and the X chromosome. (Strictly speaking the X chromosome ought
to be treated separately because, while a mother passes her X chromosomes
equally to her sons and daughters, a father only passes his X chromosome to his
daughters.)
With these qualifications,
we can say that each individual has half his/her genes (strictly speaking
alleles of genes) in common with his/her father and the other half with his/her
mother. This halving continues with each
further generation and the general rule for calculating the proportion of genes
in common with a direct relative is ½ ^ n, where n is the number of
generations. If we go back 3 generations
to our 8 great-grand-parents, we only have ½^3 = 1/8 or 12.5% of our genes in
common with each of them. Frequently it
is possible to trace some part of our family tree back for 12 generations (say
400 years). While we may feel
emotionally connected to individuals from so many generations back, we actually
only share ½^12 or <0.025% of our genes with them. We are barely connected at all.
The genetic relationship
with remote ancestors, even where a reliable family tree is present (and
ignoring the possibility of hidden illegitimacy) may actually be more or less
than the estimate (which is an average) and can even be zero, because of the
way that recombination of alleles occurs during the special kind of cell
division which precedes the formation of eggs and sperm. This recombination of alleles occurs due to
two different mechanisms, depending upon whether the genes being examined lie
on the same chromosome or upon different chromosomes.
(Consider just 2 chromosome
types in a child’s body cells, called A and B.
There are 2 chromosomes of type A present and 2 chromosomes of type B,
which we will call A1, A2 and B1, B2, where chromosomes labelled “1” come from
the mother and chromosomes labelled “2” come from the father. When the child starts to produce egg or sperm
cells each gamete receives either A1 or A2 and either B1 or B2. A quarter of the gametes will have the
chromosome complement A1 B2 and another quarter will have the complement A2
B1. Neither of these combinations
occurred in either parent, so there has been recombination.
When genes occur on the
same chromosome type, they can also be recombined to give new combinations of
alleles on the same chromosome type by the operation of a mechanism which swaps
sections of chromosome. This adds to the
mechanism of recombination of genes which lie on different chromosome
types. However, the swaps do not occur
very often, typically 1-3 per chromosome pair, some copies of the chromosomes
have no swaps at all and swaps tend to occur in a small number of places,
meaning that groups of genes tend to be inherited together without being
recombined.)
Ignoring the complications
caused by genes in mitochondria and genes on sex chromosomes, we can calculate
what proportion of other genes (the great majority) that two related
individuals have in common due to recent common ancestors. This value has the grand title of the
Coefficient of Relatedness and the symbol R.
Two individuals may be direct relatives or collateral relatives. The line of direct relatives is, for example,
son-father-grandmother-great grandmother.
A collateral relationship is, for example son-(father and mother)-son
(ie brothers) or son-father-(grandfather and grandmother)-son (ie nephew –uncle).
R can easily be calculated. Sons (or daughters) share half their genes
with their fathers (or mothers). We can
calculate R by multiplying the proportions together for each link in the
genealogical chain in travelling from one individual to the other. In the case of two brothers there are two
separate routes, each of two links (son-father-son, son-mother-son) and so R, the
proportion of genes in common between two brothers is (½ x ½)+(1/2 x ½) =1/2 ,
or 50%. In the case of nephew – uncle
there are two separate routes, each of 3 links (son-[father or
mother]-grandfather-son, son-[father or mother]-grandmother-son) so they have
(½ x ½ x ½) + (1/2 x ½ x ½) = 1/4, or 25% of their genes in common. The calculation of R can be extended to any
relationship, no matter how remote.
The above calculation of
the coefficient of relatedness applies to a situation where there has been no
inbreeding in the last few generation of a family. Where there has been inbreeding the
coefficient of relatedness will be increased in value. The most common inbreeding situation
encountered by family researchers is cousin marriage. As we have seen, R for two brothers whose
parents are unrelated is 0.5, ie they have 50% of their genes in common. If their parents were first cousins, then the
value of R increases to 0.5625. If their
parents were double first cousins, then the value for R increases again to
0.625.
Inbreeding is the mating of
genetically-related individuals. The
degree of inbreeding can be quantified by measuring the Coefficient of
Inbreeding, F. It measures the
probability that the two genes of a particular kind in an individual are
identical due to inheritance from a common ancestor. In an otherwise outbred population F is
approximately half the value of R, the coefficient of relatedness, for the two
parents. The method for calculating F
will not be given or explained. Any good
book on quantitative inheritance or plant/animal breeding will give the
formula.
(That’s the basic genetics
over!)
Given names and surnames
are a vital component of family research, because they are a major tool in
identifying individuals at different times and places throughout their lives
and, in the case of surnames, in identifying the parents and children of
individuals. Indeed, family research
could not proceed if individuals changed their given names with a significant
frequency and if surnames were not inherited between generations and, on
marriage, if the female marriage partner did not routinely give up her parents’
surname and adopt the surname of her husband.
Also, if the big events in life were not recorded in writing, names
would be nearly useless to the family researcher. But the recording of life events, the
maintenance of names within and between generations and the discarding of
female surnames on marriage are practices which are only a few hundred years
old. It is worth looking at the origin
of given and inherited names, so that we can understand the limitations that
this history places on family researchers.
In
primitive hunter/gatherer societies it was probably essential that each
individual had a unique name, so that accurate communication could occur between
members of the group, for example during hunting. There was no need to have more than one name,
provided that every member of the group had a unique name. In
Anglo-Saxon England there were so many give names that within a family or small
group, such as a village, each is likely to have been unique. With the coming of Christianity there was a
trend to use biblical names as given names and thus the pool of given names
became much smaller. The Norman invasion
of 1066 had a profound effect on Britain . Given names introduced
after the invasion tended to be either biblical names or names popular in France at that
time. Most Anglo-Saxon names were
discarded.
In addition to a given name, many people acquired a second
name, or byename, probably to differentiate them from other individuals who had
the same given name. When a byename
became fixed, adopted by all members of a family and inherited between
generations, it became transformed into a surname. Byenames were ephemeral but surnames have
survived, though byenames probably had many of the characteristics of surnames. Surnames are classified into four principal
subdivisions, those derived from place, eg Bielby, those derived from kinship,
eg Harrison, those derived from nicknames, eg Fox and those derived from
occupation, eg Harriman.
This fixing of byenames into surnames did not occur quickly
or uniformly. At the time of the
Conquest, a few Norman barons had surnames which were derived from the names of
their estates back in France ,
for example Warenne (modern Warren) was derived from the hamlet of Varenne near
Dieppe . Some barons took surnames derived from the
names of their English estates as a means of establishing ownership. However, even long after 1066 surnames were
confined to the ruling barons. By 1250
most knights (landowners at a lower level than the barons) had surnames. Also
by this date some families in provincial towns had surnames and by about 1350
most families in England
had surnames. The convention of women adopting their husband’s surname did not
necessarily occur immediately on the adoption of surnames. Sometimes women kept their old surname on
marriage, or even changed it for one different to the surname of their husband. The rules were not fixed until well after
1300 but had become fixed by 1400. Once
surnames had been adopted they did not necessarily remain immutable. Some surnames
evolved by truncation and some by variation in spelling, which was often a
variable rendering by clerks of local pronunciations. After the Conquest,
French words were not understood and were often twisted due to English
pronunciation, eg Bohun became Boon, Bone or Bown. Thus whole families of similar surnames arose
from the same original name. Even as
late as the mid-19th century variable spellings of surnames can be
found in parish and census records.
One major stimulus for the fixing of byenames into surnames
was the creation of written records for the purposes of taxation. In the reign
of Edward I (1272-1307) the whole country was taxed to raise money for the
king’s wars in Wales and Scotland and the names of all those who paid their
taxes were written down and survive today in the Subsidiary Rolls in the Public
Record Office. A century later, in the
time of Richard II another tax, the Poll Tax was introduced and names were
written down again. Some of these lists
have also survived. In the Subsidiary
Rolls, names such as “Hugh the Baker” probably indicate current occupation (ie “the
Baker” is a byename) but “John Carter, draper” clearly shows that in some cases
a surname had already evolved. Ignoring 20th century immigration, at
least 30,000 surnames are present in Great Britain . Some surnames are much more common than
others, Smith being the most frequent.
The practice of assuming the surname of the male partner on
marriage parallels the mode of inheritance of the Y chromosome in one respect,
ie it passes from father to son, to son, etc.
However, unlike genetic inheritance, social inheritance of surname is
blind to such events as hidden illegitimacy and adoption. Individual surnames
today have characteristic geographical distributions, even common names such as
Smith, which may contain information about geographical origins of names. It is likely that many of the rarer surnames
had a single geographical origin and even that all bearers of that name are
potentially genetically related.
However, in order to draw conclusions about the current geographical
distributions of surnames it is essential first to trace the origin and
evolution of that name in written records through the ages. Also, studies of particular DNA sequences on
the Y chromosome show that while such sequences are associated with some
surnames, this is never exclusively so, presumably due to hidden
illegitimacy. In the case of very
frequent surnames there is usually a diversity of sequence variants present,
perhaps mainly due to the name having arisen independently on several or even
many occasions.
Prior to the Marriage Act of 1753 marriage was governed
entirely by church law and the only requirement was that it should be
celebrated by a priest, normally, but not exclusively, after banns or the
obtaining of a marriage licence. The
1753 Act required for the first time in England
and Wales ,
under the civil law, that marriages must take place in the parish church. Curiously, the law applied to anglicans and
roman catholics but not to jews or quakers.
Before this enactment it was normal for couples to live and sleep
together and for the bride to be pregnant at the time of marriage. By the start of the 19th century
it had become the social convention that there should be abstention from sexual
intercourse before marriage. However,
premarital conception was at a level of ~40% at the start of the 19th
century, though it dropped to ~20% during the Victorian era. By the end of the 20th century it
had risen to ~40% again.
Most extra-marital conceptions were actually pre-marital
conceptions, since marriage usually followed.
In the late 1700s in England
only 2-4% of births were illegitimate.
The illegitimate were often stigmatised, especially during the Victorian
era.
It was not unusual for men to delay marriage until their
late 20s in the 18th and 19th centuries, though women
usually married at an earlier age. Women
then usually had children regularly, approximately every 2 years until death or
infertility intervened. In the 1730s,
24% of marriages were ended within 10 years and 56% within 25 years, due to the
death of a spouse. These rates fell
substantially during the 19th and 20th centuries as
health and longevity improved. While women could have 12 or more children, they
typically had only 4 or 5. Towards the
end of the 19th century there is evidence that couples had started
to limit conception voluntarily.
Family research depends crucially upon birth registration,
both parish records and statutory registration, to identify the genetical
parents of a child. There is usually
little doubt as to who was the mother, since her pregnancy was obvious, but the
father may sometimes have been someone other that the registered father. Modern genetical studies have shown that in
the UK
1-2% of all births are associated with a so-called non-paternity event, ie the
registered father is not the genetical father.
It is impossible to estimate the frequency of non-paternity events in
the 17th, 18th and 19th centuries but we can
state with confidence that they occurred.
A non-paternity event normally broke the link between Y-chromosome type
and surname (but not if the child was conceived by the brother of the husband)! Modern studies looking for the association of
surnames with DNA markers on the Y chromosome support the presumption that
non-paternity events did occur in the past.
In former times, in addition to extramarital liaisons by a wife, other possible
mechanisms involved in non-paternity events were informal adoption and
accidental baby swapping by nursemaids.
Contemporary society provides additional, more exotic mechanisms, such
as egg and sperm donation and surrogacy, for non-paternity (and non-maternity)
events.
It is worth going back over this introduction to family
research to summarise the clear limitations under which such research labours. Once we understand these limitations we can
appreciate what it is possible to achieve with our own family studies and what
is beyond knowing.
The evolution of surnames to create firm links between
generations and the regular recording of
the great events of life, birth (baptism), marriage and death (burial) from the
mid-16th century onwards, created the opportunity to pursue family
research back from the present to that time.
However, there is little chance of pushing back knowledge of our individual
ancestors to earlier times. In the case
of my own family researches, the earliest entries on my database are for the
Maw family, farmers in Epworth, Lincolnshire
from as early as 1467 to the present.
Although the Maws were not my direct ancestors, they intermarried with
my Fox farming relatives extensively in the 19th century. But they constitute a rare case where family
research can penetrate back before the 1650s.
My earliest direct relatives who have been traced are various
individuals, probably farmers, on the Yorkshire Wolds going back 9 generations
to the mid 17th century. The earliest records tell us little about
the lives of individuals other than names and dates. It is difficult to get any appreciation of
the lives they led. Even if we are lucky
and can trace back our relatives to the mid-16th century, we have to
ask ourselves if such information is useful.
We also need to remember that the greater the number of presumed
genetical links back into history, the poorer is the evidence on which that
supposition of genetical linkage is based and the lower the probability that we
have correctly identified the next set of genetical relatives in the chain as
it disappears into the mists of time.
Genetical linkage, either direct or collateral, is a strong
driver for most family researchers.
However, the approximate halving of genetical relatedness with each
direct link going either forwards, backwards or collaterally, quickly takes us
to individuals with whom we have little genetically in common. Genetic dilution is a factor which is ignored
by, or unknown to, most family researchers, but it is an important and
inevitable fact. In addition, it is also
worth bearing in mind the more esoteric consideration that, in spite of a
proven genealogical link to a remote ancestor, sampling error during the recombination
of genes prior to gamete formation may have totally excised a genetic
linkage. We may be “related” to someone
but have no genes in common, at least due to recent reproductive history! The final point in relation to genetic
relatedness is that there may be non-paternity events in our lineage. If we assume that that the frequency of
non-paternity events is 2% per generation, then if we can trace a line of
ancestors back for 11 generations the probability that there has been no
non-paternity events in this chain of 10 genetic links is (0.98)^10, which is
approximately 0.8, ie there would be
~20% chance that there would be at least one non-paternity event in the
chain. A non-paternity event is, like a
wrong attribution of parenthood based on records, a fundamental break in
understanding our genetic link to the past.
Taking all these considerations together, I find the most
interesting area of family research relates to direct ancestors born from ~1770
onwards and extending back from me by 5 generations, ie to my great, great
grandparents. These are people of whom
we may come to know something significant about their lives.
Inheritance, as we noted at the start of this discourse,
consists of two fundamental elements, cultural inheritance and genetical
inheritance. Cultural inheritance, like
genetical inheritance, is diluted with the passage of time but not in such a
precise and predictable way as genetical inheritance. But in one way it endures and may have the
power to influence people in future generations. That is the written word, the creation of
music or works of art, or any number of contemporary digital outpourings. A book, such as “The Origin of Species” by
Charles Darwin, could be cited as such a cultural work which has not lost its
impact, even though it was first published more than 140 years ago. Few of us will have a cultural inheritance as
profound as that of Darwin’s writings lurking in our family history but through
family research we usually find something that is insightful of its times and
worthy of preservation.
In recent years it has become clear that each of us carries
a precise information record within the 20,000 genes and associated “junk” DNA
in our cells which will, in time, become the means to create a deep insight
into our individual genetic origins.
Although this branch of genealogical research is in its infancy, it has
already given some understanding of the migration of ancient human
populations. These DNA sequences provide
a link with a past much more remote than can ever be achieved with conventional
family research. DNA information does
not depend on the written word and is unaffected by name changes, sampling
error and non-paternity events
Human mitochondrial DNA has a 400 letter variable sequence
which can be classified into a branching tree.
This leads to the recognition of ~36 different major branches in human
populations throughout the world but only 8 in Europe . Because of the exclusive mother-to-daughter
transmission of mitochondrial DNA, these 8 European families represent only 8
original females in the original population which invaded Europe
and left offspring generation after generation continuously to the present
day. These females have been called the
“Seven Sisters of Eve” (even though we now recognise eight of them). Their places and times of origin can be
estimated and most seem to have originated in SW Europe
10 – 45 thousand years ago.
A similar exercise to that for mitochondrial DNA can be
carried out for Y-chromosome DNA which is passed exclusively from father to
son. It turns out that there are ~21
major Y-chromosome clans in the world, only 8 of which occur in Europe and, of
these, only 5 occur in the British Isles
Analysis of minor variants in DNA sequence for both
mitochondrial DNA and Y-DNA gives a finer grain understanding of population
movements in more recent times. Of
course these two DNA sources deal only with a small percentage of all human DNA
and development of markers in the non-sex and X chromosomes has the potential
to give an even greater insight into the geographical and racial origins of
each of us, if we are prepared to pay the necessary fee!
Don Fox
20120629
donaldpfox@gmail.com
donaldpfox@gmail.com
No comments:
Post a Comment