Dead people can talk! DNA Testing
Let me discuss about DNA and its applications in our day to day life. DNA is basic building block of our life and is present in all our cells, 4 base pair called nucleotide , ATGC pair up to make up a DNA , 3 billion of them arranged in specific sequence which contains programmed instruction about you and I and is copied to all our cells. In other words it is similar to the classic computers which uses binary system programmed to execute various programs whereas Human beings use 4 bit (ATGC) base pair system, programmed to do various things. The variation in base pair repetition or sequencing caused by mutation is the key which differentiates each one of us. 99.9% DNA is common across all humans, only .1% varies. But it amounts to 3 million of them, since all of us have 3 billion of them in each cell. It is copied to every other cell in our body. We have more than 97% DNA common with our immediate ancestors Chimpanzees which proves the theory of Evolution by Darwin.
I am sure that once you learn the basic concepts of DNA you will be hooked on to it. You will be interested to know that every features in our body is determined by the various combination of the arrangement ATGC base pair called sequencing, whether it is eye / hair colour or even a rare defect like retinitis pigmentosa. Human Genome project in US successfully sequenced the entire 3 billion, started in 1990 and took almost 16 years to complete and costed more than 10 billion dollars. Now scientists are into finding how to prevent terminal deceases by altering the sequence, which may lead to selective breeding and can cause many ethical issues. If we look at the positive side it can eliminate many of the hereditary deceases and can arrest aging process. It is found that the length of nucleotide ATGC reduces over time which is main reason behind aging. Scientist have demonstrated that it can be arrested retaining one’s youth and the treatment will be available to common man by 2050 if it is allowed to continue , in spite of the ethical issues related to it. Scientists predicts face reconstruction from DNA, so it can be really helpful; in solving crimes, or even finding our ancestors.
In this article you will find the answers to your generic questions about DNA and its use in Genealogy, to trace your ancestors.. There will be a series of articles to explore the usages of DNA study in various other fields, provided you are interested in knowing more about it. So I count on your feedback to judge the interest. Let us start with DNA applications in Genealogy, to find some one’s ancestral roots and ethnicity. It will answer the following questions.
- What is DNA?
- What is DNA testing?
- How to find your missing cousins in your family tree?
- What is Y-DNA , mtDNA and atDNA testing
- What are Haplogroups? How does it helps in finding your Ethnicity?
- Which DNA testing you need to go for to answer your specific questions about your ancestry?
- Who is Y-ADAM or mtEve ? Scientific Adam and Eve vs religious Adam and Eve? Are they same?
Let us look at why someone want to use DNA for genealogy? Genealogy is the study of family ancestors and history.
- To learn more about one’s ancestry
- To confirm that one’s family tree reflects one’s actual ancestry
- To confirm the relationship between two people
- To validate a theory of where people came from
- To break down a brick wall in one’s genealogy research
- To find relatives for those that were adopted, gave up a child for adoption or otherwise do not know their ancestry
- To learn from which ancestor(s) certain traits were inherited
- To learn one’s ethnicity
The importance of Genealogy in tracing your ancestry
DNA tests can be useful in presenting the scientific evidence about your ancestors / relatives or even ethnicity. But it cannot give you the names of your grand grandfather or grandma or even the exact period in which they lived. For that you need the help of Genealogical, records, one’s known family tree recording all major events in your family tree, birth, death, marriage etc. Both of these work in tandem to give you the full picture of your ancestry. In searching for missing relative or finding a missing link in your family tree DNA will be really helpful. If your documented family history is not having entry beyond a couple of centuries you cannot be sure about your ethnicity but you can confirm using DNA testing (haplogroups) .
I got seriously interested in DNA because of a specific reason. As per Megan and Ann the authors of the book “Trace your Roots with DNA” “At some level we all have a desire to know about our origins. Some of us join the quest early in life, others are successful at ignoring the pull for many years, and sooner or later it gets us”. Alex Haley the author or the best seller “Roots” once said “In all of us, there is a hunger, bone marrow deep, to know our heritage – to know who we are and where we have come from. Without this enriching knowledge, there is a hollow yearning”
Why I got interested in DNA?
Once you cross certain age you feel it is time for reflection. The age old questions will come up. “Who am I? “ “What is the purpose of my life?”. Etc. Those who are spiritually inclined can search the answers in religious books. Some may dig into philosophical books to get these answers. I am not at all religious but rather prefer science or logical reasoning to answer these questions. A combination of cosmology and genetics may help. So the search was ON. Over the years I realized that even though one can ignore religion (this is not easy especially when you are born in an organised religion, you will be isolated very fast. I had gone through this pain early in my life, as a teenager from 9thstandard!). But one cannot ignore the culture and food habits of the community you belong to, the community of your parents. It will stick with you for life long, wherever you live. So to find the answer to the question “Who am I? “your search cannot be restricted to science alone, you have to find more about the history and culture of the community you were born. So next search was to find out the origin of our community and its culture – Syrian Christians / St. Thomas Christians (https://en.wikipedia.org/wiki/Saint_Thomas_Christians) , believed to be of Apostolic origin, one of the oldest Christian community in the World. As per oral tradition it is originated in 1stcentury AD and now grown to 18% of Kerala State population , reduced from 21% one decade earlier due to negative growth. Admission is only through birth within the community, so more or less it can be categorised as a caste. We are known as Syrian Christians since the liturgical language we use is Syriac which is derived from Aramaic the language spoken by Jews at the time of Jesus. I studied Syriac for 4 years as my 2ndlanguage, instead of Malayalam / Hindi during my graduate days.
During the last decade starting from 2000 there was a frenzy within our community to write a family history. Many families were ready to go to any extend to find the details and compile it. Our family history was also published in 500 pages, infact my mother’s lineage. The authors (Volunteers) were successful in tracing the entire history from AD 1750 to 2000. 1750 AD was when our forefather (most common recent Ancestor) migrated to my native place, less than 10 Km away from Kuravilangadu where our ancestors believed to be lived from 2ndor 3rdcentury AD. Kuravilangadu was the power centre for Syrian Christians for 15 centuries .The community was ruled by Archdeacons from Pakalomattom, a single family from 1stcentury to 17thcentury and after the split lead one of the faction till almost 19thcentury. . I belong to another family, 9thgeneration from 1750. Even children died within days of birth is figuring in the history book. It was not a big surprise because Church keeps the record of all birth and death from its inception. But still it was difficult to trace all of them who migrated to different parts of the World. This is called Genealogy record which can be corroborated with DNA testing.
In my quest for the search of our roots, my friend and I have created a Facebook group called “Syrian Christians: In search of our roots” around 3 years back to discuss our history and culture. Now it has around 500 followers and growing in spite of , we (admins) are very choosy about the admissions to the group. We have very clear historical proof from 15thcentury onwards but majority of the prior historical evidence (papers and books written on palm leaves) were systematically destroyed by Portuguese in an attempt to convert us from Syriac to Latin in Liturgical language or Nestorians to Catholicism in theology. They had achieved what they wanted, dividing the community along Catholics, Jacobites, Orthodox and Marthomites by 1653 but still Syriac survived. The oral traditions in the form of folk songs (Ramban Patt) still popular narrates the ancient history. Now the question is who were our forefathers who embraced Christianity very early in its inception, mostly earlier than Western World. Are they locals or Jewish traders settled in our shores, Arabs merchants or someone else or a mix of all of these. Jews settled in Kerala during 3 or 4 periods starting from 10th/ 6thcentury BC, 1stcentury AD and 3rdCentury AD etc… Anyway a large scale Jew settlement happened in Kerala in early centuries Evidence of these are certainly present , Jewish tombs are still preserved in many parts of Kerala so what is the percentage of Jews among Syrian Christians, and among other communities? It is a big debate within the community as well as outside and it can be settled only using DNA Haplogroup mapping. DNA testing within a very limited samples yielded 0 to 10 % of Jewish origin across various factions within the community which can be far from the truth when more and more get tested. Now the oral traditions can be proved only if a considerable number of people go for DNA testing to identify. I think I have given enough background from where the interests originated. So let us look at the science behind DNA testing.
Our bodies are composed of cells, many types of cells, including skin cells, blood cells. Muscle cells, fat cells and many more. The headquarters if each cell is nucleus and it consists of Chromosomes 23 pairs, 23 from your father and 23 from your mother Inside each Chromosomes, there are DNA ( ) which is the unique instruction manual for you. Many DNA constitute a Gene. Each cell has the identical copy DNA
The following diagram shows 23 pairs of Chromosomes you got from your parents, 23 from your father and 23 from your mother. Out of which 22 pairs are similar and is called Autosomes. 23rdone is XX or XY which determines the sex of the baby. For male, it is XY and for females it XX. Since Y chromosomes needs to be inherited from the father, it is very special, it carries the information regarding male’s paternal line. For daughters the XX can be inherited from father or mother, so it does not really have any value to trace the ancestry. Fortunately other methods are available to us , ex: mtDNA testing.
To get an idea the relative size of Chromosomes and DNA visit the link http://learn.genetics.utah.edu/content/cells/scale/
DNA stucture was published by James Watson and Francis Cirk in 1953 for which they got Noble prize. DNA building blocks are called nucleotrides.
DNA looks like a twisted ladder , often known as the “double helix”. The double helix consists of two complemenarty DNA twisted together. If we were to hypothetically untwist the DNA strand and lay it flat it would be like a ladder. The two sides of the ladder is called DNA’s backbone and the steps in the ladder represents “bases” There are two types of based in DNA ATGC (A for adenine, T for thymine, G for guanosine and C for cystosine. A always pairs with T and G always pairs with C. The straight ones (T & A) and curved ones (C&G) pairs together. The order of arrangement of ATGC is called sequencing. Human Genome projects was successful in sequencing the entire 3 billion of them. The unique sequence of ATGC forms codes which carry genetic information.
Scientists use three key features to identify their similarities and differences: Refer the above figure, right extreme.
- Size. This is the easiest way to tell chromosomes apart.
- Banding pattern. The size and location of Giemsa bands make each chromosome unique.
- Centromere position. Centromeres appear as a constriction. They have a role in the separation of chromosomes into daughter cells during cell division (mitosis and meiosis).
The basic building blocks of bases pairs / Nucleotide are carbon, Nitrogen and Phosphate and is bonded together by Hydrogen. So basically we are all made of carbon, Nitrogen and Hydrogen which are abundant in our Solar system
DNA sequencing determines the precise order of nucleotides (ATCG) in a piece of DNA. Knowledge of DNA sequences has become indispensable for basic biological research, other research branches utilizing DNA sequencing, and in numerous applied fields such as:
- forensic biology
- biological systematics
A mutation is a sudden and inheritable alternation in DNA. A mutation that occurs in a reproductive cell (Ova / Sperm) may be passed on to the next generation. Mutation that occurs in non-reproduction cells may not be passed on to the next generation. But it can copy to other cells in the body, ex: cancer. Genetic variation is useful because it helps populations change over time. Variations that help an organism survive and reproduce are passed on to the next generation. Variations that hinder survival and reproduction are eliminated from the population. This process of natural selection can lead to significant changes in the appearance, behaviour, or physiology of individuals in a population, in just a few generations.
Gene is the smallest unit of inheritance which can be recognised as being passed from a cell to another cell. It is the section of the DNA thaat contains the code for cetain characteristics such as eye color. But they are not te same. The different forms or a Gene is called alleles. A gene may have two or three alleles. In general every one inherits two alleles from their partents. , one from father and one from mother. The alleles used to determine an inheritied characteristic is determined by a system called “dominance”. Some dominant alleles are expreesed in peference to recessive ones.
So far we have described the basic concepts and with this background we can probe further into DNA testing.
DNA Testing and Genetic Genealogy
There are 3 types if DNA Genetic testing.
- Y-DNA – paternal lineage
- mtDNA – Maternal lineage
- atDNa – Ethnicity (Haplogroup)
But if we inherit these same pattern of DNA for generations un altered, it won’t be useful for finding out the ancestry. We need some significant variations (mutations) which won’t be very fast or too slow. Fortunately there is one type called STR (Short Tandem Repeat) which undergo mutations every couple of centuries on average and another mutation called SNP which undergoes mutation every 50 generation or so , which amounts to 1000 years or more. In STR the number of repetition of the same sequence of basic pair changes and SNP the sequence ATCG itself will change. Tests are available to track both, so together with Y-DNA and mtDNA we can track one’s relatives, Most Recent Common Ancestor, location of origin as well as ethnicity.
Let us look into some of the jargons used in this article before getting into details of DNA testing
UEP (Unique Event Polymorphism)
The use of the word Unique is an overstatement, but the mutation rate is too slow that it can be treated as a onetime event. UEP and SNP are loosely used interchangeably. SNP is single nucleoid Polymorphism where there is sudden change in the sequence of ATGC, different from the parents.
Haplotype – Fast marker
When distinct mutation occurs in short time frame (a few generations) it is called Haplotype.
Haplogroup – Slow marker
A haplogroup is a cluster of people who share the same UEP, a distinctive marker that all have inherited from a single ancestor. They may not have the same Haplotype which is based on fast markers (ex: DYS)) described below. In a Haplogroup there will be multiple haplotypes.
Y-DNA find in Y Chromosomes has a unique inheritance pattern that makes it valuable for genetic genealogy testing. The Y chromosomes always passed from a father to his son. The father’s cell make an exact copy of his Y-chromosome and pass that down to his sons through his sperm. Note that if a man has only daughters his Y chromosomes is not passed to the next generation , he is daughtered out.
In the following diagram given below right A passed his Y-DNA to Sky (4thgeneration) through C, and K . similarly it reached LTY through C and L. But in the same 4thgeneration WOY carries a different Y-DNA since in the 2ndgeneration E was daughtered out (E has two daughters but no son) , so E could not pass his Y Chromosomes further. In short Y-DNA will confirm exactly same Y-DNA for SKT, LTY, D, L, C, D, N and E.
If every father down through ages had duplicated a perfect copy of Y for his son , every man this World today would have an absolutely identical Chromosomes. This would have made Y Chromosomes useless in finding out one’s ancestry. But fortunately that is not the case, mutation occurs while passing the Y from father to son in an optimum mutation rate where it will be useful to trace the ancestry. Mutations don’t occur like clockwork, we cannot predict when it will happen. The original value may be preserved for 100s of generations and suddenly two brothers may end up with different values. So Y chromosomes test cannot prove that you have a common particular ancestor with another person, only that you have a common ancestor at some point.
Y-STR (Short Tandem Repeat)
STR markers are repeating sequences of DNA found in the Y-Chromosomes. STR markers are defined by the name starting with DYS (DNA – Y- Chromosome segment) such as DYS 19 or DYS390. By this time you know DNA consists of sequence of 4 letters ATCG combined in pairs in a spiralling ladder structure. Sometimes the DNA structure appears to stutter and a sequence of letters will be repeated. On a marker DYS393 many men have 13 repeats of the same base pairs (ex: GATA, GATA,..) . The number of repeats counted in each marker to give you a value or allele for all the markers tested. Normally a son would have the same Y-STR as his father, but over generations mutations can occur when an error in copying occur. The son will have passed on the new mutated value to his own son who will in turn pass on the same to future generations.
Typically a commercial Y-DNA test will provide results of 37 or 43 markers, which is of low resolution. For higher resolution / accuracy you may for 67 or 111 markers. On its own these test results are meaningless unless you are able to compare it with somebody else’s results. 37/37 matches will indicate a very close results , may be first or 2ndcousins. 36/37 matches may indicate a 3rdcousin. 6thcousins may match 35/37. DNA test centres will have millions of test results to compare it with which helps in tracing the ancestry.
What is a match?
Your DNA reports will list the names of the markers (ex: DYS390) and the results, ex 24, the number of repeats. It is convenient to string the numbers together in a compact fashion. 14-12-24-11-13-13. Repeats at different markers. This makes it easy to compare a set of results. But you need to be careful, make sure that you are comparing the numbers at the same markers. Different companies may not follow the same order.
Let us look at 4 test results. Markers used are DYS19, DYS388, DYS390, DYS391, DYS392, and DYS393…
From this we can make out that A and C are perfect match, A, B and C are closely related, D may be unrelated with so many variations.
A sample test result is given below, STR values for 37 markers , and predict a common Y-DNA haplogroup R-M269.
SNP (Single Nucleotide Polymorphism)
A SNP is a genetic change in a particular base pair; also passed on to subsequent generations • Gives more precise haplogroup, i.e., specific sub-clade • STR-37, STR-67, STR-111 & Big Y available from FamilyTreeDNA; also pre-defined individual SNP tests.
STR talks about number of repeat of ATGC base pair pattern, whereas SNP talks about variation in a specific ATGC sequence itself.
mtDNA testing (Mitochondrial –DNA testing)
UnlikeY-DNA which is inherited from the father to son , daughter can inherit x-chromosome from her mother or father So testing X-dna won’t get you any valuable ancestral data. Fortunately there is another form of DNA called mtDNA found in egg which gives the energy source of the cell. It can be inherited only from mother. Mother got it from her mother and the link can be traced back till Eve. Mitochondria are tiny energy factories located almost every cell of the body. There are 1000s of Mitochondria in each cell and each contains 100s of copies of mtDNA
In the diagram above if you want to trace the mtDNA of Joan you may think it is difficult since Joan did not have any daughters to carry her mtDNA. So currently there is no living descendant for Joan to carry out the test. In such cases you may be successful if go back some generation up in the Genealogy tree. We need to work backwards to go forward. You can see that Joan’s mother Anne has a living descendant who carry the same mtDNA., Samuel. Again Samuel cannot pass it on to his daughters the same mtDNA. It will pass only through mothers.
mtDNA is small circular DNA consists of 16569 nucleoids, the DNA codes for 37 genes. It consists of 3 regions HVR1 , HVR2 and CR (Coding Region) as shown below.
HVR1: base pairs 16001 -16569
HVR2: base pairs 001 -574
CR: Base pairs 575-16000
Traditionally mtDNA tests only HVR 1 and 2 base pairs. Now with price coming down all 16569 base pairs are tested which gives out good information about one’s maternal line.
Once the mtDNA is tested by one of the above methods, it is compared with a reference mtDNA sequence and any difference between the reference DNA and the test takers DNA is listed.
There are 3 different reference DNA to be compared with
- The Cambridge Reference Sequence (CRS).This was published in 1981 and with a sample taken from a European Female
- The revised Cambridge reference Sequence (rCRS).It is an updated version of 1 , with several bugs removed’
- The Reconstructed Sapiens Reference Sequence (RSRS). This was introduced in 2012
When the mtDNA contains different nucleotide than the reference sequence, the nucleotide difference is indicated with the site of the location and the abbreviated nucleotide such as 538C for a cystosine that has replaced the reference nucleotide at position 538. More details are shown in the table below.
|Mutation||What it means?|
|263G||Unlike the reference sequence, the tested mtDNA has a G (guanine) at position 263|
|A263G||The tested mtDNA replaced the A (adenine) at position 263 of the reference sequence with G|
|309.1C||Compared to the reference sequence , the tested mtDNA has an extra C (cytosine) after the marker 309|
|309.2C||Compared to the reference sequence , the tested mtDNA has two extra C after the marker 309|
|522-||The tested mtDNA is missing the nucleotide found at position 522 in reference sequence|
Family tree DNA uses RSRS as a reference sequence and lists the difference between the test takers DNA and the reference DNA.
Family tree DNA uses RSRS as a reference sequence and lists the difference between the test takers DNA and the reference DNA. 23andMe also tests mtDNA but it uses SNP sequence not HVR1 and HVR2. It examines approximately 3000 SNPs located all along mtDNA. 23andMe does not provide the list of difference between the test takers mtDNA and the reference sequence, although test takers can review or download their mtDNA information in order to compare it to a reference sequence themselves.
Compare and Contrast Y-DNA and mtDNA
The following table gives a comparison of Y-DNA and mtDNA.
atDNA test (Autosomal Test)
Autosomal DNA (atDNA) refers to the twenty two pairs non-sex chromosomes found within the nucleus of every cell. atDNA unlike mtDNA and Y-DNA is inherited equally from both parents . Accordingly an individual gets one chromosomes in each chromosome pair from DAD and another one from mom and unfortunately it cannot be identified which one comes from whom.
A child inherits only 50% from each of his parents, so leaving behind one half. This occurs at every generation so that he inherits only 25% of this grandparents DNA and 12.5% of his great-grandparents and 6.25% of his great-great-grandparents.
DNA Genealogy Testing
Warning: If you are going for DNA testing you should be ready for surprises. Sometimes it throws out shocking news, revealing family secrets! Sometimes one in the family line one may be adopted or can be still more shocking. Ex: Thomas Jefferson, one of the founding fathers of American nation died without any children. But one Afro American family claimed that they were descendants of Jefferson and proved recently using DNA testing and confirmed the rumours prevailed over many centuries that he fathered many children by Sally, his slave.
There are many companies doing DNA Genealogy testing testing. Family Tree DNA, 23 and Me and Ancestry have good reputation and have large database to compare with. Testing involves sequencing your DNA as well as interpreting the results comparing with a refence database. Most of these database has large data from Europe and USA but very little reference data from India. So Indian’s don’t expect any accurate interpretation of the results regarding your ethnicity till more samples are tested from India. Fortunately you can upload the DNA results to third party sites (ex: GEDmatch) and a get a report free.
You can get the test kit from Amazon and the collection method varies from saliva to cheek swab. You can read little more about this at http://in.pcmag.com/software/117456/guide/the-best-dna-testing-kits-of-2018
The following table gives which test (Y-DNA, mtDNA or atDNA) to take once your finalize your goal.
Once you received the DNA analysis data from any of the sites mentioned above you can upload the raw data to GED match and get a better analysis report as shown below.